Keywords

1 Introduction

UnderWater Acoustic Sensor Networks (UW-ASNs) [1] have been deployed to carry out collaborative monitoring tasks including oceanographic data collection, disaster prevention, and navigation. To enable advanced underwater explorations, Autonomous Underwater Vehicles (AUVs), equipped with underwater sensors, are used for information gathering. Underwater gliders are one type of battery-powered energy-efficient AUVs that use hydraulic pumps to vary their volume in order to generate the buoyancy changes that power their forward gliding. These gliders are designed to rely on local intelligence with minimal onshore operator dependence. Acoustic communication technology is employed to transfer vital information (data and configuration) among gliders underwater and, ultimately, to a surface station where this information is gathered and analyzed.

Position information is of vital importance in mobile underwater sensor networks as the collected data has to be associated with appropriate location in order to be spatially reconstructed onshore. Even though AUVs can surface periodically (e.g., every few hours) to locate themselves using Global Positioning System (GPS)—which does not work underwater—over time inaccuracies in models for deriving position estimates, self-localization errors, and drifting due to ocean currents will lead to the increase of position uncertainty of underwater vehicle. Such uncertainty may degrade the quality of collected data and also the efficiency, reliability, and data rates of underwater inter-vehicle communications [39]. Besides the need to associate sensor data with 3D positions, position information can also be helpful for underwater communications. For example, underwater geographic routing protocols (e.g., [23, 25]) assume the positions of the nodes are known. AUVs involved in exploratory missions usually follow predicable trajectories, e.g., gliders follow sawtooth trajectories, which can be used to predict position and, therefore, to improve communication.

By leveraging the predictability of the AUVs’ trajectory, the energy consumption for communication can be minimized by delaying packet transmissions in order to wait for a favorable network topology, thus trading end-to-end (e2e) delay for energy and/or throughput.Footnote 1 For instance, Fig. 1 depicts a scenario where glider \(i\) waits for a certain time period \(\varDelta t\) \(\mathrm {[s]}\) to save transmission energy and to achieve higher throughput. Based on \(j\)’s and \(d\)’s trajectory, glider \(i\) predicts a ‘better’ topology with shorter links after \(\varDelta t\) and postpones transmission in favor of lower transmission energy and higher data rate. This approach differs from that proposed for Delay Tolerant Networks (DTNs), where delaying transmission becomes necessary to overcome the temporary lack of network connectivity [11].

To estimate an AUV’s position, in [9] we proposed a statistical approach to estimate a glider’s trajectory. The estimates were used to minimize e2e energy consumption for networks where packets in the queue need to be forwarded right away (delay-sensitive traffic). In this work, we focus on delay-tolerant traffic and propose an optimization framework that uses acoustic directional transducers to reduce the computation and communication overhead for inter-vehicle data transmission. Moreover, we offer the distinction between two forms of position uncertainty depending on the network point of view, i.e., internal and external uncertainty, which refer to the position uncertainty associated with a particular entity/node (such as an AUV) as seen by itself or by others, respectively (see Sect.  5.1 for more details).

Fig. 1
figure 1

Glider \(i\) delays its transmission by \(\varDelta t\) waiting for a better topology so to improve e2e energy and/or throughput to destination \(d\). Wide arrows represent the packet forwarding routes and dashed/dotted simple arrows represent glider trajectories

Based on the estimated external uncertainty, we propose QUO VADIS,Footnote 2 a QoS-aware underwater optimization framework for inter-vehicle communication using acoustic directional transducers. QUO VADIS is a cross-layer optimization framework for delay-tolerant UW-ASNs that jointly considers the e2e delay requirements and constraints of underwater acoustic communication modems, including transducer directivity, power control, packet length, modulation, and coding schemes. Specifically, the proposed framework uses the external-uncertainty region estimates of the gliders and forwards delay-tolerant traffic with large maximum e2e delay, which includes Class I (delay-tolerant, loss-tolerant) traffic and Class II (delay-tolerant, loss-sensitive) traffic [25]. Moreover, our cross-layer communication framework exploits the frequency-dependent radiation pattern of underwater acoustic transducers. By decreasing the frequency band, transducers can change their “directivity” turning from being almost omnidirectional (with a gain of \({\approx } 0\,{\mathrm {dBi}}\))—which is a desirable feature to support neighbor discovery and multicasting, geocasting, anycasting, and broadcasting—to directional (with gains up to \(10\,{\mathrm {dBi}}\))—which is useful for long-haul unicast transmissions.

The contributions of this work are as follows:

  • We offer the distinction between two forms of position uncertainty (internal and external, depending on the view of the different nodes). A statistical approach is then proposed to estimate the position uncertainty and this estimated uncertainty is then used to improve network performance.

  • We exploit the frequency-dependent directivity of the acoustic transducer that is originally used as omnidirectional transducer at one frequency to optimize network performance.

  • We propose a distributed communication framework for delay-tolerant applications where AUVs can conserve energy by waiting for a ‘good’ network topology configuration, e.g., a favorable alignment, before starting to communicate.

The remainder of this chapter is organized as follows. We first introduce the basic knowledge on underwater acoustic sensor networks in Sect.  2 and review the related work in Sect.  3. Then we present the underwater communication model in Sect.  4 and propose our solution, QUO VADIS, in Sect.  5. In Sect.  6, performance evaluation and analysis are carried out, while conclusions are discussed in Sect.  7.

2 Basics of Underwater Acoustic Sensor Networks

UW-ASNs are applied in a broad range of applications, including environmental monitoring, undersea exploration, disaster prevention, assisted navigation and tactical surveillance.

Underwater networking is a rather unexplored area although underwater communications have been experimented since World War II, when, in \(1945\), an underwater telephone was developed in the United States to communicate with submarines [29]. Acoustic communications are the typical physical layer technology in underwater networks. In fact, radio waves propagate at long distances through conductive sea water only at extra low frequencies (\(30{-}300\,{\mathrm {Hz}}\)), which requires large antennae and high transmission power. For example, the Berkeley Mica\(2\) Motes, the most popular experimental platform in the sensor networking community, have been reported to have a transmission range of 120 \({\mathrm {cm}}\) in underwater at \(433\,{\mathrm {MHz}}\) by experiments performed at the Robotic Embedded Systems Laboratory (RESL) at the University of Southern California. Optical waves do not suffer from such high attenuation but are affected by scattering. Moreover, transmission of optical signals requires high precision in pointing the narrow laser beams. Thus, acoustic waves are generally used for underwater communications [34].

The traditional approach for ocean-bottom or ocean-column monitoring is to deploy underwater sensors that record data during the monitoring mission, and then recover the instruments [27]. This approach has the following disadvantages:

1. No real-time monitoring: The recorded data cannot be accessed until the instruments are recovered, which may happen several months after the beginning of the monitoring mission. This is critical especially in surveillance or in environmental monitoring applications such as seismic monitoring.

2. No online system reconfiguration: Interaction between onshore control systems and the monitoring instruments is not possible. This impedes any adaptive tuning of the instruments, nor is it possible to reconfigure the system after particular events occur.

3. No failure detection: If failures or misconfigurations occur, it may not be possible to detect them before the instruments are recovered. This can easily lead to the complete failure of a monitoring mission.

4. Limited storage capacity: The amount of data that can be recorded during the monitoring mission by every sensor is limited by the capacity of the onboard storage devices (memories, hard disks).

Therefore, there is a need to deploy underwater networks that will enable real-time monitoring of selected ocean areas, remote configuration and interaction with onshore human operators. This can be obtained by connecting underwater instruments by means of wireless links based on acoustic communication.

To communicate with each other acoustically, underwater sensor nodes need to use acoustic modems, which are able to convert electrical signals into sound waves and vice versa. As of today, many acoustic modems—such as those designed and manufactured by companies like LinkQuest, Teledyne Benthos, DSPComm are commercially available to provide communication capabilities in different underwater environments. These modems uses communication techniques such as Frequency-Shift Keying (FSK), Phase-Shift Keying (PSK), Direct Sequence Spread Spectrum (DSSS) and Orthogonal Frequency-Division Multiplexing (OFDM), offering data rates up to 38.4 \({\mathrm {kbps}}\) over different communication ranges, i.e., short range (up to about 500 \({\mathrm {m}}\)), medium range (up to about 4000 \({\mathrm {m}}\)), and long range (up to about 10000 \({\mathrm {m}}\)) in different underwater environments (shallow water or deep water) for different communication link setups (vertical or horizontal communication link).

Fig. 2
figure 2

WHOI micro-modem connected to different transducers

These modems have been used in different underwater communication networks. However, they are generally big in size, which is not suitable for underwater vehicles such as the SLOCUM glider. Due to the size constraint, the popular choice for underwater gliders today is the Micro-Modem produced by Woods Hole Oceanography Institution (WHOI), as shown in Fig. 2. The WHOI Micro-Modem is currently the state-of-the-art modem used on the SLOCUM glider. It is compact in size (including the transducer), offering data rates from 80 to 5300 \({\mathrm {bps}}\) with communication range of up to a few kilometers. Such feature makes it an appropriate choice for AUVs like underwater gliders.

Many researchers are currently engaged in developing networking solutions for terrestrial wireless ad hoc and sensor networks. Although there exist many recently developed network protocols for wireless sensor networks, the unique characteristics of the underwater acoustic communication channel, such as limited bandwidth capacity and variable delays [28], require very efficient and reliable new data communication protocols.

Major challenges in the design of underwater acoustic networks are as the following.

  • The available bandwidth is severely limited;

  • The underwater channel is severely impaired, especially due to multi-path and fading problems;

  • Propagation delay in underwater is five orders of magnitude higher than in Radio Frequency (RF) terrestrial channels, and extremely variable;

  • High bit error rates and temporary losses of connectivity (shadow zones) can be experienced, due to the extreme characteristics of the underwater channel;

  • Battery power is limited and usually batteries can not be recharged, also because solar energy cannot be exploited;

  • Underwater sensors are prone to failures because of fouling and corrosion.

Underwater acoustic communications are mainly influenced by path loss, noise, multi-path, Doppler spread, and high and variable propagation delay. All these factors determine the temporal and spatial variability of the acoustic channel, and make the available bandwidth of the underwater acoustic channel limited and dramatically dependent on both range and frequency. Long-range systems that operate over several tens of kilometers may have a bandwidth of only a few \({\mathrm {kHz}}\), while a short-range system operating over several tens of meters may have more than a hundred \({\mathrm {kHz}}\) of bandwidth. In both cases these factors lead to low bit rate [7], in the order of tens of \({\mathrm {kbps}}\) for existing devices.

Here after we analyze the factors that influence acoustic communications in order to state the challenges posed by the underwater channels for underwater sensor networking. These include:

Fig. 3
figure 3

Path loss of short range acoustic channel versus distance and frequency in band 1–50 \({\mathrm {kHz}}\)

Path loss: Attenuation is mainly provoked by absorption due to conversion of acoustic energy into heat. The attenuation increases with distance and frequency. Figure  3 shows the acoustic attenuation with varying frequency and distance for a short range shallow water acoustic channel, according to the Urick’s propagation model in [37] (see Sect.  4 for more details). The attenuation is also caused by scattering and reverberation (on rough ocean surface and bottom), refraction, and dispersion (due to the displacement of the reflection point caused by wind on the surface). Water depth plays a key role in determining the attenuation. Geometric Spreading refers to the spreading of sound energy as a result of the expansion of the wavefronts. It increases with the propagation distance and is independent of frequency. There are two common kinds of geometric spreading: spherical (omni-directional point source), which characterizes deep water communications, and cylindrical (horizontal radiation only), which characterizes shallow water communications.

Noise: Man-made noise is mainly caused by machinery noise (pumps, reduction gears, power plants), and shipping activity (hull fouling, animal life on hull, cavitation), especially in areas encumbered with heavy vessel traffic. Ambient noise is related to hydrodynamics (movement of water including tides, current, storms, wind, and rain), and to seismic and biological phenomena. In [14], boat noise and snapping shrimps have been found to be the primary sources of noise in shallow water by means of measurement experiments on the ocean bottom.

Multi-path: Multi-path propagation may be responsible for severe degradation of the acoustic communication signal, since it generates Inter-Symbol Interference (ISI). The multi-path geometry depends on the link configuration. Vertical channels are characterized by little time dispersion, whereas horizontal channels may have extremely long multi-path spreads. The extent of the spreading is a strong function of depth and the distance between transmitter and receiver.

High delay and delay variance: The propagation speed in the acoustic channel is five orders of magnitude lower than in the radio channel. This large propagation delay (\(0.67\,{\mathrm {s/km}}\)) can reduce the throughput of the system considerably. The very high delay variance is even more harmful for efficient protocol design, as it prevents from accurately estimating the round trip time (RTT), which is the key parameter for many common communication protocols.

Doppler spread: The Doppler frequency spread can be significant in acoustic channels [34], causing a degradation in the performance of digital communications: transmissions at a high data rate cause many adjacent symbols to interfere at the receiver, requiring sophisticated signal processing to deal with the generated ISI. The Doppler spreading generates a simple frequency translation, which is relatively easy for a receiver to compensate for; and a continuous spreading of frequencies, which constitutes a non-shifted signal, which is more difficult to compensate for. If a channel has a Doppler spread with bandwidth \(B_{BW}\) and a signal has symbol duration \(T_{sym}\), then there are approximately \(B_{BW}T_{sym}\) uncorrelated samples of its complex envelope. When \(B_{BW}T_{sym}\) is much less than unity, the channel is said to be underspread and the effects of the Doppler fading can be ignored, while, if greater than unity, it is said to be overspread [17].

Most of the described factors are caused by the chemical-physical properties of the water medium such as temperature, salinity and density, and by their spatio-temporal variations. These variations, together with the wave guide nature of the channel, cause the acoustic channel to be highly temporally and spatially variable. In particular, the horizontal channel is by far more rapidly varying than the vertical channel, in both deep and shallow waters.

3 Related Work

We review the following areas: geographical routing solutions, terrestrial and underwater DTN solutions, solutions using directional transducers and underwater cross-layer optimization solutions, which are related to our work.

Geographic routing protocols rely on geographic position information for message forwarding, which requires that each node can determine its own location and that the source is aware of the location of the destination. In this way the message can be routed to the destination without knowledge of the network topology or a priori route discovery. Geographic routing protocols offer a number of advantages over conventional ad hoc routing protocols. Geographic routing does not require maintenance of routing tables or route construction prior to or during the forwarding process. Packet forwarding also allows a packet to adapt to topology change by selecting the next best hop based on the geographic location. It is also scalable as it does not rely on information that depends on the network size. Here we review some well-known geographic routing schemes that are proposed for terrestrial wireless networks as research on underwater geographic routing is still very limited. Many geographical routing schemes, including some well-known ones such as Most Forward within Radius (MFR) scheme [36], Greedy Routing Scheme (GRS) [12] and Compass Routing Method (CRM) [18], have been proposed for terrestrial wireless networks. In MFR, the message is forwarded to the neighbor that is closest to the destination, while in GRS a node selects the neighbor whose projection on the segment from the source to destination is closest to the destination (i.e., the node with maximum advance to the destination). In the Compass Routing Method (CRM) [18], a message is forwarded to a neighbor whose direction from the transmitter is the closest to the direction to the destination. In [22], a scheme called Partial Topology Knowledge Forwarding (PTKF) is introduced, and is shown to outperform other existing schemes in typical application scenarios. Based on the estimate using local neighborhood information, PTKF forwards packet to the neighbor that has the minimal e2e routing energy consumption. These solutions are proposed for terrestrial wireless networks. In UW-ASNs, they may not work well since propagation of acoustic signals is quite different from that of radio signals. Moreover, localization underwater is generally more difficult than in the terrestrial environment.

Delay tolerant networks are networks that have intermittent connectivity between network nodes, such as networks operating in mobile or extreme terrestrial environments, or interplanetary networks in deep space. In other words, DTNs are characterized by the lack of connectivity, resulting in a lack of instantaneous end-to-end paths. For networks using conventional protocols, such intermittent connectivity causes loss of data, where packets that cannot be forwarded immediately are dropped. For example, in TCP/IP networks, temporary disconnections may cause the slower packet retransmission. If packet dropping is too severe, TCP eventually ends the session, causing the applications to fail. To address this problem, protocols are designed carefully to support such intermittent communications between nodes in DTNs. Using the store-and-forward approach, a packet is incrementally moved and stored across the network so that it will eventually reach its destination. In this way, the reliability of packet forwarding can be guaranteed in DTNs. A common goal in many DTN routing protocols is to maximize end-to-end reliability. A common technique used to achieve this goal is to replicate copies of the message in the hope that it will succeed in reaching its destination.

Solutions for DTNs have been proposed for communications within extreme and performance-challenged environments where continuous e2e connectivity does not hold most of the time [5, 11]. Many approaches such as Resource Allocation Protocol for Intentional DTN (RAPID) routing [2], Spray and Wait [32], and MaxProp [4], are solutions mainly for intermittently connected terrestrial networks. RAPID [2] translates the e2e routing metric requirements such as minimizing average delay, minimizing worst-case delay, and maximizing the number of packets delivered before a deadline into per-packet utilities. At a transfer opportunity, it replicates a packet that locally results in the highest increase in utility. Spray and Wait [32] “sprays” a number of copies per packet into the network, and then “waits” until one of these nodes meets the destination. In this way it balances the tradeoff between the energy consumption incurred by flooding-based routing schemes and the delay incurred by spraying only one copy per packet in one transmission. MaxProp [4] prioritizes both the schedule of packets transmissions and the schedule of packets to be dropped, based on the path likelihoods to peers estimated from historical data and complementary mechanisms including acknowledgments, a head-start for new packets, and lists of previous intermediaries. It is shown that MaxProp performs better than protocols that know the meeting schedule between peers. These terrestrial DTN solutions may not achieve the optimal performance underwater as the characteristics of underwater communications are not considered. Hence, in the rest of this section, we focus on related solutions for UW-ASNs.

Several DTN solutions for UW-ASNs have been proposed in [8, 15, 16, 21]. In [8], an energy-efficient protocol is proposed for delay-tolerant data-retrieval applications. Efficient erasure codes and Low Density Parity Check (LDPC) codes are also used to reduce Packet Error Rate (PER) in the underwater environment. In [15], an adaptive routing algorithm exploiting message redundancy and resource reallocation is proposed so that ‘more important’ packets can obtain more resources than other packets. Simulation results showed that this approach can provide differentiated packet delivery according to application requirements and can achieve a good e2e performance trade-off among delivery ratio, average e2e delay, and energy consumption. A Prediction Assisted Single-copy Routing (PASR) scheme that can be instantiated for different mobility models is proposed in [16]. An effective greedy algorithm is adopted to capture the features of network mobility patterns and to provide guidance on how to use historical information. It is shown that the proposed scheme is energy efficient and cognizant of the underlying mobility patterns.

In [21], an approach called Delay-tolerant Data Dolphin (DDD) is proposed to exploit the mobility of a small number of capable collector nodes (namely dolphins) to harvest information sensed by low power sensor devices while saving sensor battery power. DDD performs only one-hop transmissions to avoid energy-costly multi-hop relaying. Simulation results showed that limited numbers of dolphins can achieve good data-collection requirements in most application scenarios. However, data collection may take a long time as the nodes need to wait until a dolphin moves into the communication ranges of these nodes.

Compared to the number of approaches using directional antennae for terrestrial wireless sensor networks, solutions using directional transducers for UW-ASNs are very limited due to the complexity of estimating position and direction of vehicles underwater. Moreover, these solutions generally assume the transducers are ideally directional, i.e., they assume the radiation energy of the transducer is focused on some angle range with no leaking of radiation energy outside this range. For example, such transducers are used for localization using directional beacons in [19] and for directional packet forwarding in [40]. These solutions also use only one frequency. In our work, rather than using the ideal transducer model, we consider the radiation patterns of existing real-world transducers at different frequencies in order to minimize energy consumption for communications.

Over these years, cross-layer optimization becomes a popular choice to improve the performance in wireless networks. By removing the strict constraints on the communication interfaces between layers that are defined in the standard Open Systems Interconnection (OSI) model, different layers can share more information and interact with each other in order to improve the network performance. For example, in the physical layer, a node can change its channel coding based on the packet error rate from the link layer. Cross-layer optimization has been shown to be an effective way to improve the network performance, especially in a harsh environment such as the underwater [33]. A cross-layer optimization solution for UW-ASNs has been proposed in [25], where the interaction between routing functions and underwater characteristics is exploited, resulting in improvement in e2e network performance in terms of energy and throughput. Another cross-layer approach that improves energy consumption performance by jointly considering routing, MAC, and physical layer functionalities is proposed in [23]. These solutions, however, do not consider uncertainty in the AUV positions and are implemented and tested only by software simulation platforms and are not designed for delay-tolerant applications. On the contrary, we propose a practical uncertainty-aware cross-layer solution that incorporates the functionalities of the WHOI Micro-Modem [13] to minimize energy consumption. Moreover, our solution is implemented on real hardware and tested in our emulator integrating WHOI underwater acoustic modems.

4 Network Model

In this section we introduce the network model that our solution is based on and state the related assumptions. Suppose the network is composed of a number of gliders, which are deployed in the ocean for long periods of time (weeks or months) to collect oceanographic data. For propulsion, they change their buoyancy using a pump and use lift on wings to convert vertical velocity into forward motion as they rise and fall through the ocean. They travel at a fairly constant horizontal speed, typically 0.25 \({\mathrm {m/s}}\) [1]. Gliders control their heading toward predefined waypoints using a magnetic compass.

Assume the gliders need to forward the data they sensed to a collecting glider. The slow-varying and mission-dependent (and, for such reasons, ‘predictable’) trajectory of a glider is used in our solution to estimate another glider’s position using the position and velocity estimate of some time earlier. A glider estimates its own trajectory and position uncertainty using its own position estimates; the parameters of the estimated trajectory and internal-uncertainty region are sent to neighboring gliders. Using these parameters, these gliders can extrapolate the glider’s current position and a confidence region accounting for possible deviation from the extrapolated course.

The Urick model is used to estimate the transmission loss \(TL(l, f)\) [dB] as,

$$\begin{aligned} TL(l, f)=\kappa \cdot 10\log _{10}(l)+\alpha (f)\cdot l, \end{aligned}$$
(1)

where \(l \;{\mathrm {[m]}}\) is the distance between the transmitter and receiver and \(f \;{\mathrm {[Hz]}}\) is the carrier frequency. Spreading factor \(\kappa \) is taken to be \(1.5\) for practical spreading, and \(\alpha (f) \;{\mathrm {[dB/m]}}\) represents an absorption coefficient that increases with \(f\) [35].

The Urick model is a coarse approximation for underwater acoustic wave transmission loss. In reality, sound propagation speed varies with water temperature, salinity, and pressure, which causes wave paths to bend. Acoustic waves are also reflected from the surface and bottom. Such uneven propagation of waves results in convergence (or shadow) zones, which are characterized by lower (or higher) transmission loss than that predicted by the Urick model due to the uneven energy dispersion.

Due to these phenomena, the Urick model is not sufficient to describe the underwater channel for simulation purposes. The Bellhop model is based on ray/beam tracing, which can model these phenomena more accurately. This model can estimate the transmission loss by two-dimensional acoustic ray tracing for a given sound-speed depth profile or field, in ocean waveguides with flat or variable absorbing boundaries. Transmission loss is calculated by solving differential ray equations, and a numerical solution is provided by HLS Research [26]. An example plotted using the Bellhop model is shown in Fig. 4. Interesting enough, if node 1 sends a packet, node 4 has higher probability of receiving the packet than node 3 even though this node is closer. Because the Bellhop model requires more information about the environment than a glider will have, such as sound speed profile and depths of receivers and ocean boundary, it is only used to simulate the acoustic environment for testing (relying on trace files with historic data). Hence, the proposed solution uses the Urick model in the cross-layer optimization (Sect.  5.2), which can be computed online on the glider.

Fig. 4
figure 4

Shadow zone scenario: the left subfigure represents the transmission loss of node 1 located at the origin, while the right subfigure depicts the sound speed profile used to derive the transmission loss (the \(y\)-axis is the depth, which has the same range used in the left; the blue, yellow and red areas denote large, medium and small path losses, respectively)

We adopt the empirical ambient noise model presented in [35], where a ‘V’ structure of the power spectrum density (psd) is shown. The ambient noise power is obtained by integrating the empirical psd over the frequency band in use.Footnote 3

5 Proposed Approach

Our proposed optimization is based on the estimation of the gliders’ trajectories and their external-uncertainty regions. Therefore, in this section, we introduce the estimation of external-uncertainty regions for gliders first. We then present the cross-layer design of our proposed framework.

5.1 Internal and External Uncertainty

We first offer the distinction between two types of position uncertainty, followed by the discussion on the relationship between these two types of uncertainty. Then we present the statistical approach for external-uncertainty region estimation when gliders are used as AUVs and ocean currents are unknown. Since the details have been presented in [10], we just summarize them here.

Internal uncertainty refers to the position uncertainty associated with a particular entity/node (such as an AUV) as seen by itself. Existing approaches such as those using Kalman Filter (KF) [3, 38] may not guarantee the optimality when the linearity assumption between variables does not hold. On the other hand, approaches using non-linear filters such as the extended or unscented KF attempt to minimize the mean squared errors in estimates by jointly considering the navigation location and the sensed states/features such as underwater terrain features, which are non-trivial, especially in an unstructured underwater environment.

External uncertainty, as introduced in this chapter, refers to the position uncertainty associated with a particular entity/node as seen by others. Let us denote the internal uncertainty, a 3D region associated with any node \(j\in {\mathcal {N}} ({\mathcal {N}}\) is the set of network nodes), as \({\mathcal {U}}_{jj}\), and the external uncertainties, 3D regions associated with \(j\) as seen by \(i,k\in {\mathcal {N}}\), as \({\mathcal {U}}_{ij}\) and \({\mathcal {U}}_{kj}\), respectively (\(i\ne j \ne k\)). In general, \({\mathcal {U}}_{jj}, {\mathcal {U}}_{ij}\), and \({\mathcal {U}}_{kj}\) are different from each other; also, due to asymmetry, \({\mathcal {U}}_{ij}\) is in general different from \({\mathcal {U}}_{ji}\). External uncertainties may be derived from the broadcast/propagated internal-uncertainty estimates (e.g., using one-hop or multi-hop neighbor discovery mechanisms) and, hence, will be affected by e2e network latency and information loss.

The estimation of the external-uncertainty region \({\mathcal {U}}_{ij}\) of a generic node \(j\) at node \(i\) (with \(i\ne j\)) involves the participation of both \(i\) and \(j\). Here we use the received \({\mathcal {U}}_{jj}\) as \({\mathcal {U}}_{ij}\) (a delayed version due to propagation delay, transmission delay and packet loss). Better estimation of \({\mathcal {U}}_{ij}\) involves estimation of the change of \({\mathcal {U}}_{jj}\) with time and is left as future work. We provide a solution for internal- and external-uncertainty estimation when (1) gliders are used (following a ‘sawtooth’ trajectory) and (2) ocean currents are unknown.

Internal-uncertainty estimation at \(j\): Assume gliders estimate their own locations over time using dead reckoning. Given glider \(j\)’s estimated coordinates, \(P_n=(x_n,y_n,z_n)\) at sampling times \(t_n\) (\(n=1\dots N\)), as shown in [10], its trajectory segment can be described as \(P(t)=\bar{P}+\overrightarrow{\mathbf {v}}(t-\bar{t})\), where \(\bar{P}=(\bar{x},\bar{y},\bar{z})=\frac{1}{N}\sum _{n=1}^N (x_n,y_n,z_n)\) and \(\overrightarrow{\mathbf {v}} = \frac{\Vert \overrightarrow{\widehat{P_1}\widehat{P_N}}\Vert }{\Vert (a^*,b^*,c^*)\Vert \cdot (t_N-t_1)}\cdot (a^*,b^*,c^*)\). Here, \([a^*,b^*,c^*]^T\) is the singular vector of \(N\times 3\) matrix \(\mathbf A =[[x_1-\bar{x},\dots , x_N-\bar{x}]^T, [y_1-\bar{y},\dots , y_N-\bar{y}]^T, [z_1-\bar{z},\dots , z_N-\bar{z}]^T]\) corresponding to its largest absolute singular value, \(\bar{t}=\frac{1}{N}\sum \nolimits _{n=1}^N t_n\) is the average of the sampling times, and \(\widehat{P_i}\) is the projection of point \(P_i\) on the line segment (Fig. 5a).

The internal-uncertainty region of \(j\) is estimated as a cylindrical region [10] \({\mathcal {U}}\) described by its radius \(R\) and its height \(H_U-H_L\), where \(H_U\) and \(H_L\)—in general different—are the signed distances of the cylinder’s top and bottom surface (i.e., the surface ahead and behind in the trajectory direction, respectively) to glider \(j\)’s expected location on the trajectory. In [9] we demonstrate that:

1. \(H_L\) and \(H_U\) can be estimated as

$$\begin{aligned} \left\{ \begin{array}{l} H_L=\overline{H}-\hat{t}_{\alpha , N-1} S^{(H)}\sqrt{1+1/N}\\ H_U=\overline{H}+\hat{t}_{\alpha , N-1} S^{(H)}\sqrt{1+1/N} \end{array}\right. , \end{aligned}$$
(2)

where \(\overline{H}=\sum \nolimits _{n=1}^N H_n/N\) is the mean of these \(N\) samples, \(S^{(H)}= [\frac{1}{N-1} \sum \nolimits _{n=1}^N (H_n-\overline{H})^2]^{1/2}\) is the unbiased standard deviation, \(1-\alpha \) is the confidence level, and \(\hat{t}_{\alpha , N-1}\) is the \(100(1-\alpha /2)\,\%\) of Student’s t-distribution [6] with \(N-1\) degrees of freedom (here \(H_n\) is the \(n\)th sample calculated from \(P_n\)’s [9]); and

2. \(R\) is estimated by

$$\begin{aligned} R=\frac{\sqrt{N-1}S^{(R)}}{\sqrt{\hat{\chi }_{\alpha , 2(N-1)}}}, \end{aligned}$$
(3)

where \(S^{(R)}= [\frac{1}{N-1} \sum \nolimits _{n=1}^N (R_n-\overline{R})^2]^{1/2}\), \(\overline{R}=\frac{1}{N}\sum \nolimits _{n=1}^N R_n\), and \(\hat{\chi }_{\alpha , 2(N-1)}\) is the \(100(1-\alpha )\,\%\) of \(\chi \) -distribution with \(2(N-1)\) degrees of freedom (here \(R_n\) is the \(n\)th sample calculated from \(P_n\)’s [9]). As shown in Fig. 5b, \(j\)’s internal-uncertainty region becomes smaller over time (from \(T_0\) to \(T_2\)), i.e., as more position estimates are acquired.

External-uncertainty estimation at \(i\): After receiving \(j\)’s trajectory and internal-uncertainty region parameters (\(\bar{P},\bar{t},\overrightarrow{\mathbf {v}},H_U,H_L,R\)), glider \(i\) can update the estimate of \(j\)’s external-uncertainty region. Because AUVs involved in missions show predictable trajectories, information about the sawtooth segment can be used to derive the entire glider trajectory through extrapolation assuming symmetry between glider ascent and descent. Due to packet delays and losses in the network, \(j\)’s external-uncertainty regions as seen by single- and multi-hop neighbors are delayed versions of \(j\)’s own internal uncertainty (Fig. 5b). Hence, when using multi-hop neighbor discovery schemes, the internal uncertainty of a generic node \(j, {\mathcal {U}}_{jj}\), provides a lower bound for all the external uncertainties associated with that node, \({\mathcal {U}}_{ij}\), \(\forall i\in {\mathcal {N}}\). Hence we use the received \({\mathcal {U}}_{jj}\) as \({\mathcal {U}}_{ij}\) (a delayed version due to propagation delay, transmission delay and packet loss).

5.2 Cross-Layer Optimization for Delay-Tolerant Applications

With the external-uncertainty regions, a glider needs to select an appropriate neighbor to forward each packet to its final destination. Because the major part of available energy in battery-powered gliders should be devoted to propulsion [24], acoustic communications should not take a large portion of the available energy. Our proposed protocol minimizes the energy spent to send a message to its destination and considers the functionalities of a real acoustic modem for a practical solution. Specifically, we provide support and differentiated service to delay-tolerant applications with different QoS requirements, from loss sensitive to loss tolerant. Hence, we consider the following two classes of traffic:

Fig. 5
figure 5

External- and internal-uncertainty regions for gliders under the effect of unknown ocean currents. a Estimated internal-uncertainty region by \(j\): a cylinder with circular bottom radius \(R\) and height \(H_U-H_L\). b Change of internal-uncertainty region over time

Class I (delay-tolerant, loss-tolerant). It may include multimedia streams that, being intended for storage or subsequent offline processing, do not need to be delivered within strict delay bounds. This class may also include scalar environmental data or non time-critical multimedia content such as snapshots. In this case, the loss of a packet is tolerable at the current hop, but its e2e PER should still be below a specified threshold.

Class II (delay-tolerant, loss-sensitive). It may include data from critical monitoring processes that require some form of offline post processing. In this case, a packet must be re-transmitted if it is not received correctly.

Our protocol employs only local information to make routing decisions, resulting in a scalable distributed solution (even though the destination information is required for routing, we can use the destination information learned from local neighbors to predict the position of the destination). It is a suboptimal solution instead of a global one since it relies on local information. The external-uncertainty regions obtained as described in Sect.  5.1 are used to select the neighbor with minimum packet routing energy consumption. Here, a framework using the WHOI Micro-Modem [13] is presented. This framework can be extended and generalized in such a way as to incorporate the constraints of other underwater communication modems.

Fig. 6
figure 6

Use of external-uncertainty region in the optimization framework

To be more specific, given the current time \(t_{now}\; {\mathrm {[s]}}\) and a message \(m\) generated at time \(t_0 \;{\mathrm {[s]}}\), glider \(i\) jointly optimizes the time \(\varDelta t \;{\mathrm {[s]}}\) to wait for the best topology configuration, a neighbor \(j^*\), a frequency band \(f_{ij}\), transmission power \(P_{TX}^{(i,j)}(t) \;{\mathrm {[W]}}\), packet type \(\xi \), and number of framesFootnote 4 \(N_F(\xi )\), so that the estimated energy \(E_{id}(t) \;{\mathrm {[J]}}\) to route \(m\) to destined glider \(d\)’s region \({\mathcal {U}}_{id}\) is minimized and message \(m\) reaches it within \(B_{\max } \;{\mathrm {[s]}}\) (i.e., the maximum e2e delay from the source to the destination). We assume power control is possible in the range \([P_{min}, P_{max}]\) although transmission power is currently fixed for the WHOI Micro-Modem. We anticipate more advanced amplifier hardware will make this power optimization possible.

Here, \(E_{id}(t)\) is estimated by the energy to transmit the packet to neighbor \(j\) in one transmission, the average number of transmissions \(\hat{N}_{TX}^{(i,j)}(t)\) to send \(m\) to \(j\), and the estimated number of hops \(\hat{N}_{hop}^{(j, d)}(t)\) to reach region \({\mathcal {U}}_{id}\) via \(j\). We need to estimate the transmission power and the number of hops to destination. The external-uncertainty region is used to estimate the number of hops \(\hat{N}_{hop}^{(j, d)}(t)\) to \(d\) via neighbor \(j\) and the lower bound of the transmission power as follows (Fig. 6). Let \(\hat{l}_{i,p_1,p_2}(t) \;{\mathrm {[m]}}\) be the projected distance of line segment from \(i\) to position \(p_1\) on the line from \(i\) to position \(p_2\), and \(l_{i,p}(t)\) be the distance from \(i\) to position \(p\). \(\hat{N}_{hop}^{(j, d)}(t)\) is estimated by the worst case of \(l_{i,p}(t)/\hat{l}_{i,p_1,p_2}(t)\), i.e., Eq. (8). The lower bound for transmission power is estimated by the average transmission power so that the received power at every point in \({\mathcal {U}}_{ij}\) is above the specified threshold \(P_{TH}\). The transmission power lower bound is the integral of the product of the transmission power to obtain \(P_{TH}\) at a point in \({\mathcal {U}}_{ij}\) and the probability density function (pdf) of \(j\) to be at this point.

Fig. 7
figure 7

Picture of our underwater glider and radiation pattern of the BT-25UR transduce

To estimate the received power, it is necessary to estimate the transducer gains at the transmitter and receiver. To estimate the transmitter’s gain \(G_{TX}(\theta _{ij},\phi _{ij}, f_{ij})\), \(i\) needs to compute the radiation angles—the horizontal angle \(\theta _{ij}\in [-180^\circ , 180^\circ ]\) and the vertical angle \(\phi _{ij}\in [-90^\circ , 90^\circ ]\) with respect to \(j\). Assume the initial position of the transducer is as shown in the top left corner of Fig. 7, then \(i\)’s normalized transducer direction vector is \(\overrightarrow{\mathbf{n }_i}=(0,0,-1)\) with the horizontal plane \(z=z_0^{(i)}\) (defined as the plane perpendicular to \(\overrightarrow{\mathbf{n _i}}\)). While the glider is moving, its pitch, yaw, and roll angles are denoted by \(\varepsilon _i\), \(\zeta _i\), and \(\eta _i\), respectively. From geometry, the direction vector after rotation is \(\overrightarrow{\mathbf{n '}_i}=\mathbf{Q }_x(\eta _i)\mathbf{Q }_y(\zeta _i)\mathbf{Q }_z(\varepsilon _i)\overrightarrow{\mathbf{n }_i}^T\), while the transducer’s horizontal plane is \(\mathbf{Q }_x(-\eta _i)\mathbf{Q }_y(-\zeta _i)\mathbf{Q }_z(-\varepsilon _i)[x,y,z]^T=z_0^{(i)}\), where \(z_0^{(i)}\) is a constant, and \(\mathbf{Q }_x(\eta _i)\), \(\mathbf{Q }_y(\zeta _i)\) and \(\mathbf{Q }_z(\varepsilon _i)\) are

$$\begin{aligned} \left[ \begin{array}{ccc} 1 &{} 0 &{} 0\\ 0 &{} \cos \eta _i &{} -\sin \eta _i\\ 0 &{} \sin \eta _i &{} \cos \eta _i\\ \end{array}\right] , \left[ \begin{array}{ccc} \cos \zeta _i &{} 0 &{} -\sin \zeta _i\\ 0 &{} 1 &{} 0\\ \sin \zeta _i &{} 0 &{} \cos \zeta _i\\ \end{array}\right] , \left[ \begin{array}{ccc} \cos \varepsilon _i &{} -\sin \varepsilon _i &{} 0 \\ \sin \varepsilon _i &{} \cos \varepsilon _i &{} 0 \\ 0 &{} 0 &{} 1 \\ \end{array}\right] , \end{aligned}$$

respectively.

With the position vector \(\overrightarrow{P_iP_j}\) from \(i\) to \(j\), we can derive \(\cos \phi _{ij}=\frac{\widehat{\overrightarrow{P_iP_j}}\circ \overrightarrow{P_iP_j}}{\Vert \widehat{\overrightarrow{P_iP_j}}\Vert \cdot \Vert \overrightarrow{P_iP_j}\Vert }\) and \(\cos \theta _{ij}=\frac{\widehat{\overrightarrow{P_iP_j}}\circ \overrightarrow{\mathbf{v }}_i}{\Vert \widehat{\overrightarrow{P_iP_j}}\Vert \cdot \Vert \overrightarrow{\mathbf{v }}_i\Vert }\), where \(\widehat{\overrightarrow{P_iP_j}}\) is the projection of \(\overrightarrow{P_iP_j}\) on the transducer’s horizontal plane, \(\circ \) is the inner product, and \(\overrightarrow{\mathbf{v }_i}=\Vert \overrightarrow{\mathbf{v }_i}\Vert \cdot [\cos \varepsilon _i \cos \zeta _i, \cos \varepsilon _i\sin \zeta _i, \sin \varepsilon _i]=(a_i^*,b_i^*,c_i^*)\) is the velocity vector of glider \(i\) as estimated in Sect.  5.1. As \(\overrightarrow{\mathbf{n '}_i}\) is perpendicular to the transducer’s horizontal plane, we have \(\sin \phi _{ij}=\cos (90-\phi _{ij})=\frac{\overrightarrow{\mathbf{n '}_i}\circ \overrightarrow{P_iP_j}}{\Vert \overrightarrow{P_iP_j}\Vert }\) and \(\widehat{\overrightarrow{P_iP_j}}=\overrightarrow{P_iP_j}-(\overrightarrow{P_iP_j}\circ \overrightarrow{\mathbf{n '}_i})\cdot \overrightarrow{\mathbf{n '}_i}\). The transducer’s gain at receiver \(j\), \(G_{RX}(\theta _{ji},\phi _{ji},f_{ij})\), can be estimated in a similar way.

Let \(L_m(\xi )\) be \(m\)’s length in bits depending on packet type \(\xi \) and \(B(\xi )\) be the corresponding bit rate. The energy to transmit the packet to neighbor \(j\) in one transmission can therefore be approximated by \(P_{TX}^{(i,j)}(t)\cdot \frac{L_m(\xi )}{B(\xi )}\). Overall, the optimization problem can be formulated as

\(\mathbf P(i, d, t_{now}, \varDelta t_p)\): Cross-layer Optimization Problem

$$\begin{aligned} \mathbf{Given: }&P_{min}, P_{max}, \varXi , \varOmega _\xi , G_{TX}, G_{RX}, \eta , B_{\max }, PER_{\max }^{e2e} \nonumber \\ \mathbf{Computed: }&\varepsilon _i, \zeta _i, \varepsilon _j, \zeta _j, {\mathcal {U}}_{ij}, \forall j \in {\mathcal {N}}_i\cup \{d\} (\text{ i.e., } R^{(i)}_j, H^{(i,j)}_L, H^{(i,j)}_H)\nonumber \\ \mathbf{Find: }&j^*\in {\mathcal {N}}_i, P_{TX}^{(i,j)*}(t)\in [P_{min}, P_{max}],\nonumber \\&\xi ^*\in \varXi , N_F^*(\xi )\in \varOmega _\xi , \varDelta t^*, f_{ij}^*\in [f_L, f_U]\nonumber \\ \mathbf{Minimize: }&E_{id}(t)=P_{TX}^{(i,j)}(t)\cdot \frac{L_m(\xi )}{B(\xi )}\cdot \hat{N}_{TX}^{(i,j)}(t)\cdot \hat{N}_{hop}^{(j, d)}(t).\end{aligned}$$
(4)

In \(\mathbf P(i, d, t _\mathbf{now }, {\varvec{\Delta }}\mathbf{t}_\mathbf{p}, {\mathcal {N}}_i\), \(\varXi \), and \(\varOmega _\xi \) denote the set of \(i\)’s neighbors, the set of packet types, and the set of number of type \(\xi \) frames respectively. The objective function (4) estimates the energy required to send message \(m\) to the destination region \({\mathcal {U}}_{id}\). To solve this problem, we need to derive the relationship between these variables. Let \(L_F(\xi )\; \mathrm{[bit] }\) be the length of a frame of type \(\xi \), \(L_H \;\mathrm{[bit] }\) be the length of message \(m\)’s header, \(PER(SINR_{ij}(t),\xi )\) be the PER of type \(\xi \) at the Signal to Interference-plus-Noise Ratio \(SINR_{ij}(t)\), \(TL(l_{ij}(t), f_{ij})\) be the transmission loss for distance \(l_{ij}(t)\) and carrier frequency \(f_{ij}\; \mathrm{[kHz] }\)—which is calculated using Eq. (1)—\({\mathcal {A}}\backslash \{i\}\) be the set of active transmitters excluding \(i\), and \(P_{TX}^{(i,j)}(t)\) be the transmission power used by \(i\) to reach \(j\), we have the following formulas,

$$\begin{aligned}&{\varvec{(}}\underline{{\varvec{class{-}independent\; relationships}}}{\varvec{)}}\nonumber \\ t&=t_{now}+\varDelta t;\end{aligned}$$
(5)
$$\begin{aligned} t_{TTL}&=B_{\max }-(t_{now}-t_0);\end{aligned}$$
(6)
$$\begin{aligned} L_m(\xi )&=L_F(\xi )\cdot N_F(\xi )+L_H;\end{aligned}$$
(7)
$$\begin{aligned} \hat{N}_{hop}^{(j, d)}(t)&=\frac{\max _{p\in {\mathcal {U}}_{id}}l_{i,p}(t)}{\min _{p_1\in {\mathcal {U}}_{ij}, p_2\in {\mathcal {U}}_{id}} \hat{l}_{i,p_1, p_2}(t)};\end{aligned}$$
(8)
$$\begin{aligned} SINR_{ij}(t)&=\frac{P_{TX}^{(i,j)}(t)\cdot 10^{G_{ij}(l_{ij}(t),f_{ij})/10}}{\sum _{k\in {\mathcal {A}}\backslash \{i\}} P_{TX}^{(k,j)}(t)\cdot 10^{G_{ij}(l_{kj}(t),f_{ij})/10}+N_0};\end{aligned}$$
(9)
$$\begin{aligned} G_{ij}(l_{ij}, f_{ij})&=G_{TX}(\theta _{ij},\phi _{ij}, f_{ij})+G_{RX}(\theta _{ji},\phi _{ji},f_{ij})-L_{AMP}(f_{ij})-TL(l_{ij},f_{ij});\end{aligned}$$
(10)
$$\begin{aligned} \theta _{ij}&=\arcsin \frac{\overrightarrow{\mathbf{n '}_i}\circ \overrightarrow{P_iP_j}}{\Vert \overrightarrow{P_iP_j}\Vert };\end{aligned}$$
(11)
$$\begin{aligned} \phi _{ij}&=\arccos \frac{\widehat{\overrightarrow{P_iP_j}}\circ \overrightarrow{\mathbf{v }}_i}{\Vert \widehat{\overrightarrow{P_iP_j}}\Vert \cdot \Vert \overrightarrow{\mathbf{v }}_i\Vert }. \end{aligned}$$
(12)

Note that \(N_0=\int _{f_L}^{f_U} psd_{N_0}(f,w)df\) is the ambient noise, where \(psd_{N_0}(f,w)\) is the empirical noise power spectral density (psd) for frequency band \([f_L,f_U]\) and \(w\) \(\mathrm [m/s] \) is the surface wind speed as in [35]. \(t_{TTL}\) is the remaining Time-To-Live (TTL) for the packet, \(L_{AMP}(f_{ij})\) \([\mathrm dB ]\) is the power loss of the power amplifier at \(f_{ij}\) and \(PER^{e2e}_{\max }\) is the maximum e2e error rate for packet \(m\). In these relationships, Eq. (5) is the time after waiting \(\varDelta t\); Eq. (6) calculates the remaining TTL for message \(m\); Eq. (7) calculates the total message’s length; Eq. (8) estimates the number of hops \(\hat{N}_{hop}^{(i,j)}(t)\) to reach destination \(d\); Eq. (9) estimates the SINR at \(j\) while Eq. (10) estimates the total transmission gain in \({\mathrm {dB}}\) from \(i\) to \(j\), including the transducer gain at the transmitter and receiver, loss at the power amplifier, and transmission loss; Eqs. (11) and (12) estimate the transducer’s radiation angles of \(j\) with respect to \(i\). The constraints for \(\mathbf P(i, d, t _\mathbf{now }, {\varvec{\Delta }}\,\mathbf{t} _\mathbf{p})\) are,

$$\begin{aligned}&{\varvec{(}}{\underline{{\varvec{class{-}independent \;constraints}}}}{\varvec{)}}\nonumber \\ P_{TX}^{(i,j)}(t)\ge \int _{(x,y,z) \in {\mathcal {U}}_{ij}}&P_{RX}(i,j,x,y,z)\cdot 10^{-G_{ij}(l_{ij}(t),f_{ij})/10}\cdot g_R(x,y)\cdot g_H(z)dx dy dz;\end{aligned}$$
(13)
$$\begin{aligned}&P_{RX}(i,j,x,y,z)\ge P_{TH};\end{aligned}$$
(14)
$$\begin{aligned} 0\le&\varDelta t\le \frac{t_{TTL}}{\hat{N}_{TX}^{(i,j)}(t)\cdot \hat{N}_{hop}^{(j, d)}(t)}. \end{aligned}$$
(15)

In these constraints, \(P_{RX}(i,j,x,y,z)\) is the received signal power at the generic 3D location (\(x,y,z\)) when \(i\) transmits to \(j\). Last, \(g_R(x,y)\) and \(g_H(z)\) are the pdfs of the glider’s position on the horizontal plane (i.e., \(\chi \)-distribution with degree of \(2N-2\)) and on the vertical direction (i.e., Student’s t-distribution with \(N-1\) degrees of freedom), respectively [9], \(P_{TH}\) is the received power threshold so that the packet can be received with a certain predefined probability. Equation (13) estimates the lower bound of the transmission power to cover the external-uncertainty region so that the received power is above a pre-specified threshold, as accounted for in Eqs. (14) and (15) estimates the bounds of \(\varDelta t\), which must be less than the maximum tolerable delay at the current hop. To support the two classes of delay-tolerant traffic, we have the following additional constraints,

$$\begin{aligned}&{\varvec{(}}\underline{{\varvec{{additional\; class{-}dependent\; constraints}}}}{\varvec{)}}\nonumber \\ \quad \quad \quad \quad \mathbf{Class\; I }&=\left\{ \begin{array}{c} \hat{N}_{TX}^{(i,j)}(t)=1\\ 1-\big [1-PER(SINR_{ij}(t), \xi )\big ]^{\hat{N}_{hop}^{(j, d)}(t)}\le PER^{e2e}_{\max } \end{array} \right. \!\!; \end{aligned}$$
(16)
$$\begin{aligned} \mathbf{Class\; II }=\left\{ \begin{array}{c} \hat{N}_{TX}^{(i,j)}(t)=\big [1-PER(SINR_{ij}(t), \xi )\big ]^{-1} \end{array}. \right. \end{aligned}$$
(17)

The first constraint for Class I traffic forces packet \(m\) to be transmitted only once, while the second constraint guarantees the e2e PER of \(m\) should be less than a specified threshold \(PER_{\max }^{e2e}\). The constraint for Class II traffic guarantees message \(m\) will be transmitted for the average number of times for successful reception at \(j\). By solving the local optimization problem every time when the inputs change significantly (not every time when a packet needs to be sent), \(i\) is able to select the optimal next hop \(j^*\) so that message \(m\) is routed (using minimum network energy) to the external-uncertainty region \({\mathcal {U}}_{id}\) where destination \(d\) should be. Obviously different objective functions (e2e delay, delivery ratio, throughput) could be used depending on the traffic class and mission QoS requirements. Note that in fact our solution can be extended to serve two other classes of traffic—(1) delay-sensitive, loss-tolerant traffic, and (2) delay-sensitive, loss-sensitive traffic—by setting \(\varDelta t\) to 0.

Fig. 8
figure 8

Derivation of transducer angles from glider \(i\) to \(j\)

To reduce the complexity, we can convert \(\mathbf P(i, d, t _\mathbf{now }, {\varvec{\Delta }} \mathbf{t}_\mathbf{p})\) into a discrete optimization problem by considering finite sets of \(P_{TX}^{(i,j)}\) and \(\varDelta t\), which can be taken to be a number of equally spaced values within their respective ranges. The problem then can be solved by comparing the e2e energy consumption estimates of different combination of these discrete values. Assuming that transmission power and time are discretized into \(N_P\) and \(N_{time}\) values, respectively, for the case of WHOI modem (3 frequencies and 14 combinations of packet types and number of frames [9]), the processor in node \(i\) needs to calculate the objective value \(42N_P\cdot N_{time}\cdot |{\mathcal {N}}_i|\) times in each round. The embedded Gumstix motherboard (\(400\) \({\mathrm {MHz}}\) processor and \(64\) \({\mathrm {MB}}\) RAM) attached to the Micro-Modem is adequate to solve such a problem. To further reduce the computation, instead of running the solution for every packet, it will be rerun only at \(t_{now}+\varDelta t_p\) for the same class of traffic flow that is sent from \(i\) to the same destination \(d\). Here, \(\varDelta t_p\) is taken as the minimum of the \(\varDelta t\) values of the packets belonging to the same class of traffic and the same destination, estimated from the previous run. Figure 8 depicts an example of how \(\mathbf P(i, d, t _\mathbf{now }, {\varvec{\Delta }} \mathbf{t}_\mathbf{p})\) is solved at \(i\). At time \(t_{now}\), the problem is solved with \(j\) found to be the next hop to \(d\). The minimum of the \(\varDelta t\) values of these packets belonging to the same class of traffic and the same destination observed before \(t_{now}\) is \(\varDelta t_p'\). Packets for \(d\) will then be forwarded to \(j\) with the calculated transmission power at the selected frequency band until \(t_{now}+\varDelta t_p'\). Then, the problem is solved again and \(k\) is found to be the next hop. The minimum \(\varDelta t\) observed so far is \(\varDelta t_p''\) and, hence, the problem will be solved at \(t_{now}+\varDelta t_p'+\varDelta t_p''\).

Once the optimal frequency band is selected, \(i\) needs to notify \(j\) to switch to the selected band. A simple protocol can be used as follows. All AUVs use the same frequency band as the Common Control Channel (CCC) to tell the receiver which band is selected. A short packet or preamble with the selected band number is first sent by the transmitter using the CCC, followed by the data packet using selected frequency band after the time for the transmitter and receiver to finish frequency band switching. The receiver will first listen on the CCC, switch to the selected band embedded in the short control packet or preamble, receive the data packet, and then send back a short ACK packet to acknowledge the reception. Finally, both sides switch back to the CCC if the transmission succeeds or the transmission times out. More sophisticated frequency-band switching protocols, which are out of the scope of this chapter, can be designed to improve network performance. We rely on the Medium Access Control (MAC) scheme with the WHOI modem to send the data. Since the speed of acoustic wave underwater is very slow when compared with radio waves, the propagation delay has to be considered in order to avoid packet collisions. However, it is difficult to estimate the propagation delay since the positions are uncertain. It may not improve the performance much as the actual propagation delay may be different from the estimation. Moreover, the inter-vehicle traffic underwater is generally low. So the problem of packet collisions is not severe and hence we can just use the MAC scheme provided by the WHOI modem.

6 Performance Evaluation

The communication solution is implemented and tested on our underwater communication emulator [9] as shown in Fig. 9. This underwater acoustic network emulator is composed of four WHOI Micro-Modems [13] and a real-time audio processing card to emulate underwater channel propagation. The multi-input multi-output audio interface can process real-time signals to adjust the acoustic signal gains, to introduce propagation delay, to mix the interfering signals, and to add ambient/man-made noise and interference. Due to the limited number of Micro-Modems (four in our case) and audio processing channels, we can only mix signals from up to three transmitters at the receiver modem (one modem as the receiver and the other three as the transmitters). Therefore, we calculate, select for transmission, and mix with ambient noise, only the three most powerful signals the receiver will encounter. We leave the simulation of more than three simultaneously transmitted signals as a problem for further research.

Fig. 9
figure 9

Solving \(\mathbf P(i, d, t _\mathbf{now }, {\varvec{\Delta }}\mathbf{t}_\mathbf{p})\) every \(\varDelta t_p\) at \(i\)

Fig. 10
figure 10

Underwater communication emulator using WHOI micro-modems

Table 1 Emulation scenario parameters

We are interested in evaluating the performance of the proposed solution in terms of e2e energy consumption, e2e reliability (i.e., e2e delivery ratio), average bit rate of a link, and overhead, under an environment that is described by the Bellhop model (and the Munk acoustic speed profile in Fig. 4 as input).

Assume that a glider’s drifting (i.e., the relative displacement from the glider’s trajectory) is a 3D random process \(\{X(t), t\ge 0\}\) as the following [30]: (1) In the beginning of the deployment, the drifting is 0, i.e., \(X(0)=(0,0,0)\); (2) The drifting has independent increments, in that for all \(0\le t_1<t_2<\dots <t_n\), \(X(t_n)-X(t_{n-1})\), \(X(t_{n-1})-X(t_{n-1}), \dots , X(t_2)-X(t_1)\), \(X(t_1)\) are independent; (3) The drifting has stationary increments, in that the distribution of \(X(t+s)-X(t)\) does not depend on \(t\) and is normally distributed with zero mean and covariance matrix \(s\sigma ^2 I_3\), where \(I_3\) is the \(3\times 3\) identity matrix, and \(\sigma \) is a scaling factor that decides the magnitude of drifting. Note that this drifting model is ideal since the drifting in any of the \(x,y,z\) directions is Gaussian. The consideration of realistic drifting pattern is left as future work. Emulation parameters are listed in Table 1. The radiation pattern of the BT-25UF transducer (Fig. 10) is used in the emulations. Every 10 s, a packet is generated in each node. A glider is randomly selected as the collector and half of the other gliders are randomly selected to forward their packets towards it. For statistical relevance, emulations are run for 50 rounds and the average is plotted with 95 % confidence interval. Note that it actually is a scenario for deep water. We will also evaluate the performance in shallow water, where acoustic waves propagate differently.

We are interested in evaluating the performance of our solution for the two classes of traffic in Sect.  5.2, using either the BT-25UF transducer or an ideal omni-directional transducer (with gain equal to \(0\) \({\mathrm {dBi}}\)). We also want to compare the performance of our solution, which delays the transmission for optimal topology configuration, with the solution without delaying the transmission. For convenience, we denote QUO VADIS for Class I traffic using the BT-25UF transducer, for Class I traffic using the ideal omni-directional transducer, for Class II traffic using the BT-25UF transducer, for Class I traffic using the ideal omni-directional transducer, the solution with no delaying of the transmission (i.e., \(\varDelta t=0\) for \(\mathbf P(i, d, t _\mathbf{now },\; {\varvec{\Delta }} \mathbf{t}_\mathbf{p})\)) by ‘QUO VADIS I’, ‘QUO VADIS I - OMNI’, ‘QUO VADIS II’, ‘QUO VADIS II-OMNI’, and ‘QUO VADIS-ND’. We will also compare the performance of our solution (which is closely related to geographic routing and delay-tolerant networking) with geographic routing solutions—MFR, GRS, CRM, and PTKF—and DTN solutions—RAPID, Spray and Wait, and MaxProp—as reviewed in Sect. 3. To make the comparison fair, we use two variant protocols for each of these solutions by adding the constraints of the two classes of traffic to these solution. For example, we denote the MFR solution with Class I constraints in Eq. (16) by ‘MFR I’, and the solution with Class II constraints in Eq. (17) by ‘MFR II’.

The following networking metrics are compared:

  • e2e energy consumption: the average energy consumed to route one bit of data to the destination;

  • e2e delivery ratio: the number of data packets received correctly over the number of data packets sent;

  • link bit rate: the average bit rate between a transmission pair;

  • overhead: the number of bytes used for position and control to facilitate the transmission of payload data.

Fig. 11
figure 11figure 11

Performance comparison for Class I traffic with geographic routing protocols. a Delivery ratio comparison. b Energy consumption comparison. c Link bit rate comparison

Emulations are done for different settings and the results are plotted with 95 % confidence interval and discussed in the following sections.

6.1 Comparison with Geographic Routing Protocols

Our proposed solution forwards packets based on the geographic location. To see how well our solution performs against existing geographic routing protocols, simulations are run and the results are plotted in Figs.  11 and 12 (as existing research on underwater geographic routing is still very limited, the solutions we compare against are taken from those originally designed for terrestrial wireless networks). As shown in these two figures, we can see that QUO VADIS has better performance than QUO VADIS-OMNI and QUO VADIS-ND for the same class of traffic in terms of these three metrics. By delaying packet transmissions to wait for the optimal network topology, the e2e energy consumption is reduced while the e2e delivery ratio and link bit rate increase (e.g., with 5 gliders, the energy consumption for QUO VADIS I is around 30 % of that for QUO VADIS-ND). By exploiting the frequency-dependent radiation pattern of the transducer, received signal power may obtained a gain of up to 20 dB, which we observed in the simulations. Hence QUO VADIS using the BT-25UF transducer has better performance than that using the omni-directional transducer. Due to the QoS requirements, retransmissions are needed to recover link errors, resulting in higher e2e delivery ration for Class II traffic than for Class I traffic. On the other hand, this leads to more energy consumption.

Fig. 12
figure 12figure 12

Performance comparison for Class II traffic with geographic routing protocols. a Delivery ratio comparison. b Energy consumption comparison. c Link bit rate comparison

Fig. 13
figure 13figure 13

Performance comparison for Class I traffic with DTN protocols. a Delivery ratio comparison. b Energy consumption comparison. c Link bit rate comparison

Fig. 14
figure 14figure 14

Performance comparison for Class II traffic with DTN protocols. a Delivery ratio comparison. b Energy consumption comparison. c Link bit rate comparison

Fig. 15
figure 15

Comparison of the overhead

Fig. 16
figure 16figure 16

Shallow water: performance comparison for Class I traffic. a Delivery ratio comparison. b Energy consumption comparison. c Link bit rate comparison

Fig. 17
figure 17figure 17

Shallow water: performance comparison for Class II traffic. a Delivery ratio comparison. b Energy consumption comparison. c Link bit rate comparison

Different versions of our QUO VADIS solutions also perform better than geographic routing protocols GRS, MFR, CRM, and PKTF. This is because that the uncertainty in location leads to errors in route selection, packet transmissions, and transmission power estimates. Also these geographic routing protocols do not consider the propagation delay underwater, which results in degraded communication performance. Interesting enough, we can see that among these geographic routing protocols, PKTF offers the best performance. This is because it jointly considers the transmission power and routing to minimize the e2e energy consumption. Therefore it performs better than the other geographic routing protocol, which only consider the distance or angle metrics for routing (not closely related to network performance). GRS gives the worst performance since it generally needs to forward the packet to the node that is far from the transmitter, which introduces bad link performance. Similarly, CRM performs better than MFR as the CRM has less probability to forward packets to node that is far away than MFR does.

6.2 Comparison with DTN Solutions

Similar to the comparison against the geographic routing solutions, we compare the performance of QUO VADIS against the DTN solutions—RAPID, MaxProp and Spray and Wait. As shown in Figs.  13 and 14, QUO VADIS gives improved performance over RAPID, MaxProp and Spray and Wait. That is mainly due to that these DTN solutions transfer packets once the neighbors are in the transmission range. Such schemes may be good for scenarios where the connectivity is intermittent. However, the performance may not be optimal since this may not be the time to achieve the best link performance. In contrast, QUO VADIS predicts and waits for the best network configuration, where nodes move closer for the best communications. So the e2e delivery ratio and link bit rate of QUO VADIS is the highest while its energy consumption is minimal. Note that among these compared DTN solutions, RAPID performs the best. This is because RAPID prioritizes old packets so they won’t be dropped. MaxProp gives priority to new packets; older, undelivered packets will be dropped in the middle. Spray and Wait works in a similar way, which does not give priority to older packets. On the other hand, Spray and Wait is slightly better than MaxProp. This is because in our scenario, the network connectivity is not disrupt. The way that MaxProp routes (based on the e2e delivery ratio estimation) will be very different from that Spray and Wait does (i.e., just transmits the packet to a neighbor then lets the neighbor continue to forward it). Moreover, MaxProp still needs to pay for the overhead to obtain the global e2e delivery ratio information.

Fig. 18
figure 18figure 18

Uncertainty update interval: performance comparison for Class I traffic. a Delivery ratio comparison. b Energy consumption comparison. c Link bit rate comparison

Fig. 19
figure 19figure 19

Uncertainty update interval: performance comparison for Class II traffic. a Delivery ratio comparison. b Energy consumption comparison. c Link bit rate comparison

6.3 Overhead Comparison

We plot and compare the overheads (per node) of these protocols in Fig.  15. Note that as QUO VADIS, QUO VADIS-ND, and QUO VADIS-OMNI work almost the same way, i.e., the uncertainty region information is broadcast periodically (here the period is taken to be 60 \(\mathrm {s}\)), their overheads are the same and thus we use QUO VADIS in the figure to represent these variant versions. Similarly, nodes running the geographic routing protocols GRS, MFR and CRM only need to periodically broadcast the position information so their overhead is basically the same. Hence we use GRS/MFR/CRM to represent them.

Surprisingly, even though QUO VADIS achieves the best network performance, its overhead is not the biggest. The protocols with the larger overhead are RAPID and MaxProp. In order to work, RAPID needs the following control information: average size of past transfer opportunities, expected meeting times with nodes, list of packets delivered since last exchange, the updated delivery delay estimate based on current buffer state, and information about other packets if modified since last exchange with the peer, which takes a large number of bytes. MaxProp needs to exchange a list of the probabilities of meeting every other node on each contact, which is basically global information. It also has the neighbor discovery overhead. Compared to RAPID and MaxProp, QUO VADIS only needs to exchange the external uncertainty information of itself and the destination node, which is obviously less. On the other hand, PKTF needs a probe message that has five data fields. Only the nodes in the selected path are required to respond with a probe—whether it is sent for the forwarding or reverse direction. The Spray and Wait protocol reduces transmission overhead by spreading only a few number of data packets to the neighbors. The source node then stops forwarding and lets each node carrying a copy perform direct transmission. In our emulation, we select the number to be one to make the comparison fair and hence the overhead is small. Lastly, for the other geographic routing protocols GRS, MFR, and CRM, the nodes just need to know the geographic locations of the neighbors and the destination. Therefore the overhead required is the least. Note that here it is not necessary to differentiate the two classes of traffic since the overhead difference is small.

6.4 Performance in Shallow Water

So far the results are obtained using the setting in Table 1, which is for the deep water. We change the network scenario to the shallow water scenario by setting the depth of the 3D region to 200 \(\mathrm {m}\). Note that generally there is no definite depth value for shallow water as the sound propagation depends on the corresponding underwater environment. Therefore in some references (e.g., [1]), the shallow water is considered to be less than 100 \(\mathrm {m}\) deep, while for some other acoustic researchers this depth can be up to 500 \(\mathrm {m}\) [20]. Here we use 200 \(\mathrm {m}\), which is used by National Oceanic and Atmospheric Administration (NOAA) as the depth for the first layer (and this depth is used in quite a few cases in the well-known set of benchmark shallow water test cases presented in the Shallow Water Acoustic Modeling Workshop 1999 (SWAM’99) [31]). In this shallow water scenario, the path loss estimated by the Urick’s model is very different from that estimated by the Bellhop model. We had anticipated the performance will degrade because of this mismatch. Surprising enough, as shown in Figs.  16 and 17, we find the performance (in terms of e2e delivery ratio, energy consumption, and link bit rate) in the shallow water is actually better. A more careful analysis reveals the reason—the existence of the surface duct in the shallow water. Surface duct is basically a zone below the sea surface where sound rays are refracted toward the surface and then reflected. The rays alternately are refracted and reflected along the duct out to relatively long distances from the sound source. Hence the acoustic waves are relatively concentrated in the surface duct, leading to less path loss. This consequently leads to improved network performance.

6.5 Performance Using Different Uncertainty Update Intervals

So far the broadcast interval of uncertainty region is fixed to 60 \(\mathrm {s}\). Our last interest is to evaluate the performance of the QUO VADIS variants when different broadcast intervals are used. Therefore we re-run the emulations for two more cases: (i) half of interval (i.e., 30 \(\mathrm {s}\)); and (ii) double of interval (i.e., 120 \(\mathrm {s}\)). From Figs. 18 and 19, we can see that the performance of the QUO VADIS variants becomes worse when the update interval is doubled. This is because when the interval is doubled, the position uncertainty information becomes less accurate. This leads to larger error in neighbor selection for packet forwarding and in the estimation of transmission power. On the other hand, halving the interval leads to performance improvement as the uncertainty information is updated in a more timely manner (therefore routing error becomes smaller and transmission power is better estimated). However, this obviously leads to the increase in overhead. Therefore the tradeoff between overhead and metrics such as delivery ratio, energy consumption, and link bit rate should be carefully considered for different applications. Here we use “QUO VADIS-Half,” “QUO VADIS,” and “QUO VADIS-Twice” to denote the cases with update interval of 30, 60, and 120 s, respectively.

To sum up, our proposed framework QUO VADIS improves the network performance for delay-tolerant applications in terms of e2e energy consumption, delivery ratio, and link bit rate by waiting for a ‘favorable’ topology configuration and by exploiting the gains of directional transducers. Through emulations for different setups, we demonstrated that they can offer better performance than the well-known geographic routing and DTN protocols when serving two classes of delay-tolerant traffic.

7 Conclusion

We proposed QUO VADIS, a QoS-aware underwater optimization framework for inter-vehicle communication using acoustic directional transducers. Based on the trajectory and position uncertainties of the AUVs, an AUV predicts a favorable network topology with relatively short links in the future and postpones transmission in favor of a lower transmission energy and a higher data rate. Communication energy consumption is further reduced by exploiting the frequency-dependent radiation pattern of underwater acoustic transducers. The proposed solution is implemented and tested in our underwater communication emulator, showing improvement over some well-known geographic routing protocols and DTN protocols in terms of e2e energy consumption, reliability, and link bit rate.