Introduction

The dugong (Dugong dugon), a member of the order Sirenia, is a marine mammal that inhabits shallow coastal environments [1]. Dugongs occur in tropical and subtropical coastal waters from east Africa to Vanuatu, between approximately 26\({^{\circ }}\) north and 27\({^{\circ }}\) south [1]. Unlike most marine mammals (e.g. cetaceans and pinnipeds), they are herbivorous, described as benthic-feeding, seagrass specialists [1]. Although dugongs are listed as Vulnerable in the International Union for Conservation of Nature (IUCN) Red List [2], little is known of their true status.

A species’ spatial and temporal habitat requirements (e.g. home range, migratory path, densities and territories) throughout its life cycle are essential if appropriate and effective conservation strategies are to be enacted. Sheppard et al. [3] showed a dugong individual’s short-term home range was approximately 0.6–\(12.4\,\hbox {km}^{2}\). Within their daily home ranges, 72% of their time was spent within three metres of the surface [4]. However, both tidal and diel cycles influence dugong movement; they tend to be closer to shore at high tide than low tide and at night than during the day [5]. To better understand dugong ecology, we need to know which factors drive different uses of habitats, such as feeding, resting, or vocalising at different locations.

Passive acoustic monitoring is useful for augmenting visual surveys to assess temporal and spatial changes in the density and behaviour of some animals (reviewed in [6]), such as whales [7] and seals [8]. This information is particularly useful for locations where animals are difficult to survey using traditional techniques, such as when their behaviour makes them undetectable during conventional visual surveys. Thus, this method would be suitable for dugongs, which are vocalising marine mammals [9]. Passive acoustic monitoring cannot confirm an animal’s absence, but it is an excellent tool for investigating dugong behaviour during submergence.

Although dugongs utter bird-like calls [9], the acoustic characteristics of their vocalisations are not well described. Anderson and Barclay [9] categorised dugong calls into three types: chirps (frequency-modulated signals in the 3–18 kHz range with two or more harmonics, lasting less than 60  ms), trills (frequency-modulated calls lasting up to 2.2 s, in the 3–18 kHz range), and barks (broadband signals between 500 Hz and 2.2 kHz, lasting up to 120 ms). Dugong calls in southern Thailand have also been categorised into short- and long-duration calls [10], which seem to correspond to chirps and trills, respectively. To date, no vocalisations corresponding to barks have been recorded from the Thai population of dugongs, the three dugongs kept in Toba Aquarium in Japan [11], or at Underwater Seaworld in Singapore [12]. Furthermore, it has been reported that free-ranging dugongs produce callback responses to conspecific chirp playbacks and that the source level and duration of dugong chirps significantly increases as signalling distance increases [13].

Temporal and spatial vocalisation patterns of marine mammals, including dugongs, provide insights into favoured places for communication [14]. Distinctive differences between the distribution patterns of calving herds, non-calving herds, and vocalising dugongs were reported, with the vocalising dugongs in Thai waters clustering in a specific area of approximately \(1\,\hbox {km}^{2}\) [15, 16], far smaller than their recognised daily home range. Dugongs have shown elevated levels of vocal behaviour in this small area, which appears to be a ‘vocal hotspot’ [13, 17] of smaller size than vocal hotspots of other marine mammals [18, 19]. It was assumed that dugong vocalisation in this area was independent of feeding behaviour, given no seagrass existed within the vocal hotspot [20]. It follows that the vocal hotspot is an important location for dugongs, into which they move to communicate, that is separate from other areas, such as those where they feed, otherwise socialise, or rear their young [16].

For detailed understanding of dugong habitat use, vocalisation should be investigated in detail within home ranges. Vocal hotspots have been observed through short-term stationary acoustic recordings [17] and towed acoustic surveys during the daytime [15, 16]. Long-duration recordings are required to describe the degree of variability in vocalisation production by dugongs (i.e. their vocal cue production rate), which is important if passive acoustic monitoring is used in conservation assessment. Quantifying vocalisation rates is an essential step towards accurate estimation of animal population density and behaviour [7, 8], but in isolation they do not enable abundance estimates to be made. It remains unknown whether such vocalisation patterns and their relationship with environmental factors in the vocal hotspot differ from nearby locations in different habitats. The direction of dugong movement has been related to current flow [5]; it is likely that they use current flow for daily migration, which may affect the number of calls observed by a fixed recorder as they pass through the recording range [17]. Furthermore, the frequency of occurrence of recorded feeding sounds changed with water level [21, 22].

Fig. 1
figure 1

Study site, southern Talibong Island, Trang, Thailand. AUSOMS were deployed in two locations, represented by stars: open star (feeding area), and solid star (vocal hotspot). Shaded areas contained within dashed lines indicate seagrass distribution, as surveyed by Nakanishi et al. [20]

Table 1 Field research details

The purpose of this study was to investigate how different habitats with different environmental factors affected dugong vocalisation rate. We recorded dugong calls within the vocal hotspot and a neighbouring feeding area, compared temporal patterns of vocalisation between these two sites, and correlated environmental factors (i.e. current speed and direction, and water level) with vocalisation rate. This research provides insight into local dugong conservation measures by using acoustic monitoring to elucidate key habitats and activity patterns.

Materials and Methods

Underwater recordings were conducted in the south of Talibong Island, Trang, Thailand (7.2151\({^{\circ }}\)N, 99.4012\({^{\circ }}\)E, Fig. 1) from 5 to 28 February 2015 (Table 1), for 20 days of total recording time (two 10-day recordings). The seafloor around Talibong Island is covered by approximately \(7\hbox { km}^{2}\) of seagrass beds, dominated by Halophila ovalis, Cymodocea serrulata, and Cymodocea rotundata [20]. The seafloor is gently sloping, with few obstacles to obstruct the path of sound waves in the focal area. Some 120 dugongs were estimated to inhabit this area, representing the largest population in Thailand [23, 24]. In recent decades, there has been a single reported instance of dugong death by shark attack, on which basis it has been assumed that predation risk here was very low [24]. Our study was conducted during Thailand’s dry season, with no precipitation during field recordings.

Table 2 Description of variables and categories used for model estimation

Automatic underwater sound monitoring systems (AUSOMS, AquaSound Inc., Kobe, Japan) were deployed on the seafloor in the vocal hotspot and at the offshore edge of an adjacent seagrass bed (Fig. 1) where dugongs are known to regularly feed (K. Kittiwattanawong pers. obs.). The vocal hotspot and feeding area were approximately 1000 and 600 m offshore, respectively. The seagrass bed was 1.3 m deep and the vocal hotspot 4.1 m deep when the AUSOMS were deployed. AUSOMS use a single hydrophone (AQH-20k, AquaSound Inc., Kobe, Japan) with a sensitivity of −195 dB (re 1  V/\({\upmu }\)Pa). The hydrophone had a flat frequency response within 2 dB between 20 Hz and 96 kHz. Because of the amplifier gain of 60 dB and the AUSOMS internal noise floor of \(\sim \)70 dB, the dynamic range was 70–135 dB. The sampling frequency was 48 kHz with a 16-bit resolution. Underwater sound was stored in eight flash memory cards (32 GB microSDs). Recordings were not conducted from 11:00 am on 15 February 2015 to 11:00 am on 17 February 2015, because AUSOMS had to be retrieved for data downloading and system maintenance.

Current speed and direction were recorded by current meter (Infinity EM, JFE Advantech Co., Ltd., Hyogo, Japan) deployed in the vocal hotspot. The baseline water level at each location was obtained from a tide table published by the Hydrographic Department of the Royal Thai Navy. Actual water level for each location was calculated by adding to the baseline water level the difference between the actual depths when AUSOMS were deployed and the corresponding depths of the baseline.

Dugong calls were automatically detected from recorded sound using customised software developed in a previous study with MATLAB [25]. To improve the accuracy of automatic detection, obvious false detections were removed following a visual inspection of sonograms by the first author. Onset times of extracted calls were used for analysis. The ambient noise level was approximately 90 dB for both sites. Considering the root-mean-square (RMS) source levels of two types of dugong calls (i.e. chirp and trill) were 130–140 dB [13, 26], and based on the assumptions that the ambient noise level was 90 dB throughout the survey and that sound propagation followed a spherical spreading model, the detection range using this software was approximately 200 m. Thus, calls recorded at each site were assumed independent as the distance between sites was approximately 400 m. The detection rate of this program was 80% for calls for which the received level exceeded 98.2 dB (150-m radius) [27]. Chirps and trills were not distinguished in this study.

Table 3 Summary of observed dugong calls

We defined vocalisation rate as number of calls per hour. Periodicity and periodic cycles were examined for each site by calculating the autocorrelation coefficients and power spectrum density (PSD) of the vocalisation rate.

Fig. 2
figure 2

Time series of changes in the number of calls per hour. Recordings were not conducted from 11:00 am on 15 February 2015 to 11:00 am on 17 February 2015 for data downloading and system maintenance

Fig. 3
figure 3

Power spectrum density (PSD) of dugong vocalisation rate. PSD was obtained from the Fourier transformation of the autocorrelation coefficient and indicates the dominant periodicity of vocalisation rate (i.e. observed calls per hour). Peaks are indicated with arrows

We applied a zero-inflated negative binomial (ZINB) model to examine influences of environmental factors on vocalisation rate, as vocalisation rate had many more zeros than expected for a Poisson or negative binomial distribution. The ZINB model has been previously applied to describe the spatial distribution and existence of rare species, given that it can account for extra absences in data [28,29,30]. In this study, degree of vocalisation rate was inferred using the negative binomial count model (negative binomial regression), with occurrence of dugongs in the observation area determined by the presence of vocalisation using the binomial zero-inflation model (logistic regression). We used the ‘pscl’ package [31] in R version 3.2.4 [32] for model analysis. This package supports a maximum likelihood estimation of zero-inflated and hurdle models for count data.

We used time period, current speed and direction, and water level as explanatory variables in analyses, and included all variables in both count and zero-inflation models (Table 2). We divided time of day into eight periods (i.e. 00:00–02:59, 03:00–05:59) and considered them categorical variables. As the times of high and low tides shifted by approximately 50 min daily, tidal factors at a fixed time differed each day. Thus, explanatory values regarding tidal change (current speed and direction, and water level) were assumed independent of time of day. Furthermore, tidal range varied from 0.8 m during neap tides to 3.3 m during spring tides. We assumed that water level was independent of current speed, which changed with tidal shift. Current speed and direction were recorded at 10-min intervals, with hourly mean values used in analyses. Current speed was considered a continuous variable, while direction was represented by two categorical values, northeast and southwest, with 94.3% of values associated with either directions. Water level at both sites was divided into four categories, with a 0.75-m step from 0 to 3.0 m in the feeding area, and 2.5–5.5 m in the vocal hotspot, and treated as a categorical variable. The model used in this study is presented as follows (Eq. 1).

$$\begin{aligned}&\hbox {Vocalisation rate} \sim \hbox {time period} + \hbox {current speed}\nonumber \\&\quad + \hbox { current direction} + \hbox {water level} \end{aligned}$$
(1)

Result

A total of 2052 dugong calls were detected in the feeding area and 6607 calls in the vocal hotspot (Table 3). The average vocalisation rate (±SD) was 4.2 (±13.0) per hour in the feeding area and 13.5 (±24.4) per hour in the vocal hotspot. The vocalisation rate was significantly greater in the vocal hotspot than in the feeding area (Fig. 2, Wilcoxon signed-rank test, p < 0.001). In the feeding area, the number of 1-h time windows that contained at least one call accounted for 34.4% (168 of 489 h) of the total recording period. In contrast, this figure was 68.5% (335 of 489 h) in the vocal hotspot. Vocalisation rate showed distinctive periodicity in both sites (Ljung–Box test: feeding area, p < 0.05; vocal hotspot, p < 0.001), with the dominant PSD peak appearing on a 12.00-h cycle in the feeding area, and a 24.47-h cycle in the vocal hotspot (Fig. 3). In the feeding area, other distinctive peaks were apparent on 5.19-, 7.20-, and 23.53-h cycles. In contrast, the dominant PSD peak was far greater than others in the vocal hotspot (Fig. 3).

Table 4 ZINB model results

From the coefficients of the negative binomial regression model in ZINB, all parameters of the time period, except for 09:00–11:59 in the feeding area, were positive and significant (Table 4). These results indicate that the vocalisation rate from 06:00–08:59 was lower than almost all other time periods at both locations (Fig. 4).

Different environmental factors significantly correlated with vocalisation rate between the two locations (Table 4). In the feeding area, the vocalisation rate was high when water level was 0.75–2.25 m (Fig. 5). However, these parameters were not significant in the vocal hotspot. In the vocal hotspot, the vocalisation rate was high with a northeast current direction. No parameters reached significance in the binomial model, which corresponded to the logistic regression.

Discussion

Although dugongs did not always vocalise in any given time window, considerable differences were apparent in observed numbers and times of dugong vocalisation between the vocal hotspot and nearby feeding area. The vocalisation rate was considerably greater in the vocal hotspot than the feeding area during long-term observation. The difference in vocalisation patterns and their relationship to environmental factors indicated that dugong acoustic signal use differed between fine-scale habitats. Our findings reveal dugongs use different locations for different purposes within their home ranges, with the distance between the two sites only 400 m. Although the vocal hotspot was important for communication, we assumed that the feeding area was used for feeding. Therefore, a possible reason for less vocalisation in the feeding area was that dugongs did not vocalise during feeding, which has been reported for captive dugongs in Toba Aquarium (Y. Matsuo pers. com.).

Acoustic monitoring has been used to characterise habitats and estimate the distributions of some cetaceans [33, 34]. This method is appropriate in cases where the probability of acoustic detection is sufficiently high that an animal’s presence would correspond to vocal detection. Overall, the average probability of acoustic detection of dugongs around Talibong and Muk Islands in Thai waters was approximately 40% [15]. However, detection probability increased to 73% around the vocal hotspot [27]. Thus, we consider their presence and absence closely matched their vocalisation patterns. On this basis, we discuss vocalisation pattern and habitat use at both sites.

Fig. 4
figure 4

Vocalisation rate boxplot. The horizontal axis represents time of day, and the vertical axis vocalisation rate. Circles indicate outliers

Fig. 5
figure 5

Scatter plot of vocalisation rate against water level at the feeding area

Vocalisation Pattern in the Vocal Hotspot

The vocalisation rate showed a distinctive 24-h cycle in the vocal hotspot. As has been reported in previous studies [17, 35], vocalisations were more frequent during the night (Fig. 4). Because of poor night-time visibility, dugongs might rely on acoustic cues for communication to a greater extent than they would during the day. Another possible reason for more increased nocturnal communication was that dugongs were avoiding the acoustic interference of vessel traffic, which was more intensive during the day [36].

Dugong vocalisation following sunrise was limited over the entire observation period (Table 4; Fig. 4). This reduced vocalisation, which has not been previously reported, may reflect resting or migration outside of the recording range. Other survey methods, such as tagging, are needed to further investigate the behaviour of dugongs.

Dugongs migrate offshore when the tide falls, and towards the shore when the tide rises [5]. Such tide-driven movement can also be inferred from our ZINB model estimation in the vocal hotspot. The seagrass bed was situated northeast of the vocal hotspot (Fig. 1), and the vocalisation rate was high when the current was northeast. This indicates that dugongs vocalised when moving with the northeast current onto this seagrass habitat to feed. As dugongs are specialist seagrass feeders, a diet very low in calories [37], such a tide-driven migration strategy might facilitate conservation of energy.

Vocalisation Pattern in the Feeding Area

The ZINB model revealed the rate of vocalisation to be high in the feeding area when water level was moderate (0.75–2.25 m), indicating dugongs visited the site during certain stages of the tidal cycle. Our findings are consistent with those of previous studies on dugong feeding behaviour in Thailand. Tsutsumi et al. [21] recorded feeding sounds over an intertidal seagrass bed and reported the lowest water level in which dugongs fed was approximately 0.8 m. Furthermore, Amamoto et al. [22] revealed dugongs significantly favoured feeding at water levels of 1.0–1.5 m. Such water level-dependent feeding may be linked to energy conservation by minimising the distance to descend to feed and ascend to breathe.

PSD results in the feeding area revealed several peaks, supporting the hypothesis that dugongs feed at specific water levels. These multiple peaks might be related to change of periodicity when the water level of feeding area changed with a shift from spring to neap tide. For example, water levels were 1.4–1.6 m with a 12-h cycle during spring tides and with 5- and 7-h cycles during neap tide. These tidal cycles matched distinctive peaks in PSD analysis, supporting the thesis that feeding is dependent on a specific water level range of 0.75–2.25 m.

Conclusion

Based on our results, we suggest that dugongs around Talibong Island use a vocal hotspot primarily for active socialising via acoustic communication during the night and that after sunrise they rest. Dugongs appear to visit the feeding area most when a northeast current runs and moderate water level of 0.75–2.25 m are experienced, suggesting these conditions are the most suitable for feeding. Tide-driven migration to seagrass beds may also be important for energy conservation.

In this study, we focused on differences in the number of calls, without considering the number of vocalising individuals. Additionally, we did not report on different types of calls, such as chirps and trills. To improve understanding of dugong acoustic behaviour, future studies need to identify both the caller and analyse acoustic characteristics of calls. Changes in visibility due to the brightness of the moon may also affect vocalisation rate. An observation period of at least 1 month is needed to investigate the influence of the lunar cycle.