9.1 Introduction

In many colony-living species of birds and mammals, parents and offspring have developed the ability to identify each other, leading to mutual benefits (Halliday 1983). Indeed, parents avoid misdirected care and thus ensure their reproductive success while for young, such recognition is essential for their survival since most parents/mothers only feed their own offspring. The degree of recognition (mutual or unilateral) varies in regards to the social structure and environmental constraints of the species (Beecher 1989; Aubin and Jouventin 2002; Insley et al. 2003). In mammals, care of the young is mainly provided by the mother. Except in rare cases of cooperative mammals, only the mother provides care to her offspring and rejects sometimes aggressively any non-filial young (Le Boeuf and Briggs 1977; Harcourt 1992a; Maestripieri 1992).

Pinnipeds (seals, fur seals, sea lions, and walrus) are an excellent mammalian clade model for comparative studies of individual vocal recognition. First, pinnipeds use vocal signals, in air and/or under water, in most of their social interactions: territorial defense, mate selection, mother–young care, predator avoidance (Insley et al. 2003). Second, they show a high diversity in their social structures (from solitary to highly colonial species), breeding systems (from serial monogamy to highly polygynous species), and maternal attendance (short to long lactation, high or low level of allonursing) (Table 9.1). Phocids (i.e., true seals) live solitary or in small groups, with the exception of colonial phocids such as elephant and gray seals that form large aggregations of individuals during the breeding season. In general, phocid females stay in permanence with their young that they suckle for several weeks (4 days to 2.5 month) (Riedman 1990). Allonursing and fostering can be observed but it is not a common trait. Otariids (i.e., fur seals and sea lions) show different characteristics as they form large colonies during the breeding season, but the density of animals varying among species. Females exclusively nurse their young for several months (4–36 month) (Riedman 1990) and they can be highly aggressive toward non-filial pups (Harcourt 1992b). Throughout lactation, females alternate foraging trips at sea with ashore suckling periods, and the first separation occurs as soon as 10–15 days after birth. Finally, odobenids (walrus) show similar characteristics than otariids since they live in groups, more and less large but females stay always densely packed together. Walrus females stay in permanence with their calf even while foraging at sea since the young is able to swim few hours after birth (Kovacs and Lavigne 1992; Stewart and Fay 2001). Separations between the mother and her calf can however be frequent due to their instable habitat (fast and pack ice), but also due to the approach of predators or other disturbance (aircraft, human activities) around the colony that can induce stampede. Walrus females nurse their young for up to 3 years, and the social bond between the mother and her calf is among the strongest in mammals (Knudtson 1998).

Table 9.1 Biological and social characteristics of the three pinnipeds families

Such gradient in the social and breeding systems of pinnipeds results in different selective pressures for mother–pup vocal recognition. A comparative approach will allow a better understanding on how social organization can shape individual vocal recognition systems (Fig. 9.1). From the different quantitative and experimental works carried out on several species of pinnipeds, I aim to demonstrate that the gradient found in selective pressures for individual recognition is also found at the complexity level of their recognition systems. I define the complexity of a vocal recognition system by its characteristics or “dimensions”: occurrence (i.e., presence/absence, mutual or unilateral recognition), ontogeny (rapid or slow development of individual recognition), individual vocal stereotypy (i.e., low or high level of individuality in calls), and complexity (number and characteristics of acoustic parameters involved in identification, and their resistance to degradations during propagation in the natural environment). From the current knowledge on mother–young vocal recognition in pinnipeds, I will describe the different dimensions of their recognition systems and thus discuss the link between social and recognition systems in pinnipeds.

Fig. 9.1
figure 1

Gradient of biological and social traits of pinnipeds, the resulted selective pressures on mother–pup recognition, and their hypothetic influence on the individual recognition system

A prerequisite for individual vocal recognition is the use of vocalizations showing an individual stereotypy. Thus, a first step is to analyze the signals to determine if they are sufficiently individualized and thus may allow a reliable individual identification. This signal analysis aims to describe acoustic parameters that could encode individual identity (frequency modulation, spectral features, duration). Many studies have investigated the level of individuality in pups’ and mothers’ calls by performing discriminant function analyses (DFA) or artificial neural network (ANN). However, the difference in the use of numbers of measured acoustic parameters, number of calls per individuals, and number of individuals in each study makes the comparison among classification rates quite difficult (Insley et al. 2003; Khan et al. 2006). For instance, a correct classification rate of 90% on three individuals of species A is not the same than one of 90% obtained on ten individuals on species B. To “standardize” the results and thus to compare in a better way the results of these studies, I propose to take into account the number of individuals in these analyses, and thus calculate the index of vocal stereotypy (IVS) which is the ratio between the correct classification rate and the chance, with chance being defined as: 1/total number of individuals ×100. From the previous example, the IVS of species A is thus 2.7 (90/33.3) whereas the IVS of species B is 9 (90/10). Such standardization by the number of individuals gives a better idea on the vocal stereotypy and makes comparisons among species or populations more reliable. Based on this, I calculated the IVS for both mothers’ and pups’ calls from all studies on pinnipeds, and these results are compiled in Table 9.2. For both mothers and pups, IVS varies with the selective pressures for individual vocal recognition. Indeed, the individual vocal stereotypy decreases when ecological constraints for mother–pup recognition decrease (Fig. 9.2).

Table 9.2 Indice of Vocal Stereotypy (IVS) of the different studied pinnipeds species in regards to their ecological constraints and selective pressures for individual recognition
Fig. 9.2
figure 2

Indice of Vocal Stereotypy (IVS) in mothers’ and pups’ calls of pinniped species showing high, moderate, and low selective pressures for individual vocal recognition

These results are consistent with those found on penguins with species showing differences in their ecological constraints (colony density, presence of territories and nests, background noise). Acoustic analyses on contact calls on four species of penguins revealed that penguins without nest (King and Emperor penguins), so with the highest constraints for individual recognition, show a higher individual vocal stereotypy compared to penguins species with nest facing less constraints for individual recognition (Aubin and Jouventin 2002). Even if individual vocal characteristics are highly linked to physical traits of the emitter (vocal tract length, resonance cavities) (Riede and Fitch 1999), and thus can explain a certain level of individual stereotypy in their vocalizations, ecological constraints also greatly modulate this level.

9.2 Vocal Recognition and Ontogeny

9.2.1 Evidence for Vocal Recognition

An individual vocal signature, revealed by analysis, will not necessarily mean that a given species uses the vocal signal to identify individuals. For instance, in gray seals (Halichoreus grypus), one colony in Canada was studied and mother–pup vocal recognition occurs (McCulloch et al. 1999) whereas another colony studied in Scotland did not show such vocal recognition (McCulloch and Boness 2000). Several hypotheses have been suggested to explain such difference within a given species. The density of animals is very high in Canada and low in Scotland. A difference of habitat also occurs between these two studied colonies: in Canada the colony is located on a sandy beach, so an open area without any landmark, whereas in Scotland, the colony is established on a rocky area. Strong differences were found for maternal attendance. Indeed, females in Scotland often go foraging during lactation, so frequently separated from their young, whereas in Canada, females fast and stay in permanence with their pup. The level of allonursing is quite high in Scotland but rare in Canada. So, based on these biological and environmental traits, both populations show significant differences; however, both show needs for a mother–pup vocal recognition, and thus, they should have developed it (frequent separations between mothers and pups in Scotland, and high dense colony in Canada with no landmark), and the colony in Scotland exhibits even greater needs than those in Canada. So these findings are quite paradoxical as individual recognition is only developed in the Canadian colony. It has been suggested that the individual recognition found in the Canadian colony could be a residual behavior from an ice-breeding ancestry. Living on an unstable environment such as ice increases the chance of separations between mothers and pups, and thus this would have led to the development of vocal recognition between mothers and pups, and this has been maintained over time, even if selective pressures for individual recognition have decreased with time.

Such example reinforces the idea that it is essential to experimentally test the animals in order to assess the occurrence of such individual discrimination. In the last 30 years, mother–pup vocal recognition has been tested by playback experiments in 13 species, including seven otariids, four phocids, and one walrus subspecies (Table 9.3). Depending on the species, only the mother or the pup was tested, but for some species, both sides were investigated, with mutual vocal recognition demonstrated in northern fur seals (Callorhinus ursinus), subantarctic fur seals (Arctocephalus tropicalis), and Australian sea lions (Neophoca cinerea). Further investigations are still needed especially in phocids to draw firm conclusions, but currently Otarioidea (otarids and odobenids) show a well-developed vocal recognition system as well as some colonial phocids (elephant, gray, and harbor seals) whereas most non-colonial phocids do not exhibit mother–pup recognition (Table 9.3).

Table 9.3 Experimental tests on the mother–pup vocal recognition in pinnipeds

9.2.2 Onset of Vocal Identification

If vocal recognition exists, it seems essential to investigate when this identification is established, but also to examine the potential factors affecting the development of this cognitive process. Studies on the development of mother–pup vocal recognition are quite rare, and only four otariids species have been studied so far. Pups identify their mother’s calls 10–30 days after birth in Galapagos sea lion (Zalophus wollebaeki, n = 8 (Trillmich 1981)), 10 days after birth in Galapagos fur seal (Arctocephalus galapagoensis, n = 4; Trillmich 1981), between 2 and 5 days after birth in subantarctic fur seal (A. tropicalis, n = 9 (Charrier et al. 2001)), and between 10 days and 2 month in Australian sea lion (N. cinerea, n = 10 (Pitcher et al. 2009)). The study species showing the highest ecological constraints (A. tropicalis) is the one in which the vocal recognition has been established more rapidly and especially before the first separation between the mother and her pup. For the other three species, the colony densities are lower, and thus the risk of confusion among individuals is weaker, and thus the development of recognition of the mother’s voice by pups requires more time. Such late vocal discrimination in pups might be compensated by an early vocal recognition of the pup by the mother. Indeed, observations on different otariids species suggest that females can discriminate their pup’s voice few hours after birth (Trillmich 1981, Charrier obs. pers.); however, this has only been experimentally shown in Australian sea lion (Pitcher et al. 2010) (n = 17 females). Indeed, 48 h after parturition, Australian sea lion females can discriminate between calls of a given pup and those from their own pup. For species showing mutual vocal recognition, it is likely that recognition is established first in females and later in pups. The time difference between females and pups may vary with the strength of ecological constraints. Further investigations are still needed in phocids, for which both the occurrence of vocal recognition and the selective pressures for individual recognition vary greatly.

9.3 Individual Vocal Signature

Then, an essential step that highly interested me for years is to decipher the individual vocal signatures involved in such individual identification processes. By performing playback experiments, using modified signals and/or synthetic signals in which a particular acoustic parameter has been modified or removed, we can determine the different parameters involved in this identification process. Finally, propagation tests in the natural environment of the study models are used to determine the active/efficiency space of their communication signals. This allows determining a theoretical maximal distance at which a vocalization can be reliably detected by the young or the mother in the colony.

9.3.1 Cracking the Code of Individual Recognition

How individual vocal characteristics are coded was experimentally tested in only three otariids species: the subantarctic and Antarctic fur seal (A. tropicalis and A. gazella respectively), and the Australian sea lion (N. cinerea). In these three species, both mothers and pups use a multi-parametric vocal signature to decode the identity of the receiver. The main acoustic features involved in this identification process are the frequency modulation (FM), the amplitude modulation (AM), and the energy spectrum (ES, i.e., repartition of energy among the frequency bandwidth, or the timbre) of the call (Charrier et al. 2002, 2003, 2009; Pitcher et al. 2012; Aubin et al. 2015). However, we can detect some differences among these three studied species showing high to moderate selective pressures for individual recognition (Table 9.4). Indeed, the two species with high selective pressures for individual recognition (i.e., subantarctic and Antarctic fur seals) perform a temporal analysis of the calls using FM and/or AM as well as a timbre analysis to discriminate among individual voices, whereas the species with moderate selective pressures (i.e., Australian sea lion) performs also a temporal analysis using both AM and FM, a pitch analysis by paying attention to the exact frequency values of the calls but does not use the timbre (i.e., distribution of energy within frequencies). Before drawing firm conclusions, we need to further investigate species showing low selective pressures for individual recognition, such as non-colonial phocids. These future studies may reveal an individual vocal signature mostly relying on a pitch analysis (simple code as the risk of confusion among individuals is quite low). Indeed, the comparative study on penguins (Aubin and Jouventin 2002) showed that species without nest use a temporal analysis whereas species with nest use a spectral analysis (using both pitch and timbre), a signature thus considered as less complex (Aubin and Jouventin 2002).

Table 9.4 Individual vocal signatures and acoustic features involved in the recognition process

From a production point of view, AM and FM are considered more difficult to produce (Greenwalt 1968; Brackenbury 1982; Aubin and Jouventin 2002) than spectral cues, as the emitter needs to control perfectly two sound dimensions at the same time (AM: amplitude and time or FM: frequency and time) and this has to be maintained over time (either over several years in the case of mate recognition, or over a breeding season for parent–offspring recognition). In contrast, spectral features (either timbre or pitch) do not require a fine motor control as temporal features, and thus seem less complex to produce. Finally, in terms of coding possibilities, an identity coding based on temporal features such as AM and FM offers a larger set of individual signatures as it combines two dimensions (Greenwalt 1968; Aubin and Jouventin 2002), and thus limits the risk of confusion among individual voices. In contrast, a code based on one dimension, such as spectral cues (pitch and/or timbre), offers a lower diversity of vocal signatures (with pitch cues offering less diversity than timbre cues), and thus it can potentially lead to confusion among individuals.

In summary, vocal signatures based on a temporal analysis are considered as more complex (stronger motor control, and high diversity of possible vocal signatures) than those relying on a spectral analysis. However, independently of the type of coding (temporal or spectral), the individual vocal signature used by a species is efficient and thus well adapted to the ecological constraints faced by the species. In other words, a species facing weak ecological constraints, and thus low selective pressures for individual recognition, does not need to develop a complex vocal signature as a “simple” signature offers sufficient vocal signatures to avoid confusion among individuals.

9.3.2 Propagation of the Vocal Signature

The use of multiple parameters is a way to secure the code and thus to optimize the chance of detection and identification especially for species living in constraining environment such as noisy and confusing colonial environment. Propagations tests performed on different pinniped species have shown different efficiency of propagations in their natural environment; however, it is important to consider these results in the context of mother–young reunions and their distance ranges. Indeed, for instance, in the case of the Australian sea lion, mothers’ and pups’ calls can be reliably identified up to 32 and 64 m, respectively, as the energy spectrum that codes for individual identity is still reliably detected at these distances (Charrier et al. 2009; Pitcher et al. 2012). In the wild, mothers and pup exchange vocalizations over distances up to 50 m, and this can start even further as mothers start calling when still in the water. Moreover, pups usually stay around the last suckling spot to increase their chance to detect their calling mother returning to the colony. Such midrange propagation efficiency of the individual vocal signature in Australian sea lions seems thus sufficient in the natural range of mother–pup reunion. For Atlantic walrus (Odobenus rosmarus rosmarus), there is a strong difference in distance efficiency between propagation on ice and above water for both mothers and calves’ calls. All studied acoustic features reliably propagate up to 16 and 32 m on ice, and up to 128 m above water (Charrier et al. 2010). Most of the time, mothers and calves stay quite close, and if they got separated during group movements or panic in the presence of a disturbance or a predator, the distances to reunite again are within 10–20 m. These means that either on ice or above water, the acoustic features showing a great individual stereotypy, such as FM and energy spectrum, can be assessed at a natural communication range, and thus could be reliably used in the context of individual identification. Similarly, the propagation tests of harbor seal pups vocalizations above water show an efficiency of individualized acoustic features up to 512 m, a distance range that is way above the observed distance at which mother–pup reunions occurs (Sauvé et al. 2015b). These propagation studies have demonstrated that the active space of the acoustic features used or that could be used in mother–pup vocal recognition show propagation properties adapted to the environmental conditions in which mother–pup reunions occurs. Further investigations could assess if animals actually perceive and identify each other at these propagations distances, or if they can perform even better, as shown in king penguins that are able to extract the vocal signal when intensity is 6 dB below that of the background noise (i.e., cocktail party effect (Aubin and Jouventin 1998)). Indeed, a recent study on Antarctic fur seal (A. gazella) showed that pups identify their mother’s calls using AM, FM, and energy spectrum. Propagation experiments revealed that if FM propagates reliably up to 64 m, both AM and energy spectrum are degraded for distances over 8 m (Aubin et al. 2015). Playback experiments performed on groups of pups with female’s calls (groups include the pup whom mother’s calls were used) at different distances (8, 32 and 64 m) showed that when distances decreased, the number of responding pups were also decreasing, with 1 or 2 pups responding at 8 m. Such behavioral experiments clearly show that at long range, the identification of the mother can lead to some errors of identification (several pups responded), whereas at short range, when all acoustic features of the individual signature are not degraded and thus reliable, the identification is more robust. Redundancy of information secures the identification process especially for species living in a constraining environment such as noisy colonies.

9.4 Interactions of Acoustic with Non-acoustic Cues and Individual Recognition

It has been suggested that sensory cues besides vocalizations, such as olfactory and visual and/or spatial cues could potentially be involved in mother–pup individual recognition (Kaufman et al. 1975; Terhune et al. 1979). Anecdotal descriptions of reunions suggested the involvement of acoustic and olfactory cues with spatial and visual cues helping the localization of the individuals (e.g., area in the colony where the pup was left for spatial cues; body size and color fur pattern for the visual cues). However, vocal signals remain the primary signals allowing for efficient individual identification at both short and long range (some females calling their pups while still in the water). Visual and olfactory cues could be involved in a second step of the identification process, when mothers and pups are at close range, and they can be used as an intermediate or final check during reunions. Australian sea lion mothers have been experimentally tested with olfactory cues alone, and they are able to discriminate the smell of their pups from those of a non-filial pup (Pitcher et al. 2010). Australian sea lions mothers have been also shown to use body size and color pattern of their pup pelage to discriminate among pups of different age-classes (Wierucka et al. 2017). This age-class visual discrimination of pups by females likely plays an important role to facilitate mother–pup reunion. Indeed, this species exhibits an extended pupping period (up to 7 month (Marlow 1975)), and thus pups of different age classes (i.e., different body size, and different contrast/brightness color fur pattern) occur in the same time in the colony. When a female comes back from a foraging trip at sea, she has to find her pups among others (pups often form crèches while their mothers are foraging at sea), and thus distinguishing the appropriate age-class of her own pup will facilitate their reunion, but also reduce potential injury to pups by non-mother females (Wierucka et al. 2017). It is likely that females can individually discriminate their pups using visual cues as other mammals do (Parr and de Waal 1999; Kendrick et al. 2001), but this has not yet being tested.

Most studies on animal communication and individual recognition focus on a single sensory modality, and thus there is a lack of investigation on the synergy of sensory cues in identification process. A recent work performed on Australian sea lions involving several sensory cues highlighted the predominant role of acoustic cues in a multimodal context for both mothers and pups. Indeed, the addition of visual cues to acoustic cues did not enhance the pups’ responsiveness (Wierucka et al. 2018a). In females, the addition of olfactory and visual cues to acoustic cues enhanced the investigation behavior of females (i.e., sniffing) but did not enhance their vocal responsiveness (Wierucka et al. 2018b). Finally, when examining the relative importance of acoustic and olfactory cues in the recognition of pups by mothers, acoustic cues dominate olfactory cues (Wierucka et al. 2018b). Indeed, the vocal response of females relied only on acoustic cues and was not influenced by the identity or presence of olfactory cues (i.e., females’ responses were similar whatever the olfactory cue was filial or non-filial). Such findings highlight the importance of understanding the relative role of sensory cues in communication and recognition processes. In a multimodal context, there are environmental and biological factors influencing the use of cues such as their active space, as well as the costs and benefits to assess and integrate them (Hebets and Papaj 2005). In a mother–young recognition context, the costs and benefits of obtaining sensory cues is quite different between mothers and offspring, especially in otariids where females can be highly aggressive toward non-filial pups (Riedman 1990; Harcourt 1992b). Thus, assessing reliable olfactory and visual cues for pups would require a close approach of the calling female, and thus presents a high risk of injury. In contrast, females do not risk anything in approaching calling pups to assess additional cues. Even if vocal cues alone are sufficient to identify the pups, other cues may serve as a final check before accepting to suckle the pup.

9.5 Conclusions

Based on the current knowledge of mother–pup vocal recognition in pinnipeds, we can draw some general conclusions on the link between social and recognition systems. The wide ranges of both social structures and breeding strategies have resulted in differences in selective pressures for mother–pup individual recognition (Fig. 9.1). Even if further investigations are still needed in non-colonial phocids, showing low selective pressures for individual recognition, the gradient found for the selective pressures is also found on the complexity of the individual recognition system. Indeed, at the different levels or dimensions characterizing a recognition system such as vocal stereotypy, ontogeny, individual signature, there is a clear evidence that species with the highest selective pressures for mother–pup recognition have developed a more complex recognition system (i.e., high index of vocal stereotypy (IVS), rapid onset of vocal recognition, multi-parametric vocal signature, temporal analysis involved in recognition process) compared to species showing lower selective pressures (i.e., moderate to low IVS, delayed onset of vocal recognition, multi-parametric signature but spectral analysis involved in identification process).

Such findings on this mammalian clade, the pinnipeds, are consistent with those found on colonial birds such as penguins. These studied vertebrate species showing similar communication network and ecological constraints (e.g., group-living mammals, colonial birds) have developed similar communication systems. This suggests the occurrence of general communication strategies in vertebrates.