Keywords

1 Introduction

Different sensory modalities enable different effective ranges for communication , defined as distances over which a signaler and recipient can effectively communicate. The sense of touch can be a powerful medium for communication, but requires sender and receiver to be in direct physical contact. Chemicals can spread from sender to receiver, but this occurs slowly underwater, limiting the speed and distance of communicating by chemical signals in the sea. This is particularly limiting for highly mobile species such as the toothed whales . By contrast, acoustic and visual signals can be transmitted rapidly, and acoustic signals can travel over large distances underwater. Marine organisms can create signals by generating chemicals (pheromones), light (bioluminescence), or sound (vocalizations ), but toothed whales are only known to generate acoustic signals.

2 Communication by Touch

Toothed whales have been reported to touch one another in agonistic and affiliative interactions. It can be difficult to separate when touch might be used to signal vs harm an opponent in agonistic interactions, so this section will focus on affiliative interactions. Tavolga and Essapian (1957) noted affiliative rubbing as part of pre-copulatory behaviors of common bottlenose dolphins (Tursiops truncatus). Norris (1991) and Dudzinski (1998) noted that one toothed whale may rub a conspecific with its flipper, and they suggested that this may take place as part of affiliative behavior . Sakai et al. (2006) described flipper rubbing among wild Indo-Pacific bottlenose dolphins (Tursiops aduncus) in great detail, pointing out that two dolphins may alternate rubbing the leading edge of the other’s flipper or one dolphin may use its flipper to rub the body of the other. The animal being rubbed may move so that a specific part of the body is being rubbed, and this may remove loose skin, suggesting a role similar to grooming in primates. However, Sakai et al. (2006) report seeing parasites attached to dolphins, but they did not observe one dolphin rub off a parasite that was attached to another dolphin. Rather, the way in which dolphins exchanged rubbing roles suggested to Sakai et al. (2006) that rubbing functions as an affiliative signal. This interpretation is strengthened by the observations of Tamaki et al. (2006) that after an aggressive interaction, captive common bottlenose dolphins took longer to engage in later aggression if they engaged in flipper rubbing with their former opponent. These observations suggest that not only is flipper rubbing an affiliative behavior but that it is also a signal for reconciliation after conflict (Weaver 2003).

3 Communication by Visual Sensing

The ability to recognize visual patterns of objects has limited range underwater. Light is attenuated and scattered as it passes through water, limiting vision to ranges of tens of meters underwater. The common bottlenose dolphin eye has a pupil with an unusual shape, which allows it to have relatively good visual acuity in air and in water. However, this limits the range for best underwater acuity to about 1 m (Herman et al. 1975). Dolphin mothers and calves may be able to recognize one another at ranges of 1m or less, but vision does not play the same role for long-range perception that it does in most terrestrial mammals. Some toothed whales such as the killer whale (Orcinus orca) and the Commerson’s dolphin (Cephalorhynchus commersonii) have evolved high-contrast black and white body surface pigmentation patterns , with large patches that do not demand high acuity; these may have evolved for longer-range visual detection and discrimination. Pigmentation patterns are similar among matrilineally related clans of killer whales and differ across ecotypes (Baird and Stacey 1988), suggesting that pigmentation could provide recognition cues at relatively short ranges.

4 Integration of Information Across Senses

Animals can integrate information across sensory systems to solve communication problems. We know more about this process in terrestrial mammals than in toothed whales . For example, in settings where the young of several mothers can intermix, a female mammal faces the problem of assuring that the young she suckles is her own. The most important cue for a ewe (Ovis aries) is the visual appearance of her lamb, which allows her to reject distant lambs that do not have the right appearance. She can also recognize the voice of her lamb, but the most important adjunct to vision is the ability to smell her lamb once she approaches to <0.25 m (Alexander and Shillito 1977). We know very little about how cetaceans use chemical senses in this manner, but since they evolved from ungulates, they may have adapted chemical sensing to their aquatic environment .

5 Communication by Chemical Sensing

Most mammals have three ways to sense chemicals—the sense of taste discriminates compounds dissolved in water, olfaction senses a broader range of chemicals that are in air, and the vomeronasal sense primarily detects pheromones produced by conspecifics (Pihlström 2008). Toothed whales have a sense of taste that is similar to other mammals (Nachtigall and Hall 1984), but their sense of olfaction is different. Most mammals breathe relatively constantly, and this airflow passes odors across the olfactory mucosa in the nasal passages. Toothed whales hold their breath most of the time and breathe explosively. For example, bottlenose dolphins exhale more than 12× faster than humans (Fahlman et al. 2015). Toothed whales also use nasal structures to produce sound. These high flow rates and adaptations for sound production are not consistent with terrestrial olfaction. Toothed whales have neither olfactory mucosa nor an olfactory bulb in the brain (Pihlström 2008). Odontocete cetaceans have lost about 2/3 of their olfactory receptor genes, consistent with loss of olfactory function (Kishida et al. 2007). The vomeronasal sense is the other major system used by mammals to sense chemicals—it detects pheromones produced by conspecifics along with some other odorants (Baxi et al. 2006). This accessory chemical sense plays an important role in the detection of reproductive state in many mammals, including ungulates that are phylogenetically related to cetaceans (Montgelard et al. 1997). Though there is strong evidence that odontocetes have greatly reduced olfactory capabilities compared to terrestrial mammals, they may have evolved specialized chemical sensory abilities better suited to their aquatic lifestyle, perhaps located away from nasal structures that have specialized for non-olfactory functions.

6 Communication by Acoustic Sensing

When the terrestrial ancestors of cetaceans entered the sea, their senses had to adapt to a new environment. Vision is the best way to sense things that are far away in air, but light is not as good as sound for sensing distant objects in water. Whales and dolphins have adapted to their underwater world by specializing in sound much as we humans specialize in vision. Humans have about 1,159,000 nerve fibers in the optic nerve, 38 for every one in the auditory nerve (30,500), while an Amazon river dolphin , Inia geoffrensis, which inhabits murky muddy waters has only 0.15 optic nerve fibers for every auditory fiber (Mass and Supin 1989). Most other odontocetes have about the same number of optical as auditory fibers, and they achieve this by having 2.2 (finless porpoise , Neophocoena phocoenoides) to 5.3 (sperm whale, Physeter macrocephalus) times more auditory fibers than do humans (Ketten 1997).

6.1 Echolocation

Toothed whales not only have specialized hearing, but they also have evolved the ability to detect objects in the dark by listening for echoes from their own sounds. Echolocation in toothed whales and bats has required coevolution of sound production and reception capabilities. One way that human and animal sonars increase the level of sound is to direct sound energy in a narrow beam. They also can reduce interfering noise by aligning directional hearing in the same direction as the sound beam. Au et al. (2012) showed that echolocation clicks recorded in the center of the sound beam of bottlenose dolphins are much higher in level and in peak frequency (~120 kHz) than when recorded at off-axis angles. Not only are dolphins able to hear well at 120 kHz, but they also have highly directional hearing at this frequency, with sensitivity aligned along the axis of sound transmission (Au and Moore 1984). The evolution of a sophisticated biosonar created selection pressures for specialized high-frequency sound production and hearing in toothed whales (Au et al. 2009).

Echolocation allows toothed whales to orient and forage in the dark deep ocean and at night, an advantage for a top predator. As air-breathing mammals, they maintain a body temperature elevated above ambient water, carry oxygen to sustain high metabolism during underwater pursuit, and use air to power sound production. Their mammalian ears allow for hearing of high echolocation frequencies, giving them a predator’s advantage. But by making echolocation sounds to forage, they also provide cues that eavesdroppers can detect. Four taxa of toothed whales have evolved a cryptic anti-predator strategy, producing echolocation signals that are so high in frequency that their primary predator, killer whales, apparently cannot hear them (Madsen et al. 2005; Morisaka and Connor 2007).

A form of communication occurs when members of the same species eavesdrop on each other’s echolocation signals to keep track of one another (Jones and Siemers 2011). For example, Cuvier’s beaked whales , Ziphius cavirostris , synchronize dives within a group. They disperse once they start to use echolocation to forage, perhaps to avoid distracting one another. When pairs of Ziphius are simultaneously tagged with acoustic recording tags, the clicks produced by one whale are often audible on the tag of the other whale, and they appear to use these clicks to reunite before they surface silently (Zimmer et al. 2005a).

6.2 Acoustic Communication

Some toothed whales , such as sperm whales and porpoises , make only click sounds and have adopted click signals derived from echolocation clicks for communication. Porpoises have evolved narrow-beam high-frequency clicks, possibly to avoid detection by eavesdropping predators (Morisaka and Connor 2007). To maintain crypsis , they use high-frequency clicks in different rhythmic patterns for communication (Sørensen et al. 2018), with different patterns produced in different behavioral contexts (Clausen et al. 2010). The directionality of echolocation clicks may be a feature for communication by cryptic animals, allowing them to direct their message to an intended recipient while making it less likely that animals elsewhere will hear them. Sperm whales use rhythmic patterns of clicks for communication, relying more on social defense against predation than on crypsis. They have evolved a sound production apparatus that produces clicks that are highly directional at frequencies of 8–25 kHz (Madsen et al. 2002a) and much less directional at frequencies of about 1–3 kHz (Zimmer et al. 2005b).

The sperm whale and dwarf and pygmy sperm whales of the genus Kogia only have one pair of phonic lips to produce sound (Cranford et al. 1996). Other toothed whales produce sounds with two pairs of phonic lips, each one set above bony nares in the upper respiratory tract (Cranford et al. 1996). Delphinids make specialized communication sounds in addition to clicks, using the right set of phonic lips to produce echolocation clicks , and the left side to produce longer, tonal whistle sounds that function for communication (Madsen et al. 2013). The left phonic lips vibrate so rapidly under pneumatic pressure from the lungs that they can produce fundamental frequencies of <2 to >20 kHz (Madsen et al. 2011).

6.2.1 Contact Calls

A common function for communication involves maintaining contact between animals who share strong social bonds . For example, all mammals require a mechanism for lactating mothers to maintain contact with their young while they are dependent on suckling, a period that lasts for years in most toothed whales . Unlike terrestrial mammals, toothed whales live in an environment where mother and calf can seldom see one another if they separate by as little as 10 m, and their reduced olfaction hinders use of olfactory cues for recognition. This puts a higher priority on acoustic communication to maintain contact when mother and calf separate.

6.2.2 Signature Whistles in Bottlenose Dolphins

Some of the most detailed observations of communication signaling in toothed whales come from two species of bottlenose dolphin, the common and Indo-Pacific bottlenose dolphins, which have been observed closely in captivity and in the wild. Bottlenose dolphin calves produce tonal whistles and click-like sounds as early as the day they are born (Caldwell and Caldwell 1979). Young dolphins show a fascinating combination of precocial and altricial features. They are born highly mobile with well-developed senses and are able to surface independently to breathe within minutes of birth . Young calves often swim tens of meters from their mothers, well out of visual range. At the same time, dolphin calves have a remarkably long period of dependency. In the wild, they will suckle for many years and typically stay with their mother for 3–5 years, until the next calf is born (Wells 2003; for social structure details, see Wells for common bottlenose dolphin and Chap. 16).

The combination of high mobility, low visibility, and prolonged dependence has selected for the development of an acoustic communication system allowing mother and calf to maintain contact and to indicate a desire to reunite across long separations. Macfarlane (2016) used data from wild common bottlenose dolphin mother-calf pairs that were simultaneously tagged with acoustic recording tags (Johnson and Tyack 2003) to monitor sound production at different stages of separations and reunions. The tags were synchronized, enabling calculation of distance by timing how long it took for a sound emitted by one dolphin to be detected on the other tag. When a dolphin calf was swimming away from a mother who was producing echolocation clicks, the most likely detection range for the clicks was about 100 m, but when the mother was pointing toward the eavesdropping calf, the range was 300–450 m. These data suggest potential problems in maintaining contact as animals are separating >100 m. In addition, there is no evidence that a dolphin can tell which individual is making an echolocation click , so the listener could get confused if several dolphins are in acoustic detection range.

Each individual bottlenose dolphin produces a diversity of whistle calls, including a stereotyped individually distinctive whistle called the signature whistle , which comprises about 40–70% of the whistle repertoire of wild dolphins (Janik and Sayigh 2013). Signature whistles produced by dolphins have relatively omnidirectional fundamental frequencies and are individually distinctive, allowing closely bonded animals to keep track of their partners. Ever since signature whistles were first described in common bottlenose dolphins , they have been described as contact calls (Caldwell and Caldwell 1965), and these features make them better suited to this task than eavesdropping on echolocation clicks. Evidence from captivity (Janik and Slater 1998) and the wild (Smolker et al. 1993; Watwood et al. 2005) shows that bottlenose dolphins are more likely to produce individually distinctive signature whistles when separated from social partners than when together. The range at which dolphin whistles can be detected is better suited than clicks for reliable detection during separations. Quintana-Rizzo et al. (2006) followed hundreds of separations of common bottlenose dolphin mothers from their dependent calves and found a mean separation distance of about 100 m. The detection range of bottlenose dolphin whistles depends on habitat, but they are detectable in shallow habitats at ranges of many hundreds of meters and in deeper habitats to many kilometers (Quintana-Rizzo et al. 2006; Jensen et al. 2012), well beyond the maximum separation ranges observed. The shortest detection ranges were similar to those for echolocation clicks when a mother was pointing directly at the calf, but the omnidirectional whistles were detectable at any orientation of the whistler, an important feature for most communication.

Macfarlane (2016) identifies three different potential functions for contact calls: maintaining contact, regaining lost contact, and advertising identity. The pattern of call production predicted for maintaining contact is a spontaneous call rate that would allow animals to monitor each other’s location while separated. The function of maintaining contact would not necessarily predict that a listener would modify its behavior upon being informed of the location of a partner. Calls used to regain lost contact would represent a motivation for reuniting; these calls would be predicted to cause a partner to call back and approach. The function of advertising identity is predicted to be important for the last stage of reunion, when animals come close enough to exchange resources or face risk of attack. A common setting where this becomes important is when a parent must ensure that it is feeding the correct offspring, and the offspring must be wary of approaching a non-parent that might attack it. Quick and Janik (2012) provided support for an identity advertisement function when they showed that groups of common bottlenose dolphins in the wild engage in exchanges of signature whistles just before meeting, which they interpret as an exchange of identity information just before the decision to join.

Macfarlane (2016) tested the predictions of these three functions using data from simultaneously tagged mother-calf pairs of common bottlenose dolphins (calf age ranged from 2 to 7 years) during separations of >30 m. Every signature whistle produced by each dolphin during separations was scored as “separating” when animals were moving apart and as “reuniting” when they were coming together again. Logistic regression modeling suggested that whistles were more likely to be produced during reunions than during the separation phase. Another important factor for predicting when a signature whistle would be produced was the delay since the tag on the whistler last detected an echolocation click from its partner. These results are more consistent with the hypothesis that signature whistles are used for reuniting rather than just maintaining contact, a function that can also be supported by eavesdropping on echolocation clicks . Cases of increased whistle rates as a pair joined support the identity advertisement function, but the majority of signature whistles in this dataset were produced when the pair was not separated, suggesting the potential for other whistle functions not examined by Macfarlane (2016).

Most mammals rely upon voice cues for individual identification . Differences in voice produced by differences in the configuration of the vocal tract of different individuals provide sufficient cues for individual discrimination in many species. However, when an animal dives , the air-filled cavities of the vocal tract change shape (Jensen et al. 2011), making voice cues unreliable (Madsen et al. 2011). Bottlenose dolphins do not rely on voice cues to classify a signature whistle , but rather they use the distinctive pattern of modulation of the fundamental frequency, called the contour (Janik et al. 2006), which is under voluntary control of the whistler. Infant common bottlenose dolphins tend to produce simple unstereotyped whistles, but by the end of the first 3 months of life, most develop a stereotyped individually distinctive signature whistle (Caldwell and Caldwell 1979). Newborn dolphins stay so close to their mother that this proximity may reduce the need for an acoustic signal for individual identification, but by 3 months of age, they separate for long enough and far enough to pose problems for reunification.

This need for individual identification is heightened by the fission-fusion society of bottlenose dolphins where each grouping tends to last only for minutes, and a mother and calf may associate with dozens of individuals. Dolphins learn to produce signature whistles that vary in contour more across individuals and less within an individual than is typical for mammals that rely upon voice cues for individual recognition (Tyack 2000). When a wild dolphin hears the signature whistle of an animal with which it shares a strong social bond , it responds more strongly than when it hears a familiar whistle from a less strongly bonded dolphin (Sayigh et al. 1999), and dolphins even recognize the contours of synthetic whistles devoid of voice cues (Janik et al. 2006), demonstrating the ability to recognize signatures by contour rather than voice.

Vocal learning is rare among mammals, which tend to inherit the motor programs that generate species-typical vocalizations (Janik and Slater 1997). The gold standard for demonstrating vocal learning is to measure the pre-exposure repertoire of a subject, create a novel stimulus, and then show that the subject can copy the stimulus after hearing it. The best evidence for vocal learning in dolphins stems from studies of adults trained to imitate novel synthetic contours in the laboratory (Richards et al. 1984). There is also evidence that common bottlenose dolphins learn to incorporate acoustic features of sounds they hear into their signature whistles . The strongest evidence comes from cases where a dolphin calf incorporates a sound that differs from normal dolphin whistles. One common artificial sound in a dolphinarium is the trainer’s whistle, which indicates to a dolphin that it can get a reward. Calves that grow up in dolphinaria are more likely than wild dolphins to develop signature whistles with unmodulated frequency contours, similar to the trainer’s whistle (Miksis et al. 2002).

Once a female common bottlenose dolphin develops a stereotyped signature whistle in the wild, the signature whistle tends to remain stable for the rest of the dolphin’s life (Sayigh et al. 1990, 2007). However, development of a new social relationship can alter the signature whistle of male dolphins. As they mature, some male bottlenose dolphins form stable associations with one other male in the case of common bottlenose dolphins in coastal waters of Florida (Wells 1991) and one or two other males in the case of Indian Ocean bottlenose dolphins in the inshore waters of Western Australia (Connor et al. 1992). These male coalitions develop synchronized and coordinated behaviors that may function in improving foraging , protection from predators, territorial disputes between males from adjacent communities (Wells 2003), and competing with other males for access to females (Connor et al. 2001). Smolker and Pepper (1999) documented how the whistles of three Indo-Pacific bottlenose dolphins became more similar as their alliance bond developed, and Watwood et al. (2004) showed for common bottlenose dolphins that the whistles of established alliance partners are more similar to one another than to the whistles of other alliances. In these cases of convergence, each male retains some individually distinctive features of their whistle as the overall contour becomes more similar.

Dolphins not only use vocal learning in the development of their own signature whistles , but they also copy the signature whistles of dolphins with whom they share strong bonds. Tyack (1986) first discovered signature imitation in two captive bottlenose dolphins. Each dolphin primarily produced a stereotyped whistle contour that differed from the favored whistle of the other dolphin, but each dolphin also occasionally repeated a contour similar to the other dolphin’s signature. Similar copying has been observed in whistle exchanges between pairs of captive dolphins and in wild dolphins temporarily restrained for health assessment (King et al. 2013). In this setting, each dolphin primarily produces its signature whistle, but when two dolphins that share a strong bond are held together, one may copy the signature whistle of the other. Figure 2.1a, b shows spectrograms of whistles from exchanges between two mother-calf pairs recorded during temporary restraint. Figure 2.1c shows spectrograms of whistles from exchanges between two adult males recorded in an aquarium pool. The top row of each sub-figure shows three examples of the signature whistle of the dolphin being copied; the middle row shows the copies; and the bottom row shows three examples of the signature whistle of the dolphin making the copies. In exchanges where copying was detected, the rate of copying was 0.18 copy/min/individual, much lower than the overall whistle rate of 5.3 whistles/min/individual. King et al. (2014) used an interactive playback design where they waited until a dolphin subject produced its own signature whistle, and then they would play back either a synthetic copy of its signature (copy) or a different signature whistle (control). The subjects were significantly more likely to respond with their signature whistle to the copy than to the control, with an optimal time interval of 1 s between the original signature whistle and the matching stimulus. King et al. (2014) conclude that whistle matching is an affiliative signal that allows one dolphin to direct a signal to a particular individual within a large communication network .

Fig. 2.1
figure 1

Spectrograms of whistles exchanged by three pairs of common bottlenose dolphins , Tursiops truncatus . (a) and (b) Shows exchanges of whistles between mother-calf pairs recorded during temporary restraint in waters off Sarasota FL. In (a), the top row (a-i) shows three examples of the signature whistle of the mother, who was copied; the middle row (a-ii) shows three examples of copies of her whistle made by her male calf; and the bottom row (a-iii) shows three examples of the signature whistle of the calf. In (b), the top row (b-i) shows signature whistles of the male calf, who was copied; the middle row (b-ii) shows copies of his whistle made by his mother; and the bottom row (b-iii) shows signature whistles of the mother. (c) Shows whistles exchanged by two adult male dolphins recorded in an aquarium pool. The top row (c-i) shows signature whistles of the male who was copied; the middle row (c-ii) shows copies of his whistle made by the other male; and the bottom row (c-iii) shows signature whistles of the copier. Each middle row has inset numbers that indicate how similar the copy is judged by human observers to be to the signature that is copied, with 1 being not similar and 5 being very similar. From Fig. 1 of King et al. (2013)

6.2.3 Group-Specific Dialects of Killer Whales

The evidence described above suggests that dolphins use and copy individually distinctive whistles to maintain associations between strongly bonded partners within a fission-fusion society. There is evidence for a different pattern of contact calls in killer whales (Orcinus orca), which have stable groups. Killer whales have some of the most stable groups known in any mammal. The fish-eating ecotype of killer whales in the northeast Pacific lives in matrilineal groups spanning 2–4 generations (Olesiuk et al. 2005). Both sexes remain with their mothers throughout their life, with neither sex dispersing from their natal group (Bigg et al. 1990). Different matrilineal groups often travel together in an assemblage called a pod , with more closely related matrilines tending to associate more frequently than more distant matrilines (see Chap. 11, for details of killer whale societies).

Killer whales produce stereotyped calls that may include high-frequency components similar to dolphin whistles and pulsed components with energy at lower frequencies corresponding to the repetition rate of the clicks (Ford 1987). When members of a group are separated, they often exchange shared calls with one another (Miller et al. 2004a). Each pod of killer whales studied in the Pacific Northwest, Alaska, and Norway has a group-specific repertoire of these stereotyped calls (Ford and Fisher 1983; Ford 1989, 1991; Yurk et al. 2002; Strager 1995). Each pod averaged about ten call types, with more closely related pods sharing more call types. When one individual separates from its group, recordings show that it produces most of the calls in its pod’s repertoire (Ford 1989). Calls vary somewhat in usage among behavioral states, but no call type was associated exclusively with any one context (Ford 1989). Ford (1989: 727) argues that these “calls probably function as intragroup contact signals to maintain group cohesion and coordinate activities.”

One problem with the interpretation that group-distinctive call repertoires are used to maintain group cohesion is the large size of the repertoire coupled with the extent of overlap of calls between groups. For a listening whale to be sure that a calling group is its own, it might have to listen for long enough to hear a large set of the calls to rule out other groups. Ford (1989) points out that it would seem much more efficient for each group to have one or two group-distinctive calls as occurs in many primate species. Ford (1989: 741) notes that a complex repertoire with many components shared across groups does not seem well suited for the group cohesion function and he proposes another function—that killer whale calls synchronize specific activities, as calls “often spread contagiously among group members following the spontaneous emission of a call by one animal.” A good example of activity that requires synchronized coordination comes from Norwegian killer whale pods “carousel feeding ” on highly mobile schools of herring. Here a group of killer whales herds herring often found at depth into a tight ball near the surface, where whales take turns stunning fish with their tail flukes and then eating fish one-by-one while the rest of the group circles the school (Similä and Ugarte 1993). Van Opzeeland et al. (2005) show that Norwegian killer whales call at a higher rate per individual with a higher diversity of calls when carousel feeding than when they feed in a less coordinated fashion.

Da Cunha and Byrne (2009) review functions of contact calls and add another category to those tested by Macfarlane (2016)—that of coordinating group movement . As mentioned above, a sound production organ that functions to produce a directional sound at high frequencies produces a less directional sound at lower frequencies. Miller (2002) and Lammers and Au (2003) hypothesize that toothed whales use this phenomenon to communicate whether they are approaching or moving away from group members. If they are approaching and the receiver is within the forward-directed beam of the high-frequency signal, then it will hear high as well as low frequencies; if they are moving away, then the receiver will hear more of the less directional low-frequency component of the call. This ability to provide a cue about whether a signaler is moving toward or away from a listener could be particularly useful for a call that functions to coordinate group movement.

Call repertoires may function not only for cohesion of matrilineal groups but also to help killer whales identify their relationship to other pods. Ford (1991) compared call repertoires of 16 killer whale pods, and found that they formed four distinct vocal clans, with some calls shared within each clan and none shared between clans. Even though pods from different clans did not share calls, they often associated together close enough for prolonged acoustic contact between the pods. When pods within a clan shared a call, there often were acoustic differences in the call type that were distinctive for each pod. When Yurk et al. (2002) compared call repertoires of Alaskan pods to genetic relationships , they found two acoustically distinct groups, each of which had different maternal DNA . These results led to the hypothesis that when a pod grows too large, it may split into two and that vocal clans are formed of related pods descended from a common ancestral group. Figure 2.2 shows spectrograms of calls that are shared between Alaskan pods within a clan, showing pod-specific variation. The first two rows show variants of calls AKS01 and AKS05 which are produced by different pods in the AB vocal clan, and the third and fourth rows show variants of two calls produced by the AD clan (from Yurk et al. 2002). The overall similarity of the structure of each call type should be obvious, as well as the differences between variants. Deecke et al. (2000) measured fine-scale acoustic features of two calls from two matrilines within the same vocal clan and showed that one call changed consistently from year to year, with both groups tracking the same change, and the other call showed little change over the 14-year sample. Deecke et al. (2000) suggest that group-specific dialects could develop if as pods diverge, they add, drop, or modify calls and that the combination of differentiation coupled with maintenance of similar calls may result from cultural drift that is modulated by an active process in which matching of the acoustic structure of calls occurs across different pods within the same vocal clan but not across different acoustic clans . This pattern suggests that killer whales may be able to use call repertoires to assess genetic relatedness . Barrett-Lennard (2000) analyzed the DNA of fish-eating killer whales of the northeast Pacific and found that most mating was between rather than within sympatric vocal clans. Pods within a clan are closely enough related that selecting a mate from a different clan may represent a mechanism to avoid inbreeding .

Fig. 2.2
figure 2

Spectrograms of call types produced by different matrilines in different pods and vocal clans of Alaskan killer whales, Orcinus orca. Each of the top two rows shows a call type from the AB vocal clan, and each of the bottom two rows shows a call type from the AD clan, with different variants of each call as produced by different matrilines in the clan. From Fig. 5 of Yurk et al. (2002)

The patterns of vocal change described above are most easily explained as resulting from a process of vocal learning in which one animal modifies its own vocalizations based on what it hears from others. This ability has been well established for common bottlenose dolphins (Janik and Slater 1997) but is less well established for killer whales. Ford (1991) describes evidence for killer whales learning calls from conspecifics. However, demonstrating that they are actually modifying their own calls to match the natural call of a conspecific is more difficult than for matching artificial calls or calls of other species. Janik and Slater (1997) distinguish vocal usage learning from vocal production learning. In vocal usage learning, an animal learns to produce a call already in its repertoire in a new context, while vocal production learning requires an animal to modify its pre-exposure vocal repertoire using imitation of a sound to produce a new vocalization that was not in its pre-exposure repertoire. Foote et al. (2006) provide evidence that a wild killer whale may have copied barks of sea lions; however, they do not show that these calls differ from normal killer whale calls nor can they rule out that they might have actually been produced by sea lions. Abramson et al. (2018) trained a captive killer whale to match sounds either from her own calf or from a human. Unfortunately, the Abramson study did not use the same methods to test for matches in the pre- and post-exposure repertoires, which hinders interpretation. All of these killer whale studies face problems in demonstrating that the matches represent vocal production rather than usage learning. None of the studies of vocal learning in killer whales fully meet the gold standard of quantifying a pre-exposure repertoire, designing signals that clearly differ from this repertoire, and demonstrating accurate matching in the exposure or post-exposure repertoires.

Most biologists studying the stereotyped calls of killer whales categorize them by the whole call—defined as an utterance separated by silence. However, Strager (1995) showed that some of the calls produced by Norwegian killer whales were formed of different combinations of call components. The relative positions of components in different calls had a fixed order. For example, call N22 was similar to a suffix, always being added at the end of a sequence of call components. This observation led Strager (1995) to suggest that some killer whale calls may be formed of a sequence of phoneme-like subunits. Yurk (2005) proposed that calls of killer whales from the Pacific Northwest are also made up of subunits. Shapiro et al. (2011) used methods developed for computer processing of human speech to test the pros and cons of analyzing killer whale calls as complete calls or as sequences of subunits. They defined subunit boundaries as a silent gap of >0.1 s or a 500 Hz spectral jump within 0.25 s. They then tested three different ways to classify calls: one based on the whole call, a second that assumed each call is made up of call-specific subunits , and a third that assumed that subunits could be shared across calls. All three methods to classify calls had error rates that were not statistically different. These results suggest that killer whales could construct their call repertoire either by memorizing all of the whole call types, or by memorizing the sequencing of a smaller set of subunits. Several strands of evidence supported the shared subunit model over the unshared subunits:

  1. 1.

    In the shared subunit analysis, 75% of all calls contained at least one subunit shared across calls.

  2. 2.

    The shared subunit analysis used only 1/3 the number of subunits generated by the unshared analysis, reducing memory requirements and enabling a more efficient representation.

  3. 3.

    Nearly half of variable calls analyzed matched a subunit generated by the shared subunit analysis of stereotyped calls, suggesting that some calls classified as variable may be made up of rare sequences of the same subunits as stereotyped calls.

In addition, analysis of calls from killer whales from the Northeast Pacific showed some subunits that matched with Norwegian subunits, suggesting that subunits may be shared across populations that do not share whole calls, just as phonemes may be shared across human languages that do not share words. This subunit view suggests a different type of vocal learning in killer whales. It suggests that auditory categorization may start by detecting subunits before categorizing the entire call as a unit and that if whales have pattern generators for subunits, then learning to produce a call may involve memorizing the correct sequence of subunits. Byrne (1999) describes a similar interpretation that nonvocal imitation may also be comprised of parsing a string of behaviors that are observed and learning the sequence of motor actions that produces this string. Even if sequence learning is a critical component of learning whole calls, Deecke et al.’s (2000) demonstration that pods may change the frequency of a section of the call over time suggests that subunits may be modified by classic vocal production learning, just as different human cultures may have slightly different versions of the same phoneme.

6.2.4 How Sperm Whales Use Clicks for Communication

Sperm whales have a vocal apparatus so specialized for echolocation that even though it takes up 1/3 of the body, sperm whales appear to be limited to using clicks for communication as well as for echolocation. Sperm whales produce clicks with one pair of phonic lips at the front of the head, with most of the energy directed backwards through the spermaceti organ (Møhl 2001). Some of this energy escapes into the water, but most reflects off an air-filled sac close to the skull and is directed in an intense narrow beam forward into the water (Møhl et al. 2003; Zimmer et al. 2005b). Some of the energy reflects back toward the skull again, and this reverberation causes the sperm whale click to be made up of a series of pulses, where the inter-pulse interval is proportional to the size of the spermaceti organ (Zimmer et al. 2005c). Bioacousticians can use the inter-pulse interval to estimate the size of the clicking whale (Bøttcher et al. 2018), and it is likely that sperm whales can extract the same information from these clicks.

Sperm whales typically echolocate during deep foraging dives , producing regular series of clicks every 0.5–2 s, and more rapid series of clicks, called a buzz, as they attempt to capture prey (Miller et al. 2004b; Watwood et al. 2006). The omnidirectional low-frequency component of these regular clicks is typically detectable at ranges of ~5 km (Barlow and Taylor 1997). Male sperm whales also produce clicks with a low-frequency emphasis (centroid frequencies from 2 to 4 kHz), less directionality, and longer duration (0.5–10 ms) than other sperm whale clicks (Madsen et al. 2002b). These clicks tend to have a longer inter-click interval than other sperm whale clicks, leading them to be called “slow clicks.” Oliveira et al. (2013) show that these slow clicks tend not to be produced during echolocation-based foraging, but rather in contexts where communication is the more likely function. Oliveira et al. (2013) also recorded sperm whales engaged in exchanges of slow clicks, suggesting a communicative function. Madsen et al. (2002b) estimate that these slow clicks may be detected as far as 60 km away, an estimate that is supported by recordings of slow clicks from ranges of 37 km (Barlow and Taylor 1997).

Watkins and Schevill (1977) reported that sperm whales also make rhythmic series of clicks in stereotyped sequences either in exchanges between whales near one another or in sequences that seemed to be produced by the same whale. Whitehead and Weilgart (1991) showed that sperm whales tend to produce codas when socializing at the surface, in contrast to regular echolocation clicks that tend to be produced during deep foraging dives . Madsen et al. (2002a) showed that coda clicks differ from echolocation clicks in having less than 1/10th the source level and with less decay from the main pulse to the reverberation pulses, yielding a less powerful and more reverberant click. Madsen et al. (2002a) propose that when sperm whales use a click for echolocation, they direct most of the energy in a forward directed beam, but that to make coda clicks, they add air to a reflective sac in the front of the head, which causes more energy to reflect in the nasal complex. This results in clicks that have a longer pulse duration and lower directionality that appear better suited to a communicative function. Weilgart and Whitehead (1997) suggest that coda clicks have a 600 m detection range, and Schulz et al. (2011: 153) state that codas “are only clearly audible through near-surface hydrophones at ranges of a few hundred meters or less” detection ranges much lower than estimated for slow clicks or regular echolocation clicks. The context of coda production in combination with the short estimated range of detection led Weilgart and Whitehead (1993: 744) to suggest that codas function “to maintain social cohesion within stable groups of females following periods of dispersion during foraging.” This interpretation is supported by acoustic localization of sperm whales exchanging codas , which estimated the distance between exchanging whales ranged from 1 to 324 m (Schulz et al. 2008).

Sperm whales have a prolonged period of maternal care, with the young remaining for years with their mother in groups of adult females with young that live in tropical or subtropical waters (Best 1979; see also Chap. 12). As males mature, they leave their natal group and form dispersed groups with other males, often moving seasonally to higher latitudes to feed. Whitehead et al. (1991) report that adult female sperm whales and their young in the Galapagos Archipelago typically are sighted in groups of about 24 whales. Photo identification of individual sperm whales shows that each of these groups tends to be formed of two stable social units that associate for about 1 week at a time. Even though whales within a unit may associate for years at a time, individuals have been observed to transfer between units (Christal et al. 1998). Sperm whale units are not as matrilineal as those of killer whales; they may contain a combination of clusters of related individuals and individuals with no close genetic relationship to any other member of the unit (Mesnick et al. 2003).

Rendell and Whitehead (2003) studied the coda repertoires of 64 groups of sperm whales recorded across the South Pacific and Caribbean. Focusing on 22 stable social units repeatedly sighted in the Galapagos Archipelago, they found that all of these units could be assigned to one of three coda-use patterns, which they called vocal clans . Figure 2.3a shows the three clusters of codas, whose distribution across the units is marked in Fig. 2.3b. One vocal clan produced codas with regularly spaced clicks (“R,” marked green in Fig. 2.3), the second made codas where the last inter-click interval was longer than previous ones (“+1,” marked purple in Fig. 2.3), and the final clan made short codas with just three clicks or rapid bursts of four clicks (“short,” marked orange in Fig. 2.3). These three clans were sympatric , but units tended to associate only with other units from the same vocal clan. Of 26 encounters with groups containing 2 known units, only 1 involved a sighting of 2 units from different clans sighted within a few kilometers for at least 1 h (one was R and the other +1, but only +1 codas were recorded during this time). In the recordings of groups from the broader geographical range, only codas from one clan were typically recorded. The broader recordings yielded five vocal clans in the South Pacific and one vocal clan from the Caribbean. The vocal clans of sperm whales span a distance of about 10,000 km in the Pacific Ocean (including perhaps 10,000 whales on average); these clans are larger than those of killer whales that span about 1000 km and include about 100 whales.

Fig. 2.3
figure 3

Coda repertoires of 22 social units of sperm whales, Physeter macrocephalus , recorded near the Galapagos Islands. The top panel shows three clusters of codas that show stronger coda similarity within a cluster than between. The lower panel shows the distribution of the most common (>10%) codas for each social unit. Social units in the leftmost cluster, marked in green, produce codas with regular inter-click intervals. Social units from the middle cluster, marked in purple, produce a series of 4+ clicks either ending or starting with one slightly longer interval. The rightmost cluster, marked in orange, is comprised of just one social unit which produces either three click codas or codas with four clicks in rapid succession. From Fig. 1 of Rendell and Whitehead (2003)

Whitehead and Rendell (2004) took inspiration from the observation that different sympatric killer whale ecotypes had different foraging adaptations to study movement and foraging behaviors of two vocal clans of sperm whales in the Galapagos Islands. They show that the +1 clan moved in straighter lines offshore, while the R clan groups had more convoluted tracks and tended to be sighted inshore. Judging foraging success by defecation rates, they suggest that +1 clans were more successful during an El Niño year, while the R clan was more successful when surface waters were cooler. These observations led them to suggest that clans do not just share vocal behavior, but that they share other behavioral traits such as foraging strategies that have fitness consequences. Male sperm whales foraging in high latitudes vary their foraging behavior to take advantage of different kinds of prey at different depths (Teloni et al. 2008), but there is little evidence of different foraging strategies for female groups in tropical or temperate waters. Detailed analysis of foraging patterns of tagged sperm whales show differences in dive depth comparing the Gulf of Mexico, North Atlantic, and Mediterranean, but remarkably little difference in dive/surface durations, time spent foraging, and rate of attempts to capture prey (Watwood et al. 2006). Similar analyses will be required to test the hypothesis that clans differ in foraging strategies.

Whitehead (2003: 309) argues that “… for a sperm whale, membership in a clan has a connotation comparable to that of nationality in humans. … Group identity has benefits for an animal: a well-proven way of behaving and a pool of companions who behave similarly who can be used as models and colleagues in cooperative endeavors.” This interpretation hinges on the prediction that codas function to mediate inter-group interactions, such that only groups that share the same coda repertoire will join one another. This interpretation is at odds with Weilgart and Whitehead (1993) who use the short detection range of codas and the usage patterns of codas when whales within a group are socializing to argue for an intragroup rather than inter-group function. However, codas might serve an identity advertisement function described by Macfarlane (2016) in which groups might advertise their identity at close range just before a possible join. The hypothesis that social units use codas to decide which other units to join with, which I will call the “clan coda join” hypothesis, leads to a set of testable predictions:

  1. 1.

    Whales within a unit would be likely to produce codas when they hear regular echolocation clicks indicating the presence of another unit nearby.

  2. 2.

    Whales within a unit would be likely to produce codas when they hear codas from another unit.

  3. 3.

    A unit should be more likely to swim toward a group producing codas from their clan vs a different clan .

  4. 4.

    A unit should be more likely to swim toward playback of codas from their clan vs a different clan.

Visual observations and acoustic localization of clicking whales (Watkins and Schevill 1977) are well suited to testing these predictions, but most such observations focus on coda exchanges within, not between, groups (e.g., Schulz et al. 2008). The only tests of these predictions I am aware of involve tests of vocal reactions of sperm whales to coda playback. Initial results do not support the “clan coda join” hypothesis: that sperm whales exchange codas between groups to make decisions about which other groups to join. Rendell and Whitehead (2005) report no change in coda production for the majority of coda playbacks; when whales responded, they were more likely to stop making codas on playback than to start. One particularly problematic finding for the clan coda join hypothesis was the observation that the subjects were as likely to match the codas of a different clan as their own clan . However, there was only one instance of each kind of matching; more research on acoustic interactions between groups and responses to coda playbacks are necessary to fully test this hypothesis. Clearly it is equally important to test hypotheses about intragroup communication by codas as well.

Rendell and Whitehead (2003: 225) argue that the codas produced by vocal clans of sperm whales should be viewed as culture , “defined as group-level information or behavior transmitted by social learning. ” After some resistance from social scientists who wanted to reserve the term “culture” for exclusive application to humans, there has been growing interest in how social learning leads to the development of animal cultures such as the vocal traditions of songbirds or tool use in apes (Laland and Galef 2009). Vocal cultures , where animals learn to copy acoustic features of the vocalizations of others, “provide the largest body of evidence for cultural transmission of behavioral traits in the animal kingdom” (Laland and Janik 2006: 543). There is a well-established set of methods to demonstrate vocal production learning, which has helped to establish vocal cultures (Janik and Slater 1997). However, the evidence for vocal learning in sperm whales is quite weak.

Coda repertoires are typically categorized using a simple method of comparing codas as having regular inter-click intervals or as forming a long/short pattern similar to how Morse code intervals are distinguished as “dit” or “dah.” Some regular codas are also distinguished on the basis of the absolute values for inter-click intervals (Antunes et al. 2011; Gero et al. 2016; Schulz et al. 2011). Control of inter-click interval is important for echolocation ; all of the inter-click intervals used in codas are also used by sperm whales in slowly changing sequences of echolocation clicks , and the difference between usual and coda clicks is thought by Madsen et al. (2002a) to be caused by inflation of a sac in the sound production organ rather than by the same neural circuits responsible for generating timing of click intervals. These points suggest that if sperm whales learn codas, this can just involve remembering a sequence of relative or absolute inter-click intervals that are already part of its repertoire. This makes it hard to rule out vocal usage learning to explain the differences between the codas typical of the different vocal clans of sperm whales. Laland and Janik (2006) advocate that rather than thinking of traits to be defined categorically as genetic, learned asocially, or cultural (learned socially), it is more useful to ask how much of the variance in the trait can be attributed to social learning . They argue that this change in focus would help us to study interactions between genes, ecology, and learning in a way that helps answer questions about how cultural behavior affects evolutionary processes, an important reason for comparative studies of culture (e.g., Whitehead 1998). We still do not know how much of the variance in coda repertoires is a product of vocal production learning through matching social models.

7 Conclusion

When comparing the broad range of studies of communication in toothed whales to those of terrestrial mammals, marine mammalogists face difficulties in observing behavioral interactions in enough detail to sort out the pattern of signal and response that make up a communication system, and they face similar difficulties in tracking social relationships among individuals. Our understanding of chemical communication is hampered by our ignorance of chemical senses beyond the most basic taste sense; this suggests the need for more research studying how aquatic lifestyles may have selected for changes in the chemical senses of marine mammals. Toothed whales seldom produce visible cues that indicate when they are vocalizing, and they are often not visible, so it can even be difficult to tell which animal produces which signal, whether it is vocal, visual, or tactile. Advances in acoustic localization and development of acoustic, movement and image recording tags are helping to solve this problem for vocal signaling and for movement cues, even to observe tactile signaling (Aoki et al. 2013). Some of the most interesting patterns of communication involve learning to produce and use individual- and group- or clan-specific vocalizations , and this makes it essential to have long-term studies of identified individuals along with their association patterns to put communication in the context of changing social relationships. The more we know about the history of each individual, the better we can understand the context of communication and the problems communication must solve. Other important methods include field experiments—for example—use of human-controlled sound playbacks to test hypotheses about when and whether a particular signal will evoke a particular response in a particular individual (King 2015).

As we have learned to appreciate the diverse ways in which toothed whales rely upon sound to solve ecological and social problems, the more we have learned that noise from human activities may disrupt the behavior of toothed whales in diverse and surprising ways (Tyack 2009). Responses of cryptic species appear to be triggered by particularly low levels of sound. For example, harbor porpoises move away from human activities that generate noise at low enough levels that porpoises avoid pile driving at ranges of 20 km or more (Tougaard et al. 2009, 2014). Sperm whales do not show horizontal avoidance of pulses from air guns used in seismic surveys, but this exposure causes a reduction in foraging effort at ranges of <10 km (Miller et al. 2009). The most intense acute responses of toothed whales to anthropogenic noise involve atypical mass strandings in which two or more beaked whales strand over tens of kilometers during a few hours that coincide with naval sonar exercises (D’Amico et al. 2009). Even if lethal strandings were prevented, if disturbance causes whales to leave preferred habitats, this could affect the population if large numbers are affected (New et al. 2013). Our understanding of how toothed whales use and respond to sound thus has implications for their conservation (Wartzok et al. 2005).