Keywords

3.1 Introduction

The special senses are central to the behavior, ecology, and ultimately the survival and reproductive success of primates (Dominy et al. 2001). Through the auditory sense, primates are able to locate sound sources and derive information about the surrounding environment both at close distances and, in general, farther away than the other senses permit. For example, primates can use gustatory and tactile senses to evaluate food sources only up close (Laska et al. 2007) and the tactile sense to communicate only when in direct contact (Weber 1973). Olfaction is useful at close and intermediate ranges and for extended periods of time, including when food resources are obscured by vegetation and leaf litter (Irwin et al. 2007); however, forest substrates are discontinuous, and scent is not useful for immediately conveying time-sensitive information about resources and threats from afar. Enhanced vision is one of the hallmarks of primate evolution (Crompton 1995), and it can be utilized at both close and far distances. However, using vision to communicate across long distances can be challenging when vegetation is dense or at night (Bearder et al. 2006).

Sound can be used to communicate under varying circumstances. Audition allows primates to detect predators and alarm calls of nearby animals and even identify specific predator types and locations (e.g., Blumstein 2002; Zuberbühler 2007). Audition also allows primates to detect vocal signals from conspecifics that indicate divisible food resources; for example, when toque macaques (Macaca sinica) locate abundant food sources, they give specific calls that evoke rapid direct approach from dispersed group members (Dittus 1984). Primates utilize a variety of acoustic cues, such as the sound of rustling leaves, to locate prey (Goerlitz and Siemers 2007), and vocalizations also facilitate social behavior and mating practices (Semple and McComb 2000).

Of the acoustic signals and cues present in primate habitats, vocalizations have been a topic of particularly intensive research, owing in part to their usefulness for identifying species and behaviors even from a distance (e.g., Gautier 1988; Snowdon 1993; Snowdon, Chap. 6; Zuberbühler, Chap. 7) and for studying the evolution of communication in humans (e.g., Owren 2003; Nishimura 2008; Quam, Martínez, Rosa, and Arsuaga, Chap. 8). Variations in primate vocal acoustics have been associated with behavior (e.g., Sekulic and Chivers 1986; Zimmermann, Chap. 5) as well as ecological and habitat conditions (e.g., Masters 1991; Brown et al. 1995; Brown and Waser, Chap. 4). Thus, it is reasonable to suspect that, as the receiving end of vocalizations, the relative auditory sensitivity of primates to varying types of signals, and the frequencies included therein, vary in relation to vocal acoustics, behavior, and ecology. Such relationships are documented in other organisms and are found to be complex and variable. For example, Vélez et al. (2015) found that among nine species of sparrows (Passeriformes), those that had more complex song structure had greater auditory sensitivity to high frequencies than sparrows with pure-trilled or tonal call structure. Some species of freshwater fish may also have evolved enhanced auditory sensitivity as an adaptation to take advantage of quiet ambient noise levels in still waters (Amoser and Ladich 2005).

A few relationships between overall auditory sensitivity and behavioral ecology have been reported for primates. A longstanding model explains variations in auditory sensitivity, specifically to high frequencies, as a function primarily of sound localization acuity (R. S. Heffner 2004) (Sect. 3.4.3.1). Brown and Waser (1984) report that blue monkeys (Cercopithecus mitis) are particularly adept at detecting low frequencies associated with their low-frequency long calls and forested environments (Sect. 3.4.5). Ramsier et al. (2012a) reported a correlation between enhanced auditory sensitivity and sociality among strepsirrhine primates (Sect. 3.4.3.2).

Multiple studies have focused on the neural processing and perception of vocalizations by primates, providing a comparative context for understanding the evolution of speech, language, and social communication in humans (e.g., Ghazanfar and Santos 2004; Rauschecker and Scott 2009) (Sect. 3.2). However, studies of primate vocal communication and ecology do not discuss audition to any significant degree—vocalizations and audition are generally treated separately in the literature—in large part due to the lack of auditory data on many primates of interest and the tendency of auditory studies to take a clinical or biomedical approach. Similarly, the two key areas of primate audition—overall auditory sensitivity (range of audible frequencies reported as an audiogram) and neural processing and perception—are largely treated separately in the literature.

Measuring the overall auditory sensitivity of nonhuman primates is a complicated process that traditionally has involved months of training in laboratory settings (H. E. Heffner and R. S. Heffner 2014). Since the 1930s, audiograms derived using behaviorally based testing methods have been reported for more than twenty primate species; however, major primate taxonomic groups and hundreds of species are still unstudied (Fay 1988; Coleman 2009) (Sect. 3.4.1). Accordingly, few widespread trends in primate auditory sensitivity have been identified in the literature, leading to the supposition that primate auditory sensitivity is unspecialized in terms of range and relative sensitivity to various frequencies (R. S. Heffner 2004) (Sect. 3.4.3). At the same time, the neurobiological literature describes primates as auditory specialists in terms of auditory processing and perception, such as having species-specific vocalizations (e.g., Ghazanfar and Santos 2002; Rauschecker and Scott 2009). Taken together, these findings point to the similarity of nonhuman primates to humans in their auditory capabilities. It has become convention (or necessity), therefore, to largely disregard the potential influence of interspecies variations in auditory sensitivity when studying bioacoustic communication among nonhuman primates, a practice that is reinforced by the close evolutionary relationship between humans and nonhuman primates and the tendency to anthropomorphize nonhuman primate behaviors (Asquith 2011). Field workers may be left with little choice but to assume that what is loud or quiet to the human observer is also loud or quiet to the animals being observed when, in reality, this may not be the case. Sounds that humans may not be able to hear well or at all may affect or be utilized by nonhuman primates in ways that are not fully understood (Barber et al. 2010; Kight and Swaddle 2011).

An increasing number of studies seek to build on the solid foundation of decades of behaviorally based auditory testing to better understand the ecological implications of variations in primate auditory sensitivity. This has involved an exploration of physiologically derived auditory testing techniques for constructing audiograms (e.g., Ramsier and Dominy 2010) (Sect. 3.3.3), detailed neurophysiological and anatomical studies (e.g., Micheyl et al. 2005; Coleman and Colbert 2010; Nummela, Chap. 2), and computer modeling (e.g., Quam et al. 2015). These studies have demonstrated that at least some species do indeed have specialized neural structures and processing abilities that share similarities with humans. In addition, the sensitivity of primates to different frequencies may be more variable than previously thought; for example, species that have been described as relatively quiet may in fact be communicating in a realm outside of the range of human hearing (e.g., Ramsier et al. 2012b; Gursky 2015).

This chapter begins with an overview of auditory neurobiological processing and perception in primates (Sect. 3.2). The chapter then discusses ways in which overall auditory sensitivity is conceptualized and measured among primates (Sect. 3.3) and then reviews the current data for primates along with potential explanations for variations (Sect. 3.4). The chapter concludes with implications for future research (Sect. 3.5).

3.2 Auditory Processing and Perception in Primates

3.2.1 The Path of Sound: From Cochlea to Auditory Cortex

After sound is captured by the outer ear, transformed into mechanical energy in the middle ear, and translated into electrical impulses within the cochlea of the inner ear (Nummela, Chap. 2), the central auditory system is responsible for transmitting those signals to various brain centers for processing to determine sound source location, to identify features of the source (e.g., species or sex of an individual that produced a communication call) and, ultimately, to determine the sound’s meaning. Neuroanatomical structures and their physiological workings affect the complexity of information that can be acoustically communicated and the efficiency and specificity of sound localization. A common feature of the primate auditory system is its map-like “tonotopic” organization, wherein specific neurons or groups of neurons fire most strongly in response to particular temporal and spectral characteristics of stimuli along a tonotopic or cochleotopic frequency axis. The following section focuses on pathways for, and processing of, locus cues and vocalizations. It is in these abilities that the specialized nature of the primate auditory system may be indicated.

Within the fluid-filled spiral cochlea of the inner ear, the organ of Corti winds up the basilar membrane of the cochlear duct—this organ is the sensory structure responsible for converting fluidborne vibrations into electrical impulses that can be interpreted by the brain (Webster et al. 1992; Geisler 1998). Sound-induced movement of the basilar membrane causes movement of mechanoreceptor hair cells on the organ of Corti. Like mammals in general, the primate cochlea is tonotopically organized in that the hair cells at the basal cochlea are more sensitive to high frequencies, and the hair cells at the apex are more sensitive to low frequencies. This occurs largely by virtue of cochlear mechanics, whereby traveling waves peak at certain locations along the basilar membrane in a frequency-dependent manner (von Békésy 1960). Bipolar neurons have cell bodies that lie in the spiral ganglion, which is a string of tens of thousands of neurons along the central axis (modiolus) of the cochlea, and they are the first neurons in the auditory system to fire an action potential. They supply all of the brain's auditory input (Nayagam et al. 2011). The dendrites of bipolar neurons make synaptic contact with the base of hair cells, and their axons form the auditory portion of the vestibulocochlear nerve. The first major center of auditory neural processing is the cochlear nucleus (with a ventral and a dorsal subdivision). Figure 3.1a depicts the pathway of sound (and its neural representations) from the cochlea to the primary auditory cortices of the temporal lobe of the cerebrum, including the major (generally tonotopically organized) relay stations along this path. The pathway is similar in humans and nonhuman primates, such as the common marmoset (Callithrix jacchus) (Aitkin and Park 1993) or the rhesus macaque (Macaca mulatta) (e.g., Hackett 2011), as well as generally similar within the mammals (Webster et al. 1992; Geisler 1998).

Fig. 3.1
figure 1

Neuroanatomy of the auditory system in primates. (a) Ascending auditory pathway from the cochlea to the auditory cortices. Fibers in blue originate from neurons in the ventral cochlear nucleus, form the lemniscal pathway (LL), and eventually pass through the ventral division of the medial geniculate nucleus on their way to primary auditory cortex. Fibers in red originate from the dorsal cochlear nucleus and form the extralemniscal pathway. Low-frequency (L) and high-frequency (H) pathways are present throughout. (b) Cortical pathways for auditory processing in  the macaque. Corticocortical projections of the central auditory system run along two segregated pathways: a ventral pathway (green) runs from the anterolateral belt (area AL) along the anterior superior temporal cortex to the ventrolateral prefrontal cortex, while a dorsal pathway (red) extends from the caudolateral belt (area CL) to superior temporal cortex and inferior parietal cortex and ends in dorsolateral prefrontal cortex. Discrete thalamic input to the two pathways is provided from different medial geniculate (MG) nuclei: The ventral part (MGv) projects only to the core fields A1 and R, whereas the dorsal part (MGd) projects to primary auditory cortex (A1) and the caudomedial field (CM) (Rauschecker et al. 1997). Likewise, feedforward projections from AL and CL are largely separated and target the rostral parabelt (RPB) and caudal parabelt (CPB) regions, respectively (Hackett et al. 1998). Additional pathways involve the middle lateral area (ML), posterior parietal cortex (PP), and RPB areas on the surface of the rostral superior temporal gyrus (Ts1/Ts2) (Pandya and Sanides 1973). Prefrontal cortex projections (PFC) are segregated in Brodmann areas 10 and 12 versus 8a and 46, respectively (Romanski et al. 1999). (a was modified and reprinted with permission from Henkel 2006; b was modified from Rauschecker and Romanski 2011; reproduced with permission from the original source, Rauschecker and Tian 2000)

In primates, conscious awareness of sound takes place within the various divisions of the auditory cortex (Fig. 3.1b). Within the auditory cortex, acoustic signals first travel to one or more of the primary cortical areas, which are most responsive to pure tones (Ghazanfar and Santos 2004). There are at least two widely agreed on primary cortical areas (A1 and R), but possibly there are as many as three or four (e.g., Kaas and Hackett 2000). Signals then travel to one or more of the surrounding seven (or so) auditory cortical belt areas and subsequently enter the prefrontal cortex of the frontal lobe, either directly from the belt or through functionally specific auditory parabelt areas in auditory and/or auditory-related fields in the superior temporal gyrus (Romanski et al. 1999; Kaas and Hackett 2000; Rauschecker and Tian 2000; Poremba et al. 2003; Hackett 2011; Rauschecker and Romanski 2011).

Like other major partitions of the primate auditory pathway, portions of the human and nonhuman primate auditory cortices work in a map-like fashion to represent frequency. For example, rhesus macaques and common marmosets have a tonotopic map on auditory area A1 (Aitkin et al. 1986; Micheyl et al. 2005). Individual fibers carry information from (and neurons are most responsive to) particular tones, with response strength decreasing sharply as frequencies depart from the preferred frequency. This organization is also present in most other mammals (e.g., cats: Imig and Adrian 1977).

3.2.2 Alternate Pathways for Spectral and Spatial Information

Neural processing of localization cues begins at the superior olivary nuclei of the medulla-pons junction and the inferior colliculus of the auditory midbrain. Later, at the cortical level in human and nonhuman primates, functional divergence of object-related (what) and spatial (where) information takes place after the primary auditory cortex in the superior temporal plane (Rauschecker and Tian 2000). More specifically, in humans, divergence takes place at the planum temporale, after which object-related spectral information is processed in the anterolateral planum temporale, planum polare, lateral Heschl’s gyrus, and the superior temporal gyrus anterior to Heschl’s gyrus (Warren and Griffiths 2003). Spatial information is processed in the posteromedial planum temporale and in the parietal and frontal lobes (Bushara et al. 1999). In macaques (Macaca sp.), divergence occurs in the belt areas (along the superior temporal gyrus): object-related spectral information proceeds from the anterior lateral belt through fields in the anteroventral superior temporal region into ventrolateral prefrontal cortex, whereas spatial information proceeds from the caudolateral belt and through fields in the posterodorsal superior temporal lobe and the posterior parietal cortex into dorsolateral prefrontal cortex (e.g., Romanski et al. 1999; Tian et al. 2001). Mostly based on clinical stroke studies, the posterior part of superior temporal gyrus (STG) in humans has classically been considered as specialized for speech processing (“Wernicke’s area”). Given reports from human imaging that anterior regions of STG are at least as selective for the perception of words as posterior regions (DeWitt and Rauschecker 2012), a redefinition of posterior STG as an area specializing in sensorimotor integration and control seems appropriate (Rauschecker 2011). This would include a role in spatial processing as well as in speech production and perception.

An important aspect of the primate central auditory system is its redundancy. For example, in the macaque lateral belt, signals are largely segregated into spatial (caudolateral belt) and nonspatial (anterior lateral belt) information; however, the streams obviously interact (Kaas and Hackett 1999; Romanski et al. 1999). Some neurons in the primate caudolateral belt respond to both location and specific calls, and the middle lateral belt is approximately equally selective for both call type and sound source location (Tian et al. 2001). Furthermore, each side of the brain receives and processes impulses from both ears, although in primates (human and nonhuman) the left cerebral hemisphere may have greater selectivity for processing temporal information, and the right cerebral hemisphere may have greater selectivity for processing spectral information (Joly et al. 2012; Ortiz-Rios et al. 2015).

3.2.3 Encoding Signals

In humans, the cortical region around Heschl’s gyrus, which also contains primary auditory cortex, is responsible for pitch perception (Schneider et al. 2005). A cortical area analogous to this region has been described for nonhuman primates (Bendor and Wang 2005). In their study on common marmosets, Bendor and Wang demonstrate that an area (restricted to low frequencies) on the border between two of the primary cortical areas (A1 and R) and adjacent to the anterior auditory cortical belt (AL and ML) contains pitch-selective neurons (also see Tomlinson and Schwarz 1988). Each neuron or group of neurons responds best to a specific pitch, whether it is generated by an actual pure tone or by a “missing fundamental” frequency represented by its spectral envelope.

Temporal relationships of signals and signal elements are important for identifying target proximity and location and distinguishing between calls (e.g., Ghazanfar and Santos 2004). In many cases, temporal alteration may affect representation more than spectral manipulation (Nagarajan et al. 2002; Ghazanfar and Santos 2004). Some neurons in the auditory midbrain respond selectively to order and spacing combinations (Wollberg and Newman 1972). This is demonstrated by the differential processing of temporally expanded and compressed vocalizations by the common marmoset (Wang et al. 1995) (Sect. 3.2.5). Other neurons in the auditory midbrain respond selectively to duration of frequency modulation or rates of amplitude modulation (e.g., Casseday et al. 1994). In another example, researchers presented a series of alternating high- and low-frequency tones to awake long-tailed macaques (Macaca fascicularis) and found that increasing the frequency separation, presentation rate, and tone duration improved the spatial differentiation of tonal responses on A1’s tonotopic map (p. 1656 in Fishman et al. 2004).

Studies on auditory cortex in anesthetized primates (e.g., common marmosets: Wang et al. 1995; squirrel monkeys, Saimiri sciureus: Bieser 1998) have reported that neurons mainly detect signal changes (onsets or transients). By contrast, when recording from primary auditory cortical and lateral belt neurons in awake common marmosets, Wang et al. (2005) found that responses are not only phasic but also tonic, indicating that some neurons respond continuously to spectrally and temporally optimal parts of the signal. Thus, cortical responses may be phasic (onset or offset), persistent tonic, inhibitory, and/or excitatory depending on stimulus frequency, intensity, location, and duration, similar to simple and complex cells in visual cortex (Tian et al. 2013). Since responses in anesthetized animals to pure tones are generally only phasic, they may not represent the full range of cortical responses/firing patterns. Considering this, studies of awake rather than anesthetized animals (e.g., Recanzone et al. 2000; Malone et al. 2002) may be preferable, depending on research questions and methods.

3.2.4 Are Primate Brains Specialized for Processing Vocalizations?

The human brain has long been claimed to have specialized neural structures, such as Wernicke’s area, for processing speech and, perhaps, others for interpreting meaning and auditory imagery (Fisher and Marcus 2006), but the notion of areas specialized for speech perception is undergoing some revision. Although primates show evidence of homologous neuroanatomical pathways and structures, a topic of debate is whether the nonhuman primate central auditory system contains regions that are (or, even as a whole, is) specialized for processing vocalizations. First, it is important to distinguish between auditory brain areas being sensitive versus selective. That an area is vocalization sensitive means that its neurons respond especially well to all vocalizations. That an area is vocalization selective means that single or groups of neurons within that area each respond to different vocalizations: some neurons may respond preferentially to contact calls, whereas others may respond to predator warning calls. Based on neurophysiological experiments, authors such as Rauschecker et al. (1995) and Tian et al. (2001) argue convincingly that certain regions of the primate lateral belt may be vocalization selective. However, during the above experiments, responses to vocalizations were not consistently compared with responses to relevant nonvocal complex sounds in the same neurons. Thus, it is possible that neurons in the primate lateral belt are vocalization sensitive but not selective, and such selectivity is not generated until later in higher processing regions.

Many authors have reviewed vocal communication and parallels with human language in primates. In their study on speech segmentation in cotton-top tamarins (Saguinus oedipus), Hauser et al. (2001) demonstrate that nonhuman primates are able to recognize different sequences of syllables in a speech stream. Humans use this ability to calculate statistical probabilities of sequence occurrence (transitional probabilities) for the segmentation and identification of words in an unknown language (e.g., Chomsky 1975). Interestingly, many authors have pointed out that some facets of speech that are central to speech perception in humans, such as syllable onsets, formant frequencies, glottal-pulse periods, and the spectral profiles of consonants and vowels, are already encoded in peripheral hearing not only of primates but of mammals as a whole (e.g., Delgutte 1997; Lieberman 2006).

However, although the mammalian ear may be well-equipped to encode aspects of speech important to human perception, this does not mean that primates are specialized to process the meaning of these features. Many attempts have been made to understand the differences and similarities between human and nonhuman primates with regard to auditory-vocal processing. Because of the complex nature of identified (or as yet unidentified) relationships, Owren and Rendall (2001) rightly warn that, at present, comparisons between (and models of) human language and nonhuman primate vocalizations need to be approached cautiously (also see Ghazanfar and Santos 2004).

3.2.5 Potential Specializations for Processing Species-Specific Vocalizations

Although the communication systems of nonhuman primates do not match humans in either their combinatorial power or the recursive structure of human speech and language, the primate auditory cortex displays similarities with humans, particularly in having a hierarchical structure with tonotopic mapping and specialized streams for processing specific types of information (Rauschecker and Scott 2009). The primate central auditory system shows evidence of specialization for processing location as well as complex bioacoustic communication signals such as conspecific vocalizations. In fact, acoustic sensitivity may decrease when frequencies are not heard in sequences corresponding to biologically meaningful stimuli such as species-specific calls.

The acoustically distinct vocalizations of primate species are well documented and those vocalizations can even be utilized, in some cases, to assess phylogenetic relationships (e.g., Zimmermann 1990). Behavioral studies in the wild provide substantial evidence that primates are able to recognize conspecifics, kin groups, and individuals based on variations in vocal acoustics (e.g., Chapman and Weary 1990). Neurobiological experiments measuring the responses of auditory cortical areas to natural vocalizations versus artificially manipulated or synthesized vocalizations provide a basis for understanding how at least some nonhuman primate species are able to distinguish conspecifics based on their calls. In both human and nonhuman primates, conspecific vocalizations are received in both the left and right cerebral hemispheres, but processing is focused in specific areas of the left hemisphere where some single neurons or groups of neurons may respond particularly well to distinct vocalizations (Ghazanfar and Santos 2004; Poremba et al. 2004). Studies on rhesus macaques demonstrate that the lateral belt systematically represents tones and frequencies and is especially responsive to complex signals such as species-specific vocalizations (e.g., Rauschecker et al. 1995; Romanski et al. 1999). Studies on squirrel monkeys found that neurons in the auditory cortex responded to frequency modulations in both natural and synthesized vocalizations, but responses were greater for natural, strongly amplitude-modulated vocalizations, possibly owing to their syllable-like divisions (Bieser 1998; Ghazanfar and Santos 2004).

In a study on common marmosets, neurons in the primary auditory cortex responded preferentially to normal versus time-reversed, compressed, or expanded conspecific vocalizations. When the same marmoset vocalizations were presented to cats, the evoked responses were relatively small and roughly equal for normal and time-reversed examples (Wang et al. 1995; Wang and Kadia 2001). A behavioral experiment where long calls were played back to cotton-top tamarins found that individuals were more likely to respond to whole rather than parts of conspecific calls (Ghazanfar et al. 2001; Snowdon, Chap. 6). Additional studies indicate that among some animals, neuronal responses to temporally correct combinations of tones are stronger than the summed response to the individual signals presented separately (Viemeister and Wakefield 1991; Alder and Rose 1998). Other studies have shown that, whereas squirrel monkey and cotton-top tamarin auditory cortical areas respond more strongly to conspecific vocalizations than to those of other species, time reversing and pitch shifting did not significantly alter the results, indicating order/spectral insensitivity (e.g., Glass and Wollberg 1983). The preferential processing of and response to species-specific calls may be preprogrammed or dependent on experience and may be related to recognizing signals that are similar to those that are self-produced (Brainard and Doupe 2002). Correctness likely varies at the species level (Alder and Rose 1998).

3.2.6 Interindividual Recognition

Unarguably, humans are able to distinguish between individual voices based on spectral and temporal cues. Two humans saying the same word or phrase (call) can be distinguished from one another. Conversely, tamarin and squirrel monkey studies suggest that the primate auditory system does not respond differently to variants (different examples from different individuals) of the same call (Ghazanfar and Hauser 1999). This suggests that primates may not be universally adept at recognizing individuals based on call structure (Ghazanfar and Santos 2004). However, Wang and colleagues (1995) report that in marmosets, auditory cortical representations from spectrotemporal variants of calls from different individuals were different but overlapping, suggesting some individual recognition might be possible.

Behavioral evidence also supports that primates can recognize individuals from their calls. For example, vervet monkeys (Chlorocebus aethiops) can organize individuals hierarchically and into kin groups based on individual calls (Cheney and Seyfarth 1990), and Waser (1977) provides evidence from playback studies that monkeys can recognize individuals based on their vocalizations. The results of these studies are perhaps not surprising, considering that individual recognition based on call structure has long been reported in birds (e.g., Thorpe 1968). It is completely unknown at present how the primate brain processes and stores these subtle differences.

3.3 Defining, Representing, and Measuring Overall Auditory Sensitivity in Primates

Comparative audiograms for primates have been gathered primarily via traditional behaviorally based testing and physiological techniques such as the auditory brainstem response (ABR) method (Sect. 3.3.3). Currently, data are available for only a small percentage of the hundreds of nonhuman primate species (Sect. 3.4), and much of the existing data is likely to be incomparable due to issues or inconsistencies with experimental design or data reporting, greater than average interindividual variation, unexpected results that do not fit preconceptions about variation in the order, or philosophical debates with regard to potential incompatibilities between behaviorally and physiologically derived data (Coleman 2009; H. E. Heffner and R. S. Heffner 2014). This section introduces the ways in which auditory sensitivity is defined and represented, the conceptual issues surrounding methods of data collection, and the comparability of the resulting data.

3.3.1 Defining and Representing Auditory Sensitivity in Primates

The term auditory sensitivity is utilized throughout this chapter as the broadest definition of the function of the sense—it can be conceived of herein as a representation of all sounds that are collected via the ear, are received (produce a neural response) in the brain, and have the potential of being utilized by the individual. Although the terms auditory sensitivity (audition) and hearing are often used interchangeably, the term hearing carries additional complex meanings related to perception and psychoacoustics.

The auditory sensitivity of primates can be represented as the range of audible frequencies, measured in hertz (Hz), that are detectable at varying amplitudes, measured in decibels (dB re 20 μPa). Frequencies below 20 Hz are defined as infrasound because they are below the range of human hearing, and frequencies above 20 kHz are defined as ultrasound, or above the range of human hearing. Auditory sensitivity can be represented graphically as an audiogram—a curve showing the lowest audible level (threshold, in dB) at each tested frequency. In this chapter, variation in auditory sensitivity within and between species is considered through the most common audiometric parameters: frequency of best sensitivity, defined as the frequency that can be detected at the lowest level (in dB); and the low-frequency and high-frequency limits, defined as the lowest and highest frequencies, respectively, detectable at reasonable amplitudes (conventionally 60 dB). The audible range, defined as the number of octaves between the low- and high-frequency limits, is also a common audiometric parameter, but it is not considered here since it is highly reliant on both the low- and high-frequency limits, and the former is not available for most subjects. Studies have also sought to formulate additional audiometric parameters to facilitate interspecific comparisons, such as the absolute threshold level at particular frequencies, or measures of overall sensitivity across the audiogram, or sensitivity within low-, mid-, and high-frequency areas (e.g., Coleman and Colbert 2010; Ramsier et al. 2012a); these parameters are yet to be widely adopted and thus are not considered further in this chapter.

3.3.2 Determining Threshold

When constructing an audiogram, the precision of the threshold measurement is highly dependant upon the frequency steps used and the accurate calibration of stimuli (Coleman 2009). A free-field speaker is generally considered the ideal transducer for delivering stimuli to primates. The use of headphones, from inserts to circumaural, is also relatively common when testing auditory sensitivity in humans and other animals, as headphones may help minimize interference from subject position, room noise, and electrical artifacts (Martin and Clark 2006). However, earphones that depress or bypass the pinnae may influence or negate the amplification effects of the pinnae (Sinyor and Laszlo 1973; Rosowski 1991). Thus, some workers express concern over the use of headphones, particularly insert varieties, over pinna amplification issues or concerns that delivering low-frequency signals through these devices can be problematic (R. S. Heffner 2004; Coleman 2009). Tables 3.1 and 3.2 show data gathered free-field and with headphones for several species. There seems to be good agreement in the high-frequency limit but more variation with the frequency of best sensitivity, which may be more strongly subject to methodological variations. More data are needed to fully evaluate pinna effects and the influence of transducer type on auditory thresholds. Another potential issue is that pure tone stimuli may only broadly represent auditory sensitivity, given that in at least some primates, neural responses to conspecific vocalizations are enhanced compared to nonspecific noise (Sect. 3.2.5).

Table 3.1 Auditory sensitivity in primate semiorder Strepsirrhini
Table 3.2 Auditory sensitivity in primate semiorder Haplorhini

3.3.3 Testing Methods

After decades of refinement, well-designed behavioral testing regimens produce what are generally considered to be ideal estimates of auditory sensitivity, as the behavior of whole animals is measured (H. E. Heffner and R. S. Heffner 2014). Beginning with Elder’s (1934) audiogram for chimpanzees (Pan troglodytes), most existing data on primate audition have been gathered via behaviorally based methodologies, although very few have been collected in recent decades (Sect. 3.4) (Coleman 2009).

An alternative to behaviorally based testing is minimally invasive, physiologically based testing, such as the ABR method (Jacobson 1985), during which the responses of the auditory system are measured directly. The ABR method has been widely adopted within the biomedical and clinical realms (Burkard and Don 2007) and recently within primatology (Ramsier and Dominy 2010). The ABR method reliably estimates overall audiogram shape (dips and peaks in sensitivity) and the behaviorally derived high-frequency limit and frequency of best sensitivity. However, threshold levels for low-frequency stimuli may be underestimated by the ABR method, and additional data are needed to fully evaluate to what degree it is possible to compare absolute thresholds derived through each method.

3.4 Auditory Sensitivity Among Primates

3.4.1 Primate Audiograms

Reasonably complete and comparable audiograms for twenty-nine nonhuman primate species have been published using either traditional behavioral testing or the ABR method (Sect. 3.3.3). These data are considered together in this section, despite some debate over the degree to which data gathered with different methodologies (e.g., behavioral versus ABR, speaker versus headphones) can be compared (Sect. 3.3).

The sample of published audiograms represents both primate semiorders. The semiorder Strepsirrhini (Table 3.1) is the evolutionarily oldest primate clade and more closely reflects the ancestral primate condition (Masters et al. 2013; Zimmermann, Chap. 5). The Strepsirrhini includes two infraorders, the Lorisiformes and Lemuriformes. The Lorisiformes include relatively small-bodied, nocturnal, highly arboreal species from Africa and Asia; audiograms have been published for six species. The strepsirrhine infraorder Lemuriformes is more variable than Lorisiformes in body size, behavior, and ecology—it consists of small- to medium-bodied, arboreal to semiterrestrial, nocturnal, cathemeral, and diurnal species from the island of Madagascar. Audiograms have been published for nine taxa of Lemuriformes.

The semiorder Haplorhini (Table 3.2) includes primates that are more closely related to humans than are the strepsirrhines. The haplorhine suborder Tarsiiformes includes one infraorder (also Tarsiiformes) and multiple species of small-bodied, nocturnal, arboreal tarsiers (Carlito sp., Cephalopachus sp., Tarsius sp.) from Asia; an audiogram exists for one species (Ramsier et al. 2012b). Due to behavioral and morphological similarities with the semiorder Strepsirrhini, tarsiers were traditionally grouped with them (Masters et al. 2013). The haplorhine suborder Anthropoidea has two infraorders. Infraorder Platyrrhini consists of New World monkeys from Central and South America, which are medium-bodied arboreal species that generally are diurnal, with the exception of the nocturnal owl monkey (Aotus sp.). Comparable audiograms exist for three smaller bodied species, but none exist for the many larger bodied New World monkeys, such as howling monkeys (Alouatta sp.), spider monkeys (Ateles sp.), and capuchins (Cebus sp. and Sapajus sp.), nor the many species of tamarin (Saguinus sp.).

The Anthropoid infraorder Catarrhini consists of Old World monkeys, apes, and humans from Africa, Asia, and Europe. This is a highly diverse group that consists of medium- to large-bodied species that are all diurnal and range from terrestrial to highly arboreal. Much research in this group has focused on common laboratory species such as macaques. No comparable audiograms are published for the speciose Colobinae subfamily of monkeys nor for the apes other than the chimpanzee.

There is notable variation in the auditory sensitivity of the primate species tested to date. This can be conceptualized visually by comparing median behavioral audiograms for each infraorder (Fig. 3.2) and by examining audiometric parameters for both behavioral and ABR audiograms (Tables 3.1 and 3.2; Fig. 3.3).

Fig. 3.2
figure 2

Median (lines) and range (shading) of behavioral audiograms for the four major primate infraorders (based on Coleman 2009)

Fig. 3.3
figure 3

The 60-dB high frequency limit among the primate infraorders. Horizontal lines show range, box limits show first and third quartiles, vertical lines show median, and dots show mean values. For the Tarsiiformes, the one data point is at least 76 kHz but could be higher, as represented by the arrow

Fig. 3.4
figure 4

Frequency of best sensitivity among the primate infraorders. Horizontal lines show ranges, box limits show first and third quartiles, vertical lines show medians, and dots show mean values

3.4.2 Intraspecies Variation

Coleman (2009) reviewed behaviorally based primate auditory studies and calculated the average within-study intraspecific variation in the threshold for each tested frequency to be ±4.2 dB around the mean (range ±0.95–9.25 dB), with slightly increased variation at frequencies greater than 8 kHz. There was a relationship between the number of individuals included in a study and the reported intraspecies variation—the average intraspecific variation for studies with four or more subjects was higher (mean ± 5.7 dB) than the overall average of ±4.2 dB. Given that studies tend to choose similar subjects (e.g., young adult males), it seems likely that intersubject variability is underestimated in tests of auditory sensitivity.

3.4.3 Variation in High-Frequency Limit

There is much variation in high-frequency limit (Tables 3.1 and 3.2; Fig. 3.3). On average, primates of the semiorder Strepsirrhini are more sensitive to high frequencies; within the Strepsirrhini, there is much overlap between the infraorders Lorisiformes and Lemuriformes, with the latter averaging the highest high-frequency limit. Monkeys and apes of the semiorder Haplorhini tend to be relatively less sensitive to high frequencies with the notable exception of the small, nocturnal tarsier, for which the high-frequency limit is the highest reported within the primate order (Table 3.2; Fig. 3.3).

3.4.3.1 High-Frequency Limit and Sound Source Localization

A long prevailing model explains variation in high-frequency auditory sensitivity among mammals as a product of head size and the need for localizing sound sources (Masterton et al. 1969; R. S. Heffner 2004). Auditory localization is the act of determining the directional location of a sound source horizontally (azimuth) and in elevation (Blauert 1997; Popper and Fay 2005). How accurately an animal can localize sources is referred to as localization acuity. Most mammals, other than subterranean species, can localize within a window of 40° or less (R. S. Heffner and H. E. Heffner 1992; R. S. Heffner 2004). Data on Japanese macaques (Macaca fuscata) (Houben and Gourevitch 1979) and squirrel monkeys (Don and Starr 1972) suggest that nonhuman primates are very good localizers with acuity similar to that of cats, pigs, and opossums at around 4–6° azimuth (R. S. Heffner and H. E. Heffner 1988; R. S. Heffner 2004). Humans, like dolphins (Renaud and Popper 1975) and elephants (R. S. Heffner and H. E. Heffner 1982), are especially good localizers, with acuity of around 1° azimuth—in other words, humans can orient directly toward a sound source with almost perfect accuracy (Middlebrooks and Green 1991; R. S. Heffner 2004).

R. S. Heffner (2004) reported that auditory localization acuity is well-matched to the width of the field of best vision among mammals. The narrower the field of best vision is, the better the auditory localization acuity is so that the head can be oriented precisely to put the sound source in the subject’s field of best vision. The especially good auditory localization ability of haplorhine primates, such as humans, corresponds with the presence of a very narrow field of best vision. This relationship is underlain by similarities in auditory and visual neural structures and mechanisms (Rauschecker 2015); for example, responses to stimuli coming from the area that is attended to are amplified, and responses to peripheral stimuli are attenuated (e.g., Bushara et al. 1999; Winkowski and Knudsen 2006). The first localization response allows the head to be turned for subsequent maximum auditory and visual localization acuity. Currently, there are insufficient comparative data on auditory and visual localization acuity among primates to fully investigate trends within the primate order, but further investigation would be interesting given that enhanced vision is one of the hallmarks of primate evolution (Martin and Ross 2005).

Many terrestrial vertebrates can detect the horizontal location of a sound’s source with the aid of binaural cues—differences in the sound received at each ear (Geisler 1998; Heffner 2004). In general, a sound is perceived as more intense by the ear that is facing more directly toward the sound, at which it also arrives first. Interaural distance, the distance between the tympanic membranes, influences the effectiveness of binaural localization cues at different frequencies. Increasingly lower frequencies have increasingly longer wavelengths such that low-frequency sound waves may pass by the head (especially a small head) with little or no deflection, making low frequencies increasingly difficult or impossible to use for localization. Furthermore, interaural timing cues rely on low frequencies and decrease in usefulness as head size decreases (e.g., Klump and Eady 1956; Heffner 2004). Thus, the allometric model of auditory sensitivity explains high-frequency sensitivity as a negative function of interaural distance—smaller headed mammals are increasingly reliant on higher frequencies to enable localization through binaural and pinna cues (R. S. Heffner 2004). This is a well-supported model that explains general patterns observed among mammals.

R. S. Heffner (2004) concluded that primate hearing is not specialized in terms of audible frequency range but, rather, follows the typical mammalian pattern with smaller species capable of hearing higher frequencies than larger species. While this relationship holds across mammals and across the primate order in general (R. S. Heffner 2004; Ramsier et al. 2012a), interaural distance does not explain all variation among primates (Coleman 2009; Ramsier et al. 2012a). For example, a relationship between high-frequency sensitivity and interaural distance was not significant within the semiorder Strepsirrhini (Ramsier et al. 2012a). When all data (multimethod) from Tables 3.1 and 3.2 were considered, the relationship was significant among the Catarrhini and Lorisiformes but not among the Lemuriformes nor the Platyrrhini. When all primates were averaged, the relationship was not significant unless the Lemuriformes were averaged prior to order-wide analysis. Some individual primate species depart from the expected pattern as well. The yellow baboon (Papio cynocephalus), for example, which is one of the largest primates for which data on auditory sensitivity exist, has a relatively elevated high-frequency limit—the opposite of what is predicted by the allometric model (Table 3.2).

R. S. Heffner (2004) noted that animals may take advantage of sensitivity to high frequencies that evolved in relation to localization acuity to communicate via high-frequency signals. However, high-frequency vocal communication is potentially a selective force in itself as well. Both small-headed and large-headed species may experience selective pressure to detect high-frequency sounds, such as those emitted by infants, insect prey, or smaller sympatric species (Sect. 3.4.3.2). Given that individual primates vary in their auditory sensitivity, such selective pressure could certainly operate in addition to, or in the absence of, selective pressure related to sound localization.

Examining limited data available at the time, R. S. Heffner (2004) concluded that intraspecies differences in interaural distance, even the twofold differences present between some dog breeds, did not seem to correlate with differences of equal magnitude in the high-frequency limit and suggested that the high-frequency limit is a species trait, not an individual trait. Along these lines, the lack of a significant relationship between interaural distance and high-frequency limit among the strepsirrhines (especially the lemurs) might be attributed to the close evolutionary relationship among some of the species (and subspecies) in the sample. Thus, the relationship between high-frequency limit and localization acuity may still hold at taxonomic levels above species, and other factors (perhaps after controlling for interaural distance) may further explain the evolution of variation in primate auditory sensitivity. These could be interesting areas for future research.

3.4.3.2 High-Frequency Limit, Behavior, and Ecology

Ramsier et al. (2012a) tested the auditory sensitivity of eleven strepsirrhine primate species and found a relationship between enhanced auditory sensitivity and group size, particularly to high frequencies, indicating that the more social species may benefit from enhanced acoustic communication with conspecifics if higher frequencies are used for communication. For some primate species, it may be particularly beneficial to emit and detect higher frequency alarm calls that are perhaps less audible to common aerial and terrestrial predators (Ramsier et al. 2012b). This model could partially explain why the yellow baboon, a highly social haplorhine species (Semple 2001; Barton et al. 1996), is sensitive to high frequencies despite its large head size and interaural distance. However, the haplorhines are, overall, characterized by relatively poor high-frequency and enhanced low-frequency auditory sensitivity, suggesting that haplorhines as a group may benefit from antipredator strategies other than emitting high-frequency alarm calls (Hill and Dunbar 1998). Lack of use of high-frequency localization cues (R. S. Heffner 2004) and reduced ability to produce high-frequency vocalizations (Fitch 1997) may have contributed to the particularly enhanced low-frequency auditory sensitivity of humans (see Quam, Martínez, Rosa, and Arsuaga, Chap. 8). Perhaps human ancestors relied more heavily on detecting low-frequency sounds produced by avian and felid predators, or perhaps they communicated directly with predators to deter them, similar to the African putty-nosed monkey (Cercopithecus nictitans martini) (Arnold et al. 2008).

3.4.4 Frequency of Best Sensitivity

The frequency of best sensitivity is an indication of the frequency at which a species hears best, and thus this audiometric parameter could provide clues to important selective pressures in a primate’s environment. In the current dataset, the frequency of best sensitivity is higher on average in the strepsirrhines compared to the haplorhines (Fig. 3.4), following the overall pattern for high-frequency limit. Among the haplorhines, the catarrhines have the broadest overall range in frequency of best sensitivity (0.8–16 kHz), but nine of the eleven tested species have a frequency of best sensitivity (or a second peak in sensitivity) in the lower range of 0.8–4 kHz.

Some primates, particularly the platyrrhine monkeys, have a prominent dual peak of best sensitivity (a w-shaped audiogram, Fig. 3.2) (Coleman 2009). This pattern is not uncommon and is also found in other mammalian groups (e.g., Rice et al. 1992; Bohn et al. 2001). Among the platyrrhines, the higher peak (7–12 kHz) is the most sensitive and thus forms the actual frequency of best sensitivity; the lower (less sensitive) peak lies around 2 kHz, close to the lower frequency cluster found in catarrhines. Some authors speculate that a dip in sensitivity between the peaks in animal audiograms is an adaptation to enhance sound localization ability (e.g., Rice et al. 1992; R. S. Heffner 2004). Others hypothesize that the upper peak may be an adaptation for mother-infant communication (Bohn et al. 2001; Sterbing 2002). Given that acoustic communication can be affected by habitat (Brown and Waser, Chap. 4), perhaps the dual peak is also related to broad niche occupation (i.e., high and low strata, densely vegetated and open areas) or shifting niche occupation from the ancestral platyrrhine monkey to the extant species. This may be related to a larger evolutionary explanation, whereby the upper peak represents the ancestral primate condition (still conserved in the strepsirrhines), and the lower peak is a derived condition related to changing behavior, anatomy, and habitat acoustics. Such a pattern might have evolved partially as an adaptation to tune out loud ambient acoustical noise (biological and nonbiological in origin) or take advantage of “sound windows” in forest habitats (see Brown and Waser, Chap. 4). In any case, a larger sample and additional research, including re-evaluating how to report and compare the frequency of best sensitivity, could lead to interesting insights.

Importantly, identification of the frequency of best sensitivity is highly dependent on the frequencies tested—many studies have tested in octave steps, whereas others have been more specific with half-octave steps, intervals of 10 kHz, or other frequencies of interest. Also, the frequency of best sensitivity is sometimes determined within a narrow margin, with the best frequency differing by only 1–2 dB from a secondary peak (Tables 3.1 and 3.2), and this small difference may be within the margin of calibration or testing error (Coleman 2009). Thus, the frequency or frequencies that a species is most sensitive to are important to consider, but the values taken out of context of the whole audiogram should be compared cautiously.

3.4.5 Low-Frequency Limit

Interspecific variation in the low-frequency limit ranges from 28 Hz in the Japanese macaque to 150 Hz in the fork-marked lemur (Phaner furcifer) (Tables 3.1 and 3.2). The haplorhines have, on average, a lower limit than the strepsirrhines. One of the first studies considering both audition and ecology among primates was that of Brown and Waser (1984), which found that in blue monkeys low-frequency vocalizations were associated with enhanced low-frequency auditory sensitivity.

Currently, it is difficult to draw any broad conclusions about the low-frequency limit in primates given that the existing data overlap, data are unavailable for most species, and data for some species are based on a sample size of one. It is not currently clear whether the observed variation is significantly beyond what is normal for interindividual variation, due to physical limitations of the primate ear, or what is a product of selection. Patterns of existing variation and lack of more data may also reflect methodological issues—it can be particularly difficult to calibrate low-frequency acoustic stimuli in variable testing conditions that can include relatively loud low-frequency background noise.

3.5 Summary and Implications for Future Research

Anthropological, biological, and biomedical studies often use nonhuman primates as models for humans. However, human and nonhuman primates differ in their auditory capabilities (Sect. 3.4). Although researchers have identified 21 hearing-linked genes that differ between chimpanzees and humans (Clark et al. 2003), the intricacies and auditory consequences of these genetic differences are not yet fully understood, in part due to a relatively small sample of nonhuman primate auditory data (Sect. 3.4). Identifying the subtleties that separate human and nonhuman primate audition, and the biological relevance of such differences, will require continued effort to fully explore, integrate, and expand the current dataset. A major aspect of this exploration will be further evaluating and refining methods of data collection and analysis (Sect. 3.3).

With respect to auditory processing and perception, nonhuman primates show evidence of specialization for the processing of vocalizations overall, of species-specific vocalizations in particular, and, in some cases, of the ability to recognize specific individuals (Sect. 3.2). Whether or not the nonhuman primate auditory system is specialized for processing vocalizations in general is still a matter of debate, but studies do indicate that specialized cortical structures for processing vocalizations that were thought to be unique to humans actually have homologous counterparts in nonhuman primates. Numerous studies indicate that, like humans, at least some nonhuman primates are able to distinguish conspecifics, and possibly individuals, based on call structure. Studies on the differential processing of normal versus synthesized and spectrally or temporally modified calls both support and dispute that nonhuman primates possess these abilities. Compared to the majority of mammals, nonhuman primates are excellent sound source localizers, closely approaching humans in their high acuity (Sect. 3.4.3.1). The relationship between both excellent sound localization acuity and visual acuity and the similarities in underlying neural structures provide evidence for the coevolution of these two senses in primates. However, these data are based on relatively few species that are common to laboratory settings. The general trend seems to be that as more and more data are accumulated, the auditory abilities of nonhuman primates are increasingly indicated as being very close to those of humans. Identifying the subtleties that separate human and nonhuman primate auditory abilities, and the biological relevance of those differences, requires consideration of all available data. Additional research into the auditory processing and perception of other primate taxa is needed to fully evaluate patterns and evolutionary relationships within the order.

With respect to overall auditory sensitivity, primates have traditionally been portrayed as unspecialized, which may be a consequence of the overall generalized nature of the auditory sense among mammals. There is variation among the species, though, with respect to both order-wide trends and species that display auditory specializations. A review of the literature shows that strepsirrhines are, on average, more adept at detecting high frequencies, and the haplorhines are, on average, more adept at detecting low frequencies (Sect. 3.4). A few trends relating overall auditory sensitivity and behavior or ecology have been identified in the literature. The well-supported allometric model explains that smaller headed species with smaller interaural (between-ear) distances (such as strepsirrhines) particularly need to utilize high frequencies for sound localization (Sect. 3.4.3.1) (R. S. Heffner 2004). Overall auditory sensitivity, particularly to high frequencies, has also been related to increased sociality in lemurs (Sect. 3.4.3.2) (Ramsier et al. 2012a).

The lack of identification of additional broad trends and relationships may be partially attributed to the limited current dataset, which is lacking representation from major taxonomic subgroups. Additionally, or perhaps alternatively, order-wide trends may be minimal given the many possible reasons why enhanced or reduced sensitivity to low, mid, or high frequencies may be beneficial for different species. For example, small, nocturnal or insectivorous species, such as tarsiers and some strepsirrhines, may benefit from detecting the high-frequency signals of insect prey or by communicating in a high-frequency band that is less audible to potential avian or felid predators (Ramsier et al. 2012a, b) (Zimmermann, Chap. 5). For species subject to intensive predation pressure, the reception of alarm calls may be particularly vital to survival (Arnold et al. 2008; Ramsier et al. 2012a). Alternately, it may be more or less advantageous to detect the calls of infants, which tend to be particularly high in frequency, depending on a species’ body size, behavior, and ecology (Snowdon and Hauserger 1997; Pistorio et al. 2006; Ey et al. 2007). In some species, it may be particularly advantageous to be attuned to low frequencies, and this may be attributed to factors such as the detection of low-frequency acoustic cues or the occupation of forested environments (Brown and Waser, Chap. 4). Some species may also benefit from reduced sensitivity to certain frequencies, such as those produced by forest insects, to enhance the detection of other important sounds.

The above are just a few of the many facets of primate audition that need to be explored in more depth, not only through additional data on auditory sensitivity gathered via continually evaluated and refined methodologies but also by more data documenting habitat acoustics (especially in disturbed habitats), anthropogenic noise, and acoustic signals and cues present in the wild and in captive facilities. These data could be a critical component to the survival of the endangered primates for which little or no data on auditory sensitivity currently exist.