Introduction

Toothed whales and dolphins (Odontoceti) are often considered hearing specialists with highly derived auditory characteristics. They can detect a wide frequency range, from 150 to 180 kHz (Johnson 1967; Kastelein et al. 2002; Nachtigall et al. 2008; Mooney et al. 2012). This aids in hearing the broadband or high-frequency energy of many echolocation signals. The species tested process sounds rapidly, following individual clicks presented 2000 times/s, and demonstrate an integration time of 264 µs (Au et al. 1988; Mooney et al. 2006, 2011). Such processing may compensate for the speed of underwater sound. Odontocetes have also lost some characteristics of terrestrial mammals, such as the external pinna and instead, they receive sound through specialized regions of head and lower jaw.

Odontocete jaw hearing is well supported. The most parsimonious pathways for sound reception are through fat bodies associated with the lower jaw (Norris 1980). In addition to the masses of acoustic fat, sound reception is also likely influenced by the hollowed mandibles, air-filled pterygoid sinus, and the bony ear (tympanoperiotic) complex (Norris 1968; Ketten 1992). Yet even the most basic roles of these structures are relatively unknown, such as the locations of sound entry into the odontocete head. Most odontocete hearing studies have addressed bottlenose dolphins (Tursiops truncatus). With this species we have learned that the primary path for detecting echolocation signals is the lower jaw (Norris and Harvey 1974; Brill and Harder 1991). Further, there are regions of maximal sensitivity just anterior to the thinning pan bone region (Møhl et al. 1999). An additional area along the ventral midline showed a similar sensitivity, but the relative influence of sound conduction in one vs. both ears was not examined (Møhl et al. 1999). The locations of greatest sensitivity differed from the more posteriorly located areas of shortest response latency, indicating sound reception is complex in structures as broad and massive as an odontocete head. More recently, sound reception has been proposed to be partly frequency-based, with side areas that are better suited for lower communication-range signals, and anterior regions which convey higher-echolocation-range frequencies better (Popov et al. 2008).

Despite the advances of this work, a limitation is that studies have been traditionally focused on the bottlenose dolphin. As suggested by Cranford, “Any extrapolation from the results of work with Tursiops to other species should be undertaken with trepidation” (Cranford et al. 2008). Supporting this notion, a growing body of literature suggests that species vary in how sound is received. Initial work with belugas (Delphinapterus leucas) showed sensitivity not just from the pan bone region but also from the tip of the lower jaw (Mooney et al. 2008). A more complex assessment of the finless porpoise (Neophocaena asiaeorientalis asiaeorientalis) indicated that areas of sensitivity can be frequency dependent, with lower frequencies received better from the side and higher frequencies heard better from anterior parts of head (Mooney et al. 2014). Finally, recent modeling of sound pathways in a beaked whale (Ziphius cavirostris) suggests that there may be a region of particular sensitivity in the throat (“gular”) area, a region on the ventral midline (Cranford et al. 2008). This was not an area of maximal sensitivity in the finless porpoise or beluga. Further, at lower frequencies (e.g., 8 kHz), this ventral region was among the least sensitive areas. But, as noted above (Møhl et al. 1999), there is some indication that the bottlenose dolphin is sensitive in that area.

Overall, these variations suggest that how sound enters odontocete heads may vary by species. Yet with few comparative studies conducted, there is little understanding of the extent or magnitudes of this variation. Such information would not only address classic form-and-function questions, but also reveal how odontocete hearing structures may directly influence ecological needs such as the detection of prey or predators, communication, and navigation. Further, hearing pathways could influence a species’ sound sensitivity. Sound paths conduct, attenuate or amplify certain frequencies. Such conditions could affect how anthropogenic noise impacts anatomical and physiological structures, or how animals detect and respond to certain sounds. Finally, managers often apply data from “representative species” across odontocetes. Yet we have little understanding of which species may in fact be representative, or how biologically justifiable this practice may be.

To address these concerns, this work seeks to address how sound is received across the head of a relatively unique but accessible odontocete species, the Risso’s dolphin (Grampus griesus). Risso’s dolphins are a pelagic species of squid-eating odontocetes that are typically found in deep, temperate, and tropical waters near continental shelf edges and submarine canyons (Leatherwood et al. 1980). In contrast to the rounded melon of most delphinids, Risso’s dolphins have distinctive melons that are broad, somewhat square in profile, and creased by a characteristic longitudinal furrow or indentation extending down the melon to the top of the upper jaw. Their echolocation signals may be unusually oriented at a downward angle (Philips et al. 2003). Their lower jaw is also relatively shorter than that of bottlenose dolphins. While this probably reflects a foraging and prey-capture adaptation, the difference may also influence sound reception. There are only two published Risso’s audiograms, an older female with high-frequency hearing loss, and a neonate male with a sensitive and broad range of hearing, up to 150 kHz (Nachtigall et al. 1995, 2005). There are no published data on their sound reception pathways.

The overall goal of this work was to evaluate how sound is received by a Risso’s dolphin. Using a jawphone suction-cup transducer, click-sound stimuli were presented at specific locations on the animal’s head and lower jaw. Hearing thresholds were then measured for each stimulus location and the results were compared. These relative hearing levels were then compared to computed tomography scans of Risso’s dolphin specimens to place the relative hearing abilities in context with the species’ acoustic fat regions.

Materials and methods

The data collection was carried out at Farglory Ocean Park, Taiwan over a 2-week period in March and April, 2012. Hearing measurements were made using the auditory evoked potential (AEP) technique. This method is a well-established means to examine odontocete cetacean hearing rapidly, passively, and non-invasively (rev. Supin et al. 2001; Nachtigall et al. 2007; Mooney et al. 2012). The subject was a female Risso’s dolphin (Da Hwa). The animal was originally caught from the wild in the Japanese Taiji fishery (ca. end of 2003) and has resided at the site of the investigations since July 2004. No ototoxic drugs were used before this investigation took place. At the time of the experiments, the animal was approximately 15 years old, weighed 294 kg, and was 288 cm in length. The animal was housed in cement pools filled with local seawater, along with several bottlenose dolphins, T. truncatus. The bottlenose dolphins were guided out of the test pool to adjacent pools by the trainer before each experimental session. One-to-two test sessions were conducted per day. Hearing measurements were carried out using one of two formats: either the animal was in the water stationed at the side of the pool (for the baseline audiogram or tests addressing responses to pulses of different durations), or the animal voluntarily beached out of the water, resting entirely on the deck adjacent to the test pool (for hearing pathway studies).

Stimulus presentation, evoked potential recording, and baseline audiogram

Once the animal was properly oriented for the respective experiment, it was fitted with three custom-built silicone suction cups (KE1300T, Shin-Etsu, Tokyo, Japan) embedded with gold-platted electrodes (Grass Technologies, Warwick, RI, USA). A conductive electrode gel was used to enhance AEP signal collection (Signagel, Parker Laboratories, Fairfield, NJ). The active (non-inverting) electrode was placed along the midline of the animal 3–4 cm behind the blowhole, the reference electrode was placed on the dorsal fin of the animal, and the ground electrode was placed on the animal’s caudal peduncle. The electrodes were connected to a biological amplifier (CP511, Grass Technologies, Warwick, RI, USA) set to amplify responses 10,000-fold and bandpass filter them from 300 to 3000 Hz. This bioamplifier was connected to a Krohn-Hite filter (3B series, Brockton, MA) also set at 300–3000 Hz bandpass. The signal was then conducted to a BNC breakout box (2110, National Instruments Corporation, Austin, TX, USA) and a PCMCIA-6062E data acquisition card implemented in a laptop computer. A custom LabView program (National Instruments) converted the analog signal to a digital record at a 16-kHz sampling rate. One thousand sweeps were averaged per stimulus frequency and sound level. This averaged signal was then stored on the laptop for offline analysis.

Acoustic stimuli were created using the same custom LabView program, laptop, and data acquisition card. Outgoing signals were produced at a 512-kHz update rate. Signal amplitudes were controlled using a HP 350D attenuator and projected to the animal through a custom “jawphone”. This jawphone consisted of a Reson 4013 transducer (Slangerup, Denmark) implanted in a custom-built silicone suction cup. The jawphone was attached to the animal using the electrode gel to eliminate reflective air gaps between the cup and the animal’s skin.

For the audiogram, the jawphone was placed along the midline, near the lower jaw tip, allowing sound to travel equally to both ears (Fig. 1, position 2). Stimuli for the audiogram consisted of amplitude-modulated tones presented in 20 ms bouts at a 20 s−1 rate. Stimuli tested included 4, 8, 11.2, 16, 22.5, 32, 45, 54, 80, 100, 120, 128, and 150 kHz. The evoked response recordings began coincident with the stimulus presentation and were 30 ms in duration. Initial tone pips making up the 20 ms amplitude-modulated tone were of 1 ms duration.

Fig. 1
figure 1

Locations of jawphone placement: (1) Melon, (2) rostrum tip, (3) anterior throat, (4) posterior throat, (5) anterior right jaw, (6) posterior right jaw, (7) meatus, (8) flipper. The jawphone is located at the anterior right jaw in the image. The recording (non-inverting) electrode can be seen just behind the blowhole

Stimuli were presented 1000 times for each sound level and a corresponding response was collected for each sound presentation. These 1000 responses were averaged using the custom AEP software and stored for later data analyses. Start amplitude was predetermined using a level approximately 30 dB above the thresholds of the previous two Risso’s dolphin audiograms (Nachtigall et al. 1995, 2005). Sound levels were then increased or decreased in 5 or 10 dB steps depending upon the envelope following response (EFR) and respective fast Fourier transform (FFT) amplitudes that were visible on the custom AEP program. In order to maintain a good signal to noise ratio for the AEP recordings, durations for the higher frequency stimuli (>11.2 kHz) were limited in the cycles per modulation which allowed some frequency spreading around the center tone (Table 1). Thus, as carrier frequency increased, pip duration decreased. Spectrum bandwidths were not increased beyond ±10 kHz or a ±0.25 octave range. The effects of this were noted in the “Results” and stemmed a short experiment on pip-duration, described below. This enhanced potential response amplitudes but did not significantly impact thresholds (Supin and Popov 2007). Yet, this decrease in stimulus duration influences the calculation of stimulus dB rms values. Thus, the audiogram was calculated in two ways: the dB rms values were calculated for the entire 20 ms pip train (including the silent periods) and dB thresholds calculated by including the duration of the pips only (i.e., excluding the silent periods). The former was the preferred method of calculation for pip train stimuli because it followed similar methods, had a negligible effect on threshold calculations, and the odontocete auditory integration time is long enough to sum the stimulus energy of the individual pips (Johnson 1968; Supin and Popov 2007). Thresholds were determined offline by fast Fourier transforming a 16-ms (256 point) portion of the EFR. The spectra were plotted relative to their respective sound pressure level (SPL; dB re: 1 µPa rms; Fig. 2). A regression line was then fitted to the peak values at the modulation rate, and the point where the regression crossed zero was taken as the threshold (e.g., Nachtigall et al. 2005, 2007).

Table 1 Stimulus parameters and signal duration
Fig. 2
figure 2

Evoked potential measurements for 16 kHz. a AEP responses at 90, 85, 80, 75 and 70 dB re 1 µPa for the jawphone stimuli (from top to bottom). b Fast Fourier transform of 16 ms of the respective AEP responses. c The peak value of the FFT at the 1-kHz modulation rate and respective stimuli levels. The open symbols and line show a best-fit regression addressing the points. The crossing threshold was 72 dB re 1 µPa SPL using the jawphone stimulus

Jawphone stimuli were calibrated in the water at the test facility before the experiment using the same sounds as in the hearing tests. While calibration measurements were in the free- and far-fields, it is acknowledged that jawphone-presented stimuli were not received by the animal in this manner. However, this calibration allows for some comparisons with how sounds may be received in the far-field while recognizing the differences between free-field and contact transducer measurements (Cook et al. 2006; Finneran and Houser 2006). Received measurements were made using a Reson 4013 transducer. The jawphone projector and receiver were placed 50 cm apart at 1 m depth. This distance was used in part because it can be compared with measures of other jawphone-measured odontocetes and because the location of the jawphone was similar to the expected distance from dolphin’s jaw tip to the ears. The received signals were viewed on an oscilloscope (Tektronix TPS 2014, Beaverton, OR, USA) and the peak-to-peak voltages (V p–p) were measured. From these values sound pressure levels were calculated (SPL, dBp-p re 1 μPa) as is standard to measure odontocete click intensities due to the inherent brevity of the signals (Au 1993).

Duration- and location-based thresholds

Two additional experiments were used to address how the Risso’s dolphin heard pulsed sounds. The first addressed responses to pulses of increasing duration. Pulses were generated using the custom Labview program by creating signals with the number of cycles varying from 1 to 10, as well as 25 and 50 cycle pips. This experimental set was gathered using two center frequencies: 54 and 100 kHz. The designed duration of these pulses was from 10 to 500 µs (see calibrations below). The AEP responses with methods and analyses windows as noted above. Response amplitudes from a constant peak-to-peak sound level were compared (116 dB for 54 kHz and 140 dB for 100 kHz). The durations were varied to evaluate relative response differences between short, broadband pulses, similar to dolphin clicks, and longer duration, narrower pulses with more cycles. An initial goal of this experiment was to provide a comparison to the duration-varied pulses used to increase AEP signal-to-noise values (see “Results”). This experiment also helped address energy-based sound detection in dolphins and porpoises. The latter uses longer, narrower-band pulses similar to their own echolocation signals and has better hearing thresholds for these sounds (Mooney et al. 2011). But it was uncertain whether this was a feature of specialized porpoises or an auditory trait found in multiple odontocete species including dolphins. In addition to comparing response amplitudes, thresholds were collected and compared at four of these durations (1, 2, 5, and 10 cycles).

To calibrate these signal parameters the stimuli were also recorded using the custom data acquisition program. Sound records were sampled at a rate of 512 kHz and stored as a mean of ten stimuli. From these recorded files and the dBp–p it was possible to calculate and compare the energy flux density of the pulses (dB re 1 µPa2•s), a valuable metric of short signals which vary in duration (Madsen 2005). A FFT of the waveform revealed the spectra of the recorded pulses and helped confirm the center frequency of each pulse type (following Au 1993; Madsen and Wahlberg 2007). Pulses were centered at 100 ± 3 and 54 ± 3 kHz, with the exception of the 100 kHz 3 cycle pulse, which was centered at 93 kHz. Pulse durations were measured from the recorded files and characterized as time between two points at which the wave oscillations rose from and descended into the background noise (Au 1993; Li et al. 2005). Durations for 100 kHz pulses generated with 1 through 10 cycles were 28, 30, 41, 43, 57, 59, 65, 78, 84, and 91 µs, respectively. The 54-kHz signal durations included 26, 31, 42, 44, 55, 59, 66, 79, 85, and 94 µs, respectively (see Mooney et al. 2011). Durations were 252 and 498, and 250 and 501 for the 25- and 50-cycle pulses of the 54- and 100-kHz tones, respectively.

To measure location-based receiving sensitivity, the jawphone was attached at nine specific locations on the animal’s head and body. These included the melon, lower jaw tip, anterior throat (gular), posterior throat, lower jaw fat pad (left and right), anterior lower jaw, meatus, and flipper. The first eight locations were used to ‘map’ the relative thresholds of the animal across its head as well as compare left vs. right sensitivity. The flipper presentations were used as a control. Stimuli were broadband clicks centered at 100 kHz and 31 µs duration, presented in 20 ms bouts at a 20 s−1 rate. The relative thresholds for multiple species were compared for the melon, lower jaw tip, anterior throat, anterior, right jaw, posterior right jaw, and meatus. Species included the Risso’s dolphin, finless porpoise, and bottlenose dolphin. Thresholds were assessed using a K-means clustering analysis (using SPSS) and a one-way ANOVA followed by a Tukey’s pairwise posthoc test.

CT scanning and 3D modeling

Scanning and related anatomical assessments were carried out using a separate mature, 2.83 m male Risso’s dolphin specimen. It was found stranded alive along the north coast of Taipei in November 2011. The dolphin died after 1 day in a temporary rehabilitation pool. A CT scan was performed before carrying out a gross necropsy. A scan of the head and thorax was completed using a 64-section multidetector CT unit (LightSpeed VCT, GE Healthcare). Images were acquired in the transaxial plane (i.e., at right angles to the long axis of the body) and helically by rotating an X-ray source of 120 kV at 320 mA. A total of 800 transverse slices at 0.625 mm thickness were collected, with a matrix size of 512 × 512 and a field of view of 30 × 30 cm. These parameters yielded voxel dimensions of 0.9 × 0.9 × 3.0 mm. Segmentations (i.e., assigning pixels to particular structures), 3-D reconstructions, and volume calculations were conducted using the software program OsiriX 4.1.2 64-bit version (Rosset et al. 2004). Anatomical structures were identified using a head atlas of the bottlenose dolphin (Houser et al. 2004) and beaked whale, Ziphius cavirostris (Cranford et al. 2008). Briefly, 3-D reconstructions were completed for the skull, mandibles, brain, tympanoperiotic complex, outer core of the mandibular fat body, inner core of the mandibular fat body, and cranial air spaces, which consisted of the nasal passages and laryngeal air, pterygoid sinus and peribullary sinus. Segmentations were completed by applying a threshold of tissue density values [represented by Hounsfield units (HU)] that defined each anatomical structure. For example, the values for the inner core of the mandibular fat body ranged from −139 to −91 HU, whereas values for the outer core of the mandibular fat body ranged from −90 to 10 HU. This thresholding procedure was followed by visual inspection and manual editing to ensure that structures were properly defined.

Results

The audiogram

In order to increase AEP signal-to-noise ratios, the stimuli durations were decreased at the upper frequencies. This slightly increased the stimuli spectrum bandwidths compared to pure tones (Table 1) which had an overall effect of proportionally increasing AEP amplitudes (see also below) and improving the threshold detections for the adjusted frequencies (see Supin and Popov 2007). But relatively greater stimuli bandwidths actually occurred at the lower frequencies which were not shortened (≤11.2 kHz).

Overall, the Risso’s dolphin showed generally good hearing abilities. The open triangles represent the thresholds if dB rms values are calculated for the entire 20 ms pip train (including the silent periods; Fig. 3). The closed triangles reflect the threshold calculated by including the duration of the pips only (i.e., excluding the silent periods). Hearing thresholds were most sensitive at 11.2 kHz and between 40 and 80 kHz. There was a slight notch in the sensitivity at 32 kHz. Thresholds increased rapidly above 100 kHz reflecting a decrease in sensitivity at higher frequencies. Responses were detectable and thresholds were measured up to 128 kHz (107 dB), the limit of hearing in this animal. No responses were detected at 150 kHz. Low-frequency thresholds increased gradually below 11.2 kHz.

Fig. 3
figure 3

Audiogram for this Risso’s dolphin and previously measured animals. a The audiogram of this animal. Stimulus shortened above 11 kHz to improve signal detection. Jawphone sound levels were calculated using dB rms values re 1 µPa incorporating the whole 20 ms (open triangles) or only the sections that contained the tone pip (closed triangles). b A comparison of the data from the Risso’s dolphin if this study and the two previously measured animals. The data measured here largely overlap the earlier data [adapted from (Nachtigall et al. 1995, 2005)]

Pulse duration

While bioelectrical noise levels in the measurement location were not unusually high, noise rms values were clearly not as low as in a more controlled laboratory setting (Mooney et al. 2009) or areas without electrical noise interference (Castellote et al. 2014). It was immediately apparent that shortening the duration of the stimulus tended to result in higher amplitude AEPs and lower thresholds (these data and Supin and Popov 2007). In order to quantify the influence of changing signal duration on AEP amplitude we compared reduced AEP amplitudes to the number of cycles (and thus the duration) of the stimulus. This was conducted using 54 and 100 kHz centered signals.

Evoked potential amplitudes were dependent upon the duration of the sound stimulus (Fig. 4). This was most evident by examining the peak values of a 16-ms portion of the AEP signal (Fig. 4b). At lower numbers of cycles (1–3) the peak FFT level was relatively flat. Levels then dropped when cycles were increased to 4–10. The peak amplitudes, while still consistently visible at this constant output level, remained level out to 50 cycles per pip. This was the limit of what was tested and essentially what would fit into a 54-kHz tone-pip of 1 ms duration. The AEP decrease with cycle increase showed a relatively strong negative relationship (r 2 = 0.7135; p < 0.05) using a power function to predict the trend (y = 17.137x −0.385).

Fig. 4
figure 4

Evoked potential response amplitudes based upon number of cycles in a stimulus. a AEP waveforms for a signal centered at 54 kHz with 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 25, and 50 cycles (only odd number of cycles are labeled for conciseness). b The FFT spectra for the waveforms in a. The peak at 1 kHz reflects the relative following response of the stimulus. c The 1-kHz FFT peak levels as a function of number of cycles showing a decrease in response amplitude as number of cycles increases (y = 17.137x −0.385; r 2 = 0.7135)

Thresholds were calculated for 1, 2, 5, and 10-cycle stimuli at both test frequencies (Fig. 5). Both showed similar relationships. Thresholds were highest with signals of 1 and 10 cycles and conversely, lowest at 2 and 5 cycles. For 54 kHz the threshold drops as a function of the number of cycles up to 5, then increases again at 10 cycles. The lowest threshold was found at 2 cycles when using 100 kHz.

Fig. 5
figure 5

Thresholds relative to the number of cycles in a signal. Stimuli were centered at 54 and 100 kHz using 1, 2, 5 and 10 cycles in the waveform. Thresholds (in dB re 1 µPa) were relatively elevated with fewer and higher numbers of cycles

Relative sensitivities

The overall goal of the work was to address the likely hearing pathways across the head of the Risso’s dolphin. Leveraging the above audiogram and stimulus cycle data, thresholds were measured using a pulse centered at 100 kHz and testing nine different stimulus points. The lowest thresholds overall were found with the jawphone placed at the lower jaw tip and the anterior throat position (right side), 95 and 93 dB, respectively (Fig. 6a). The posterior left jaw showed a similar threshold level. The posterior right jaw was relatively and surprisingly elevated compared to the jaw tip and anterior throat jaw locations. Yet, it was more sensitive than the posterior throat and anterior right jaw. These two locations had sensitivities similar to the jawphone placement over the meatus (107 dB). The flipper (control) did not produce any apparent AEP responses. The melon placement demonstrated the highest threshold, 128 dB.

Fig. 6
figure 6

Thresholds based upon the location of the jawphone presented stimuli. a Click-based AEP response thresholds for the Risso’s dolphin. The lowest thresholds were produced from jawphone-presented clicks at the lower jaw tip and anterior throat locations (located on the animal’s midline), and the posterior left jaw (dB re 1 µPa). b Comparisons of similarly tested animals using click data only plotted in increase in sound levels for each location (presenting similar locations only). Zero dB reflects the lowest thresholds for that particular animal (see also Møhl et al. 1999) and higher numbers represent the dB value for respectively higher thresholds at the other locations tested. Data are plotted in SPL, dB re 1 µPa. Data are shown from a bottlenose dolphin (Møhl et al. 1999); beluga (Mooney et al. 2008); and finless porpoise (Mooney et al. 2014) in addition to the Risso’s dolphin measured here

When location data were pooled across species, mean thresholds from the melon were significantly elevated (by +23.2 dB) compared to the anterior right jaw (+5.7 dB) and anterior throat (+2.4 dB) (one-way ANOVA F 7,34 = 2.83; p < 0.05; Tukey’s pairwise posthoc test). However, most thresholds were not significantly different between the other locations reflecting the substantial between-species variation at each location and few general location-based trends. A cluster analysis of the location-based thresholds suggested that both the Risso’s dolphin and finless porpoise cluster in similarity, whereas the bottlenose dolphin settled in a separate cluster. In other words, some species differed in where they were most sensitive, whereas others were similar. While this generally reflects similarities based upon the shape of the head and telescoping of the mandible, predicting these trends might be difficult without comparative data from additional species.

The results are particularly intriguing when compared to CT 3-D reconstructions of Risso’s dolphin specimens. Lowest thresholds were found along the anterior midline and over the acoustic fat bodies of the lower jaw (Fig. 6). Thresholds were elevated at the anterior portions of these fat bodies, when not on the midline. They were also elevated posterior to the acoustic fat regions, even when very close to the bulla complex (i.e., over the external auditory meatus).

Discussion

Audiograms

Generally, the audiogram was similar to those of previously measured Risso’s dolphins (Nachtigall et al. 1995, 2005). The two previous animals included a neonate with sensitive high-frequency hearing (above 22.5 kHz) and an older female with some high-frequency hearing loss. These higher frequency thresholds of the neonate were consistently below those of the animal measured here— sometimes substantially such as at 32 kHz (25–30 dB). But more often the thresholds were within 5–10 dB of the thresholds for the animal tested here. The neonate and this animal showed a similar frequency range of consistently sensitive hearing from 45 to 80 kHz. At 11 kHz and below, the thresholds of this animal and those of the neonate were comparable. Further, both were elevated relative to the behavioral measurements of the older Risso’s dolphin tested. The similarity of the two AEP-tested animals and the tendency of AEP measurements to be elevated at lower frequencies suggest that the difference between these data and those measured behaviorally tested might be due to the methods applied.

The thresholds of the animal measured here were often generally low (below 70 dB) from 11 to 100 kHz and might be considered relatively sensitive compared to other odontocetes (Mooney et al. 2012). However, the hearing thresholds were often slightly higher than the two previously tested Risso’s dolphins, suggesting either masking during these tests or some degree of hearing loss. Masking of low frequencies might have occurred from the general background noise in the pools (pumps, filters, etc. …) or a result of the physiological noise levels of the animal which could mask the AEP responses, increasing the thresholds across frequencies. The hearing abilities of this animal were similar to those found in bottlenose dolphins tested with similar metrics (Houser and Finneran 2006; Houser et al. 2008). There was a slight but distinct notch in the audiogram at 32 kHz. Because the method is considered to be robust in terms of reducing threshold variability (Supin and Popov 2007), it is likely that this notch may be reflective of the actual sensitivity.

Responses were measured up to 128 kHz, with good hearing (<60 dB) up to 100 kHz. No responses were detected at 150 kHz and sound levels of 130–140 dB. However, the upward trend of the high-frequency region clearly suggests the tests had reached the high-frequency cut-off. This high-frequency hearing extended beyond the range of a previously measured Risso’s hearing abilities (Nachtigall et al. 1995), but was not quite as sensitive, or as high as the previously tested neonate Risso’s dolphin’s (Nachtigall et al. 2005).

Durations and relative sensitivities

Shorter duration signals clearly influence evoked response amplitudes (at higher sound levels) but this does not directly translate to lower thresholds. Figure 4c illustrates that the peak response levels are highly dependent upon the stimulus duration. With the shorter pulses, there is an inherently broader spectrum. This likely induces stimulation of a greater range of hair cells along the basilar membrane and such a response is indicative of the higher AEP peak levels. While the shortest, 1 cycle pulses, result in the highest thresholds, pulses slightly longer in duration allowed for lower thresholds. As noted elsewhere (Mooney et al. 2011), the elevated thresholds with 1 cycle pulses was likely a result of the lower energy flux density in these shorter pulses and less energy available for detection. Thus, while some tuning of stimulus duration can improve threshold detection without actually changing the threshold value calculated; hyper-shortening signal duration has the opposite effect.

The anterior midline and the locations most-directly above the mandibular fat bodies provided the lowest thresholds. As suggested in previous studies, thresholds from this midline location are probably influenced by sound being received by both ears (Mooney et al. 2008). But notably, the anterior midline locations (throat and jaw tip) were more than 10 dB more sensitive than the posterior throat point. At the very least, this suggests these anterior areas conduct sound to the ears more efficiently than the posterior locations. It is possible that these locations vibrate the mandible itself, causing a more exaggerated vibration of the fat bodies adjacent to the mandibles. But we would expect this vibration would be minimal when using sound levels near threshold (as was the case here) and thus would translate to minimal influence on the results. These relatively low thresholds also provide some empirical support for the throat hearing pathway suggested in prior work (Cranford et al. 2008), at least for the Risso’s dolphin. Similar areas have been tested in other species. In porpoises, this area was not particularly sensitive (Mooney et al. 2014; Fig. 6b). However, while the region was not distinguished, the ventral midline area was an area of relatively good hearing for the bottlenose dolphin (Møhl et al. 1999). But the data from bottlenose dolphin and this animal were confined to click-only stimuli. For the Risso’s dolphin, ventral hearing sensitivity matches suggestions of a downward oriented echolocation beam (Philips et al. 2003). Clicks that were projected downward would likely reflect off prey items and hearing well from below would maximize hearing efficiency of the returning echo energy. However, if the throat pathway does predominate in some species, it might mean they might be “monoaural” listeners with little difference in level or time of arrival at each ear.

It is hard to say if these anterior midline locations are more sensitive than the mandibular jaw bodies, in part because this study found differences in the thresholds of the left and right thresholds. Unfortunately, the left jaw measurements were made at the end of the experiment and further measurements were not able to be collected. Left and right ear differences are not uncommon for humans, but animal measurements are relatively rare. Predominant hearing loss in one ear has been noted for both a male and a female bottlenose dolphin (Brill et al. 2001). The female showed relatively minor differences (3–6 dB) while the male demonstrated substantial hearing loss in both ears. Between ears, differences were ca. 30 dB at some frequencies. While the differences here are not as striking, they were 8 dB or more than twofold in intensity and were noted on an animal considered to be of normal health without any history of auditory irregularities. It would be intriguing to compare thresholds at multiple frequencies. Further, noting these left–right differences in multiple animals and across species suggests they can be found relatively often, if not on a “normal” basis. In pinnipeds, otitis media infections have caused hearing impairments in just one ear (Ketten et al. 2011). If binaural hearing differences is common in odontocetes, this suggests jawphone studies might test left and right ear differences before conducting hearing tests, and midline placements might provide reasonable alternative when situations (i.e., strandings) do not permit these tests. It also questions where the amount of hearing loss noted elsewhere (Mann et al. 2010) is binaural or simply a function of primarily measuring from one ear.

The relative similarities of the anterior midline locations and the mean posterior jaw locations are also interesting. When the thresholds are binned into 10 dB groups (i.e., 90–99, 100–109 dB, etc. …) there is a suggestive pathway of sound reception (i.e., green dots, Fig. 7d–f) for the Risso’s dolphin. These data suggest that sound in the anterior midline points may not only be conducted to both ears but conveyed on both sides of the intra and extra mandibular fat regions. The relative density of the mandibular bone (its impedance mismatch between bone and fat) of the anterior jaw area may limit intramadibular sound conduction. But sound entering more posteriorly, where the bone thins [i.e., the acoustic window, or pan bone area (Norris 1968)] may allow better sound transmission. This would then support multiple sound pathways to the ear. However, the sound must still be conducted through multiple changes in a medium (fat to bone to fat), a major challenge to this panbone hearing theory (Norris 1968). This perhaps lends support to the throat hearing hypothesis (Cranford et al. 2008).

Fig. 7
figure 7

Thresholds relative to anatomy. (a, d) show a side view, (b, e) are an offset angle from below, and (c, f) show directly below. Some substantial differences were found across the head of the Risso’s dolphin. Some of the lowest hearing thresholds were found along the ventral midline. Differences were also noted between the left and right side placements of the jawphone. Numbers in subplots (a–c) are the dB SPL level of the click threshold at that location (re 1 µPa). The gray “91 dB” circles in a and c refer to the posterior left jaw measurement. Color dots reflect the same jawphone placement but binning the thresholds into 10 dB increments to illustrate the relatively sensitivity across the head

Such pathways have more often been tested invasively or modeled for the bottlenose dolphin and modeled for some beaked whales (Bullock et al. 1968; Cranford et al. 2008, 2013). Comparisons of modeling to these in vivo data would provide an intriguing examination of both methods. But it is necessary to combine both methods using multiple species to begin to resolve whether new data such as these are applicable across species, or whether different species receive sound in different ways. In either case, this growing breadth of received sensitivity data seems to reflect differences between species and a limitation to the use of representative species. For example, ‘which species is representative?’ The taxa tested to date (bottlenose dolphin, Yangtze finless porpoise, beluga and Risso’s dolphin) vary widely in social structure, foraging strategies and soundscapes. Their differences in hearing pathways likely influence (and are perhaps shaped by) their acoustic behavior and ecology. As ocean noise increases, differences in sound reception may also influence noise impacts. Establishing how sounds are received becomes vital as we seek to understand acoustic ecologies, evaluate the differences between species, and seek to mitigate potential noise influences.