Introduction

The order in which flight and echolocation evolved in bats is contentious, with no clear consensus based on the limited fossil record or molecular analysis from contemporary species. Onychonycteris finneyi, one of the oldest complete bat fossils that represents a species that lived approximately 52 mya, was clearly capable of flight but its ability to echolocate is contested (Simmons et al. 2008, 2010). Initially, O. finneyi was thought to represent a nonecholocating volant ancestor to modern day microchiropterans based upon its relatively small cochleae (similar to nonecholocating extant Chiroptera) (Simmons et al. 2008). However, further analysis of this fossil using CT scans showed an apparent linkage between the stylohyal and tympanic bones, this being indicative of laryngeal echolocation in modern day bats (Veselka et al. 2010). Although intriguing, the ability to definitively determine this connection is confounded by the somewhat poor condition of the flattened skull (Simmons et al. 2010).

With the advent of genetic analyses, the search for a molecular signal that links all contemporary echolocating bats has proven elusive. However, these data have produced testable phylogenies of extant bats that offer alternative hypotheses of origins and diversification. Several researchers have proposed that echolocation evolved independently anywhere between two and four times in echolocating bats (Teeling et al. 2005). In addition, and perhaps most puzzling, such analyses group the most advanced bats with laryngeal emission, the Rhinolophidae with pteropodids. This makes less intuitive sense especially in light of the significant differences in brains, skulls, jaw suspension, dentition, cranial vasculature, neuroacoustic systems, and flight musculature (see Pedersen and Timm 2012).

One avenue of molecular research that does provide a fascinating insight into the convergent evolution of echolocation in mammals is the protein called Prestin, occurring in the membranes of the outer hair cells (OHC’s) found on the basilar membrane (BM) of the cochlea. Prestin provides the electromobility of the OHC’s and is thought to play a role in cochlear amplification, which confers frequency sensitivity and selectivity to the mammalian auditory system (Dallos 2008). By comparing the protein sequences in echolocating bats and cetaceans, Li et al. (2008, 2010) and Parker et al. (2013) showed that mutations cluster both of these groups together even though sonar clearly evolved independently in these groups. As for bats, Li et al. (2008) unite all laryngeal echolocating bat species into a single clade providing significant support for the original suborder designation of Megachiroptera (Pteropodidae- typically nonecholocating) and Microchiroptera (all families of laryngeal echolocating bats) and refuting the more recently derived suborder of Yangochiroptera (most laryngeal echolocating families except Rhinopomatidae, Rhinolophidae, Hipposideridae, and Megadermatidae) and Yinpterochiroptera (Pteropodidae, Rhinopomatidae, Rhinolophidae, Hipposideridae, and Megadermatidae) (Teeling et al. 2002). However, a more recent study has supported independent evolution of the hearing gene KCNQ4 in rhinolophids, potentially providing some support for the existence of the Yinpterochiroptera (Lui et al. 2011). Of course, both proteins are involved in the perception of high frequency sound, not in its production. These data may therefore muddy the waters therein. What is nearly completely lacking in the literature on this topic is an ontogenetic approach to questions regarding the relationship between echolocation and flight evolution and how traceable heterochronic shifts in regulation during development act as historical markers of evolutionary change (Liem and Wake 1985; Müller 1990; Jablonka and Lamb 1998; Klingenberg 1998; Adams and Pedersen 2000; True and Haag 2001; Baguñà and Garcia-Fernández 2003; Minelli 2003; Young and Badyaev 2007; Cretekos et al. 2008; Dial et al. 2008).

In addition, investigations on the ontogenetic and evolutionary genesis of derived morphologies indicate that derived states are products of developmental events that are manifested downstream of those that establish the more generalized form of a given taxa. For example, positioning of the limb bones underneath the body is acquired in a stepwise manner in mammals, wherein newborn rats first crawl with their venter in contact with the surface. Initially, paw strikes are plantigrade with lateral bending of the spine, similar to what is expected in the first terrestrial vertebrates (Romer 1959; Williams 1981; Westerga and Gramsbergen 1990; Ischer and Ireland 2009). The evolution of highly derived morphologies required for more complicated behaviors such as jumping and climbing have also been shown to transition through fixed developmental sequences expected in ancestral forms (Ferron 1981; Eilam 1997; Eilam and Shefer 1997; Fischer et al. 2002; Lammers and German 2002; Witte et al. 2002; Schilling and Petrovitch 2006; Cretekos et al. 2008).

In addition, Cooper and Sears (2013) showed how a simple regulatory shift in BMP, BMP2, and BMPS gene regulation is responsible for finger elongation and wing development in bats. Adams and Shaw (2013) went on to relate the development of the musculoskeletal system, wing growth, and flight performance in juvenile bats to derive a compelling model for the evolution of flight in bats.

Herein, we use an ontogenetic approach to test the hypothesis that echolocation evolved before flight in bats. We predict that the ontogenetic emergence of echolocation will consistently precede the emergence of flight in all species studied to date, suggesting a fundamental ontogenetic pattern. In addition, morphology and physiology associated with high frequency sound production and hearing will also develop before the emergence of flight in all species studied.

Sound Production in Juvenile Bats

Gould (1971) provided the first in depth description of the ontogeny of echolocation in bats using Eptesicus fuscus and Myotis lucifugus (Vespertilionidae). His study revealed several interesting patterns regarding the development of communication and echolocation calls with regard to locomotion. Short duration, frequency modulated (FM) calls are emitted by neonates soon after birth (day one in M. lucifugus and day six in E. fuscus) and before the onset of flight. The repetition rate of these calls is commensurate with levels of excitation and different bat species may have modified different excitation indicator sounds to produce or evolve echolocation. Gould (1971) argued that “self-excitation,” such as walking or flying results in emission of vocalizations and that these vocalizations are/have ontogenetically and evolutionarily been co-opted into a form of sonar. He argued that laryngeal output and locomotion are to some extent coupled, resulting in vocalization when the bat is crawling or flying (self-excitation). Interestingly, deafened bats emit echolocation calls at regular intervals when flying, indicating that locomotion promotes echolocation emission despite the inability to hear the echoes (Woolf 1973).

Other investigators have found similar results regarding the ontogeny of flight and echolocation in M. lucifugus and E. fuscus (Buchler 1980; Moss et al. 1997; Monroy et al. 2011). However, Gould (1971) concluded that, in E. fuscus, isolation calls mature into both echolocation calls and social calls. Interestingly, echolocation calls develop at a faster rate than the communication calls, indicating that the maturation of echolocation and communication calls are decoupled. The onset in production of echolocation-like sounds in the second week coincides with when young bats begin crawling (Monroy et al. 2011), which is consistent with that of Gould (1971). Moss et al. (1997) also detected high frequency FM sonar-like calls from four day-old M. lucifugus when they were dropped from a one-meter high perch onto a padded surface.

Other vespertilionids exhibit similar ontogenetic patterns. Brown (1976) showed that seven day-old Antrozous pallidus produce echolocation precursors while crawling. The emission rate of these echolocation-like sounds increases as young bats begin flexing and extending their wings and conducting push-ups, presumably to strengthen flight muscles. Wang et al. (2014) showed that Myotis macrodactylus produce echolocation precursors during the first week of life, well before the onset of flight at 6 weeks. Hiryu and Riquimaroux (2011) showed that, in Pipistrellus abramus, echolocation-like calls also precede flight, with the former beginning during the second week and the latter during the fifth week. Echolocation precursors first appear between week one and two and first flights start during week four in two species of Tylonycteris, T. robustus and T. pachypus (Zhang et al. 2005).

In our lab, we have shown that adult-like echolocation is present in nonvolant young well before the onset of flight in two phyllostomids, Artibeus jamaicensis and Carollia perspicillata. By recording vocalizations of individuals as they dropped from a perch onto a padded surface, Carter et al. (2014) showed that one day-old A. jamaicensis can emit echolocation-like calls that are very similar to those emitted by volant adults (Fig. 1). Although one-day-old bats are unable to fly, they exhibit intended-wing movement (Adams and Shaw 2013) and echolocation behavior while in free fall. Week old C. perspicillata emit echolocation-like calls while directed at an approaching target (Sterbing 2002) and while in free fall (Carter personal observation).

Fig. 1
figure 1

Sonograms and oscillograms of communication (top row and third row down) calls and sonar (second row down and bottom row) recorded from Artibeus jamaicensis at different ages and flight development stages. Flight ability is defined as flop (nonvolant), flutter (nonvolant), flap (semi-volant), flight (volant), or adult (volant) based on drop tests. Not only are adult-like echolocation calls present at day 1 but echolocation calls do not develop from the longer duration communication calls (from Carter et al. 2014, with permission)

Echolocating species of the Yinpterochiroptera utilize high duty cycle calls that are defined as sequences of intermittent calls where ≥25 % of the time is occupied by the calls/sound (Fenton et al. 2012). These high duty cycle species emit calls that consist of either long constant frequency (CF) calls or CF components that may begin and/or end with short FM segments and are therefore referred to as narrowband calls. Despite the differences in echolocation call structure, emission rate, foraging strategy, and evolutionary history among high duty cycle echolocation bat species (Simmons et al. 1979), temporal relationships between the ontogeny of echolocation and locomotion are similar. For example, Rhinolophus rouxi first emit echolocation-like calls during the third week of life and begins to fly during the fourth week (Rübsamen 1987). During the third week, these high frequency calls are emitted through the nostrils while crawling and when the pups start moving their heads from side to side when scanning the environment. Rhinolophus ferrumquinum are inactive and relatively quiet during the first week of life while during the second week become more active and emit calls considered underdeveloped echolocation calls (Liu et al. 2007). By week three, young R. ferrumquinum have begun to fly short distances and emit echolocation calls similar to those of adults. Pteronotus parnellii (Mormoopidae) is the only species within the Yangochiroptera that produce high duty cycle calls but do so orally rather than nasally and likely evolved high duty cycle calls independently from those in the Yinpterochiroptera (Fenton et al. 2012). Vater et al. (2003) found that P. parnellii emit echolocation precursors between the first and second week of life and begin flying during the fourth week of life. During the first week, these echolocation-like calls can be elicited by moving the neonate through the air. During the second week nonvolant activity and echolocation vocalization increases and during the fourth week, flight and Doppler shift compensation begin. Noctilio albiventris (Noctilionidae) emit echolocation-like sounds almost from birth, although lower in frequency and repetition rate to that of adults (Brown et al. 1983). The emission of these echolocation precursors precedes flight by almost 5 weeks, whereas crawling behavior appears during the second week of life.

Mystacina tuberculata spend a significant amount of time foraging on the ground and so are ideal candidates for investigations into the link between walking locomotion and echolocation. Terrestriality is a secondarily derived condition, but is the novel hallmark of this taxon (Hand et al. 2009) that appeared in the fossil record 51–41 mya. Mystacina tuberculata emit echolocation calls at a significantly higher rate when walking than when stationary but also 120 % faster than when in flight (Parsons et al. 2010). The practical use of such a system to locate prey in leaf litter is questionable (Parsons et al. 2010) and may simply reflect the ancestral state. However, the selective pressures that drove the evolution of this taxon are more complicated than once assumed (Hand et al. 2009), thereby warranting considerable interest in understanding the ontogeny of flight and vocalization in this group (Hand et al. 2009).

Ontogeny of Echolocation Call Structure

Coincident with the development of vocalization are changes in call structure that exhibit similar patterns among taxa. Pups of all species studied exhibited multi-harmonic FM calls or multi-harmonic narrow bandwidth calls. In most cases, the echolocation precursors emitted from pups of low duty cycle species have lower fundamental frequencies, a lower call emission rate, and a narrower harmonic bandwidth compared to adults. These structurally different calls of neonates are emitted while crawling, performing intention to fly movements, and/or moving the head. The frequencies of the fundamental, call emission rate, and harmonic bandwidth all increase as crawling becomes more vigorous and ultimately becomes adult-like as flight is achieved. The echolocation precursor sounds produced by rhinolophid pups are initially emitted orally and therefore also includes the fundamental harmonic, once nasal emission is achieved several harmonics including the fundamental, are suppressed and calls are limited to the second harmonic (Pedersen and Timm 2012). Regardless, nasally emitted echolocation-like sounds produced by nonvolant pups follows the same developmental pattern between species, where emission rate and frequencies of present harmonics increases with pup mobility.

In some species, it appears that echolocation calls develop from lower frequency, harmonically rich communication calls (Fanis and Jones 1995; Moss et al. 1997; Sterbing 2002; Wang et al. 2014), whereas in others, echolocation-like calls are already present at birth (Brown 1976; Brown et al. 1983; Jones et al. 1991; Vater et al. 2003; Knörschild et al. 2007; Liu et al. 2007; Jin et al. 2011; Monroy et al. 2011; Carter et al. 2014). Interestingly, the parts of the brain stem that control echolocation calls are very different from those that control the emission of isolation calls (Metzner and Schuller 2010), suggesting that echolocation and communincation calls have independent developmental origins. If the development of echolocation and social calls are independent they likely have different evolutionary origins (Monroy et al. 2011; Carter et al. 2014).

Echolocation Call Production: the Larynx

With the exception of tongue clicking (Rousettus) and potentially wing clicking pteropodids (Eonycteris spelea, Cynopterus brachyotis, and Macroglossus sobrinus) (Boonman et al. 2014), echolocation calls are initially formed by the larynx (Griffin 1946). Variable tension on the vocal folds is achieved by the cricothyroid muscles, providing different vibration frequencies required for tonal sound production. Many echolocating bats produce calls that sweep from high to low frequency by finely controlling laryngeal subglottic pressure and vocal fold tension (Fattu and Suthers 1981). Adaptations associated with laryngeal emission of echolocation include enlarged cricothyroid muscles and calcified or ossified laryngeal cartilages (Denny 1976) that apparently increase the tension on the vocal folds. However, Carter and Adams (2014) have shown that newborn A. jamaicensis are capable of echolocation-like calls (although at a slower emission rate and sweeping over a narrower bandwidth) with under-developed larynges (Figs. 1 and 2). Increases in the emission rate of echolocation by young bats correlate with increasing muscular demands on the larynx during growth and development. In addition, several species of rodents produce high frequency sounds without any apparent specializations of the larynx (Roberts 1974). Thus, the initial high frequnecy echolocation-like calls produced by nonvolant young bats is primitive in structure compared with adult bat calls and produced by an underdeveloped larynx. As growth and development continues, changes in laryngeal morphology results in modifications of call structure eventually becoming adult-like in concert with the onset of sustained flight. It appears that in bats, the ontogenetic increase in sophistication of echolocation that accompanies the shift from a nonvolant crawling lifestyle to one dominated by flying is manifested as significantly faster emission rates with increased bandwidth supported by a stronger, sturdier laynx.

Fig. 2
figure 2

Laryngeal calcification in developing Artibeus jamaicensis (Alcian blue and Alizarin red stains). Graduations shown of the left sides represent 1 mm. The right lateral (top) and dorsal sides (bottom) of each larynx are shown for nonvolant (a), semi-volant (b), and volant (c) individuals. Red (dark) represents calcified cartilage and blue (light) represents un-calcified cartilage. Calcification begins on the posterior-superior regions of the cricoid. During no developmental stages do the thyroid or arytenoid cartilages show signs of calcification (from Carter and Adams 2014, with permission)

Ontogeny of Auditory Response

Unlike the temporal relationship between locomotion and sonar development, it appears that the emission of echolocation does not necessarily coincide with the ability to hear the returning echoes (Woolf 1973; Gould 1975). Newborns that do not exhibit neurophysiological activity in the inferior colliculus in response to sound are often very vocal, as are bats that have been deaf their entire lives that can emit similar echolocation calls to those of non-deafened bats (Woolf 1973). This ability of pups to vocalize before they are able to hear suggests that, at a fundamental level, some stimulus other than hearing, perhaps self-excitation is responsible for triggering vocalization (Gould 1971; Rübsamen 1987).

The postnatal onset of hearing has been described for a handful of bat species, these include: A. pallidus-onset at 7 days (Brown et al. 1978), M. velifer-onset at 2 days, P. parnellii-onset at 1 day (Brown and Grinnell 1980), R. ferrumquinum-onset at day seven, M. oxygnathus-onset at day ten (Konstantinov 1973), and C. perspicillata-onset at 1 day (Sterbing 2002). In all cases, the auditory frequency range of young bats is lower than that of adults and in many cases corresponds to the frequency range of the communication calls and echolocation precursors produced by neonates. In high duty cycle species, the tuned frequency range of the cochlea corresponds to the energetically dominant harmonic of the call. As bats develop the capacity for producing adult-like echolocation calls, so expands the range of frequencies heard. This developmental link between hearing and changing frequency of echolocation calls suggests a functional link between these two systems, with one system driving change in the other. Interestingly, bats are born with relatively mature cochleae that already contain many of the structures associated with high frequency hearing (Figs. 3 and 4), meaning that the ontogenetic increase in hearing frequency range does not result from gross changes of the cochlea (Vater 2000; Vater and Kössl 2011; Carter and Adams 2015). In addition, many mammal species can hear high frequencies, but cannot produce comparable sounds suggesting that the ability for hearing high frequency sounds is incidental to the capacity to produce them.

Fig. 3
figure 3

Cross sections through the modiolus of cochleae from nonvolant (a) and volant (b) Artibeus jamaicensis at 40× (hematoxylin and eosin), showing similarities in cochlear morphology. The dotted line represents cochlea height, dashed-dotted line represents basal turn diameter, and the dashed line represents apical turn diameter. Primary spiral lamina (PSL) and secondary spiral lamina (SSL) are indicated with arrows. The first half turn is indicated with T1, the second with T2, third with T3, and fourth with T4. Cochlear dimensions, gross morphology, and number of turns are no different between nonvolant and volant individuals (from Carter and Adams 2015, with permission)

Fig. 4
figure 4

Cross sections through the first half turn (T1) of cochleae from nonvolant (a) and volant (b) Artibeus jamaicensi at 400× (hematoxylin and eosin), showing similarities in basilar membrane (BM) structure. The dotted box surrounds the pars tecta and the dashed box surrounds the pars pectinata of the BM. PSL, SSL, and the tectoral membrane (TM) are indicated by arrows. BM structure and anchoring through PSL and SSL are no different between nonvolant and volant individuals (from Carter and Adams 2015, with permission)

Although bats can often not hear at birth, the onset of hearing occurs within the first ten days of life, during a period of increased mobility and vocalization. Indeed, nonvolant young exhibit many of the morphological and behavioral requirements of a functioning echolocation system. These include emission of echolocation-like calls (e.g., Gould 1971; Brown 1976; Rübsamen 1987; Zhang et al. 2005; Carter et al. 2014) with cochleae and a central nervous system (CNS) that are sensitive to the returning high frequency echoes (Rübsamen 1987; Vater 2000; Sterbing 2002; Vater and Kössl 2011; Carter and Adams 2015). Furthermore, neonate P. parnellii have been shown to possess an auditory cortex with functional circuits capable of calculating distance based on temporal separation of pulse and echo (Kössl et al. 2012). Thus, it seems apparent that the ability to hear the higher frequencies associated with echolocation is present before the onset of flight in most species and this is likely as well for the evolutionary sequence of these adaptations in bats.

Evolutionary Implications

Developmentally, nascent echolocation precedes flight in nearly all bat species studied to date, therefore supporting our initial predictions and our hypothesis that echolocation evolved before flight. In most cases, these early echolocation-like calls develop into calls exhibiting adult structure and capacity by a disappearance of the lower frequency harmonics and an increase in frequency of the remaining high intensity harmonics. In low duty cycle species, there is also a decrease in call duration during development, which allows for the emission of more calls in a call sequence without masking of the returning echoes (Carter et al. 2014). Interestingly, in FM emitting species, echolocation call structure is maintained throughout development, as is the use of narrow bandwidth calls and CF calls by high duty cycle species. This means that adult FM echolocation does not develop (Fig. 1), nor did it evolve, from the long duration, harmonic-rich, narrow bandwidth communication calls (Brown 1976; Brown et al. 1983; Jones et al. 1991; Vater et al. 2003; Knörschild et al. 2007; Liu et al. 2007; Jin et al. 2011; Monroy et al. 2011; Carter et al. 2014). Interestingly, the emission of unstructured high frequency sounds by tenrecs, shrews, and rodents has been described as a by-product of locomotion or excitation level (Gould 1971; Thiessen and Kittrell 1979). Hemicentetes semispinosus, H. nigriceps, Suncus, Blarina, Setifer, and Tenrec all emit repetitive sounds that vary in rate with body movements such as extending the body, head raising, and intention to walk (Gould and Eisenberg 1966; Gould 1969). Repetition rates range from 2 to 16 pulses/s and all begin with a fast rise time, an important quality for sound localization (Gould 1971). Rodents also emit high frequency sounds in various contexts throughout their lives (Noirot 1972; Nyby and Whitney 1978), with some appearing to serve no communicatory role and are thus hypothesized to be by-products of locomotion (Thiessen and Kittrell 1979). Mongolian gerbils (Meriones unguiculatus) emit 90 % of high frequency sounds during a hop, suggesting that compression of the lungs forces air out through the larynx. In fact, there is considerable evidence that high frequency sound production and locomotion are associated across many rodent species (Blumberg 1992). This suggests that echolocation may have evolved from high frequency sounds that were initially a by-product of locomotion in early mammals, rather than from communication calls. It also may explain why the asymmetrical loads of walking produce a higher rate of echolocation emission than the symmetrical loads of flying in M. tuberculata. This scenario is at odds with the interesting hypothesis proposed by Boonman et al. (2014) where, potentially, wing-clicking fruit bats represent behavioral fossils, suggesting that sophisticated echolocation evolved as a by-product of powered flight.

In addition, aspects of the cochlea and CNS exhibit morphology and functionality associated with the perception of returning echolocation echoes before the onset of flight. The interpretation of these ontogenetic data in an evolutionary framework suggests that echolocation preceded flight and was likely being used by the nonvolant ancestor to bats. This is supported by the use of echolocation in the shrew genera Sorex, Blarina, and Crocidura (Sales and Pye 1974; Buchler 1976; Tomasi 1979; Forsman and Malquist 1988; Siemers et al. 2009), which potentially share their ancestry with bats in basal laurasiatherians (Gunnell and Simmons 2005). In addition, high frequency hearing is thought to not only be an ancestral trait to all bats (Davies et al. 2013) but to many early and contemporary mammals (Meng and Fox 1995).

The coincidence of excitation level and high frequency sound production may have provided an exaptation for the eventual evolution of echolocation in bats. The evolutionary co-opting of sounds that originally served no communicatory purpose but were instead by-products of locomotion would also explain the possible ontogenetic decoupling of echolocation and communication calls seen in extant bats (Fig. 1). We feel the most parsimonious explanation is that bats inherited a primitive echolocation system that evolved in earlier insectivorans for nighttime terrestrial navigation and later became integrated with flight in what came to be the only true flying mammals, bats (Adams and Shaw 2013). Thus, developmental data on morphology, functional integration, and behavior, when compared to living and fossil groups of mammals, indicate a clear pathway for the evolution of high frequency hearing ability preceding the capacity for high frequency sound emission that preceded flight ability and the consequential refinement of these sounds into the complex echolocation and flight abilities of present day bats.