Keywords

2.1 Introduction

This chapter is an attempt to reconcile interpretations of the structures of fossil mammalian middle ears with what is known about the development, anatomy, and physiology of modern mammalian and nonmammalian ears. As Bennett and Ruben (1986) wrote: “It is obviously difficult to ascertain physiological characters from dead animals. It is even more difficult to infer those characters from fossilized animals” (p. 207). In spite of these truisms, it is possible, when taking all known paleontological, developmental, anatomical, and physiological data into account and observing the traditional rules pertaining to the interpretations of each set of data, to come to a consistent view of the changes in structure and function of the hearing of mammals over geological time. Detailed overviews of the structure and physiology of amniote middle ears already exist (see, e.g., Rosowski, Chap. 3 and Rosowski 1994).

The term middle ear applies to any structure that improves the transmission of sound energy between a conductive medium outside the body and the inner ear. Strictly speaking, the term could be applied where water or air is the conductive medium, thus also in certain kind of fishes—even though they swim in a medium whose acoustic impedance is essentially the same as that of the inner-ear fluids. In those animals, the presence of a gas-filled swim bladder creates an interface within the body where there is a large change in acoustic impedance, and stronger acoustic vibrations occur at that interface. Connecting the inner ear to this interface, as with the Weberian ossicles in certain fish groups, greatly improves sensitivity to water-borne sound (Ladich and Popper 2004) and fulfills the definition of a middle ear. In the present discourse, however, coverage is restricted to the middle ears of land vertebrates.

The emergence of vertebrate animals onto the land was, without doubt, one of the most far-reaching events in evolution. As so often in science, early concepts of this “event” have had to be strongly modified in the face of newer evidence. For example, examination of the first fossils of this period led early to a number of dogmata that have since been shown to be false. One example is the idea that the earliest vertebrates transitional to the amphibians were at least partially land-living and possessed pentadactile, or five-toed, appendages. It has since been shown that limbs, as opposed to fins, in fact developed in water-living animals, limbs that were presumably used to move around more easily among water plants and that these animals possessed more than five toes on their appendages (Coates and Clack 1990; Clack 2009). Another dogma, which is very relevant to our understanding of middle ears, is that vertebrates developed a tympanic (or eardrum-bearing) middle ear at the time of the water-to-land transition and that all subsequent vertebrates inherited this kind of middle ear and modified it accordingly. In fact, the history of hearing in land vertebrates is, at least for the first half of their evolutionary story, much more varied than expected. As described later, most lacked a tympanic middle ear and were presumably “hard-of-hearing.”

A second “auditory” dogma has also fallen victim to the clarity that has emerged from newer fossils. The mammalian middle ear did not emerge by the addition of two more ossicles to an existing, one-ossicle middle ear, for the simple reason that mammalian ancestors, like all other vertebrate lineages of those late Permian-early Triassic times, lacked a tympanic middle ear. These and other issues are the topics briefly discussed in the text that follows.

2.2 The Water–Land Transition and Early Attempts at Middle Ears

It is not the intention of this chapter to go deeply into paleontological issues, but of course the history of land vertebrate middle ears is being discussed and—besides comparisons between modern lineages—fossils are the main source of information. Older textbooks reiterate the story that developed from the early descriptions in Paleozoic amphibians of a deep notch in the back of the skull that, among the various changes to sensory organs that were necessary when vertebrates emerged on to land, was assumed to be the start of the evolution of a tympanic, impedance-matching middle ear. Air-borne sound reflects strongly from a surface with a higher impedance and this development would have improved hearing sensitivity by at least 40 dB compared to the absence of such a middle ear (Manley 2011; Puria and Steele 2008). As it turns out, however, although there is evidence of some highly interesting innovations for hearing in air and water in early fish (e.g., Clack et al. 2003; Clack and Allin 2004; Brazeau and Ahlberg 2006), none of these innovations survived very long or they were found only in lineages that themselves died out. Reinterpretation of some early fossils led to the conclusion that at least some of the skull notches interpreted as tympana instead housed a spiracle, an open passage for water between the buccal cavity and the outside world (e.g., Clack 2002). For the best part of 100 million years (Ma) after vertebrates emerged onto land, fossil indications of a tympanic middle ear are scattered and provide no evidence for the early development of a middle ear that was inherited by all later forms.

2.3 Middle Ears Developed Late in Evolution and Many Times Independently

Over the course of land vertebrate evolution, several kinds of tympanic middle ears developed, only to be lost again or in lineages that died out. Some forms in the late Carboniferous (310 Ma; e.g., Clack 2002) and late Permian (265 Ma; Müller and Tsuji 2007) show evidence of possessing a middle ear, but died out during, for example, the great extinction event of the Permian-Triassic, at the transition from the Paleozoic to the Mesozoic. Until the beginning of the Triassic (~250 Ma ago) the majority of land vertebrate lineages showed no history of a tympanic middle ear (Clack and Allin 2004). During the Triassic period, probably over a period of tens of millions of years, however, all lineages of tetrapods that survive until today developed a tympanic middle ear—and all independently of each other (Clack and Allin 2004; Manley and Clack 2004). Although the skeletal elements that were used to create these middle ears were common to all groups, the formation of these elements into a functional tympanic middle ear was independent in all cases, as it has been shown that their respective ancestors did not have a middle ear and presumably heard only louder, lower frequencies (e.g., Kemp 2007).

The aforementioned conclusions mean that the middle ear of amphibians, of archosaurs (birds and their crocodilian relatives), of lepidosaurs (tuataras, lizards, and snakes), and of mammals do not have a common ancestry, although their individual components do. The independent emergence of middle ears and the scattered attempts at middle ears in earlier vertebrate history was possible thanks to an amazing flexibility in development provided by a cell type unique to vertebrate animals, the neural crest cells (see Sect. 2.6). A close look at the middle ear of amphibians shows clearly that, among middle ears, it is unusual (Smotherman and Narins 2004). Among other interesting features, there is a unique linkage in the columellar system such that—in contrast to all other middle ear systems—when the eardrum is pushed inwards, the columellar footplate is pulled outwards. In spite of their independent origins, the middle ears of mammals and nonmammals share important features in individual development or ontogeny (see Sect. 2.6). The mammalian middle ear is, of course, the only one that uses three ossicles to connect the eardrum to the inner ear, and the above discussion makes clear that it developed de novo and was not an “improvement” on a preexisting, single-ossicle middle ear (Manley 2010). In fact, as shown later, it also arose multiply and independently within several related groups of early mammals, some of which did not survive until modern times.

2.4 The Single-Ossicle Middle Ear of Archosaurs and Lepidosaurs

In these two groups, as also perhaps in the others, a change in jaw-movement patterns during evolution led to adjustments in the structures bracing the jaws against the rest of the skull. For our purposes, the most important change was that the columella (“stapes”) bone lost its most important function. At that time, it was a substantial skeletal element that had until Triassic times braced the rear part of the outer skull (specifically the quadrate bone, later to become the incus in mammals) against the braincase. The columella thinned greatly and changed its orientation, the outer end migrating dorsally, where an eardrum evolved and connected to the columella via a new extension, the extracolumella. This apparatus lay directly behind the skull, above and behind the jaw joint. Thus in these lineages, the changes in skull and head structure necessary to evolve a tympanic middle ear were not very great, as the columella-stapes had always connected on its inner end to the bones surrounding the inner ear at a location that later became the oval window. It has been suggested that the relatively massive columella-stapes bones of the amniote ancestors might have worked as an inertial system (Manley 1973, based on Hotton 1959). Thus head vibration caused by low-frequency sound or ground vibrations might have been accompanied by a delay in the movement of the (large) stapes, which would have vibrated out-of-phase with the rest of the head and thus provided a stimulus to the inner ear.

There has, in the past, been considerable confusion in the literature with regard to the performance of the ears of mammals and nonmammals, also with regard to their middle ears. Earlier, the multiple-ossicle middle ear was considered to be responsible for the fact that mammals heard “better” than nonmammals, “better,” however, generally not being clearly defined (Masterton et al. 1969; Taylor 1969). The middle ear of nonmammals was supposed to be inferior to that of mammals, and this idea was based partly on the belief that (supposedly) mammals added two ossicles to a preexisting middle ear and this presumably would not have happened if it had not led to an improvement in performance. We now know that in fact the mammalian three-ossicle middle ear evolved de novo (see later) and thus the relationship between the two types of middle ear must be discussed quite independently of any assumptions of “improvement.” All three mechanisms that are used by the three-ossicle middle ear to match impedances (area ratio between the eardrum and the footplate, lever ratio between the malleus and incus “arms,” and the curved-membrane lever system) are also all found in single-ossicle middle ears (Manley 1972; Fig. 2.1). The only difference is that, in contrast to the primary lever system of mammals, the single-ossicle system uses a secondary lever along the extracolumella–columella system (Fig. 2.1a). The “performance” at the level of the eardrum is equivalent (Fig. 2.1c), but above about 4 kHz, the secondary lever system is less efficient at passing along the stimulus, resulting in an increasingly large loss at the footplate for the higher frequencies. This is, however, at least partly due to an increase in inner-ear impedance at higher frequencies (Manley 1972). In the guinea pig, there is also a dramatic decrease in middle ear performance at frequencies exceeding those processed by the inner ear (Manley and Johnstone 1974).

Fig. 2.1
figure 00021

Schematic representation of middle ear function, comparing (a) nonmammalian amniote and (b) mammalian middle ears, both in (c). In both cases, a diagram of the lever system involved is shown, with the capital letters corresponding to the positions of force application (A, idealized to the middle of the eardrum), load (B), and fulcrum (C). The axis of rotation is shown as a circle around the fulcrum. The necessity for transforming a rotation of the extracolumella in the nonmammalian middle ear into a piston-like movement of the columella is enabled by a flexible joint between the extracolumella and the columella. The amplitude and force at the eardrum (longer black arrow) is changed by the lever into a smaller amplitude and greater force at the footplate of the columella/stapes (shorter but wider black arrow). (c) Comparison of the displacement amplitudes of the middle of the eardrum in (continuous line) the Tokay gecko and (dashed line) the guinea pig over the same frequency range and using the same apparatus for stimuli at 100 dB SPL. In both cases, the outer ear was driven by a closed sound system. Although these are similar measurement conditions, the relative amplitudes may be influenced by the different impedance conditions on the inside of the eardrum (opened mouth floor in the gecko, open bulla condition in the guinea pig) (Partially after Manley 2011; Tokay gecko data from Manley 1972; guinea pig data from Manley and Johnstone 1974)

Manley (1973), comparing the inner and middle ears of mammals and nonmammals, came to the conclusion that in general, the mammalian ear was superior to that of nonmammals only with respect to its frequency-hearing range. Generally, but not always, the upper frequency range of hearing in mammals is higher or much higher—leaving aside new evidence for ultrasonic hearing in frogs (Feng et al. 2006) and an upper frequency limit in lizards of 14 kHz (Manley and Kraus 2010). The upper frequency limits of inner and middle ears in all species have apparently coevolved and, despite earlier concepts to the contrary, the upper frequency limit of the middle ear does not alone determine the upper limit of hearing. Instead, middle ear performance also depends on the frequency range “accepted” by the inner ear. Above the highest frequencies of the inner-ear receptor, the impedance of the inner ear rises and this influences the upper limit of the middle ear (Manley 1972). The discussion concerning the relative importance of inner and middle ears regarding the shape of the audiogram has more recently been extended and strengthened by Hemilä et al. (1995) and Ruggero and Temchin (2002). A discussion of the evolution of the mammalian middle and inner ears must be carried out fully free of preconceptions of “better” or “poorer” and concentrated on the status of inner and middle ears during the fascinating evolutionary innovations of the Triassic period.

2.5 The Origins of Mammalian Middle Ears

The title of this section is couched in the plural to emphasize that the mammalian type of three-ossicle middle ear originated several times, perhaps indeed many times. Modern (extant) mammals are divided into three groups: the placental (eutherian), marsupial (metatherian), and egg-laying monotreme mammals. Placentals and marsupials together are termed therian mammals. Before the origin of true mammals in the late Triassic (Lucas and Luo 1993), the ancestral synapsid “reptiles” had already developed some features that are considered uniquely mammalian. Indeed, the features that today are considered as mammalian (some of which were present in now-extinct nonmammals) arose over a very long period of time: there was no “big bang” origin for mammals. One of the first of the features typical of mammals (but that had its origin in the lineage well before true mammals arose) is a heterodont set of teeth, which indicated a substantial change in diet. This change in diet was accompanied by a coordinated series of changes in the muscles that moved the jaws and the bones that made up the lower jaw. The lower jaw progressively became simplified, from originally seven bones to one single bone, the dentary, which was later part of a new, secondary jaw joint. All of the jaw muscles thus became attached to the dentary, a process that involved migration of the muscle–tendon attachments. The final stage brought forth a jaw suitable for chewing, correlated with the processing of food in the mouth cavity, rather than the typical nonmammalian bite-and-swallow technique. Detailed, comparative examination of individual development in nonmammals and mammals strongly supports the ideas generated from paleaontological evidence and indicates that changes in the genetic control of the ontogenetic processes that led to the jaw-joint and middle ear components could gradually re-mold this region of the head (see Sect. 2.6).

A further, parallel, development was the growth of a bony plate, the secondary palate, separating the mouth from the nasal cavity. This structural feature is also—with the exception of its independent evolution in crocodilians—uniquely mammalian and arose more or less parallel to the loss of the primary jaw joint (Carroll, 1988). The secondary palate prevented food particles entering the nasal cavity and thus permitted uninterrupted breathing during chewing. This innovation permitted mammals to begin the masticatory and enzymatic digestive processes in the mouth itself. It has been suggested that this palate—and other changes—would also have played an important role in separating the middle ears of mammals from each other and from the mouth cavity, thus leading to the loss of a previously existing pressure-gradient received system (Christensen-Dalsgaard 2010; Manley 2010). A reinterpretation of the evidence indicates, however, that the immediate ancestors of mammals did not in fact have a tympanic middle ear, and thus had no pressure-gradient receiver that they could lose.

Thus the immediate ancestors of true mammals had changed their jaw construction and eliminated six bones from the lower jaw, making it more stable. During the transition period from a primary to a secondary jaw joint (the latter between the squamosal in the upper jaw and the dentary), species with a double jaw joint existed. The primary jaw joint was gradually eliminated because its lower-jaw component, the articular bone, which connected to the upper-jaw quadrate, was moved medial to and out of the lower jaw. The secondary jaw joint evolved lateral to the primary joint, and contemporary species such as Diarthrognathus used both joints simultaneously (Allin and Hopson 1992). With time, the old joint moved deeper and entered the middle ear while retaining a connection to the lower jaw over a long period of time. There is a general consensus that the mammalian middle ear, including its eardrum, evolved at a completely different location from that of the single-ossicle middle ear (e.g., Allin 1986). Instead of directly behind the head, the tympanum originated near the rear end of the lower jaw, over those bones that were in transition out of the jaw and into the middle ear. The angular bone of the lower jaw became known as the ectotympanic, and grew into a circular support for the eardrum; the articular became the incus. The malleus originated from the upper-jaw quadrate. This series of events were, in basic form, elucidated very many years ago, of course, by Reichert (1837) and later Gaupp (1912) and provided an early and very convincing case of evolutionary transformation of function. Since then, this research area has been enormously enriched by new fossil material but has not been free of controversy. Some authors suggested, for example, that early mammals had a double middle ear, with two tympana, or that the early tympana were perhaps also sound-producing, rather than only sound-absorbing organs (see, e.g. Allin 1986). Maier (1990), however, considered it unlikely that early mammals had anything other than a single tympanum behind the lower jaw.

The three ossicles of the mammalian middle ear evolved independently at least three times. In monotremes, for example, the jaw depressor muscles and thus the relative placements of middle ear structures, differ from the therian situation, indicating independent evolutionary acquisition (Rich et al. 2005). In therian mammals, the three ossicles of the middle ear did not suddenly detach from the lower jaw and become freely suspended in a middle ear space. Although middle ear spaces are difficult to find in early mammals, it is obvious that the malleus, in particular, remained attached to the inside of the lower jaw via an ossified Meckel’s cartilage (a remnant of the embryonic lower jaw of vertebrates). This condition is considered as an intermediate stage in the evolution of freely suspended ossicles and persisted for a remarkably long time (transitional mammalian middle ear [TMME]; Allin and Hopson 1992; Figs. 2.2 and 2.3). This morphological stage can be seen in a very similar form today in embryonic monotreme (egg-laying) mammals, as the ossicles in modern monotreme mammals separate fully from the lower jaw only around the time of hatching (Luo 2007) but remain very stiff throughout life (Aitkin and Johnstone 1972).

Fig. 2.2
figure 00022

Different morphological states of mammalian middle ears, illustrating the transitions from a mandibular middle ear (MME, a, d) to the transitional mammalian middle ear (TMME, b, e) and finally to the definite mammalian middle ear (DMME, c, f, g). (ac) Medial views of the mandibles of the fossil species Morganucodon (early Jurassic) and Liaoconodon (early Cretaceous) and a generalized modern therian mammal, showing the relationship with the ossified Meckel’s cartilage (in yellow, absent in modern adult mammals) and the ear ossicles (see color coding). (dg) Ventral views of the ear regions in Morganucodon, Liaoconodon, Ornithorhynchus (modern Platypus), and Didelphis (modern marsupial), illustrating the relationship of the ossified Meckel’s cartilage, ear ossicles, the dentary bone, and the nearby cranium. The black arrow in (e) points to the external auditory meatus, the red arrow to the gap between the ossicles and the inside of the dentary (From Meng et al. 2011. Reprinted by permission from Macmillan Publishers Ltd. Nature 472, 181–185, copyright 2011.)

Fig. 2.3
figure 00023

A schematic diagram of the events occurring in the middle ears of mammalian lineages over 230 million years of evolutionary time. The time scale is to the left, in millions of years before the present. The four main geological periods shown—the Triassic, Jurassic, Cretaceous (all Mesozoic eras), and Cenozoic—are color-coded. Dashed lines indicate the approximate times of origin and extinction of the various lineages. Only placental, marsupial, and monotreme lineages survived to modern times. Small boxes enclose time blocks during which major events occurred or important fossil finds indicate the acquisition of new features, as shown in the appropriate labels

In some very early mammalian groups, such as the genus Morganucodon of the early Jurassic, this condition prevailed. In a study of the Morganucodon middle ear, Rosowski and Graybeal (1991) came to the conclusion that it was so stiff that it very likely best transmitted higher frequencies to the inner ear. This correlated with the analysis by Masterton et al. (1969) of mammalian hearing, in which they speculated that the earliest mammals perhaps heard only high frequencies. However, since 1991, new fossil finds of Morganucodon indicate that Rosowski and Graybeal’s specimens were distorted and in fact the ossicles were not so confined as they thought (Hurum 1998), which influences any functional interpretation. In Morganucodon, the cochlear canal was straight and less than 3 mm in length (Graybeal et al. 1989). Aitkin and Johnstone (1972) studied the middle ear of the Echidna or “spiny anteater” Tachyglossus aculeatus of Australia and showed that, although its best frequency was at 6 kHz, it had a very low upper frequency limit near 14 kHz and was about 20 dB less sensitive compared to the middle ears of other mammals—and, indeed when compared to those of lizards. Hearing in the related Platypus Ornithorhnychus anatinus is very similar (Gates et al. 1974). Rosowski (1992) suggested that Morganucodon had an audiogram similar to that of modern monotreme mammals. If anything, Morganucodon is more related to modern monotremes than to therians (Fig. 2.3).

Studies of fossil middle ears with a view to understanding their frequency response is, however, bound to be a very difficult and inconclusive enterprise because the frequency response of the middle ear is strongly influenced by what the inner ear can process (Manley 1973; Ruggero and Temchin 2002). What is very clear, however, is the fact that from the early beginnings of mammalian middle ears, during which the malleus was still connected to the lower jaw, it took something like 100 Ma before the ancestors of placental and marsupial mammals had ossicles that were freely suspended in the middle ear. There is evidence that this free suspension did occur in some other early forms (Hadrocodium; Luo et al. 2001; Martin and Luo 2005; Fig. 2.3) but either it never happened, or it was not sustained, in the ancestors of the later-evolving therian mammals. Meng et al. (2011), following Allin and Hopson (1992), define a “transitional” middle ear seen, for example in Laioconodon, an early Cretaceous (~120 Ma) mammal. We cite their definition of the TMME to illustrate the state of these structures almost 100 Ma after the origin of mammals. Meng et al. (2011) wrote: “The TMME can be characterized by several features: the articular, prearticular and angular lose their direct contact with the dentary (thus called the malleus and ectotympanic) and are supported anteriorly by a persistent Meckel’s cartilage, but not by cranial structures, in adult; the malleo-incudal articulation is hinge-like and lost its primary function for jaw suspension; all ear ossicles are primarily auditory structures but are not completely free from the feeding effect; the tympanic membrane is not fully suspended by the ectotympanic, and the manubrium of the malleus has not developed” (p. 184). In contrast to this situation, the direct ancestors of placental and marsupial mammals, which split from each other about 100 Ma after the actual origin of the group Mammalia, possessed what has been termed the “definitive mammalian middle ear,” or DMME. As seen in modern representatives, in which all ossicles are freely suspended, the malleus had developed a manubrium and other middle ear structures as per the aforementioned TMME definition. Apart from its evolution in the lineage leading to placental and marsupial mammals, this DMME evolved independently (i.e., is homoplastic; Martin and Luo 2005; Rich et al. 2005) in monotreme mammals (as indicated by the structural relationships; Rich et al. 2005) and in their earlier relatives, the multituberculates (Fig. 2.3), and perhaps also independently in other related lineages that did not survive until today. Even late Cretaceous multituberculates had cochleae that were only approximately 6 mm long, with some evidence of low-frequency hearing (Luo and Ketten 1991).

2.6 Middle Ear Development in the Ontogeny of Mammals and Nonmammals

The historical evidence from fossils has been extended and corroborated by comparative developmental studies, more recently using general genetic and gene misexpression techniques. The homologies between bones of the ancestral jaw and of the mammalian middle ear have been confirmed through these studies, which traced the cellular origin of each structure and the genetic control of its development.

The development of the middle ear during embryogenesis involves complex morphogenetic processes and reciprocal interactions between mesenchymal (mesodermal) and epithelial (both ectodermal and endodermal) cells (Fig. 2.4). During middle ear evolution, the process of homeosis, that is, the transformation of one body part into another due to mutation(s) in, or altered expression of, specific developmentally critical genes, has been centrally important. In the course of middle ear development, both the origin of the cells and the local signaling environment mutually specify cell fates and thus the final identity of the forming structures. All three germ layers contribute to the formation of middle ear elements, and the foundation for these elements is laid by hindbrain-derived neural crest cells that migrate into the embryo’s branchial (“gill”) arches (Fig. 2.5a). Neural crest cells are unique to vertebrates and originate along the lateral edges of the developing central neural tube. Branchial cleft (“gill slit”) ectoderm, pharyngeal pouch (mouth-throat cavity) endoderm, and branchial arch (“gill support”) mesenchyme together give rise to the middle ear. The “gill” structures are truly homologous to fish gill arches and slits and the arches form the ventral component of the vertebrate skull known as the viscerocranium.

Fig. 2.4
figure 00024

Embryonic development of the middle ear as exemplified by a single-ossicle avian ear (chicken). (ac) Schematics of transversal sections through the middle ear region at different stages of development between embryonic day (E) 6.5 and E10. (a) At E6,5, middle ear ossicles such as the columella (Co) and the forming otic capsule (Oc) are visible as condensing mesenchymal cells (green clusters). (b) Morphogenesis appears as chondrogenesis of the otic capsule (Oc) and primordial columella (Co) takes place. From the outside, the embryo’s ectoderm invaginates to form the external auditory meatus (EAM). At the same time the pharyngeal pouch (P) extends and establishes the middle ear cavity (Mc). (c) At E10, before endochondral ossification takes place, the columella morphology differentiates, and the processes of the distal extracolumella (eCo) extend into the mesenchyme. Toward the inner ear (Ie) the columella footplate inserts into the oval window, which has formed by separation of the cartilage of columella (Co) and otic capsule (Oc). After narrowing, the descending connection of the middle ear cavity (Mc) communicates with the pharynx via the remaining Eustachian tube (e). a artery, Co columella, E embryonic day, e Eustachian tube, EAM external auditory meatus, eCo extracolumella, Ie inner ear, Mc middle ear cavity, Oc otic capsule, P pharyngeal pouch, v vein

Fig. 2.5
figure 00025

Developmental origin of vertebrate middle ear structures. (ae) Vertebrate embryo head region, lateral view. (a) Neural crest streams (arrows) originating from hindbrain rhombomers (r) migrate into the branchial arches (I, II), where they give rise to middle ear structures. Expression domains of Hox (Hoxa2) (a) and Dlx (such as Dlx5/6) (d) genes influence the migrating neural crest cells and thus pattern the branchial arches. (b) Broad aberrant rostral expansion of the Hoxa2 domain causing misguided r4 derived neural crest cells to invade first branchial arch. (c) Missing Hoxa2 expression domain leading to neural crest cells from r1 and r2 entering and populating the second branchial arch. (d) Nested expression pattern (different densities of green color) of vertebrate Dlx gene pairs (Dlx1/2, Dlx5/6, Dlx3/7). (e) Missing Dlx5/6 expression domain in Dlx5/6 −/− double mutant mice. (a'–c', e') Schematics of middle ear structures resulting from expression patterns ac, e. (a') Wild type (WT) mouse (Modified after Mallo 2001). (b') Chicken. Hoxa2 overexpression causes reduction of first branchial arch structures such as the quadrate (Qu) leaving an unarticulated rudiment (rudQu) and supernumery second arch structures (asterisk) (After data from Grammatopoulos et al. 2000). (c') Hoxa2 mutant mice (Hoxa2 −/− ) display homeotic and mirror image transformation of first arch derived middle ear structures (Gendron-Maguire et al. 1993; Rijli et al. 1993; modified after Mallo 2001). (e') Dlx5/6 −/− double mutant mice lack distal branchial arch elements but duplicate structures (asterisk) derived from the proximal branchial arch (maxillary), such as the incus. The hyomandubular stapes, lacking a foramen (mS), is associated with ectopic cartilages (After data from Depew et al. 2002). Co columella, I, II branchial arches, I incus, M malleus, mS malformed stapes, Oc otic capsule, Qu quadrate, r1, r2, r4 hindbrain rhombomers, rudQu rudimental quadrate, S stapes, Sq squamosal, St styloid process, Tr tympanic ring, WT wild type, arrows neural crest streams, * supernumery structures resulting from gene misexpression

The development of the ear drum (tympanic membrane) clearly reveals this assemblage of different tissues, as it is composed of epithelia of the branchial arch and of the pouch, with mesenchymal cells sandwiched between (Chin et al. 1997; Mallo et al. 2000). From the outside, invagination of the first pharyngeal cleft surface ectoderm creates the external auditory meatus, whereas the medially lying tympanic cavity results from expansion of the pharyngeal pouch. Thus the entire epithelium of the mature middle ear cavity is of endodermal origin. The middle ear cavity later communicates with the pharynx via the auditory (Eustachian) tube (Fig. 2.4), the latter being a narrow extension of the pharyngeal pouch (Jaskoll and Maderson 1978) that permits air-pressure equalization within the middle ear space.

The segmental structure of the embryonic head region—as reflected in serial hindbrain sections known as rhombomers and in embryonic branchial arches—was fundamental to the evolution of the vertebrate viscerocranium. During development, distinct homeobox (Hox) gene expression domains in the hindbrain rhombomeres and in the branchial arches establish segmental identity. Segmental identity in turn provides separate novel regulatory clusters of gene interaction cascades, and this allows the differentiation of visceral arch cartilage into, for example, jaws, as well as the derived middle ear ossicles. Hox-dependent branchial arch identity becomes obvious after experimental alteration of Hox gene expression, as this results in homeotic transformations of branchial arch derivates, swapping tissue fates between first and second arches (see later for details).

Early in development, neural crest cells originating from dorsal neural tube rhombomeres migrate in so-called streams laterally and ventrally, populate the branchial arches and give rise to the middle ear (Koentges and Lumsden 1996). The segmental anterior–posterior organization of the midbrain–hindbrain regions that is set up by homeobox genes allows tracing the origin of middle ear structures to sequential neural crest streams. Rhombomeres r1 and r2 (numbered from anterior) form the mandibular neural crest stream into the first branchial arch and give rise to the incus (maxillary part of the branchial arch) and the malleus (mandibular part of the arch) of mammals. In nonmammals, these same neural crest streams give rise to the articular and quadrate bones of the skull—the primary jaw joint. Derivatives from the second branchial arch stream, originating from rhombomere r4, give rise to the stapes or the columella in mammals and nonmammals, respectively. As shown by tissue transplantation experiments, neural crest cells and branchial arch endoderm mutually specify each other by reciprocal interactions. When neural crest cells from the hindbrain are ectopically introduced into a branchial arch environment, they lose their original (Hox) gene expression and take on the fate dictated by their new environment. Conversely, transplantation of larger portions of neural crest results in the cells keeping their original identity (Trainor and Krumlauf 2000; Schilling et al. 2001). These neural crest cells instruct gene expression in the surrounding cells and influence patterning and growth of the branchial arch (Noden 1991; Schneider and Helms 2003).

Pluripotent migrating neural crest cells are thus a prerequisite for the evolution of the middle ear. In birds, the application of retinoic acid to the hindbrain can induce the loss of the columella and ectopic formation of a retroarticular process cartilage in the first branchial arch (equivalent to the lateral process of the mammalian malleus [O’Gorman 2005]). Rhombomere r4–derived neural crest cells are, in this situation, misguided into the first branchial arch, concomitantly with the expression of Hoxa2 extending into the first arch (which normally has no endogenous Hox gene expression; Fig. 2.5b). Thus, after migrating cells encounter an altered Hox gene expression and—subsequently—changed local signals, the first branchial arch cartilage transforms into second arch derivates (Plant et al. 2000). Broad misexpression of Hoxa2 in the hindbrain and in the branchial arch leads to duplication of tongue skeleton (also viscerocranium) elements as a result of transformation of the Meckel’s cartilage (forerunner of the lower jaw) and of the quadrate (Grammatopoulos et al. 2000) (Fig. 2.5b'). Loss of Hoxa2 in mutant mice mimics first branchial arch identity and misguides neural crest cells ectopically into the second branchial arch (Fig. 2.5c), resulting in a mirror-image duplication of first arch derivates such as malleus and incus in the second branchial arch (Fig. 2.5c').

The evolution of a different set of homeobox genes—Dlx—was crucial for neural crest patterning. Whereas vertebrates exhibit expression of three pairs of Dlx genes (Dlx1/2, Dlx5/6, Dlx3/7), responsible for the elaborate proximodistal patterning of branchial arches, the ancient cephalochordate relative of vertebrates, Branchiostoma, has only a single copy of Dlx expressed in the epidermis and nervous system. Thus, probably via gene duplication, vertebrates extended and diversified the Dlx expression domain. Dlx2, for example, is expressed in the surface ectoderm as well as in the neural crest–derived mesenchyme and plays a key role in ectoderm-mesenchyme interaction. Mutation of this gene leads to malformation of the incus and stapes (Qiu et al. 1995). Further candidate genes for locally-operating signals that mediate epithelial–mesenchymal interactions include Endothelin1 and Fgf8, as both, when mutated, result in the absence of first arch-derived malleus and incus (reviewed in Mallo 2001). Growing evidence supports the interpretation that the multiplication of Dlx genes was also crucial for the evolution of jaw diversification, as an independent regulatory system for upper and lower jaw structures became available, allowing the exploitation of new feeding niches (Koentges and Matsuoka 2002).

Early in development, initial cues from the branchial endoderm impose a first proximodistal patterning axis onto the arriving neural crest cells, leading to a nested expression pattern of Dlx genes. This nested Dlx expression (Fig. 2.5d), however, appears to be a novelty of jawed vertebrates, as lampreys (jawless relatives) exhibit the full range of Dlx genes, but not in a nested expression pattern (Shigetani et al. 2002). The gene pair Dlx5/6 controls distal branchial arch identities and is implicated in the elaboration of the lower jaws, as a double mutation of Dlx5 and Dlx6 (and thus failure to form this expression domain, Fig. 2.5e) leads to a transformation of the lower jaw into an upper jaw. In the forming middle ear, loss of distal branchial arch elements due to Dlx5/6 knock-down is concomitant with a duplication of structures originally derived from the proximal branchial arch, such as the incus (Depew et al. 2002) (Fig. 2.5e').

Middle ear ossicles undergo endochondral ossification; thus their development is visible as foci of mesenchymal condensation within the embryonic branchial arches, the sites where cartilage is generated (Fig. 2.4). A single condensation gives rise to both the malleus and incus (Hall and Miyake 1995). Separation of these ossicle primordia from each other appears to be influenced by the gene Eya1; hence mice mutant for this gene display different forms of malleus–incus fusions (Xu et al. 1999). First and second branchial arch mesenchyme, respectively, is characterized by expression of specific genes such as Ptx1, Six2, Lhx6, Alx, Bapx1 in the first arch and Msx1 in the second (reviewed in Chapman 2011). Loss of distal identity also leads to subsequent down-regulation of second arch expressors such as Wnt5a (Depew et al. 2002). Although the expression of a plethora of Wnt-related genes, most prominently Wnt11, Fzd9, Frzb, and SFRP2, in the developing middle ear has been described (Sienknecht and Fekete 2008), their functional interactions still remain to be elucidated and are currently under investigation.

Developing middle ear ossicles influence the positioning of the outer ear meatus and tympanic membrane, as well as, at the other end, the positioning of the oval window on the inner ear (reviewed in Mallo 2001). Vice versa, the otic capsule contributes to the columellar footplate (Jaskoll and Maderson 1978). Thus, evolution and development of the vertebrate middle ear appears to be a playground of various gene-interaction scenarios that determine structure–function relationships.

Comparisons of the embryonic development of different extant mammalian groups (monotremes, marsupials, and placentals) present sufficient variation to exemplify the phylogenetic separation of the middle ear from the jaw. In all mammalian groups, similar morphogenetic processes contribute to this separation of the middle ear, but to a varying extent. These processes are (1) embryonic displacement of middle ear anlagen (in the medial and posterior direction); (2) negative allometry of middle ear structures (i.e., early ossification of middle ear components, thus limiting their relative increase in size); and (3) developmental resorption of Meckel’s cartilage, leading to separation of the middle ear and achievement of the DMME status (reviewed in Luo 2011).

In summary:

  1. 1.

    There is a common developmental origin for the primary jaw joint and the middle ear elements in all amniotes.

  2. 2.

    These elements arise through processes controlled by common gene patterning and developmental pathways.

  3. 3.

    Modifications in the number of genes and in their temporal and spatial expression during development and over evolutionary time are sufficient to lead to the observed morphological transformations.

2.7 Function of the Early Mammalian Middle and Inner Ear

One of the most important questions relating to the functional aspects of the earliest three-ossicle middle ears is: What was the hearing organ like in the earliest mammals? If we can assume that in these animals, as in all modern amniotes, the middle and inner ear performances were well matched (Ruggero and Temchin 2002), the configuration of the inner ear should help us decide whether early mammalian middle ears transmitted high frequencies or not. The main evolutionary events are summed up on the appropriate time scale in Fig. 2.3. For the present purposes, it will suffice to note the following conclusions about the early mammalian inner ear:

  1. 1.

    A cladistic outgroup analysis of hearing in amniotes clearly reaches the conclusion that the ancestral condition of hearing in stem amniotes, before the several lineages split from each other, was low-frequency hearing only (below ~1 kHz; Manley and Köppl 1998; Manley 2000).

  2. 2.

    The cochlea of the earliest mammals (defined by those animals that had three ossicles and thus a secondary jaw joint) was very short indeed (Fig. 2.3; Hadrocodium, Dryolestes, Henkelotherium, 2–3 mm; Luo et al. 2001, 2010; Ruf et al. 2009) and not coiled. Although in some cases a short cochlea (e.g., mouse, 7 mm) can be correlated (in modern mammals) with high-frequency hearing (Rosowski 1992), the kind of cochlear structure and the cochlear ancestry (see 4) tell a different story in the earliest mammals.

  3. 3.

    The space available in the earliest mammalian cochlea was restricted further by the presence of a lagenar macula (which is still found in modern monotreme mammals), reducing the length of the putative basilar membrane to approximately 1.5 mm.

  4. 4.

    A short hearing organ (~1–1.5 mm) is compatible with the kind of structure that must have been the ancestral hearing organ of stem amniotes (Kemp 2007) and, also because of the absence of a tympanic middle ear in those animals, could have responded to only loud, low-frequency sounds.

  5. 5.

    Any middle ear cavity present was surrounded only by membranous tissues. Auditory bullae partly or wholly surrounded by cartilage and/or bone(s) evolved only late in therian mammal history, vary greatly in their structural components, and probably arose many times independently (Novacek 1977).

All of the foregoing indicate that, contrary to the conclusions reached via a comparison of modern mammals (Masterton et al. 1969), the hearing of the earliest mammals (~220 Ma ago) was low frequency only, even though these were small animals (Kemp 2007). Smallness does not necessarily correlate with high-frequency hearing. Indeed, almost all modern nonmammals are small and all survive only hearing frequencies well below 10–15 kHz. Among modern mammals, size is not a reliable indicator of frequency responses (Heffner et al. 2001), and modern mammals have of course a quite different inner ear from that of their earliest forebears.

The fossil evidence indicates that it took more than 50 Ma for mammals to achieve full coiling of the cochlea, before a peak in mammalian diversity approximately 90 Ma ago (Bininda-Emonds et al. 2007; Fig. 2.3). At that time the lengths of the hearing organ were approximately 5 mm, which is shorter than the cochlea of any modern mammal. Nonetheless, almost all Cretaceous mammals were small to very small and because of this, their middle ear structures were also small and not well suited for the transmission of low frequencies. Achievement of a coiled cochlea in such small animals likely led to a gradual improvement in higher-frequency hearing in parallel in placentals and marsupials. In more recent evolutionary history, following the demise of the dinosaurs at the end of the Cretaceous, mammalian radiations led in many parallel lines to increases in body size. Even very large changes in body size were possible in 10–20 Ma (Evans et al. 2012). Such increases in body size almost certainly led to an improvement in low-frequency hearing, in some cases (e.g., elephants) accompanied by a reduction in the upper frequency limit. This phenomenon has been clearly demonstrated during the history of primates, including humans (Coleman and Boyer 2012), but was seen in many lineages, the results for the audiograms of each group differing as a result of their independent evolutionary trajectories. Thus the evolution of hearing ranges in the mammalian cochlea had two major phases: (1) an early phase of greater than 100 Ma, before the cochlea coiled, a phase marked by an extension of the initial low-frequency hearing range toward higher frequencies; and (2) after coiling, a major differentiation during the next 100 Ma between many diverse lineages, resulting also in the extremes of ultrasonic hearing in some (bats, some rodents, whales) and subsonic hearing in others (large mammals). The origin and evolution of the mammalian cochlea and the consequences of coiling will be discussed in a further publication (Manley 2012).

2.8 Mammals: Correlations of Middle and Inner Ear and Brain Evolution

Although the evolution of the middle and inner ear of mammals correlates with changes in brain size, so do many other features of early mammals. Thus, for example, the evolution of hair and vibrissae, together with the accompanying homeothermy and the olfactory expansion associated with the nocturnal life style of early mammals, also correlate with—and were likely largely responsible for—an early expansion of the brains of mammals, including the olfactory lobes (Rowe et al. 2011). A correlation has also been described between the changes in middle ears and forebrain expansion in early mammals (Rowe 1996). Rowe (1996) suggested that the expansion of the forebrain, pushing the middle ear toward the rear of the head (the middle ear completes its individual development before the brain does) was ultimately responsible for the separation of the middle ear from the lower jaw, forming the DMME. However, as Takechi and Kuratani (2010) point out, “Although it was suggested that brain expansion increased the distance between the postdentary elements and dentary to separate them during mammalian evolution (Rowe 1996; Luo et al. 2001), this view was refuted by a fossil with a small brain size and with postdentary elements separate from the dentary” (Wang et al. 2001, p. 421).

In spite of these correlations, Rowe et al. (2011) imply that, in the first phases of brain expansion following the origin of the first mammals, the auditory system played no obvious role. Thus features that later take up large volumes of the brain, such as well developed auditory nuclei, did not undergo significant expansion following the origin of the mammalian middle ear. For the first phase of mammalian evolution, the auditory system was far less important than, for example, olfaction and somatosensation. This supports the idea that the earliest mammalian inner ears were essentially the same as those of their immediate therapsid ancestors, processing lower-frequency sounds for which the requisite mental capacities were essentially already present. During the first 100 Ma of mammalian evolution, the brain expanded slowly until, in the mid-Cretaceous, it was two to three times more voluminous than in the first mammals. The brain capacity necessary for processing extensive auditory information was apparently a very late development in mammalian evolution, most of the early evolution of the mammalian brain being dominated by olfaction (Rowe et al. 2011).

These ideas make it difficult to interpret the significance of correlations between brain size and auditory structural characteristics (e.g., Rowe 1996) because recent fossil data suggest that there was little or no causality between these developments. It appears likely that high-frequency hearing in small mammals, which first made it possible for them to localize sounds efficiently and rapidly and was processed by an appropriate expansion of auditory brain centers, probably did not occur until the late Cretaceous. Remarkably, this was 30–40 Ma after the cochlea had achieved full coiling (Fig. 2.3; Manley 2012) and the middle ear had reached the DMME status. Because mammals developed their middle ear de novo and it was connected to the upper buccal cavity only by narrow Eustachian tubes to allow for periodic ventilation, they never had a pressure-gradient system such as that of lizards that provides sound lateralization information at the level of the auditory nerve (Christensen-Dalsgaard and Manley 2008). The mammalian brain thus had to develop methods of neural-computation processing of sound level and sound arrival-time information derived from the two ears and aided by the (newly evolved) pinnae.

2.9 Pinnae

One other major difference exists between mammalian and nonmammalian hearing systems, in that mammals generally have pinnae. Because pinnae are a soft-tissue feature, we do not know when—and how many times—they evolved. They are absent in monotreme mammals, which probably indicates that pinnae evolved after the early major split of lineages in mammalian evolution. Pinnae of course act as sound collectors, increasing the sound pressure at the eardrum in some frequency ranges by up to 20 dB or more, but they also enlarge the distance between the ears, which increases the time-of-arrival difference of sounds at the two eardrums. Thus features of pinnae, once evolved, undoubtedly played an important role in the subsequent evolution of mammalian hearing. Their origin is likely to have been well after the evolution of cochlear coiling, accompanying an increase in the upper frequency limit of hearing.

Lacking a pressure-gradient hearing system, mammals were mainly dependent on interaural cues and neural computation for sound localization. In bats, which evolved relatively late in mammalian evolution and in which we might expect the greatest specializations for sound localization, there are indeed correlations between pinnae performance and emitted echolocation sounds, at least among the constant-frequency bats (e.g., Obrist et al. 1993). The largest increase in sound pressure due to the pinna, however, is found in bats that do not echolocate, but capture their prey using passive sound localization. Such bats fly very slowly in the vicinity of their prey and their very large pinnae are thus less of an aerodynamic problem than in bats that capture prey during fast flight. Evolutionary pressures on the size and shape of pinnae have been diverse and were certainly not confined to optimal sound reception. In general, pinnae tend to be smaller in species living in colder regions of the world, reducing heat loss. In a few large species, such as elephants, the pinnae play an important role in increasing heat loss. Among birds, the owls evolved facial disks that function as sound collectors that are, in some cases such as the barn owl, as efficient as mammalian pinnae.

2.10 Summary

All amniote vertebrates of the Mesozoic era inherited a small, low-frequency (upper limit estimated at ~1 kHz) hearing organ from their Paleozoic ancestors. Mammals and nonmammals evolved tympanic middle ears independently during late Triassic times. Despite the polyphyletic relationship of amniote middle ears, deep homology of the constituting structures is supported by both the ontogenetic development and the fossil record. Developmental studies clearly demonstrate a common origin for the middle ear ossicles of all land vertebrates and thus for the columella/stapes, the bones of the primary jaw joint of non-mammals and the malleus and incus of mammals. The development of these structures is controlled by processes of common gene patterning and cellular interactions. Over evolutionary time, changes in the number and the temporal and spatial expression of genes led to the observed morphological transformations. In mammals, a three-ossicle middle ear evolved at least three times. After these events, all groups independently enlarged and specialized their inner ears toward higher frequencies and better parallel processing of signals. The primary lever system of the therian mammalian three-ossicle middle ear was fortuitously preadapted to transmit higher frequencies more easily. In spite of this, it took almost 100 Ma after the origin of mammals for the mammalian middle ear ossicular chain to free itself fully from the lower jaw and for the cochlea to fully coil and achieve hearing organ lengths of more than 5–7 mm. This coincided roughly with the time of the origin of eutherian (placental) mammals. The (independent) evolution of the monotreme mammal middle ear and cochlea substantially lags that of therian mammals. The major period of cochlear differentiation in therian mammals, including the evolution of echolocation capabilities and of middle ear bullae, postdates the split between placental and marsupial lineages in the early Cretaceous. Brain size increases associated with hearing specializations occurred first during the Cretaceous. Thus, contrary to expectations, high-frequency hearing arose in mammals over a very long period of time, perhaps, especially in small mammals, to the detriment of low-frequency hearing. An improved sensitivity to low frequencies in large mammals (as a correlate of large size) and in specialized small mammals evolved later and independently in the various lineages.