Keywords

Animal and Human Models of Fetal Hearing

Science teaches skepticism in applying the results of animal studies to humans. However, some questions about humans cannot be answered directly by humans. They can, however, be approached in carefully designed and conducted animal studies. Conversely animal studies can suggest avenues of understanding that can only be pursued with humans.

Philbin, Lickliter, and Graven (2000) and Lickliter and Bahrick (2000) make the case for pursuing human and animal studies citing well-tested parallels of human fetal hearing and behavioral responsiveness to sound shown in several animal models. However, some recent literature has dismissed the relevance of one of the most studied models of the fetal sound environment, the pregnant ewe (e.g., Lahav, 2015). The purpose of this section, therefore, is to give a few brief examples of established concordance and to outline the back and forth between questions and answers that make animal studies invaluable for advancing knowledge about both humans and animals.

Sheep Model of Fetal Development

Ewe and human have a similar size fetus and pregnant uterus. There are also “similarities in the physics of transmission and early development of inner ear function prenatally” (Abrams & Gerhardt, 2000). Older but still valid studies using this model (e.g., Gerhardt & Abrams, 2000; Abrams & Gerhardt, 2000; Armitage, Baldwin, & Vince, 1980) show, for example, that sounds available to the head of a fetal sheep are similar to those recorded by hydrophone near the neck of a human fetus at term (Richards, Frentzen, Gerhardt, McCann, & Abrams, 1992). Using a hydrophone (designed for use in fluid) shows both a slight but significant reduction in intrauterine sound levels (perceived as loudness/quietness and measured as decibels (dB) of airborne, external sounds) and a slight but significant amplification of internal sounds, particularly the maternal voice. Both show a varied intrauterine sound environment of internal and external sounds, including voices surrounding the mother.

Avian Models

The science of hearing development has been furthered in major respects with the use of avian models. Mammals are closely related to humans but not with respect to developmental psychoacoustics. However, avian hearing development is like human in that it is precocial, developing in many ways in a fluid environment before hatching or birth. Most mammals, by contrast, are born with relatively undeveloped hearing that subsequently matures in air. Additional advantages are that avians develop quickly, are readily available, and are easily manipulated in the shell. Many studies of the bobwhite quail and ducks show discrimination of and approach behavior toward the calls of the chick’s specific mother hen compared with other hens as well as the disruption of psychoacoustic development by other sensory stimuli (e.g., Harshaw & Lickliter, 2011). Rubel et al. (1984) developed an avian model of the physical development of the cochlea and hair cells and their related brain structures. Following on these findings Philbin, et al. used the native (untrained) peeping of the domestic chick to describe the development of many psychoacoustic abilities found in the human neonate including disruption of habituation (Philbin, Ballweg, & Gray, 1994).

Research in habituation illustrates the ability of an animal model to answer questions suggested by human studies and, in turn, to provide a platform for furthering human research. Habituation is important because it enables an organism to discriminate the novel from the typical, thus improving safety, the location of food, and other complex learning. The ability to habituate or decrease responding to repeated sounds is used as a marker of neurobehavioral development in the widely accepted newborn and preterm human infant neurobehavioral development assessments designed by Brazelton, et al. (1987), Als et al. (2005), and Lester et al. (2004). However, while most healthy term newborns habituate to repeated sounds and other stimuli, most preterm newborns do not. The scientific question is whether the newborn intensive care unit (NICU) environment might have an influence on preterm infants’ development of habituation .

Hatchlings of the domestic chicken show a very sturdy ability to habituate. Many studies make use of this ability to document attention to novel events in the acoustic environment. To answer the question of habituation abilities suggested by the human developmental assessments, Philbin et al. (1994) exposed chick embryos to tape recordings of NICU sound while keeping all other sensory experience the same as that of a control group. Unexpectedly, the experimental group of hatchlings, unlike the controls, did not habituate reliably to repeated bursts of white noise. A more recent animal study using brain mapping via functional magnetic resonance imaging (fMRI) shows that environmental noise, such as might make up a NICU sound environment, retards auditory cortical development (e.g., Chang & Merzenich, 2003).

Findings such as these lead back to the question about preterm humans and their environments. Is there a difference in habituation to sound between preterm infants cared for in a noisy NICU and those in a quiet one? Stated another way, is the unreliable habituation found in preterm humans a product of nature or nurture? With the advent of very quiet, single-patient rooms and the continued use of crowded, noisy ones, this question could now be asked and perhaps answered with human infants. An answer in the positive would then suggest a line of research regarding the effect of newborn intensive care on preterm infants responses to the other stimuli well described in neurobehavioral testing.

The Development of Sound Perception: Developmental Psychoacoustics

Hearing is developing actively by the 20th week of gestation, and the fetus is shown to respond to sound by 24th week (National Institute of Occupational Health and Safety – NIOSH). The development of species-typical perception is complex. It occurs in conjunction with the particularly specialized acoustic environment of the uterus. In the inner ear, physical vibrations are translated to nerve conduction by the hair cells of the cochlea and from there to the auditory centers of the brain.

The Development of the Perception of Frequency: Tonotopic Organization

Sound frequencies are initially registered as different but without an ordered pattern, distinct but disorganized. With maturation and experience, as each hair cell in the cochlea becomes associated with a specific counterpart in the auditory brain centers, the individual frequencies are organized precisely in both inner ear and brain, just as the keys on a piano (Gray, 1991; Appler & Goodrich, 2011). Similar organization occurs with the physical registration and central nervous system organization of level, rhythm, duration, and other attributes of sound perception.

The phenomenon of hair cell maturation, the place principle , was first described by Rubel et al. (1984). Low-frequency sounds are the earliest to register at the large end of the basilar membrane near the oval window that connects the middle and inner ears (Querleu, Verspy, & Vervoort, 1988; review Gerhardt & Abrams, 2000). Sound and vibration from both the external and internal maternal environments have been recorded in the pregnant ewe near term from 60 Hz (i.e., very low) to 8000 Hz, the high end of the frequencies of speech. As its complex structures develop, the basilar membrane becomes increasingly flexible and capable of moving the emerging hair cells further toward the apex. The registration of lower frequencies moves along simultaneously, leaving the large end to register ever-higher frequencies (Rubel et al., 1984). Stated another way, the hair cells that once registered lower-frequency sounds eventually register higher-frequency sounds until, at some point in early infancy, the entire membrane and all hair cells and their related central functions are developed. While the exact timing of this progression is known in avian, ewe, and other models, it is not well known in humans.

This development of lower frequencies first is advantageous for fetal hearing development as the sounds most available in the uterine environment are also in the lower frequencies due to the loss of energy of higher frequencies in the transition through the maternal tissue and amniotic fluid barrier. However, high-energy (i.e., high decibel; loud) sounds of all frequencies transit into and through the uterus with less loss of energy and affect the fetus at all frequencies it is capable of perceiving.

Perception of Sound Level: The Development of Hearing Acuity

During fetal life the ossicles (three small bones) of the middle ear cannot perform their amplifying, translational functions because outer, middle, and inner ear are all fluid filled. At this time the hair cells of the inner ear are stimulated directly by vibrations of the fetal skull. Bone-conducted stimulation of the cochlear hair cells produces actual hearing and is not a “separate sense” as stated in some contemporary publications. The consequence of transmission of sound waves in this manner is similar to an adult hearing loss of about 40 dBA. (Note: dBA refers to a particular, nonlinear decibel conversion – the A scale – that represents sound levels as they are perceived by humans. The dBC scale does not make a conversion and is seldom used in reports relative to human hearing. dB usually refers to dBA if the topic relates to humans.) The relatively quiet world of the uterus is biologically compatible with the development of fetal ear and auditory centers during maturation. The ability to perceive sound level develops over time in utero and after full-term birth (Rabinowitz, Willmore, Schnupp, & King, 2011). Considering this, one must use caution in deciding what constitutes auditory deprivation for a preterm infant as the same levels of 40 dBA to 60 dBA are common in new, not crowded, or single-room NICUs.

Perception of a Signal in Noise

A signal can be any distinct stimulus. In the case of hearing, it is usually a sound with characteristics that set its intelligibility apart from the rest of the ambient environment. While most aspects of hearing mature at a rapid pace during the first and second years of life, the perception of a signal in noise is not adult-like until late childhood or early adolescence, depending on the test used. The lateness of this ability has broad implications for all children’s environments including child care centers and schools.

The immature auditory system (end organ and brain) is best able to detect a sound signal if parts of it are at a higher frequency (perceived as pitch) and level (perceived as loud/quiet) and more varied frequency pattern, or if it has different tonal qualities than other sounds. A high signal-to-noise ratio occurs in utero fairly often with respect to the maternal voice because it is different in these aspects from other sounds.

The Effect of the Predictability of the Sound Environment on Attention and Distraction

Attention to a sound signal varies with the predictability or orderliness of the background. Infants and children require a more orderly ambient sound environment than adults in order to maintain attention (Gray & Philbin, 2004). This is to say, prior to adulthood distractibility increases and, therefore, attention decreases as predictability of the acoustic environment and age decrease. An environment perceived as predicable (and, therefore, not distracting) by an adult may be unpredictable and distracting for children and even more so for an infant. These differences between infant and adult perception make adults’ poor judgment of sound signals intelligible to preterm infants. This is important in considering the required orderliness of the sound environment enabling a preterm infant to discriminate the mother’s voice, music, or other environmental sounds.

The Acoustic Environment and Auditory Perception of the Fetus

The acoustic environment of the fetus consists of a closed chamber filled with a fetal body and fluid more dense than ocean water. The chamber wall provides a solid but flexible boundary with the approximate density of muscle that changes in shape, thickness, and tautness with fetal movement and growth in body size. Prior to the engagement in the pelvis, the head can be in any location (review: Abrams & Gerhardt, 2000). These multiple changes result in variable deflective and absorptive acoustic properties and a variable acoustic environment for the fetus.

Intrauterine sounds originate in three sources. Internally generated sounds originate in the maternal organs, voice, and movement. They are conducted via tissue and bone through the body, cross the uterine wall, and travel through intrauterine fluid directly stimulating the hair cells of the cochlea via vibratory compression of the fetal head. They lose little energy in the process of transmission through the body and uterus and can even amplify (Richards et al., 1992). Externally generated sounds are conducted via air and must cross the maternal-uterine tissue barrier, losing considerable energy in the process if levels are only moderately loud. Sounds in the uterus are usually of low frequency, but sounds of any frequency can pass through if the level is high enough. However, these high frequencies may not be perceived by the fetus because of the gradual development of perception of frequencies described above. (The same restrictions on perception are the case for young preterm infants; high frequencies in the NICU may not be perceived.) The third sound source is external vibroacoustic stimulation (VAS) by mechanical objects making physical contact with the maternal body.

Maternal Heartbeat Sounds and the Maternal Voice

Some researchers, clinicians, and laymen believe that heartbeat sounds dominate the uterine sound environment (Panagiotidis & Lahav, 2010; Salk, 1960, 1962). The purpose of this section is to examine the science supporting the belief.

The theory of prominent intrauterine heartbeat sounds originated in 1960 with Lee Salk, a psychiatrist in Queens, New York (Salk, 1960, 1962). Although it seems quaint now, this nearly 70-year-old theory of imprinting was persuasive and consistent with science at the time (e.g., Hess, 1959; Moltz, 1960). The idea came to him during a visit to the Central Park Zoo in Manhattan, an old-style zoo of mostly isolated animals in small cages. He does not cite methods of primate behavior research but simply asserts that in 40 out of 42 of his own observations, a single rhesus monkey carried her infant against her left side (“closest to her heart”). The entire theory of heartbeat sounds was based on the way in which mother monkeys and newly delivered women held their infants.

Salk’s idea caught on and is so emotionally attractive that it perhaps will require a type of scientific revolution – a fundamental shift in thinking – to see past the data that don’t support it and thus dislodge it from popular and scientific culture (Kuhn, 2012). There have been few studies based on the singular experience of imprinting to heartbeat sounds although DeCasper and Singafoos (1983) present a literature review of work emphasizing a broader picture of imprinting and the effects of fetal auditory experience.

Overreaching his data, Salk asserted that imprinting by intrauterine heartbeat sounds was “the basis of all later learning” (Salk, 1962, p. 762) and persisted into adulthood. The rhythms of music and dance are given as examples of the “universal” and “biological tendency [of man]” to seek proximity to a heartbeat sound because it “has survival value … [and] involves mutual satisfaction” (p. 762). He proposed that a monkey or human mother holds her infant so that she can hear her own heart because she then had “the sensation of her own heartbeat reflected back” (p. 762). In other words, her own imprinting to heartbeat sounds leads her to hold the next-generation infant in such a way that it too will imprint to heartbeat sounds.

Salk tested his imprinting theory by comparing group (whole room) responses of healthy toddlers in cribs in a hospital for orphans. Individual infants were not tested, and groups did not serve as their own controls. In his several studies, each of the two groups received only the experimental (heartbeat) or a very different control stimulus (e.g., unspecified lullaby or room sounds, including sounds made by other infants/toddlers).

A methodological error in these studies, as well as in some contemporary studies (e.g., Panagiotidis & Lahav, 2010; Ullal-Gupta, Van den Bosch der Nederlanden, Tichko, Lahav, & Hannon, 2013) is that the experimental and control stimuli are too different from one another; there were too many variables to isolate the discrimination of heartbeats specifically. For example, one variable, e.g., the voice, may be of a significantly higher frequency or level than heartbeat (e.g., Panagnostakis & Lahav, 2010). In other studies, the more simple heartbeat sound could attract the infant’s attention simply because it has fewer tone changes and is more rhythmic than, say, the “no control” of random room sounds (e.g., Doheny, Hurwitz, Insoft, Ringer, & Lahav, 2012; Rand & Lahav, 2014).

It is possible that any two sounds with features similar to heartbeat could elicit attention or orientation equally. For example, a rhythmic waltz beat (lub dub-dub, lub dub-dub) as well as the quarter time beat of the heart itself (lub dub pause pause, lub dub pause pause) could both be attracting stimuli. If, for example, both heartbeat-like conditions showed clinically significant, lower heart rates than no stimulus – all in a very quiet acoustic chamber – the infant listener would be responding to the rhythmic nature or some other feature of the signals rather than to heartbeat per say.

It is reasonable to assume that preterm and term infants could make fine discriminations between heartbeat-like sounds. Shahidullah and Hepper (1994) showed that fetuses at 27–35 weeks gestational ages could discriminate between “baba” and “bibi.” Moon et al. (2013) showed that newly born infants could discriminate subtle differences in sounds of the same vowel in two languages. DeCasper and Fifer (1980) showed that newborns could discriminate between their mother’s voice and the voice of other women. DeCasper and Prescott (1984) showed that they can discriminate between female and male voices, and Spence and deCasper (1987) showed discrimination between different frequencies of the same words. If preterm and term infants can make these fine discriminations, surely they could discriminate between “lub dub-dub, lub dub-dub” and “lub dub pause pause, lub dub pause pause.”

Following from the work of Salk, Bench (1968) measured heartbeat sounds from pregnant women prior to the onset of labor. The state of consciousness of the women is not given, and there is no information about the presence of the maternal voice. These studies are now recognized as invalid because a microphone (not hydrophone – designed to function in fluid) was covered with a rubber sleeve, thereby made inaccurate. It was passed through the cervix and moved around the inside of the uterus before and after rupturing membranes. The researchers particularly tried for placement over the ear of the unborn infant whose head was engaged in the pelvis. Probably because of placement of the microphone, only heartbeat and other cardiovascular sounds could be identified and were reported to be loud. Bench concluded, and the scientific and lay communities accepted, that these measurements represented acoustic conditions throughout fetal life.

Grimwade et al. (1970) also attempted to measure intrauterine sounds. It is important to understand this and the Bench studies because the findings are at times the rationale for the levels and frequencies produced by speakers attached to the belly and inserted in the vagina. Their study included 16 pregnant women, at term but not in active labor, before and after rupture of the membranes. Again, the microphone was positioned near the fetal head engaged in the pelvis. It also included seven nonpregnant women undergoing uterine curettage. For both groups of subjects, the covered microphone was passed through the cervix, and the vagina was packed with gauze for the purpose of excluding extrauterine room sounds. However, it is uncertain from the report whether this packing was effective. The state of consciousness of the women is not given, and the emphasis is on measuring noise and using other studies to interpret it as primarily the sounds of pulse.

The microphone was calibrated to 55 dB and above and between 100 and 1000 Hz, the mid-dB range of comfortable sound levels, and very low to low, mid-frequency range of sounds perceptible by humans. The authors assumed that this calibration guaranteed accuracy at other levels and frequencies. The capabilities of the microphone and its calibration appear to guarantee exclusion of more quiet and high-frequency sounds. Using complex calculations based on the actual measurements, the authors conclude that sound levels in the pregnant uterus have an arithmetic mean of 95 dB – very loud – for sounds assumed to be the maternal pulse. (See Section 6.6 for the inaccuracy involved in calculating an arithmetic mean for decibels.) The authors speculated that these sounds were important for sensory development.

In 1980, Armitage et al. (1980) specifically addressed the Bench studies using a hydrophone (designed for measurement in fluid) inside the amniotic sac of pregnant ewes and a tested methodology. Their methodology and equipment were well established. They report:

…we have found that the sounds of the mother’s eating, drinking, rumination, breathing, and muscular movements were discernable, as also were sounds from outside the mother; external sounds were attenuated by 30 dB on average. Sounds from the cardiovascular system were not perceptible, however. (p. 1173)

All sounds between 100 and 1000 Hz were recorded at 40 dB or less, many within the likely acuity level of fetal hearing.

In 1992, Richards et al. (1992) studied intrauterine sounds of conscious women with a spinal block in early stage labor. They used a hydrophone passed through the cervix and had very different findings than Bench’s and Walker, Grimwade, and Woods’ high sound level intrauterine environments (Richards et al., 1992). Their results were as follows:

Low-frequency sounds (0.125 kHz) generated outside the mother were reduced by an average of 3.7 dB. There was a gradual increase in attenuation for increasing frequencies, with a maximum attenuation of 10 dB at 4 kHz, [within the range of human speech]….Intrauterine sound levels of the mother’s voice were enhanced by an average of 5.2 dB whereas external male and female voices were attenuated by 2.1 and 3.2 dB, respectively. ….[All] were statistically significant. (p186)

Abrams and Gerhardt (2000), using their well-developed ewe model, also found no evidence of heartbeat sounds but did document a prominent maternal voice. They write,

mother’s voice… [is] the most significant and common mode of potential auditory stimulation [by]… non-air-induced acoustic stimulation in the uterus.

Vibroacoustic Stimulation (VAS)

VAS may originate in air with enough energy to cross the air-tissue barrier into the uterus or be conducted to the fetal head (and, thereby cochlea) through coupling of the sound source with the woman’s body.

Unintentional VAS: Work and Recreation

VAS in work and recreation can originate in air and simultaneously through direct coupling of the woman’s body via stadium floor and seats, farm vehicle frame and seats, and riflery, leaning against machinery or an instrument (e.g., belly against the piano keyboard, shoulder against the bridge of the double bass), and riding in closed cars with a boom box radio. The National Institute for Occupational Health and Safety (NIOSH, 2016b) advises women to “avoid noise you can feel as a rumble, …. noisy jobs: machines, guns, loud music, crowds of people, sirens, trucks, airplanes.” However, professionals are advised to exercise caution in recommending changes in work conditions as this may cause the loss of family income.

For preterm infants in transport vehicles such as helicopters or ambulances (less so in fixed-wing aircraft), directly coupled VAS is delivered via the vehicle body and the attached incubator and ventilator and then to the infant itself, with the head being exposed to the potential for intraventricular hemorrhage. Homemade anti-vibration pads and gel-positioning devises are ill advised as the material can amplify as well as dampen vibration. Accelerometer studies and specifically matched anti-acceleration pads can be made in collaboration with qualified engineers.

Intentional VAS: Commercial and Parental Sources

A particular word of caution is offered concerning the practice of attaching audio headphones to the belly of a pregnant woman (Abrams & Gerhardt, 2000; Abrams, Hutchinson, Gerhardt, Evans, & Pendergast, 1987) or, by reasonable extension, of inserting a sound (vibration) source into the vagina (www.Babypod.net/en/babypod). The level at the fetal ear is impossible to control because of acoustic dynamics in the uterine space. Further, objects placed in the vagina have potential for trauma to the urethra and cervix and expose bladder, vagina, and cervix to infection from a foreign body. The purpose is typically teaching or instilling a preference for music or the acquisition of a nonnative language assumed to be advantageous for the child. The practice is promoted for commercial gain and loosely based on the invalid research of Bench and Walker, Grimwade, and Wood reviewed above.

Intentional VAS: Diagnostic Purposes

Obstetric clinicians may use VAS from an adapted artificial larynx. The purpose is to assess the well-being of the fetus by stimulating heart rate, gross motor movement, and facial reflexes. Abrams, Gerhardt, Peters, and their associates (e.g., Abrams & Gerhardt, 2000) conducted extensive and still valid studies of VAS by the artificial larynx. They showed that the highest-energy VAS stimuli are in the low-frequency ranges, the frequencies first actively transmitting signal in the developing auditory system. Gagnon found that fetuses between 33 and 40 weeks gestation had an increase in gross movements beginning 10 min after stimulation by an artificial larynx and lasting up to an hour (Gagnon, 1989). Philbin et al. (1996) show heart rate changes including bradycardia and tachycardia in term infants in response to the sounds and low-frequency vibrations of an MRI machine.

See Table 6.1, for a summary of the information in this section.

Table 6.1 Effects of the intrauterine environment on the fetus

The Sound Environment, Listening Conditions, and Sound Measurements in Newborn Intensive Care

Frequencies Available in the Acoustic Environment: Questionable Effects on Hearing, Language Development , and Music Perception

Frequencies in the environment of the preterm infant are relevant only to their perception; higher frequencies available in the environment may be only partially perceived because of the gradual development of the basilar membrane (and, therefore frequency registration) with postmenstrual age. This information is relevant to a current controversy about the nonspecific effects on hearing and language development of the newborn intensive care unit (NICU) frequencies above 500 dB termed “high” (e.g., Lahav, 2015; Lahav & Skoe, 2014). As it happens, there is no authority that defines low, mid, upper mid, and high frequencies. The automotive industry, the National Aeronautics and Space Administration, the audio equipment industry, and the American National Standards Institute (ANSI), all have different definitions (e.g., Smith, 2013). The frequencies of speech are in a range including and well above 500 Hz and are not considered high in most definitions.

See Table 6.2.

Table 6.2 Definitions of low to high frequencies used by the automotive industry, the National Aeronautics and Space Administration (NASA), the audio equipment industry, and the American National Standards Institute (ANSI)

Language development and musical abilities are far more complex than frequency perception . They are dependent on the infant’s and child’s innate capabilities and, equally important, on environmental influences such as the language and music available in the environment and, particularly, on involving interaction. Infant- and child-directed speech is termed motherese by some researchers and clinicians. Socialization with children and early childhood education also influence language, social development, and musical ability. Such development occurs over many years in many settings.

Sound Levels Within the Infant Incubator and Mother’s Voice

The range of sound levels in the acoustically sealed, empty infant compartment of a newer-designed incubator tends to be narrow. Motor noise in such an incubator is an essentially constant broad band of low frequencies of relatively low SPL (about 50 dBA), well tolerated by most preterm infants. However, due to its cubic shape, static size, and stiff shell, the same incubator is a reverberant chamber and effective amplifier of additional low, middle-range, and high-frequency sounds and vibrations. Additionally, sounds of the NICU environment entering through portholes opened many times each day; respiratory equipment inside the shell, and, particularly, the infant’s own cry can make the chamber a remarkably high SPL environment with short duration but often occurring levels above 100 dB as measured by this author. The infant is also removed from the incubator many times each day, for example, for feeding, skin care, and procedures. In sum, the infant is living in a very quiet environment only part of the time.

While the mother’s voice cannot penetrate a closed incubator with tight seals, she can be heard if her head is close to an open door of an incubator porthole. It is obviously also heard when the mother speaks near the infant’s head on an open warmer and when the infant is held. The natural, automatic adjustment of any speaker’s vocal effort is based on distance from the listener, background sound levels, speech privacy intentions, and the listener’s behavior. This adjustment will likely make a mother’s voice level high enough above background to be heard but low enough to be comfortable for the infant. The tonal quality and prosody of the voice will also aid in distinguishing it from other sounds in the environment.

Startling Short-Duration Sound in Old-Style and Newly Designed NICUs

High-level, short-duration , unexpected sounds are perceived as sudden and distinct from the background. They are distracting and unpleasant for both adults and infants. At times they evoke a startle response and vital sign changes in both infants and adults (e.g., Philbin et al., 1994). Such sounds can occur in old-design, crowded NICUs, and also in newer designed NICUs with ample space for each bed, including single rooms. These newer yet acoustically unaccommodating NICU rooms typically do not have sound absorptive surfaces on walls and ceiling and impact strike preventive materials for flooring. Such brief sounds are lost in room sound equivalent measurements (L eq), a measure of central tendency, but can be captured by the human ear and verified in measurements of L 10, the level exceeded 10% of the time, and L max, the level lasting 1/20 of a second.

Lack of Auditory Deprivation in the NICU

All things considered, a very quiet NICU room or incubator may as likely be an advantage as a disadvantage for hospitalized preterm and term infants. In addition to sounds, they are exposed to pain and massive stimulation of other sensory systems (i.e., touch, kinesthetic, vestibular, olfactory, gustatory, and visual) multiple times each day. Looking at the infant holistically, low sound levels may be a respite allowing sleep and recovery from other stimuli.

Regardless of these facts, some investigators and clinicians (e.g., Rand & Lahav, 2014; McMahon, Wintermark, & Lahav, 2012) suggest that a NICU incubator or new NICU single room may be too quiet or have too few auditory stimuli and constitute conditions of auditory and language deprivation. However, new NICU single-bed rooms with ample sound-absorbing surface materials typically meet the Recommended Standards for Newborn ICU Design, described below. These and other authors propose benefits of adding recorded sound to the acoustic environment. However, studies and standard assessments of the fetus and preterm newborn, such as the NIDCAP, NNNS, and APIB, generally indicate that any purposefully added stimulation must be carefully considered and administered. Additionally, the long- and short-term negative effects of hospital stimuli on infants and parents are unknown; the amount and type of auditory stimulation “good enough” for language and social development are also unknown. The history of the neonatology is rife with examples of attempting to solve a problem without fully understanding it or the proposed solutions.

Mother’s Voice and Music : Live Versus Recorded

Although the upper frequencies are largely lost in utero, the maternal voice, like external live voices, carries a tonal quality and prosody unlike other sounds, as described above. The same phrase or word is rarely produced in the same way twice but is constantly novel and attracting to attention within the constraints of a single exemplar, the mother. This is the biologically expected manner of exposure, sustained attention, and increasing recognition of the maternal voice and the language and social competence it carries. These stimuli are quite different from recorded sounds.

Repetitious sounds elicit less attention over time if levels are in the low-moderate range. All animals, from the neurologically most simple to the more complex (e.g., newly hatched chicks in Philbin et al., 1994), to the most complex (humans), habituate to repeated, moderately strong stimuli. One might ask whether habituation is the desired effect of exposure to mother’s voice and music.

If a tape-recorded voice and music are not habituated, one might ask whether the recorded sound is played at high levels, variations in frequency, and tempo to sustain attention long after the infant’s response would otherwise be fatigued. One would hope that a live speaker, vocalist, or instrumentalist or person responsible for monitoring recorded sounds would be attentive to the infant and make the necessary adjustments, including stopping, to facilitate behavioral organization and state stability.

NICU Conditions of Auditory Masking and Distraction: Perceptibility of Mother’s Voice and Skin-to-Skin Holding

The preterm infant’s limited ability to discriminate signal from noise means that sounds at significantly high levels in the near environment (e.g., old-style, crowded, reverberant NICU rooms) can mask the mother’s voice and cause distraction (Gray & Philbin, 2004). In a crowded, noisy NICU, the voice signal level can be raised by decreasing distance and air transmission time (i.e., bringing mother and baby closer together) and, most effectively, by direct, soft tissue transmission through the mother’s body to the infant’s body and cochlea with skin-to-skin holding. Staff speaking with the mother during this care are best advised to speak quietly to avoid interrupting the infant’s discrimination of the mother’s similar voice signal. During skin-to-skin holding, the mother may sleep, rest, or otherwise not talk for periods of time. Many people read in a monotonous tone with less emphasis and rhythm than their speech. However, speech behaviors natural to the mother are the infant’s basis for future language acquisition. Clinicians are cautioned to avoid interfering with them.

See Table 6.3, for a summary of the information in this section.

Table 6.3 Effects of the intensive care unit sound environment on preterm infants

Sound Measurement of Voice and Music: Research and Clinical Interventions

Researchers and clinicians rely on accuracy in the literature and aspire to valid studies and clinical interventions. However, many studies report inaccurate and misleading measurements of sound thereby adding confusion to the literature, failed studies, and ineffective or detrimental clinical interventions. Gray (2000) and Gray and Philbin (2000) provide complete descriptions of the properties of sound and sound measurement in the NICU. A summary of information particular to voice and music is provided here.

Accuracy of the Equipment

Some studies are flawed by the use of inexpensive sound level meters (SLMs) with microphones that collect only a narrow range of frequencies leaving much of the sound unmeasured. One might say that if an intervention or study is worth doing, it is worth having the equipment necessary to do it right. Type II is a designation for a very accurate microphone. Such a microphone is necessary for a valid study or clinical intervention. Type I is a designation for a microphone of extremely fine accuracy of the type needed for measuring the acoustic environments of some scientific equipment. These microphones are unnecessarily sensitive, and expensive, for the more gross measurement of room acoustics. Methods sections of studies and clinical manuals should specify the type of microphone used to ensure that a quality Type II (or Type I) microphone and not an inaccurate toy-type or subprofessional microphone is being used.

Sound Level Measurements

The Recommended Standards for Newborn ICU Design are not intended to apply to all conditions but only to the design of NICUs (White, Smith, & Shepley, 2013). They are based on studies of the wake-up thresholds of multiple term infants (Philbin, Robertson, & Hall, 1999) not to perceptions of individual preterm infants that are best determined by the infant’s behavior.

The sound equivalent level (L eq) is a measure of central tendency. It gives a general picture of the sound levels (perceived as loudness or quietness) in a room over a given period of time. The standards (White et al., 2013) define an appropriate L eq for the design of a NICU infant room as 45 dBA over 1 h. The Recommended Standards are not particularly useful in evaluating specific sound levels as they affect a specific listener, adult, or infant, at a specific time because they are intended for evaluating general room conditions occurring over an entire given period. For example, an hour is too long to be relevant to conditions during live interventions and some clinical studies.

The level exceeded 10% of the time (L 10) indicates the irregular, individual levels higher than 90% of other levels over a given period of time. It is an indicator of the general range of relatively loud sounds. An L 10 almost never occurs over a sustained length of time in a NICU and is 50 dBA over an hour in the Recommended Standards. These sounds would affect individuals; their sources are worth identifying and eliminating or reducing if the goal is to decrease the L eq and other levels.

Unlike the other two amalgamated levels, L max describes the one highest individual sound occurring in a given period of time and lasting 1/20th of a second, a time period easily perceived by humans. There are usually a number of levels close to this one indicator as readily seen on the graphs of time periods of, for example, 1 min, produced by professional grade, Type II dosimeter SLMs. L max is absorbed in L eq, making L eq an inappropriate measurement of individual, annoying, and distracting sounds. If the environment is perceived as noisy despite the L eq and L 10 being within limits, the sources of L max, individual, sounds are worth locating and eliminating. Human is generally a good detector of these sound sources.

Other Sound Levels of Interest: L 90 and L pk

The standards are focused on designing against high sound levels. However, the level of quiet is another measure important to research and clinical intervention (Philbin & Gray, 2001). This can be measured as L 90, the level exceeded 90% of the time. Old crowded NICUs often have a small difference between the noise floor (e.g., L 90) and the L eq; there may never be moments of quiet. Alternatively, a more quiet NICU room could have only a somewhat lower L eq but a much lower L 90. In other words, the spread between loud and quiet could be wider, but that reality is hidden in the L eq. A lower L 90 would indicate a lower noise floor and many episodes of relative quiet. Interesting quality improvement projects can be devised to lower the L 90.

Some studies report the peak level (L pk), a technical term for the highest individual sound level during a specific period of time. Graphs produced by dosimeter SLMs can include L pks, and their high levels look dramatic. However, these are instantaneous measurements and may not last long enough to be perceptible. They are not appropriate to descriptions of sound level in a NICU room or available to a particular listener. L pk is available on a SLM for the purpose of protecting delicate instruments sensitive to high-level sounds such as would affect an electron microscope. They are also useful in heavy industry where many brief and high-level sounds could affect hearing acuity (National Institute of Occupational Safety and Health – NIOSH, 2016a, 2016b). No known NICU can produce sounds that reduce hearing acuity, and reporting them in studies of the NICU environment misrepresents the conditions in the room.

The Error of Averaging Sound Level Measurements

A common error in the literature and in practice is to interpret a series of sound level measurements as a mathematical average. However, sound levels are logarithmic measurements; the numbers are multiples of each other. For example, 60 dBA is perceived as twice as loud as 53 or 50 dbA. Mathematical averages of several L eq measurements will always underestimate actual conditions (i.e., indicate the space to be more quiet than it actually is). To arrive at the L eq, a SLM performs complex calculations to equally distribute the energy reflected in the measurements over the specified time period. Therefore, to avoid confusion and misrepresentation of conditions, the L eq should not be thought of, referred to, or summarized as an average. The range of L eq measurements taken at the times of interest can serve as a correct representation of the central tendency of room conditions.

Determining Sound Levels for Live and Recorded Speech and Music

In order to avoid masking recorded and live music , it must be played at a level perceptible to the infant. Keeping the signal below or at the L eq or L 10 limits in the standards may make it unintelligible to an infant and conceal positive findings, as may have happened in the study by Dearn and Shoemark (2014). The most appropriate limits for added stimuli should be based on an accurate, sensitive observation of the infant’s behavior. This might be done using behavioral observations such as those in the Newborn Individualized Developmental Care and Assessment Program (NIDCAP) (Als & McAnulty, 2014). If one is determined to use the standards for research and clinical purposes , levels of a song or mother’s speech might be above L 10, (50 dBA in the Recommended Standards), but not above L max (65 dBA in the Recommended Standards).

Location of the Microphone: Sound Levels and Distance Between the SLM Microphone and Listener

Some measurements in published studies place the microphone in the center of the room or at an unspecified distance from the infant listener and proceed as if these are the levels at the infant’s ear (e.g., Lahav, 2015). However, sound levels increase or decrease over distance, and the change is not linear but geometric. In an open field, sound is reduced by ~6 dB with every doubling of distance from the microphone (e.g., 1 m, 2 m, 4 m, 8 m, etc.) and, conversely, increases ~6 dB every halving of distance (e.g., 64 m, 32 m, 16 m, 8 m, etc.) as it is brought closer to the sound source. The increased decrement in an old-style, crowded NICU with hard, reverberant surfaces varies from this relationship because sound is produced in many places and the room is reverberant; it is not a single sound source in an open field. In such an environment, sound levels and frequencies are best measured close to the infant but not so close as to bump the microphone with normal care activities. In sum, if levels are not measured at a specific distance from the infant, in a room of specific reverberation qualities, the measurements will be uninterpretable.

For example, consider a new NICU with widely spaced beds or single rooms and ample sound absorbent surface materials (i.e., without significant reflections; close to free field conditions). If a L eq of 70 dBA is measured 16 m from a specific infant (e.g., in the center of the room), the level would be about 64 dBA at 8 m (less than L max in the standards), 58 dBA at 4 m, 52 dBA at 2 m, 46 dB at 1 m (less than L 10 in the Standards), 40 dB at 0.5 m (L eq in the Standards), and 34 dBA at 0.25 m; below most background sounds of the heating, ventilation, and air conditioning (HVAC) system; and probably not perceptible to many preterm infants, depending on the characteristics of the sound. In other words, the center-room level of 70 dB is 34 dBA or less near the infant, depending on how close to the ear the microphone is placed.

SLM instruction manuals can be misleading for NICU room measurements because they are typically written for industry conditions. In industry, measurements are often taken in the center of the room because the point is to safeguard hearing from damage by high, sustained sound levels. Sound conditions at the sound sources may overrepresent the sounds at the workers’ ears. Surfaces tend to be hard and reverberant in these conditions and approximately the same throughout the space. These spaces are not relevant to any known NICU or to home conditions.

Placement of the SLM Microphone

Many studies and clinical interventions are flawed by placement of the microphone on padding laid atop the infant’s bed mattress. A Type I or Type II microphone is designed to collect sound waves from all directions. In a room it should be dangled at least 3 feet from a large reflective surface such as a wall or tall cabinet to avoid including reflections. It should not be suspended beneath an air-handling register to avoid overrepresenting HVAC noise. In an incubator the cable can be taped to the center of the inside top of the shell. Because of the small size of the infant compartment and its reverberant character, sound levels are virtually the same everywhere in an incubator, and there is no need to place the microphone near the infant’s head or to use two speakers for delivering sound. In any case the features of stereophonic sound are lost in the reverberant incubator shell.

While the microphone cannot be washed, the high-grade stainless steel of the exterior can be wiped clean with a near-dry alcohol pad. The soft microphone cover is not needed in these conditions if the operators are careful to not bump it and cannot be cleaned effectively for infection control purposes. The operating instructions of the equipment usually specify methods of cleaning.

Summary and Conclusions

Animal models of the development of fetal and newborn sound perception and behavioral responses provide useful information impossible to obtain through direct study of the human. Such studies add knowledge regarding the perceptions of level, frequency, and other characteristics of sound that develop gradually during gestation and infancy. However, the perception of signal in noise is not completely developed until late childhood or early adolescence.

Sounds in the pregnant uterus are of low frequency and low level but nonetheless varied and detectable by the fetus. Sounds received at the fetal cochlea are lower than in the uterus itself. Intrauterine sounds form a mixed ambient background of eating, gastrointestinal activity, breathing, and moderately loud external sounds, including voices. Heartbeat sounds are detected occasionally but are no more distinct than other internal sounds. In some studies they are not detected at all by sensitive intrauterine hydrophones. However, the mother’s voice is easily detected by the infant due to its consistent prosody and the relative loudness of the higher frequencies.

It is important to protect fetal hearing by avoiding high-level vibroacoustic stimulation in the workplace, entertainment, and sport and by coupling recorded sounds to the mother’s body via speakers attached to the belly or inserted in the vagina. Current, incomplete knowledge suggests caution regarding frequent exposure of the fetus to assessment techniques such as vibroacoustic stimulation.

The ambient sound levels of older-design, highly reverberant NICUs are overstimulating for the preterm infant and interfere with detecting the maternal voice. New-design NICUs, including single-room units, may produce annoying, distracting, and overstimulating brief sounds if the surfaces are hard and reflective even though the ambient sound level (L eq) may be within the limits of the Recommended Standards for Newborn ICU Design. Given the myriad, strong stimuli of all sensory systems experienced by the preterm infant, “sound deprivation” in a new or single-room NICU may be a misnomer. The amount of sound necessary for the development of hearing, other sensory systems, language, and organized neurobehavioral responsiveness is not known. The most likely sound experiences eliciting attention and recognition are those of the mother and family members during skin-to-skin holding. As described in classic and contemporary research, native (not managed) sounds specific to the family are the scaffold upon which future language and social competence build.

Key Messages

  1. The fetus typically lives in a low sound level (perceived as quiet) environment in which mother’s voice is a prominent signal but heartbeat sounds are not.

  2. Although the ambient sound equivalent level (L eq) of a NICU room may be low and hard, reverberant room surfaces can produce startling, distracting, and annoying individual sounds. These are best measured over brief time periods such as Lmax.

  3. Given the numerous strong and disorganizing multisensory experiences of intensive care, “sound deprivation” may be a misnomer if some responsive, live speech and singing are available, particularly, during skin-to-skin holding.

  4. Natural conversation and singing by family members expose infants to the scaffold on which language acquisition is built.