Introduction

When residential smoke alarms were first developed and widely distributed in the 1970s the focus was on the technology to detect heat and/or smoke and little attention was paid to the nature of the audible signal. A loud, high frequency signal was easily generated by a small piezoelectric siren and this was often included as the normal alarm signal. As noted by Berry [1], the issue of the audibility of fire warning equipment was relegated to an appendix of the NFPA (74–1975) and the assurances about the ability of the signal to awaken people were at variance with the published auditory threshold data available at the time.

Fire code standards often include a requirement that the alarm in a bedroom must be received at the pillow with a minimum volume of 75 dBA (e.g. USA, Canada, UK and Australia). A caution that this level may not be adequate to awaken all sleepers is often included (e.g. AS1670.1-2004). A temporal three pattern (T-3) of beeps was defined in 1987 (ISO 8201 “Acoustics––Audible Emergency Signal”) and this was adopted by the NFPA in July 1996 (and later by many other countries) as the required fire notification signal, including in smoke alarms. The T-3 signal has three 0.5 sec signals with 0.5 sec intervals, and a 1.5 sec interval between each group of three. No recommendation as to a frequency of the sound is included.

Studies on auditory arousal from sleep have shown that most unimpaired adults will awaken quickly to low volume (60 dBA) hallway smoke alarms [2] and low level auditory fire cues [3]. It seems that sleep in “normal” populations is not in itself the major risk factor for fire fatality but that additional risk factors need to be present to substantially increase the chance of sleeping through an alarm. Studies of smoke alarms, auditory arousal and sleep [4] tells us that significant risk factors include being a child, being under the influence of hypnotics, being alcohol intoxicated, being hearing impaired, being sleep deprived and having high levels of background noise.

The ability to wake to auditory signals changes with age. Older people are likely to awaken more easily than younger people and children are generally the hardest to arouse [5, 6, 7]. Table 1 shows the gradual reduction in arousal thresholds across three different adult age groups, in two stages of sleep. Stage 4 is the deepest stage of sleep, while stage 2 can be described as a more intermediate level of sleep. Both are part of nonREM sleep. REM sleep is typically associated with reports of story-like dreams and its auditory arousal threshold is similar to stage 2 sleep [7]. Zepelin [7] found that the decline in arousal thresholds with age was sharpest in stage 4 sleep, but occurred in stage 2 and REM as well. The authors concluded that age was not as influential as individual differences in accounting for the auditory arousal threshold levels, but age differences were nevertheless substantial, with the decline becoming evident by the 40s.

Table 1 Auditory Awakening Thresholds (dB) to a 5 sec 800 Hz Tone at Three Different Age Levels by Stage of Sleep (n = 52)

There may be several factors operating that result in arousal thresholds declining with advancing age. Perhaps the most important is the age related change in levels of brain energy (using electroencephalogram (EEG) power spectrum density analyses) within sleep. Adult EEG energy levels (documented across ages 18–43 years) show a decline with increasing age [8]. Secondly, the duration of the deeper parts of sleep (known as slow wave sleep, SWS, consisting of both stage 3 and 4 sleep) reduces with age so that younger adults spend more time in SWS than older adults. The decrease is especially evident in the amount of stage 4 sleep in the older individuals and more so in men than women. In some cases stage 4 may disappear in people over the age of 60 [9]. A recent meta-analysis concluded that the minutes of SWS decline with age such that at age 65, 75 and 85 we could expect 67, 50, and 25 min respectively of SWS [10].

Previous research has raised the possibility that an alarm of a different frequency and/or different sound may be more effective for waking sleeping individuals at lower volumes than the current high pitched alarm. Comparisons of the waking effectiveness of different auditory sounds in unimpaired young adults aged 18–26 years (using a repeated measures design, wherein the efficacy of all signals were compared across the same participants) have shown that a female voice had a lower arousal threshold than naturalistic fire sounds [11]. Further comparisons of a high pitched alarm (4000–5000 Hz), a female voice, and a mixed frequency T-3 alarm (500–2500 Hz) found that the performance of the latter two signals was approximately equivalent, and both were significantly better than the high pitched signal [12]. Subsequent pilot work (M.Ball and D.Bruck, unpublished data) suggested that the male voice required a lower volume to wake sleepers than the female voice for most young adults tested (in only one of six participants was the female voice more effective).

Comparisons across different signals were also conducted for young adults after they had ingested alcohol and achieved a blood alcohol content of either 0.05 or 0.08 [12]. Alcohol significantly increased arousal thresholds and comparisons across the different signals of the high pitched alarm, female voice, and mixed frequency T-3 again found the latter two performed equally well and significantly better than the high pitched signal. The difference was greatest in the sober condition where the voice and mixed T-3 alarms elicited a response at around 20 dBA lower volume than the high pitched alarm.

Moreover, only 6% of children aged 6–15 years were found to awaken on both nights to a hallway high pitched alarm of 60 dBA [2]. When a subset of younger children (aged 6–10 years) were tested with an alarm above their bed (89 dBA), only 29% reliably awoke [13]. This younger age group was then tested with a variety of signals and it was found that, at 89 dBA, the three signals of a mother’s voice (saying the child’s name), a female voice and a mixed frequency T-3 alarm all performed significantly better than a high pitched alarm [14].

Thus the previous research suggests that when the high pitched alarm is compared to other signals, such as a voice or beeps of a lower, mixed frequency, it consistently performs more poorly. In response to such data the U.S. Consumer Product Safety Commission initiated a review [15] on the sound effectiveness of residential smoke alarms, with a focus on children and older adults. One recommendation was to explore the effectiveness of alternative signals to the current smoke alarm signal.

The ability of sleeping older adults to hear the current high pitched smoke alarm is of particular concern given the extent and nature of the age related decline in hearing. A large normative study of 48–92 year olds (n = 3753) found that 46% had some hearing loss [16] with older people most likely to lose their sensitivities to higher frequencies first. This gradual age related hearing loss, known as presbycusis, affects males more than females. Figure 1 shows that hearing thresholds (when awake) for a tone at 3000 Hz are much higher than for a 500 Hz tone for middle age and older male adults. For example, in order for a 70 year old man to hear a 3000 Hz signal it would need to be over 30 dBA louder than a 500 Hz signal. Presbycusis also occurs in females, but is less marked (see Table 4 in Methodology).

Figure 1
figure 1

Hearing threshold values (dBA) for tones at two different frequencies for males of different ages (right ear) when awake (data from [16])

Some studies have considered the ability of individuals of different ages to hear different sounds (when awake) encountered in medical environments and in home telephone ringers. One study [17] tested the ability of anesthesiologists across ages 25–74 years to hear alarms of different frequencies (31.5, 63, 125, 250, 500, 1000, 2000 and 4,000 Hz) in an operating room. They found that the inability to hear alarms occurred only with those alarms that had a frequency of 4,000 Hz and concluded that high frequency alarms may go undetected by the ageing human ear. Three acoustically different electronic telephone ringers were compared for 20–30 year olds and participants over 70 years of age [18]. For the older group it was found that signals with prominent mid range frequency components (1000–1600 Hz) could be more easily heard than higher frequency ringers (with peaks at 3150 and 20,000 Hz). The authors cite an early conference paper by Hunt (1970) which noted that the most effective ringers have at least two spectral components between 500 and 4500 Hz with a prominent component below 2000 Hz.

While the above findings consider how easily signals of certain frequencies may be heard across different age groups when awake, others have considered the urgency of a signal, with the assumption that an urgent signal will improve the effectiveness of emergency notifications in awake individuals [19, 20]. It has been reported that signals that produce the highest ratings of perceived urgency were those with a higher frequency, a fast speed (tested across 0–500 ms), and a high level of loudness [21]. The frequencies tested were across the range of fundamental frequencies from 200 Hz to 800 Hz, where each also had higher component frequencies. The one that was perceived as most urgent had a fundamental frequency of 800 Hz with components also at 1600, 2400, 3200 and 4000 Hz. However, it seems that perceived urgency is not important for the waking effectiveness of a signal. This conclusion is supported by the findings of Ball and Bruck [11, 12] wherein an actor’s voice recording, designed to convey a message urgently and with emotion, yielded similar arousal thresholds to a mixed frequency T-3 beeping signal across different conditions. (The influence of perceived urgency of the signal on behaviour after awakening is not known.)

Older adults are at higher risk for fire fatality than the national average, with people aged over 65 years having twice the risk, those over 75 having three times the risk and those over 85, four times the risk [22]. A breakdown of activity when fatally injured by fire shows that more older adults died while sleeping (36% of all >65 years fire deaths) than while engaged in other activities (31% were attempting escape and 22% were acting irrationally or unable to act) [23]. The key reasons for the overall high fire fatality rate for older people are unclear. Some fire deaths during the sleeping period may arise from sleeping through their smoke alarm and/or increased cognitive confusion on awakening, with both factors possibly exacerbated by the much higher rate of hypnotic use among older adults.

Other important factors may include their higher rates of disabilites impairing evacuation and socio-demographic factors affecting safety in their residences (e.g. no working smoke alarms). One issue that needs careful evaluation is whether older adults are significantly more likely to wake up to a different signal (especially one of a lower frequency), compared to the current high pitched alarm. The data reported above on young adults (sober and intoxicated) and children suggests that an alternative alarm may be better for a variety of population groups. Given the age related decline in the ability to hear high frequencies the difference between signals for sleeping older adults may be particularly large.

No studies have been published to date that investigate the extent to which older individuals will waken to the current smoke alarm signal, or how their responsiveness to other signals may compare. The first step is to investigate such questions in a group of unimpaired older people, who are within the normal hearing limits for their age. This project empirically investigates this issue with regard to sleeping individuals aged 65–85 years. The results may have implications for the development of a more effective fire notification signal.

Research Aim

The aim was to investigate the arousal thresholds from sleep in older adults (aged 65–85 years) to the current US smoke alarm emitting the high pitched T-3 and compare these thresholds to several alternative signals. A general description of each signal and their rationale are set out in Table 2. Further details of each signal can be found in ‘Apparatus’.

Table 2 Description and Rationale for the Four Signals Delivered in this Study

Hypothesis 1

The older adult sample would have significantly higher arousal thresholds to the high pitched T-3 signal than to the two signals of mixed frequency (the mixed T-3 and the male voice).

Note: The inclusion of the 500 Hz pure tone was exploratory as there had been no previous research using such a tone. However, it was felt to be particularly valuable to determine whether a pure low frequency tone performed equally well, or better, than mixed frequency signals that incorporated 500 Hz levels.

Hypothesis 2

The older adult sample would have significantly lower arousal thresholds to all signals than a young adult sample previously tested under comparable conditions. (This applies particularly to the mixed T-3 and male voice signals where identical comparisons are available.)

Methodology

Participants

Forty five adults aged 65–83 years were involved in the sleep research. The overall mean age was 73.1 years (S.D. = 5.6). Table 3 shows the age and sex distribution of participants who completed the study. As three people dropped out before completing all four signals, the repeated measures analyses involved only 42 participants. The numbers of those who did complete all four signals is shown in brackets in Table 3. Three participants had insufficient English to follow instructions and a member of their family volunteered to provide translations (they all spoke Arabic).

Table 3 Numbers of Participants as a Function of Their Age and Sex

Recruitment was conducted by a graduate psychology student, predominately through social groups. All participants were paid $200 (Australian) for their involvement. Where recruitment was from a social group, the group was paid $150 for each person who completed the study.

Inclusion criteria for participants were that they had to be independently mobile (although use of a walking stick or walking frame was permitted), not taking medication affecting their sleep, cognitively capable (screened using the Mini Mental State if doubts existed), report that they normally do not have significant difficulties falling asleep, and report that they considered their hearing to be average or above average for someone their age.

A total of 59 potential participants underwent the hearing screening test and nine failed (15%). Five people passed their hearing test but then chose not to commence the study. The hearing of each potential participant was screened across five auditory frequencies (500, 1000, 2000, 3000 and 4000 Hz) by a professional audiologist from HEAR Service Victoria. The criteria levels are shown in Table 4 below. They are based on the requirement to perform within, or better than, one standard deviation of the mean age and sex-matched normative threshold at each of the five frequencies in each ear, with those who performed in the lowest 16% for their age and sex (at any of the five frequencies in either ear) being excluded. Table 4 shows that older participants had a much higher allowable auditory threshold and this was especially so for the 3000 Hz and 4000 Hz frequencies. For example, an acceptable hearing threshold for an 82 year old male for 3000 Hz was any value below 80 dBA. Normative mean values and standard deviations were from data collected in 1993–1995 from Beaver Dam, Wisconsin USA, involving 3,753 people aged 48–92 years [16]. Inspection of the reasons why the nine people failed their hearing test shows a near even spread across age groups (above or below 75 years) and sex, as well as a spread across various frequencies failed and a mix of hearing problems in one ear or both.

Table 4 Minimum Pure Tone Air Conduction dBHL Thresholds Required to Pass Screening Criteria for Females (F) and Males (M) as a Function of Age, Across Different Frequencies

Apparatus and Materials

Signals

Two sets of equipment were used. Each set consisted of portable sleep stage monitoring equipment (Compumedics Siesta), a laptop computer, two speakers to deliver the alarm signals and a hand held sound meter (Lutron SI-4001). The latter were professionally recalibrated immediately prior to the study. Full details of the process by which the multiple sound files were created for delivery during sleep, as well as information on sound measurement, calibration, and delivery can be found in Bruck et al. [5].

The origin of the four sounds are as below:

  • Mixed T-3 was from Simplex 1996, 4100 Fire Alarm Audio Demonstration CD.Footnote 1

  • Male voice was recorded in a radio studio with a male actor, chosen for his particularly deep voice. The 10 sec message said “Danger! Danger! There is a fire. Wake up! You must get up and investigate. There is fire. Get up now!”

  • High T-3 was recorded from a current US smoke alarm (Kidde).

  • 500 Hz T-3 was generated by a computer program.

Frequency details of the four sounds are as follows:

  • Mixed T-3 had a fundamental frequency of around 520 Hz (+/− 4 Hz) with odd harmonics (3rd, 5th etc.) It was a distorted square wave. The spectral analysis and wave shape of the mixed T-3 is shown in Figure 2.

  • Male voice had dominant frequencies in the range from 500 Hz to 2,500 Hz, with some additional frequencies from 2,500 to 4,000 Hz.

  • High T-3 had a single fundamental frequency just above 3000 Hz.

  • 500 Hz T-3 was a pure tone of just below 500 Hz.

Figure 2
figure 2

Sound wave shape (upper graph) and spectral analysis (lower) of the A-weighted 60 dB mixed T-3 sound file where SPL is Sound Pressure Level in Pascals and RMS is Root Mean Square

Spectral analyses of all signals are reproduced elsewhere [5]. The spectral analysis shown in Figure 2 is from the computer file. Spectral properties of signals change with different volumes, different environments, different amplifiers and different speakers. In this study the latter two were held constant, while the volume was manipulated and the environments differed with different bedrooms.

General Forms

In addition to a screening questionnaire (based on the selection criteria) a short questionnaire was devised to check that (i) the sleep of each participant on the previous night had been within normal limits and (ii) that their alcohol consumption after 4 pm that day was none or minimal (no more than one standard drink).

Procedure

Each selected participant had their sleep monitored on two separate nights in their own homes. Two different signals were presented each night. Tests were normally one week apart to allow for recovery from any sleep deprivation, with the minimum being three nights. The participant was required to sleep on their own with the bedroom door closed.

All participants were told they needed to have an average or above average sleep (for them) the night before testing and that only a very moderate quantity of alcohol, if any, could be consumed earlier in the day and that it was important that both days of testing were as similar as possible. A questionnaire was completed each testing night to check these requirements (and in all cases they were met). The sleep technician (ST) arrived at the participant’s home about 1.5 h prior to their usual bedtime. After setting up the equipment the ST measured the level of background noise in decibels (using an average reading with the meter on a slow response). They then calibrated the sounds in the bedroom. The speakers were placed approximately 1 m from the pillow. A file of the mixed T-3 sound was played which had previously been recorded to be received at 60 dBA when 1 m from the speakers and the delivery levels calibrated (with the sound meter on maximum hold). The sounds to be delivered that night were played to the participant without comment. This was to ensure that all sounds had been heard beforehand, rather than just the sound that was used for calibration and/or the usual smoke alarm signal. The sleep technician applied the electrodes to the scalp and face.

A series of performance tasks were completed prior to sleep and after the final awakening each night. These were to assess the possible effect of sleep inertia. The results of this testing will be reported in a separate paper.

After preparation for the sleep monitoring the participant went to bed to sleep. The ST was in the hallway outside the bedroom monitoring their sleep patterns on a laptop computer. Signals were delivered in slow wave sleep, either stage 3 or 4. (The option of either stage 3 or 4 sleep was chosen in contrast to only stage 4 sleep, which has been used with younger adults, because of concern that not all participants would enter stage 4 sleep sufficiently to consistently present all signals in that stage.) When the participant entered stage 3 sleep the ST waited 90 sec before delivering the sound. If the participant moved to a lighter sleep (e.g. stage 2 or 1) then the ST waited till they again reached stage 3 sleep and maintained it for 90 sec (or went into stage 4 sleep). They then commenced the automatic sound delivery program, set to play the required auditory signal. Sounds were delivered via a modified method of limits procedure. This meant that each signal was presented at each volume level for 30 sec at a time, beginning at a low pillow volume level (35 dBA) and increasing by 5 dBA until awakening occurred. The loudest signal was 95 dBA and this continued for a total of three and a half minutes, or until awakening occurred, whichever occurred first. The participant was instructed that when they awoke to a signal they should press the button by their bedside three times to indicate that they were awake. After the first signal they should try to return to sleep immediately. With the second (and final) signal presentation each night they were required to do the sleep inertia tasks once they woke up. Upon completion the electrodes were removed and the ST departed. All participants had the same ST on both nights. The order of the presentation of the signals was counterbalanced across both subjects and nights.

This research was approved by the Victoria University Human Experimentation Ethics Committee.

Data Analysis

All data was analysed using SPSS for Windows 11.0 and the alpha level required for significance was set at 0.05. For each of the four signals the following dependent variables were available for analysis across the whole group and as a function of age group (above and below 75 years) and sex:

  • Auditory arousal threshold (AAT), or the mean decibel level at which participants awoke––mean, standard deviation, range and median.

  • Behavioural response time––mean, standard deviation, range and median. The behavioural response time was the accumulated time to press the bedside button from when the signals were first presented at incremental volumes (i.e., increasing from 35 dBA every 30 sec to maximum of 95 dBA).

For the purposes of data analysis it was important to be able to incorporate the data of those who slept through the presented signals in such a way as to allow statistical comparisons. This was operationalised as in previous studies [11, 12]. Specifically, if a participant slept through the full three and a half minutes of signal presentation at 95 dBA the volume at which they awoke was arbitrarily assigned as 105 dBA and their behavioural response time as 600 sec. In considering mean values the effect of such an assignation must be kept in mind as it may underestimate the mean values of those signals where people are more likely to sleep through very loud volumes. This is because it assumes that everyone will wake up at 105 dBA, but this in fact may not be the case at all. It may also mean that statistical comparisons may fail to find a difference when there really is a difference. Median values are, of course, unaffected by this.

Comparisons were made across the four signals (using repeated measures analysis of variance) and across the categories of age and sex (using independent t-tests). Some descriptive frequency analyses were also conducted on the AAT in terms of how many people woke at different decibel levels.

Data from this study were directly compared to data collected by M. Ball and D. Bruck (partially published in [12]) in sober 18–26 year olds.

Results

Differences Across the Four Signals

There was a highly significant difference between all four signals presented for both of the dependent variables measured (behavioural response time and auditory arousal threshold, AAT). Table 5 presents the relevant data and results of analyses. Participants awoke most readily to the mixed T-3 signal, while the highest AAT was to the high T-3 (the current US smoke alarm signal). Consideration of the median AATs showed a 20 dBA difference, from 45 to 65 dBA, between the most and least effective signals.

Table 5 Summary of Descriptive Statistics and Repeated Measures ANOVA Analyses for Auditory Arousal Threshold (AAT) and Behavioural Response Time for the Four Signals Presented

Table 5 also shows the percentage of participants who slept through the 75 dBA level (the minimum recommended level at the pillow in the US). Between 14 and 18% slept through the three signals that performed most poorly (high T-3, 500 Hz T-3 and the male voice), while 5% slept through the mixed T-3 at 75 dBA.

It can be seen in Table 5 that three of the older adult group did not awaken at all to the male voice (at 95 dBA). On closer inspection of the raw data it was determined that two of these people were from a non-English speaking background (NESB, Arabic) and had participated in the study with the help of a translator. They had not slept through any other signal presented. It was decided to re-run the key analyses across the four signals omitting the three NESB participants. This re-analysis changed the mean AAT to the male voice from 55.9 dBA (S.D. = 19.2) to 53.6 dBA (16.4) but did not change the level of significances of the overall analyses, including the pair-wise comparisons shown in Table 6. Thus the results for the male voice signal were not confounded in any important or significant way by the inclusion of the three NESB participants. (Although they certainly raise an issue to be researched further if a voice alarm is being considered.)

Table 6 Matrix Showing the Level of Significance for Pair-wise Comparisons Across the Four Signals (Using Least Significant Difference Statistic) (n = 42)

Table 6 shows the pair-wise comparisons between all four signals across the whole sample (using the Least Significant Difference statistic). The same pattern of differences was found for most comparisons whether the dependent variable was mean AAT or mean behavioural response time. Importantly, the mixed T-3 fared better than ALL other signals presented, with either a significant difference being found between comparisons, or a trend.

Sex and Age Differences

Further analyses were conducted to consider sex differences and differences between the 65–74 year olds and 75–85 year olds. Table 7 shows that no significant differences were found between males and females for AATs to any of the four signals. Table 8 shows that there was a significant difference between the 65–74 and 75–85 age group on AATs to the high T-3, with the 75–85 year old adults having higher AATs. For the older group the median was 70, compared to a median of 60 for the 65–74 year olds. It was found that 5/18 (28%) of the 75–85 year old participants slept through the high T-3 at 75 dBA, while 1/18 (6%) slept through the 95 dBA high T-3. Median high T-3 AAT levels of males and females in the 75–85 year old age group were 70 dBA for each sex.

Table 7 Summary of Descriptive Statistics and Independent t-test Analyses for AATs for the Four Signals Presented for Males Versus Females
Table 8 Summary of Descriptive Statistics and Independent t-test Analyses for AATs for the Four Signals Presented for 65–74 year olds Versus 75–85 year olds

To further investigate the relationship between age and AAT a correlation was performed comparing age with AAT for each of the four signals presented. A moderate correlation was found between the high T-3 AAT and age (r = .47, P = .001, n = 44). All other correlations were less than 0.4.

Comparisons Between Older Adults and Young Adults

In this study several of the signals presented were the same as presented earlier to a group of 18–26 year olds, as reported in [12], except that more participant data had become available. The young adult data is based on n = 14 for all signals, except the male voice where n = 10. Thus comparisons could be made across age groups, and they are shown in Figure 3. Whilst both studies used similar methodology and identical mixed T-3 and male voice signals, there were two differences. First, for the older adults signals were presented mostly in stage 3 sleep (see below), whereas for the young adults all were presented in stage 4. Nevertheless, for both groups this sleep represents their dominant deepest sleep stage (as stage 4 declines considerably in older adults). Second, across the two age groups the “high beeps” were both high pitched, but with the young adult group the sound was a 4000–5000 Hz continuous fast beeping (as found in the older US alarms), whilst in the older adult group the 3000 Hz beeps were in a T-3 sequence.

Figure 3
figure 3

Comparison of AAT (dBA level at which awoke) for the older adult sample with a sample of young adults (see text)

Independent T-tests were conducted comparing the young adult sample with the older adult sample for the three signals where data was available for both age groups. For the mixed T-3 signal a significant difference was found (t = 2.31, df = 55, P = .03), indicating that the young adults had a significantly higher mean AAT (57.9 dBA, S.D. = 13.9) than the older adult sample (48.0 dBA; as in Table 5). Comparisons between the two age groups for the male voice and the high pitched alarms, showed no significant differences. For the latter signal a minor, non-significant difference between means was evident in the predicted direction. The expectation that the older adults would have lower AATs to all signals compared to the young adults was not fulfilled. If the NESB participants are excluded from the male voice data for the older adults, the means are very similar to the male voice data for the young adult group.

Other Data

Presentation Order

It is possible that the AATs may have been influenced by the order of presentation of the signals. In other words participants may have found it easier, or harder, to wake up quickly with the first signal presented compared to the fourth signal. To check for the possible existence of such a confound a repeated measures ANOVA was conducted where the independent variable was order of presentation. No significant order effect was found (F(3,39) = 1.16, P = .34).

Time to Fall Asleep

Participants fell asleep at night without too much difficulty. The mean time taken to fall asleep for the first opportunity to sleep each night (initial sleep onset latency) was 18.4 min (S.D. = 8.3) with a minimum of 5.5 min and a maximum of 39.5 min. Perusal of the intervals between the first and second awakening each night showed that in almost all cases the time needed to fall asleep after the first awakening was within the range of the initial sleep onset times.

Background Sound Levels

The average decibel levels measured in the participants’ bedroom (with the door closed) was 39.4 dBA (S.D. = 4.6) with a range from 33.5 to 50.5 dBA. Frequencies were not measured.

Stage of Sleep/Response Time

The aim was to present all the signals in slow wave sleep (stage 3 or 4). The signals were to be presented 90 sec after the first appearance of stage 3 sleep, provided the person did not move into a lighter stage of sleep (e.g. stage 2). During this 90 sec interval some participants would have moved into stage 4. Independent scoring of a representative sample of the sleep stage data was subsequently conducted and it was determined that approximately 92% of all signals were initially presented during stage 3 sleep and about 8% were presented during stage 4 sleep. No systematic differences in response time between these two groups were evident.

From this sample it was further determined that in approximately two thirds of all participants less than 15 sec elapsed between the first moment of electroencephlagraphic (EEG) defined wakefulness to the time of first pressing the bedside button. About one third took between 16 and 60 sec.

Discussion

Differences Between Signals

The most important finding was that the mean arousal threshold (AAT) for the mixed T-3 was lower than for the other three signals tested, suggesting the mixed T-3 was the most effective signal for arousal. The ranking of the four signals, in terms of best to worst responsiveness was; the mixed T-3, the 500 Hz T-3, the male voice and the high T-3. The first hypothesis stated that the older adult sample would have higher AATs to the high pitched T-3 than to the two signals of mixed frequency (the mixed T-3 and the male voice). This hypothesis was strongly supported for the mixed T-3 but less so for the male voice signal. The results showed that mean AATs to the high T-3 were significantly higher than to the mixed T-3. In other words responsiveness during sleep to the high T-3 was significantly worse than to the mixed T-3. The difference between the median levels was 20 dBA, with the mixed T-3 having a median AAT of 45 dBA and the high T-3 median AAT being 65 dBA. This is a substantial difference.

The male voice signal did not differ statistically from the high T-3 in analyses with the behavioural response time variable, but did show a low level significant difference (P = 0.03) when the AATs for the two signals were compared. This suggests a possible marginal superiority of the male voice over the high T-3, but not a robust difference. Importantly, the mixed T-3 showed a clear superiority over the male voice signal (i.e. significantly lower AAT, P < .01).

The 500 Hz T-3 performed significantly better than the high T-3, with highly significant differences in responsiveness to the two signals being found. It did not perform significantly better than the male voice signal. AATs of the mixed T-3 were found to be marginally superior than the 500 Hz signal (statistical trend). (See section on mixed T-3 and voice signal below for further discussion.)

Sex and Age Analyses

No sex differences were evident in the AAT data. A significant difference was found between the 65–74 year olds and 75–85 year olds on AATs to the high T-3, with the older group needing increased volumes to wake up. This older group had a median AAT of 70 dBA for the high T-3, compared to a median AAT of 40 dBA for the mixed T-3. This substantial difference is most likely related to the normal age related decline in ability to hear high frequencies. In this context it should be noted that the hearing screen criteria (as set out in Table 4) had threshold criteria at 3000 Hz of 70 dBA for those males aged 70–79 and a threshold of 50 dBA for females in the same age group. This sex difference in auditory thresholds when awake did not translate to a sex difference in AATs to the high T-3 when asleep, even for the 75–85 year old group. There was not even a trend for a difference between the sexes of the over 75 year olds on the high T-3 when they were asleep.

Older Adults Versus Young Adults

The second hypothesis predicted that the older adults would have lower AATs to all signals than a young adult sample tested under comparable conditions. Identical signals (under similar presentation methods) were available for the mixed T-3 and the male voice. The hypothesis was supported for the mixed T-3 signal, where the young adults were found to need significantly higher volumes to wake up compared to the older adults. It was not supported for the male voice signal, where the young adults had similar AATs to the older adults. Mean values suggested a minor non-significant difference in the predicted direction for the high frequency signals.

The Mixed T-3

In this sample of older adults the mixed T-3 signal clearly woke individuals more readily than the alternatives. The mixed T-3 has now been found to perform significantly better in awakening sleepers at lower volumes than the current high pitched T-3 for various population groups. These groups are children (aged 6–10 years [14]), sober young adults and alcohol intoxicated young adults [12] and, from the current study, older adults.

The fact that the participants in the current study did not wake as readily to the 500 Hz T-3 signal shows that this advantage does not simply arise from the inclusion of low pitched frequencies in the mixed T-3. Comparison of the spectral analyses of the mixed T-3 and the 500 Hz signal show that both have the same dominant frequency at around 500 Hz, but the mixed T-3 also includes the 3rd, 5th etc. harmonics, with the most dominant harmonic being at just over 1500 Hz. It would seem that the inclusion of these harmonics is important in helping people wake up at lower volumes. It needs to be determined if other mixed frequency signals, with perhaps higher dominant frequencies and higher harmonics are even more effective at waking people up.

It is not immediately obvious why the mixed T-3 signal should be the most effective signal tested so far for waking individuals up. Perhaps it is because human responsiveness to sounds when asleep is best when the signal includes a range of frequencies. However, because the male voice did not wake people as easily as the mixed T-3 it is not the case that any pattern with a mixed frequency profile between 500 Hz and 2500 Hz will give similar AATs. The fact that the best signal for waking up tested so far includes a range of spectral components in the mid range is consistent with earlier data testing responsiveness to different ringers when awake [18].

Voice Signal

The research literature, using comparatively small sample sizes, suggested that voice alarms performed as well as the mixed T-3 with the children and young adult samples, while the current study suggests voice alarms perform significantly more poorly than the mixed T-3 with older adults. The reduced responsiveness to the male voice in these older adults was a somewhat unexpected finding which did not arise from the inclusion of three non-English speaking (NESB) participants. However, the fact that two of the NESB participants did not wake at all to the male voice suggests that voice alarms with an English text may be unsuitable for use in a population that includes NESB people.

While it is possible that the increased AATs to the male voice are related to the reduced ability to discriminate speech with increasing age [25, 26], this seems unlikely to be the full explanation, as the task did not require understanding the words of the signal (which they had heard just prior to sleep), only waking up when it occurred. One possibility is that older people may have developed more of a capacity to screen out irrelevant (and/or familiar?) noises occurring during sleep, and many irrelevant noises relate to people talking (e.g. background TV or radio sounds, conversations). Alternatively, perhaps the higher arousal thresholds to the voice signal was a function of increased dream incorporation of the voice compared to beeps, and older people may be more likely to do this than younger adults. In the absence of any evidence this remains speculative.

Signal Continuity

Comparisons of this data with the AATs found by Zepelin et al. [7] (where a 5 sec 800 Hz signal was presented at 2 min intervals), show much lower AAT levels in the current data compared to Zepelin et al. and a much smaller gap in AATs between the different age groups (44 dBA gap between AATs across the two studies for the young adults versus 22 dBA gap across the two studies for the older group). These differences suggest there is an arousal advantage in a signal presenting a sequence of short, 0.5 sec signals (as in the T-3), (compared to lone signals of 5 sec duration) and this advantage may be especially important for the younger adults.

Experimental Versus Field Settings

The sleep study was designed with one main intention. This was to compare arousal thresholds to different signals in older adults and to do so in a well controlled manner to reduce heterogeneity and therefore obtain the most valid result with a manageable sample size. Some of the screening criteria were also desirable for ethical and safety reasons.

Extrapolating this data to actual field situations is highly problematic. This is the case for both arousal thresholds and estimates of what percentage of the population might wake up at what volume. There are multiple reasons for this. One set of limitations relates to the population used, while another relate to the methodology.

The selection criteria make it clear that this sleep study was conducted in a highly selected population of older adults. In many ways they represent the most highly functioning adults in the over 65 year age range. All older adults were omitted from the study who were taking medication affecting sleep, cognitively not fully capable, not independently mobile, reported sometimes having difficulties falling asleep, and considered their hearing to be below average for someone their age. It was also required that alcohol intake was minimal or none during the study. In addition all those whose hearing at any of the five selected frequencies in either ear was in the lowest 16% of the population were excluded. Furthermore, the fact that recruitment was conducted through social groups meant that those older adults who were more isolated were not given the opportunity to be involved.

The methodology of the signal presentation in the study is not the same as may be expected in an emergency awakening situation in the home and three points are especially relevant. Firstly, the modified methods of limits methodology meant that each volume level of a signal was not presented from silence, as would be the case when a smoke alarm signal sounds. Examination of the literature on awakenings to alarms [4, 5] suggests that people may have lower AATs (i.e. wake up more readily at a certain decibel level) when that sound cuts in from silence. Thus the present AAT levels may be slightly inflated, with the sleepers becoming habituated to the gradually increasing volume of a signal up until a certain level, which then arouses them. The advantage of not including silences between the signals at different decibel levels is that it more readily allows statistical comparisons across signals, without the confounding effects of dealing with arousals that may happen in the intervening silence. This confound was discussed as a problem in an earlier paper [11]. An additional reason why generalisations are problematic is that all these participants were awoken from their deepest sleep, slow wave sleep (SWS). Previous literature suggests that AATs from other stages of sleep would be reduced (i.e. woken more easily) from other parts of sleep, with one study suggesting the reduction would be by an average of 10 dBA in older adults [7]. In older adults most of their sleep consists of lighter sleep, with the minutes of SWS declining from an average of 67 min per night for 65 year olds to 25 min in 86 year olds [10]. Another consideration, however, is that most fatal fires occur earlier in the sleeping period, which is when SWS is most likely to occur [27]. The third, and most important, methodological difficulty in generalising AATs from this study to real field situations is that in this study all the participants were primed, that is, they knew they would be awoken by a signal, and knew which signals would be played on a testing night. Priming will significantly increase the likelihood of awakening, for example from 25% to 90% [28], and this would decrease AATs.

Overall, the population factor acts to decrease the AATs (in the study, relative to the field), while within the methodological factor, the lack of silence and the use of the deepest sleep stage probably both act to increase the AATs. However, priming within the sleep study would decrease AATs. The differential effect of all these factors is not known for certain but the literature would suggest that the issues exerting the greatest effect are the population and priming (both of which decrease AATs). Thus in an unprimed, field population that includes people with a range of risk factors (e.g. intoxicated, sleep deprived, poorer hearing, being a child, etc.) the responsiveness to signals would be reduced, compared to what has been found in this study. In other words, the AATs and percentage of those who slept through sounds in the study would be underestimates compared to a field setting.

Thus it is not valid to form any conclusions or recommendations on the notion that this sleep study suggests a certain percentage of older adults would wake to any particular alarm in the bedroom. In reality it is not known what percentage of adults over 65 would wake up in an emergency situation in their home. We do know, however, that of the alternatives tested and under the conditions of the sleep study the current high T-3 signal performs the worst and the mixed T-3 signal performs the best.

Conclusions and Recommendations

The present study, using a rigorous design and sufficient sample size of sleeping adults aged over 65 years, has found a substantial difference in the median auditory arousal threshold between the current high frequency T-3 and the best performing alternative signal tested. The difference was found to be approximately 20 dBA. The data from this and other studies show that a high frequency alarm signal, performs the most poorly of the alternatives tested for waking all the different population groups tested so far (i.e. children, sober and alcohol intoxicated young adults, older adults aged over 65 years). The evidence is sufficient to lead to the following recommendation:

The high frequency alarm signal currently found in smoke alarms should be replaced by an alternative signal that performs significantly better in awakening most of the adult population, once the nature of the best signal has been determined.

The findings of the current study, together with previous literature, indicate that a mixed frequency T-3 signal has performed significantly better than a high frequency signal in its ability to awaken sleepers in every sample group tested so far. This includes children, young adults (sober and alcohol intoxicated) and older adults. Voice signals appear to be as effective as the mixed T-3 in the children and young adult groups, but are less effective than the mixed T-3 in the older adults.

Unfortunately, direct extrapolations from the present data to the field in terms of expected arousal thresholds in a real emergency or percentages of the population that may awaken to certain signals at certain volumes are not possible from this study due to the limitations imposed by the highly selected sample and methodology used.

The implications of introducing a signal frequency recommendation into the standards for smoke alarm notifications are considerable, involving a retooling of the entire industry. In view of this, any signal change that is mandated must be done on the basis of rigorous evidence that the best signal has in fact been found. The research is not yet at this point. Suggestions of future research issues are set out below. These may take two to three years to complete. In the meantime there are some recommendations that can increase the chance of sleeping individuals waking to a fire.

  • Encourage interconnected alarms. Interconnected alarms that include an alarm in the bedroom will mean that the volume at the pillow is likely to be above 85 dBA. Increased volume of any signal that can be perceived will increase the chance of waking up. In addition, the more people in a household that are exposed to alarm signals, the more chance that one of them will wake up.

  • Consider the special hearing needs of “normal hearing” older adults. It is inadequate to require the smoke alarm (of 3000 Hz) for older adults to be a minimum level of 75 dBA at the pillow. The current study shows that both males and females aged over 75 were particularly poor at waking to the current high T-3 (median AAT of 70 dBA). (If their smoke alarm was a mixed T-3 signal the median AAT value would be 40 dBA.) One possibility would be to recommend that older adults should have interconnected alarms, or at the very least stand alone alarms (with the current signal) in their bedroom. An additional, more satisfactory, possibility is for smoke alarm manufacturers to market special alarms for this age group that emit a mixed T-3 signal and suggest placement, at a minimum, in the bedroom.Footnote 2

Future Research

The future research that should be completed prior to the mandating of a specific signal encompasses a variety of issues.

  • Determine the optimal pitch and pattern of an alternative signal to wake people up. Given the large individual differences in waking thresholds this should be done as a repeated measures design in an accessible population that sleeps well, such as young adults. From the results of the current study it is recommended that the option of a voice alarm be no longer considered for adult populations, both because it performed poorly with the older adults and because of the indicated problems with individuals who do not speak English. In view of the infinite number of different sound patterns that an alarm could take and the current dominance of the T-3 as the mandated temporal pattern it is suggested that alternative pitches (e.g. pure and mixed) and pitch patterns (e.g. changing frequencies across the 0.5 sec tone) be investigated within the T-3 temporal pattern, at least in the first instance.

  • Once several signals have been shown to have the lowest auditory arousal thresholds (AATs) in the one population tested, they need to be tested in other sleeping populations, especially those most at risk of dying in a fire or of sleeping through an alarm signal. These groups include prepubescent children, older adults (65+years), adults impaired with alcohol or hypnotic medication and people who are sleep deprived (such as is frequently the case with shift workers and adolescents). The signals should also be tested for salience and/or urgency as an emergency notification signal requiring action in awake individuals.

  • Because of the inability to generalise data from the current study to field estimates, further research is needed using large numbers of non-primed, unselected groups to yield population based estimates of waking effectiveness. It seems most likely that the research to date may be underestimating the proportion of people who will not wake up to an alarm. This arises from a range of factors but especially because almost all of the participants in the relevant empirical studies on alarms and sleep have been primed to expect that a signal will go off on one of several nights. Studies are needed where there is a long time frame (e.g. one or two months) within which a test alarm may be activated and an unselected large sample is tested. This will yield valuable field estimates of alarm effectiveness where conditions influencing responsiveness are uncontrolled (e.g. alcohol intake, prior sleep deprivation, prior time in bed, sleep stage). A challenge would be the accurate monitoring of the latency to wake up.

  • A study characterising the spectral characteristics of the background noises in a range of “typical” bedrooms would be informative and relevant. When this information is put together with the acoustical information about which signal is most likely to awaken sleepers, the extent of possible masking can be determined.