Introduction

The “Deutsch’s illusion” (Deutsch, 1974a, b, 1975), often called also “Octave illusion”, occurs when a dichotic pair of tones (i.e., two different tones presented simultaneously, one at the right ear and the other at the left ear) spaced an octave apart is repeatedly presented in alternation, so that when the right ear receives the high tone, the left ear receives the low tone and when the right ear receives the low tone, the left ear receives the high tone. In other words, two identical tone sequences are presented, one at the right and the other at the left ear. Each sequence is composed of two alternating tones, a low-pitch tone and a high-pitch tone, spaced an octave apart. One ear receives the sequence shifted in time one tone position with respect to the other ear, so that when it receives the high tone, the other ear receives the low tone and vice versa. In this stimulation condition, subjects typically perceive a single low-pitch tone at one ear alternating with a single high-pitch tone at the other ear (see Fig. 1). The percept contains two illusory elements: (a) only one tone is perceived at a time, whereby throughout the stimulation two different tones are presented at the same time, one at each ear; (b) one of the two tones (the low-pitch or the high-pitch one) is perceived at the ear where it is actually presented but the other tone is perceived at the wrong ear, i.e., at the ear actually not receiving it. Right-handed listeners perceive the high tone in the right ear and the low tone in the left ear significantly more frequently than left-handed listeners. Familial handedness background also plays a role: the tendency to hear the high tone on the right and the low tone on the left is lower among those with a sinistral familial background than among those with a purely dextral familial background (Deutsch, 1983). Moreover, the proportion of listeners obtaining complex and less stable percepts is higher in the left-handed population (Deutsch, 1974b). Musical training can also influence the perception of the illusion (Craig, 1979).

Fig. 1
figure 1

Top Stimuli which elicit the Deutsch’s illusion (numbers indicate tone frequencies in Hz). Bottom Typical percept of a right-handed subject; low and high refer to the perceived pitch. Of note, outside from the context of the illusion, the 400-Hz tone would be perceived as “low tone” whereas the 800-Hz one as “high tone”

Originally the illusion was observed with sinusoidal tones of 400 and 800 Hz, and almost all subsequent work on the topic has been carried out using these two frequencies (Deutsch, 1978, 1983; Ross, Tervaniemi, & Näätänen 1996; Lamminmäki & Hari, 2000; Brennan & Stevens, 2002), with a few rare exceptions such as a report by Deutsch (1988) who shifted the octave interval towards the higher pitch (600–1,200 Hz) obtaining still the same perceptual effect, and one report by McClurkin and Hall (1981) which assessed that the illusion is pitch- and not frequency-based as it persists when the 400-Hz sinusoid is replaced by a high-frequency complex tone whose residue pitch is perceived as the low stimulus (missing fundamental effect). In the same paper authors reported that the illusion is negligibly affected by alterations of timbre.

The main hypothesis of the present study aims at verifying some theoretical arguments which have been postulated in previous works (Bregman & Steiger, 1980; Chambers, Mattingley, & Moss 2002, 2004), according to which the most reliable explanation of the perceptual mechanism underlying the octave illusion is based on the primary harmonic relationship between the tones composing the dichotic pairs. However, surprisingly, little research has been carried out to test directly this explanation, and there is no scientific evidence describing whether the genesis of the illusion depends on the pitch interval between the tones composing the dichotic pairs. In other words, no study reports whether the occurrence of the illusory percept is limited to tones having a frequency ratio of 2:1. Such an evidence could indicate whether the origin of the illusion depends on the frequency relationship between the tones composing the dichotic pairs. Indeed if the illusion occurs also with dichotic pairs composed of tones having frequency relationships different from 2:1, the explanation of the illusion based on harmonic factors should be refused (Bregman & Steiger, 1980). To answer this question, in this study we presented alternated dichotic pairs arranged in sequences as described above, in which the interval between the two tones varies in extent in each sequence, from a minor third (frequency ratio, 6:5) to an eleventh (frequency ratio, 8:3), and asked the subjects to describe their percept. The tones were presented with durations of either 200 or 500 ms in order to verify the results of one study reporting that the illusion is better perceived in rapid sequences composed of very short tones lasting 200 ms each (Zwicker, 1984).

Materials and methods

Subjects

Seventy-four healthy subjects, 57 females and 17 males, aged 18–39 years (average age, 29.4 years) participated in the experiment. Subjects scores at a hand-preference questionnaire (Edinburgh Inventory; Oldfield, 1971) was as following: 66 subjects scored ≥25, 4 subjects scored ≥−25 and <25, and 4 subjects scored ≤−25. None of the subjects was a professional musician. Subjects declared to have no auditory impairment; audiometric assessment was performed and they were recruited when no (±5 dBA) hearing thresholds difference were present between left and right ear.

Stimuli and procedure

Stimuli were synthesized on a personal computer with Sound Blaster audio card (Creative, Model AWE 32; Microwave, Rome, Italy) by means of the CSound language for sound synthesis (Vercoe, 1992). All tones were sinusoids having a duration of 200 or 500 ms. Amplitude envelope presented an attack of 10 ms and a decay of 190 or 490 ms, depending on stimulus duration (200 or 500 ms, respectively). Tones had the following frequencies: 400, 480, 533, 600, 667, 711, 750, 800, 853, 900, 960, and 1,067 Hz. These frequencies were chosen in order to achieve temperate musical intervals in which the low tone was always the 400-Hz tone and the high tone had one of the other listed frequencies. Subsequently, dichotic pairs were created in which the tones were separated by the 11 following musical intervals: minor third (frequency ratio 6:5), fourth (4:3), fifth (3:2), major sixth (5:3), minor seventh (16:9), major seventh (15:8), octave (2:1), minor ninth (32:15), major ninth (9:4), minor tenth (12:5), and eleventh (8:3).

Subjects were presented with 44 sequences of these dichotic pairs presented repeatedly in alternation, so that when the right ear received the high tone, the left ear received the low tone, and vice versa. Each sequence lasted 15 s. Half of the sequences (22) were composed of dichotic pairs lasting 200 ms and half of dichotic pairs lasting 500 ms. In each sequence the tones composing the dichotic pairs were separated by one of the 11 above-listed intervals. Each sequence was presented twice, the second one with inverted headphones (Philips SHP5400). Sequences were presented in across-subjects counterbalanced order. The tone sequences are available at the following internet page: http://www.unich.it/facolta/psicologia/edcbnl/research-chieti.htm.

Subjects were tested individually by the experimenter. After having listened to each sequence, subjects had to answer the following question written on a paper sheet: What did you hear? The possible forced-choice responses were two: (a) two alternating sounds, the high-pitch one at one ear and the low-pitch one at the other ear, and (b) other percepts. The order of the two response options was counterbalanced across subjects. The choice of the response (a) indicated that the subject had perceived the typical Deutsch’s illusion; the choice of the response (b) indicated that the subject did not.

Regarding the investigation of the role of tone duration, we performed both an implicit and an explicit inquiry. The first one concerned the responses given to the question described above for the different frequency intervals, which were all presented at both 200 and 500 ms; the second one concerned duration directly. We asked explicitly whether the illusion was perceived better in the slow (tones lasting 500 ms) or in the rapid sequences (tones lasting 200 ms) and 44 subjects had to indicate on a scale ranging from 1 to 10 whether they perceived better the tone alternation from ear to ear in the rapid sequences (composed of 200-ms lasting tones) or in the slow sequences (composed of 500-ms lasting tones).

Each session lasted about 30 min.

Results

Results are summarized in Fig. 2 and Table. 1. The left panel in Fig. 2 shows the number of subjects who perceived (black bars) or not (white bars) the Deutsch’s illusion in sequences composed of 200-ms tones, with headphones placed in standard (top) or inverted (bottom) position. The right panel shows the same for the sequences composed of tones lasting 500 ms. Asterisks indicate that the number of subjects that perceived the Deutsch’s illusion was significantly different from the number of subjects that did not perceive it (P < 0.05: one-tailed χ2, with Bonferroni correction considering 44 observations). It can be observed that the illusion was better perceived, in at least three cases out of four (200-ms standard headphones, 500-ms standard and inverted headphones), when the dichotic pairs were composed of tones separated by intervals of a major seventh, octave, minor ninth, major ninth, and minor tenth. It can be also observed that in the case of the minor third interval the perception of the illusion is less frequent compared to other percepts.

Fig. 2
figure 2

Histograms showing the number of subjects who perceived (black bars) or not (white bars) the Deutsch’s illusion in the eleven different musical intervals presented (minor third, fourth, fifth, major sixth, minor seventh, major seventh, octave, minor ninth, major ninth, minor tenth, and eleventh. Left panel Sequences composed of 200-ms tones, with headphones placed in standard (top) or inverted (bottom) arrangement. Right panel Sequences composed of 500-ms tones, with headphones placed in standard (top) or inverted (bottom) arrangement

Table 1 Number of subjects who perceived or not the illusion in the eleven different musical intervals presented, and in the two duration conditions (tones lasting 200 or 500 ms)

We also analyzed the data considering the gender; among females (n = 57) statistically significant results indicating the typical illusory percept were observed with the following intervals: major sixth, minor seventh, and major seventh (500 ms: standard and inverted headphone arrangement), eight (200 ms: standard arrangement; 500 ms: standard and inverted arrangement), minor ninth (200 ms: standard arrangement; 500 ms: standard and inverted arrangement), major ninth and minor tenth (200 and 500 ms: standard and inverted arrangement), eleventh (200 and 500 ms: standard arrangement). Among males (n = 17), no significant effect was found, possibly due to the low numbers.

Regarding the perception of the illusion on the basis of tone duration, we analyzed data produced by both a question which did not directly concern duration and which concerned duration directly, asking explicitly whether the illusion was perceived better in the slow (tones lasting 500 ms) or in the rapid sequences (tones lasting 200 ms). The first analysis (implicit question) was performed on the same dataset used for the determination of the frequency-interval effect and considered the intervals in which the Deutsch’s illusion was perceived significantly more often, i.e., major seventh, octave, minor ninth, major ninth, and minor tenth. Results showed that the illusion was perceived significantly more often with tones lasting 500 ms (χ2 = 5.15, P = 0.04 with standard headphones; χ2 = 7.60, P = 0.01 with inverted headphones). Noticeably, the same result is observable considering all tested intervals (χ2 = 23.4, P < 0.001 with standard headphones; χ2 = 30.9, P < 0.001 with inverted headphones).

According to the second analysis (explicit question regarding duration) 30 out of 44 subjects reported a better perception of the illusion when tones lasted 500 ms (χ2 = 5.82, P = 0.02). Mean (±standard error) response on a scale ranging from 1 (better perception of the illusion with 500-ms tones) to 10 (better perception of the illusion with 200-ms tones) was 4.67 (±0.32). Single sample t test showed that this value differed significantly (t = −2.57, P = 0.01) from the middle point of the scale.

Discussion

The results showed that the perception of the Deutsch’s illusion is not confined to tones separated one octave apart, but can also be perceived at least with tones separated by a major seventh, minor ninth, major ninth, and minor tenth. These intervals are all around the octave, having frequency ratios between the tones ranging from 15:8 to 12:5.

Another question of interest was the role of tone duration on the perception of the illusion. Originally, the illusion was demonstrated using tones lasting 250 ms. However, according to Zwicker (1984) the illusion can be perceived with tones ranging from 200 ms to 2 s in duration, but better percept would occur with 200-ms tones, which in the opinion of the present authors gives rise to too fast sequences. Here we demonstrated, using both an implicit and an explicit judgment of duration, that the illusion is perceived better when tones have duration of 500 instead of 200 ms.

The results of this study extend previous findings demonstrating that the perception of the illusion is resistant to changes in intensity (Deutsch, 1978), duration (Zwicker, 1984), and timbre (McClurkin & Hall, 1981), by adding evidence indicating that the illusion is resistant also to variations of the frequency interval between the tones. In addition, this study reveals that when the tones composing the dichotic pair are separated by a much smaller interval such as a minor third, the perception of the illusion becomes very unlikely (although it is not completely abolished).

Many attempts have been made to explain the perceptual mechanism(s) at the basis of the illusion, which remains a rather complex issue (Deutsch & Roll, 1976; Deutsch 1981, 1988, 2004; Chambers et al., 2002, 2004, 2005). Two of those are strictly in relation with the frequency of the tones composing the sequence which generates the illusion. Bregman and Steiger (1980) proposed that the auditory system treats the 800-Hz tone as a harmonic of the 400-Hz tone and localizes the perception of the tone at the ear receiving the “more reliable higher harmonic”. Chambers et al. (2002) suggested that the illusory percept is based on dichotic fusion, a perceptual effect that occurs with dichotic pairs composed of tones having very similar frequencies or frequencies in harmonic relationships, which elicit a fusion of the stimulus in a single auditory image. According to the current study, however, explanations of the perceptual mechanisms at the basis of the illusion implicating the harmonic relationships between the two tones constituting the dichotic pairs have to be refused, since the present findings demonstrate that the illusion also arises when the tones have negligible harmonic relationships. In particular, dichotic fusion has to be refused, since it would predict that the illusion also persists with tones having similar frequencies in the dichotic pair, but the present results demonstrate that with similar frequencies (e.g., minor third interval) the illusory percept almost disappears.

In conclusion, we suggest labeling the illusion investigated in the present study always as “Deutsch illusion” rather than as “Octave illusion” since the same perceptual outcome originally discovered using the octave interval can be obtained with other musical intervals.