Introduction

Acoustic communication signals usually are distorted and masked during propagation through the biotope and thus arrive at the receiver in a degraded form (Richards and Wiley 1980; Römer 1992, 1998; Klump 1996; Lang 2000). As a rule the most severe masking problems will be caused by conspecific signalers, since then the spectra of the masking sounds and the signal overlap, thus preventing improvement of the signal-to-noise ratio by frequency filtering (Gerhardt and Klump 1988; Grafe 1999). In such cases the temporal structure of the stimuli becomes especially important.

An often-used way to improve signal transmission is to repeat signals over and over again or to use signals with repetitive substructures. This strategy enables a potential receiver to take repeated samplings and thereby to improve the chances for detection and correct interpretation of a signal. Signal repetition is indeed a very common phenomenon in acoustically communicating insects, frogs and birds (Kroodsma and Miller 1996; Römer 1998; Gerhardt and Huber 2002). A factor possibly constraining the evolution of redundant signals is that signal repetition by many conspecific individuals may lead to a severe degradation of the signals' temporal structure (Römer et al. 1989; Römer 1998; Grafe 1999; Römer and Krusch 2000). Chorusing animals are often found either to synchronize or to alternate their calls, thereby mitigating such masking effects (e.g. Greenfield 1994; Grafe 1996; Gerhardt and Huber 2002). Apart from signal interference redundancy implies additional costs as redundant signals may increase not only attraction of conspecifics but also of predators or parasitoids (e.g. Ryan 1985; Lehmann and Heller 1998; Zuk and Kolluru 1998).

In grasshoppers, the recognition of the species-specific communication signals is based predominantly on temporal cues present in the amplitude modulations of the signal (von Helversen and von Helversen 1994, 1997, 1998; Pollack 2000). This conclusion is primarily based on behavioural evidence (e.g. von Helversen and von Helversen 1997) and is supported by the peripheral organization of the grasshoppers' auditory pathway. On the basis of their frequency tuning the tympanal receptors of gomphocerine grasshoppers can be classified in two groups: low-frequency receptors with best frequencies around 5–7 kHz and high frequency receptors with best frequencies between 15–25 kHz (Römer 1976; Jacobs et al. 1999; see also Fig. 4 in von Helversen 1997). Within the rather broad frequency bands covered by each of the two receptor classes an analysis of finer differences in carrier spectra is highly unlikely (von Helversen and von Helversen 1997). Hence the communication system of grasshoppers provides excellent opportunities to investigate noise tolerance in a system operating in the temporal domain.

In two earlier papers we tested whether and how grasshopper males (Chorthippus biguttulus) exploit the redundancy which is present in the repetitive structure of their long stridulation signals (von Helversen and von Helversen 1994). Under low noise conditions signals as short as 165–250 ms (corresponding to two or three song subunits of ca. 80 ms in this species) are sufficient for reliable signal recognition (Ronacher and Krahe 1998). Even under noisy conditions the animals do not exploit the full duration of the species-specific signal which is around 1–1.5 s (von Helversen and von Helversen 1975). Recognition is improved with signal durations between 250 and 420 ms (i.e. three to five subunits), but stimuli longer than five subunits do not lead to a further decrease in detection threshold (Ronacher et al. 2000).

In these latter experiments the interfering noise had no additional low-frequency amplitude modulation, in contrast to the temporally structured ambient noise in natural environments (e.g. Römer 1992, 1998). In the receiver's auditory pathway an amplitude-modulated noise may evoke different adaptation levels compared to continuous noise. Furthermore, the amplitude modulation of the noise may interfere with the signal's amplitude modulation pattern. Hence, the repetition of signal subunits in longer signals may become more important for signal recognition under these conditions. Indeed, the relatively long signals produced by Ch. biguttulus females could be interpreted as an evolutionary adaptation to the amplitude modulations of ambient noise (cf. Ronacher et al. 2000). Here we investigate how signal detection in Ch. biguttulus males is affected by the presence of temporally structured noise. We tested recognition of signals in the presence of a masker noise that was sinusoidally amplitude modulated. Several modulation frequencies between 1.5 and 150 Hz were used in order to see whether some frequencies might impede signal recognition more strongly than others.

Material and methods

The experimental procedures were basically the same as described in Ronacher et al. (2000) and in Ronacher and Krahe (1998). The principle of the test paradigm was to stimulate a Ch. biguttulus male with female stridulation signals via a laterally situated speaker while the animal was continually exposed to a noise broadcast from a speaker situated 65 cm above the animal. The set up and the female song models were the same as described in Ronacher et al. (2000); signals were composed of 4 subunits or 12 subunits (corresponding to durations of 335 ms and 1015 ms). These signals reliably elicit a turning response of the males which is the first step in the males' phonotaxis towards a stridulating female (cf. Ronacher and Krahe 1998; von Helversen 1997).

The experiments were run in a room the walls of which were covered with sound-reflection attenuating foam material. The temperature during the tests was kept at 29–32°C. The female signals were stored digitally and broadcast at an intensity of 50 dB sound pressure level (SPL) as described in Ronacher et al. (2000). The masker was broadband noise (low-pass filtered at 10 kHz, Kemo VBF8) which was amplitude modulated by multiplying it with the output of a sine wave generator at appropriate d.c. offset (Toellner 7706A). The noise was amplified (Blaupunkt GTA 200) and its level was adjustable in 1-dB steps by means of an attenuator (Kenwood RA 920 or KAY 837). Sound intensities were measured at the position of the animal with a Brüel & Kjaer sound-level meter (2235, 1/2" condenser microphone 4133) and are given in dB SPL (re 2×10−5 N m−2). The signal's intensity was kept at 50 dB SPL, while the noise level was changed (see Results and Discussion). The signal and the masking noise had no fixed temporal relation.

The tests were performed on males of Ch. biguttulus L. (Acrididae, Orthoptera) caught in the field in Berlin or near Göttingen. Nineteen males participated in the three test series with 1.5, 15 and 150 Hz modulation frequency. A different group of 17 males were tested with unmodulated noise, another group of 24 males were tested with 5 and 50 Hz (14 of which participated in both frequency regimes), and another 16 males were tested with 2.5 and 70 Hz (9 participating in both test series). At the beginning of each experiment a male was tested with a 4- and a 12-subunit stimulus without interfering noise. Only animals that showed vigorous turning responses in those pretests were used. Tests with short and long stimuli were alternated in order to exclude hysteresis effects. At each noise level the two stimuli were presented at least ten times. As statistical tests we used Friedman's test combined with the Wilcoxon-Wilcox test for multiple comparisons or Wilcoxon's test for paired comparisons. To compare independent samples we applied the Wilcoxon-Mann-Whitney test with Bonferroni correction.

Choice of modulation frequency range

Which Fourier components of the signal envelope of a grasshopper's song are important for signal recognition? This question has been answered in behavioural experiments by testing females of Ch. biguttulus with song models whose envelopes were systematically modified (von Helversen and von Helversen 1998). In these experiments amplitude modulations in the range up to 50–70 Hz turned out to be crucial for the recognition of the syllable-pause structure of the song models. Higher frequencies up to 150 Hz were important for the rejection of songs that comprised small gaps within the syllables. In the experiments described here we intended to cover this frequency range of amplitude modulations in order to confront the animals with a most difficult task.

Comparison of noise and signal levels

In our test paradigm the animals experienced the signal and the noise coming from different directions. The noise was delivered from above in order not to provide any directional cues in its amplitude modulations that could interfere with the grasshopper's turning decisions (cf. Ronacher et al. 2000). When comparing the levels of signal and noise we have therefore to take into account the directionality of the ears. The ears of Ch. biguttulus exhibit a ca 2.5–3 dB higher threshold when stimulated from above compared to lateral stimulation (Wolf 1986). Hence, for a comparison with the signal the measured SPL levels of the noise given in Figs. 1 and 2a have to be reduced by ca. 3 dB to arrive at the effective noise levels. This effect of different sound directions, however, is the same for unmodulated and modulated noise.

Fig. 1.
figure 1

Responses of a single male to female signals played back in the presence of amplitude modulated noise. The level of the noise is shown on the abscissa (in decibels sound pressure level, SPL), the ordinate shows the percentage of turning responses (re. total number of stimulus presentations). Two different signal durations (12 and 4 subunits, circles and triangles) and two modulation frequencies (5 Hz and 50 Hz, open and filled symbols) were applied. The curves for 5 Hz and 50 Hz are based on 99 and 88 stimulus presentations, respectively. Further explanations in text

Fig. 2a, b.
figure 2

Influence of modulation frequency on recognition thresholds. a Critical noise levels (mean value+95% confidence interval) as derived from the curves of Fig. 1 are shown for the two stimulus types (12-subunit stimulus: circles, 4-subunits: triangles). Arrow points to results obtained with unmodulated noise. b ΔdB Values measured at 50% response level (cf. Fig. 1). Mean+95% confidence interval is shown. Asterisks in a and b indicate significance levels for comparison with unmodulated noise: large asterisk P<0.001; small asterisk P<0.01 (with Bonferroni correction). For the number of individuals tested see Materials and methods

Results

At relatively low noise levels most males respond to every presentation of a female song. In Fig. 1 the percentages of turning responses of a single male are shown for different levels of the amplitude modulated noise. Up to noise levels of 47 dB this animal responded to every stimulus presentation, both with a 4-subunit and a 12-subunit female song model. With increasing noise levels, the response percentages show a steep drop towards zero response. These curves obtained with two signal durations and two modulation frequencies exemplify several general observations. First, with a 5-Hz modulation frequency this male tolerated a much higher noise level than with a noise modulated at 50 Hz (Fig. 1, open symbols: 5 Hz; filled symbols: 50 Hz). Second, compared with the 4-subunit stimulus the curves for the 12-subunit stimulus are shifted towards higher noise levels, confirming that a longer stimulus allows for a more robust signal detection. The difference between the 12-subunit and the 4-subunit curves is much larger for a noise modulated at 5 Hz compared with 50 Hz (circles: 12-subunit stimulus; triangles: 4-subunit stimulus). To compare the responses to different stimulus regimes the following criterion was chosen: we interpolated the "critical noise level" that led to a turning response in 50% of the stimulus presentations (Fig. 1, arrows). This noise level is a good indicator for the detection threshold (see Ronacher et al. 2000 for a detailed discussion). The critical noise levels are plotted in Fig. 2a for several modulation frequencies and the two stimulus durations used (12 and 4 subunits, i.e. 1015 and 335 ms duration). Also shown are the corresponding values for tests with unmodulated noise (arrow at the left). Both curves exhibit the same tendency. At low modulation frequencies (1.5–5 Hz) the tolerated noise level is in the same range as for unmodulated noise (stippled line in Fig. 2a), or even higher (open symbol at 2.5 Hz). Towards higher modulation frequencies there is a steep drop between 5 and 15 Hz, after which the curves level off. The curve for the 4-subunit stimulus is shifted towards lower noise levels compared to that for 12 subunits, confirming earlier results that a long stimulus allows signal detection at somewhat higher levels of interfering noise. The two curves are not quite parallel. This is detailed in Fig. 2b which shows the ΔdB values, i.e. the differences between the critical noise levels for a 12- and 4-subunit stimulus (see Fig. 1). The mean ΔdB value was around 5–6 dB for modulation frequencies of 1.5–5 Hz, and 2–3 dB for 50–150 Hz. These latter values are not significantly different from that obtained with unmodulated noise (P>0.05), while the values for 1.5–5 Hz are significantly different (Fig. 2b); the 15-Hz value is intermediate (though not significantly different after Bonferroni correction). The improvement of signal detection with the longer signal is around twice as large for the low modulation frequencies as compared to the high frequency range.

When comparing the critical levels of unmodulated and sinusoidally modulated noise, one has to take into account that with the sinusoidally modulated noise the peak level had to be 3 dB higher to arrive at the same SPL as an unmodulated noise. Thus, considering peak levels the values for amplitude modulated noise in Fig. 2a have to be shifted to 3 dB higher levels for a comparison with the critical levels found with unmodulated noise. The principal picture (significance values indicated in Fig. 2a), however, would not change that much by this calculation: for the long stimulus now the values at 1.5 and 15 Hz are not significantly different from the unmodulated reference, while at 2.5 and 5 Hz the values are significantly higher (P<0.001 and 0.01), and at 50–150 Hz they are significantly lower (P<0.001, 0.001 and 0.01), as compared to the unmodulated reference. For the 4-subunit stimulus the 2.5-Hz modulation led to a higher noise tolerance (P<0.01) compared to the reference value, while 50–150 Hz led to lower values (P<0.001, 0.001 and 0.05), all other differences being not significant (after Bonferroni correction). Whether peak levels or RMS levels give a more realistic description of the animals' task is difficult to estimate. It depends on neuronal integration times, which unfortunately are not known for Ch. biguttulus. For auditory receptors of noctuid moths a rather large range of time constants between 2 and 70 ms has been reported (for a detailed discussion see Tougaard 1998; cf. also Ronacher et al. 2000), and hence the question of peak or RMS levels must remain open.

Discussion

This study focused on the question of how sinusoidally amplitude modulated noise affects the detection and recognition of the female communication signals by Ch. biguttulus males.

Modulation frequency is decisive

The effects of the amplitude modulation depended strongly on the modulation frequency (Fig. 2a). At relatively low frequencies (1.5–5 Hz) the modulated masker did not lead to a rise in detection threshold compared to the base line obtained with unmodulated noise. With the long stimulus there was even an indication for a moderate improvement towards tolerating higher noise levels (Fig. 2a). In contrast, the critical noise level decreased markedly at high frequencies indicating a strong detrimental effect of modulation frequencies higher than 15 Hz (this effect persisted when peak levels were considered, see above). The most detrimental maskers were modulated at frequencies of 50–150 Hz and consisted of sound pulses of 10–3 ms duration (halfwidth; the amplitude modulation depth was 100%). By filling the pauses between syllables such sound pulses masked the species-specific song structure very efficiently (Fig. 3c–e). Earlier experiments had revealed that the elimination of the pauses makes a female signal unattractive (Ronacher and Krahe 1998; cf. also von Helversen and von Helversen 1997). Hence, the relatively high thresholds for these stimuli are in accord with these observations.

Fig. 3a–e.
figure 3

Female stridulation signal in the presence of noise modulated at frequencies of 1.5 Hz (b), 15 Hz (c) and 150 Hz (d, e). In a the unmasked female signal is shown, and in e a detail of the signal masked with the 150-Hz noise. The RMS level of the female signal was 50 dB SPL, noise levels were 45 dB SPL. Scale bars: 200 ms

The results obtained at low frequencies (up to 5 Hz) can be explained as follows: males of this species are able to reliably recognize conspecific signals consisting of only two to three subunits (165 and 250 ms, Ronacher and Krahe 1998). As the depth of the amplitude modulation was 100%, at low frequencies long troughs existed in which the signal was relatively undisturbed (Fig. 3b). If these troughs were long enough to accommodate at least two to three subunits of the signal, then a lower detection threshold is expected and can account for the relatively high noise levels tolerated at low modulation frequencies with the long signal (Fig. 2a). This effect corresponds to the "dip listening hypothesis" brought forward by Buus (1985) for hearing thresholds of human subjects. Since there was no fixed temporal relation between signal and noise, evidently with the four-subunit signal the probability that the two to three subunits necessary for signal recognition fell in a trough was substantially smaller (the duration of a four-subunit stimulus—335 ms—corresponds to a half period of 1.5 Hz).

The responses of the animals to noise modulated at low or high frequencies differed also in another respect. At frequencies of 1.5–5 Hz the ΔdB values (i.e. the separation of the 12-subunit and the 4-subunit curves, see Fig. 1) were two to three times larger than those obtained with unmodulated noise (Fig. 2b). For frequencies between 50 and 150 Hz the ΔdB values were not significantly different from the reference value (unmodulated noise). The larger ΔdB values at low modulation frequencies can probably be attributed to the increased likelihood that a sufficient part of the long signal was present in the troughs of the noise. However, apart from this "dip listening effect", in the presence of a high-frequency amplitude-modulated noise long songs did not improve signal recognition beyond the level attained with unmodulated noise (Fig. 2b).

Spatial release from masking?

For various vertebrate species a spatial release from masking has been reported (e.g. Schwartz and Gerhardt 1989; Klump 1996), that is, an improved signal detectability if signal and masker arrive from different directions. Tests in the horizontal plane with a "split song" paradigm have shown that—for pattern recognition—these grasshoppers summate over signals stemming from two sources located on the left and the right of the animal (von Helversen 1984). This is true for males and females of Ch. biguttulus and unlike the situation in crickets and bushcrickets (Pollack 1988; von Helversen and von Helversen 1995; Schul et al. 1998; Römer and Krusch 2000). Hence, although it cannot be completely excluded it is not likely that in our paradigm the grasshoppers could have exploited the differences in sound direction—signal from the side and noise from above—to improve the signal-to-noise ratio further than the 3-dB difference provided by the ear's directionality (see Materials and methods; Wolf 1986).

Comparison with comodulation masking release

For various vertebrates comodulation masking release (CMMR) has been reported. This term describes the observation that an amplitude-modulated wideband masker yields considerably less masking than a unmodulated masker with comparable spectrum (Moore 1992; Klump and Langemann 1995; Buus 1985, 1998). The masking release is thought to result mainly from comparisons across different critical bands although within-channel cues may contribute as well (Buus 1985; Klump and Langemann 1995; Klump 1996; Moore and Borrill 2002). In starlings comodulation masking release was highest for low modulation frequencies (12–50 Hz) but persisted up to bandwiths of 400 Hz (Klump and Langemann 1995). Hence, the frequency range investigated here was in the right order of magnitude to detect any CMMR.

Given the sensory equipment of a grasshopper's auditory periphery (cf. Introduction) and the similar carrier frequency contents of signal and noise used in our paradigm, however, CMMR was not likely to be expected in these animals. Indeed, the results gave no indication that such a sophisticated processing mechanism exists in grasshoppers. High frequencies (15–150 Hz) yielded higher thresholds (i.e. less noise tolerance) instead of the lowered thresholds expected in case of masking release (Fig. 2a). Only for the long stimulus and at low modulation frequencies a moderate beneficial effect was found (Fig. 2a). The fact that the expected improvement was absent for the short stimulus, however, points to a different explanation, the "dip listening hypothesis" (see above and Buus 1985, 1998; cf. also with the concept of auditory windows: Wolf and von Helversen 1986; Hedwig 1990).

Taken together, the drop beyond ca. 5 Hz in the curves of Fig. 2a is obviously due to a strong detrimental influence of sound pulses delivered with little spacing. This masking destroys the syllable-pause structure of the signal and thus impedes signal recognition (Ronacher and Krahe 1998). Our results indicate that a temporally structured noise—as normally will occur in choruses in the field—is a more efficient masker than unmodulated noise with the same carrier frequency spectrum. Unmodulated noise does reduce the signal's modulation depth while the signal's temporal structure is affected to a lesser extent, in contrast to the influence of high-frequency modulated noise. Similar effects as reported here have been found in humans in tests in which both signal and masker were amplitude modulated. Masking was strongest when the modulation frequencies of the masker were similar to the signal's modulation frequencies (Houtgast 1989). This has been explained by postulating a bank of neuronal modulation filters for sound envelopes (Ewert et al. 2002). Present evidence from behavioural experiments, however, does not support the concept of a set of tuned modulation filters in the auditory pathway of grasshoppers (von Helversen and von Helversen 1998). As an alternative to modulation filters, continuous noise may enable a more efficient adaptation compared to the habituation evoked by pulsed noise (cf. Givois and Pollack 2000; Römer and Krusch 2000).

In contrast to high modulation frequencies, noise modulated at very low frequencies had a beneficial effect on signal detectability, by opening "auditory windows" in its troughs during which signal detection is facilitated. In this context the long signals yield a distinct improvement compared to the short ones (Fig. 2b), and this could explain the hitherto enigmatic observation that the females do produce songs longer than necessary for recognition by the male (Ronacher et al. 2000).