Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

In 1954, Manfred Schroeder published two seminal papers [1, 2] on sound distribution in rooms. In the first paper, in refutation of the accepted theory of the time, he was able to show that resonance frequency in rooms are random if one introduces in the room small objects of dimensions equivalent to the wave length of interest. Up to then, practitioners believed that big diffusing objects were necessary to randomize the modal distribution. In the second paper, he carried on with statistical properties of such randomized sound field, using advanced statistics recently developed by Rice. Thus, he could show that, above a cut-off frequency dependent of volume and reverberation time, modes overlap and combine so that average distance between peaks, between troughs, and even the standard deviation, was predicted by the theory of random noise.

Several years later in 1962, he and his collaborators went one step further [3]. They simulated the random distribution of sound fields in rooms by means of a powerful tool of random process theory: Monte-Carlo simulation. In analogy with the hazard game of roulette, which made Monte-Carlo casino world famous, Monte-Carlo technique simply simulates complex processes by choosing values at random, but following a given probability distribution, and combining them according to the properties of the process to be simulated. Using this powerful technique, Schroeder and his collaborators were thus able to simulate the frequency response of a room, showing that it displayed the same characteristics as real measurements.

The assumptions upon which Schroeder based his simulation are now well known. They are twofold. Firstly, the resonance frequencies of the modes are distributed randomly, that is, they locally follow a uniform distribution. Secondly, the transfer functions of the modes at their resonant frequency follow exponential distributions with imaginary random arguments. Thus, the transfer function at any frequency can be obtained by superposition of the different modes that respond at the frequency. When modes overlap, the transfer function becomes Gauss distributed by virtue of the Central Limit Theorem of probability. Of course, this is only valid at high frequencies, when many modes overlap, but the approximation is considered satisfactory as soon as ten modes respond at a given frequency.

The present paper develop this idea and extend it to the time domain, as was suggested by Moorer [4] when he wrote that an impulse response is similar to white noise exponentially decreasing with time. Indeed, the properties of the Fourier transform ensure that the time domain response, that is, the impulse response of a room, also is Gauss distributed. Several researchers [5, 6] have independently tried to check this property on measured impulse responses, with mixed success. We shall review some attempts, and stress the reason for their failure: the absence of a proper model of impulse response. Indeed, statistics teaches us that it is almost impossible to prove statistical properties ex post. Only the existence of a model, that is, analysis ex ante, can prove the statistical properties of room response.

As a consequence, the next step is a review of Schroeder’s model for the transfer function, and its translation in the time domain, generalizing Moorer’s model. The paper then carries on with the analysis of a measured impulse response, arbitrarily selected among some 200 obtained during a recent campaign in Paris [7]. It stresses the necessity of compensating for the decay, in order to reconstruct a stationary signal, before carrying out a detailed statistical analysis in both the time and frequency domains. It concludes with the necessity of improving the deconvolution algorithms currently used in room acoustics to obtain impulse response from sine sweep.

2 Results from Previous Research

In 1989, the present author offered a generalization of Schroeder’s model [8] that gave the theoretical framework of Jot’s approach to digital reverberation filters [9]. As a consequence, several authors [5, 6] have tried recently to check the validity of the model. They used different statistical tests, but all of them rely on some statistical estimator: the ratio of the kurtosis to the variance [5], or Kolmogorov test [6]. We briefly present the results of Defrance [6].

Defrance used a set of impulse responses measured in Salle Pleyel with pistol shots (Fig. 5.1), and checked whether it was distributed according to a Gaussian distribution of same variance and mean value (null hypothesis). He used Kolmogorov-Smirnov test, that is, he compared the distribution of the experimental values to the empirical distribution of data obtained with a Gaussian random generator fed with the same mean value and the same variance. Most of the time, the impulse response passes the test (probability P(t) > 0.05, black values in Fig. 5.1), although the probability remains low, but every now and then the null hypothesis is rejected, meaning that the two distributions are not the same. A striking feature of Fig. 5.1 is the fact that the results of the test depend on the length of the window used (200 or 120 samples), with slightly better results for longer windows. Other tests give similar results [6], leading to the rejection of the hypothesis that impulse responses are distributed according to Gaussian laws (see also [5]).

Fig. 5.1
figure 1

Impulse response from Salle Pleyel (top) and the probability that samples are distributed according to Gaussian law of same variance and mean value (Kolmogorov-Smirnov test)—middle curve: using 200 samples; bottom curve: using 120 samples—gray values correspond to intervals where the test fails

In order to check the correctness of the procedure, Defrance also evaluated simulated impulse responses constructed by weighting Gaussian random noise with an exponential window. The results of Kolmogorov-Smirnov test are given in Fig. 5.2. This time, the impulse response passes the test most of the time, with rather high probability. But every now and then, it fails the test. Once again, the results depend on the length of the analysis window used for the test.

Fig. 5.2
figure 2

Gaussian impulse response (top) and the probability that samples are distributed according to Gaussian law of same variance and mean value (Kolmogorov-Smirnov test)—middle curve: using 200 samples; bottom curve: using 120 samples—gray values correspond to intervals where the test fails

The conclusion of this survey of previous research is that the methodology they use is not appropriate. Indeed, they are not constructed on a theoretical framework, like Schroeder’s model, but only attempt to test a property: that impulse responses are Gaussian. In other words, they want to prove statistical properties ex post, whereas an analysis ex ante, making use of the properties of a model, is necessary to prove the statistical properties of room responses.

3 Generalizing Schroeder’s Model to Time Domain

As stressed in the previous section, no proof of the statistical properties of impulse responses can be achieved without a proper model of impulse responses. This model relies on Schroeder assumptions:

  • The resonance frequencies of the modes locally follow a uniform distribution.

  • The transfer functions of the modes at their resonant frequency follow exponential distributions with imaginary random arguments.

At arbitrary frequencies, therefore, one must take into account the bandwidth of each mode, and superpose the modes accordingly (Fig. 5.3). The result is a random walk.

Fig. 5.3
figure 3

Superposition of modes with random initial phases, taking into account their bandwidth (right)—it results in a random walk (right)

As shown in Fig. 5.3, it results in a complex transfer function, which is best decomposed into its real part X(ω) and its imaginary part Y(ω):

$$ H\left(\omega \right)= X\left(\omega \right)+ jY\left(\omega \right) $$
(5.1)

The theory of random walk then predicts that both the real and imaginary parts follow centered Gaussian distribution of same variance:

$$ \begin{array}{c}\left\langle X\left(\omega \right)\right\rangle =\left\langle Y\right(\omega \left)\right\rangle =0\\ {}\left\langle {X}^2\left(\omega \right)\right\rangle =\left\langle {Y}^2\right(\omega \left)\right\rangle \end{array} $$
(5.2)

In other words, the real and imaginary parts are equidistributed. Further, they are decorrelated [3]:

$$ \left\langle X\left(\omega \right) Y\left(\omega \right)\right\rangle =0 $$
(5.3)

Now, Schroeder model is defined in the frequency domain, and applies to modes taken individually, with due consideration of their bandwidth, that is, of their decaying nature. As a consequence, when translated into the time domain, it means that each mode will be exponentially decaying with random initial phase. By summing up all the modal contributions, one obtains an exponentially decaying random noise. However, since modes last the whole duration of the impulse responses, the random distribution of their superposition must be checked over the whole duration of the impulse response. Locally at some instant, nothing stops the impulse response from deviating from a Gaussian distribution.

In the following, we keep in mind the long-term characteristics of room impulse responses. As a consequence, we compensate for time variation, using the property of exponential decay. In a similar way, we must take into account the spectrum of the source when checking the Gaussian distribution of the transfer function; therefore, the next section is devoted to the presentation of room impulse responses.

4 Raw Analysis of Impulse Responses

For the purpose of illustrating the properties of room impulse responses, we arbitrarily chose an impulse response measured at Opéra Garnier in Paris during a recent campaign in 16 Parisian theaters and concert halls [7]. This response, sampled at 44.1 kHz, is presented in Fig. 5.4. It was obtained by deconvolution of a logarithmic sine sweep of 30 s duration, using the Aurora suite developed by Farina [10].

Fig. 5.4
figure 4

Impulse response measured at Opéra Garnier in Paris

From this response, we computed the transfer function by Fourier transformation, and estimated the running variances and cross-correlation according to Eqs. (5.2) and (5.3). These estimates were obtained by averaging 44 adjacent values of the squared real part, of the squared imaginary part, and of the product of the two. These estimates are presented in Fig. 5.5, where the real part is traced in blue, the imaginary part in red, and the cross-correlation in green. It is evident from Fig. 5.5 that the averaged squared real and imaginary parts coincide, since the blue curve is almost completely hidden behind the red one. On the other hand, the cross-correlation does not vanish, as predicted by Eq. (5.3). Proper estimation of the empirical process corresponding to averaging 44 adjacent values of the product of the real and imaginary parts leads to the prediction that the green curve should lay 16.5 dB below the two others [8], a value which agrees with Fig. 5.5.

Fig. 5.5
figure 5

Transfer function measured at Opéra Garnier. Blue: real part; red: imaginary part; green: cross-correlation

It is also instructive to have a look at the logarithmic display of the impulse response. Once again, we estimated the running variance and the running average on 44 adjacent values. They are displayed in Fig. 5.6, where the running average is traced in blue, and the running variance in red. This time, due to correlation between successive values, the difference between the two curves is much less than 16 dB [8].

Fig. 5.6
figure 6

Logarithmic plot of the impulse response. Blue: mean value; red: quadratic value

It can be seen from Figs. 5.5 and 5.6 that Schroeder’s model needs adaptation before any Gaussian distribution can be checked. First, the decay in time must be compensated, so that the impulse response approximates a stationary signal; then the spectrum must be compensated, to account for the spectrum of the source. These two issues are successively addressed in the next sections.

5 Compensating for Decay

The compensation procedure for the decay is illustrated in Fig. 5.7. Firstly, it is necessary to detect the background noise, or noise floor to which the impulse decays at larger times. This background noise is traced in dark blue for the mean square value, and in red for the mean value. Only the mean square value is of interest, and the noise floor is estimated by linear fit (green line in Fig. 5.7). By convention, any value within 10 dB of the noise floor is considered as background noise [11]. Therefore, the impulse response is reduced to the light blue part for its mean square, and the brown part for its mean.

Fig. 5.7
figure 7

The different linear fits used in the decay compensation process

The next step consists in estimating the decay by a linear fit of the mean square—magenta line in Fig. 5.7. Notice that the linear fit of the mean decay (yellow line) does not run parallel to the magenta line. This difference will come to light later on.

Once the mean decay is obtained by linear fit, it is straightforward to compensate for the decay by multiplying the impulse response by the exponential of the logarithmic decay, while windowing it from 0.04 to 0.96 s. However, inspection of the compensated mean square (Fig. 5.8) reveals that the procedure is not sufficient. There remain abnormal values at the onset of the response, corresponding to coherent reflections. In a similar fashion, the mean square value increases toward the end of the window, revealing that some background noise is influencing the response. A shorter window must be used.

Fig. 5.8
figure 8

Mean square of compensated impulse response

Figure 5.9 presents the mean compensated decay over the same window extending from 0.04 to 0.96 s. Now, the trend is different, with a positive mean slope of the curve. This corresponds to the fact that the yellow fit curve is not parallel to the magenta one in Fig. 5.7, having a smaller slope. In fact, it simply traduces the fact that computing mean values amount to low-pass filtering, albeit a very rough one, and that reverberation times at low frequencies usually are longer than at higher frequencies.

Fig. 5.9
figure 9

Mean compensated impulse response

As a consequence, it is not sufficient to look at mean and mean square values to check for that compensated impulse responses follow Gaussian distribution, as was originally proposed in [8]. One must refine the analysis.

6 Detailed Analysis

The Artefact (cf. Oxford Dictionary) observed in the previous section led us to slightly amend the compensation procedure by selecting a more conservative time window in its last step. Figure 5.10 presents the thus selected portion of the impulse response used in this section.

Fig. 5.10
figure 10

Selected portion of compensated impulse response

6.1 Impulse Response

Visual inspection of Fig. 5.10 reveals that the selected portion of the compensated impulse response looks indeed like white noise. It is therefore meaningful to compare its properties to the properties of white noise, for example by looking at the histogram of the values it takes (Fig. 5.11).

Fig. 5.11
figure 11

Histogram of compensated impulse response

Figure 5.11 compares the histogram of the compensated impulse response, traced in red, to the theoretical histogram of a Gaussian distribution with zero mean and the same variance, traced in blue, computed for the same number of samples as contained in the impulse response. The two traces look similar. Therefore, we decided to check the distribution with statistical test. We used the Kolmogorov-Smirnov test, which compares the empirical distribution to a sample of the same number of random values that follow the theoretical distribution. We repeated the test several times with different simulated samples of the Gaussian distribution, and the compensated impulse response always passed Kolmogorov-Smirnov test.

However, a zoom on the histogram around the peak of the distribution reveals a slight skew of the impulse response. Indeed, its distribution does not top at zero value, as expected, but slightly below it (Fig. 5.12). We interpret it as a misalignment of the deconvolution filter that recovers the impulse response from the sine sweep measurement. Great care must be taken in the alignment of the filter in order to ensure Gaussian distribution, as well as the right amount of background noise.

Fig. 5.12
figure 12

Zoom on the histogram revealing misalignment of the deconvolution filter

6.2 Transfer Function

In order to check the distribution of the transfer function, it is also necessary to carry out some compensation of the source spectrum. Indeed, the transfer function computed from the impulse response by Fourier transformation is far from being flat, as shown in Fig. 5.5, but only the compensated part of the impulse response must be taken into account.

Figure 5.13 presents the raw spectrum of the compensated part of the impulse response (red). This estimate of the transfer function is far from being flat, and need smoothing before any compensation can be envisaged. Therefore, Fig. 5.13 also display several methods for smoothing the spectrum, from Welch spectrum computed with window lengths of 64 samples and 50 % overlap (green), to Burg spectral estimation based on 32nd order autoregressive process (light blue), and Yule-Walker spectral estimator with the order of the autoregressive model set to 16 (dark blue).

Fig. 5.13
figure 13

Spectrum of compensated part of impulse response, and its estimation according to several procedures

As little difference subsists between the Burg and Yule-Walker estimators, we decided to carry out the spectral compensation using the 16th order Yule-Walker estimator. In a similar way to the decay compensation of the impulse response, the complex spectrum computed from the compensated impulse response is then divided by the Yule-Walker estimator, yielding real and imaginary parts of the spectrum (Fig. 5.14) that look very similar after truncation of the central part of the spectrum—between 200 Hz and 14 kHz—and once again similar to white noise. Thus, proper distribution analysis can now be carried out.

Fig. 5.14
figure 14

Compensated complex spectrum. Blue: real part; red: imaginary part

Figure 5.15 presents the histogram of the real part of the compensated transfer function, traced in blue, and compares it to the theoretical histogram of a Gaussian distribution with zero mean and the same variance, traced in red, computed for the same number of samples as contained in the transfer function. The two traces look similar. Therefore, we decided to check the distribution with statistical test. Once again, we used the Kolmogorov-Smirnov test, and repeated the test several times with different simulated samples of the Gaussian distribution, and the compensated transfer function only passed Kolmogorov-Smirnov test some of the times.

Fig. 5.15
figure 15

Histogram of real part of compensated transfer function

This time, a zoom on the histogram around the peak of the distribution reveals a skew of both the empirical and theoretical distributions. None of them tops at zero value, as expected, but slightly below it for the empirical distribution, and slightly above it for the theoretical distribution (Fig. 5.16). But this time, an interpretation of this discrepancy is less evident, although it is obvious from Fig. 5.14 that the variance of the distribution slightly decreases with frequency, probably explaining why the Kolmogorov-Smirnov test sometimes fails.

Fig. 5.16
figure 16

Zoom on the histogram revealing abnormal values of both curves

Figure 5.17 presents the histogram of the imaginary part of the compensated transfer function, traced in blue, and compares it to the theoretical histogram of a Gaussian distribution with zero mean and the same variance, traced in red, computed for the same number of samples. The two traces look similar, except around the center of the distribution where the empirical distribution visibly has a higher peak than the theoretical one. As a consequence, several repetitions of the Kolmogorov-Smirnov test with different simulated samples of the Gaussian distribution always failed.

Fig. 5.17
figure 17

Histogram of imaginary part of compensated transfer function

A zoom on the histogram around the peak of the distribution confirmed that the empirical distribution has a higher, and simultaneously narrower, peak than the theoretical one. Further, they both top at the same slightly negative value, but not at zero as expected (Fig. 5.18). This behavior points to a non-constant variance over the whole frequency range, as is visible in Fig. 5.14: the variance of the distribution decreases with frequency. This is why the Kolmogorov-Smirnov test always fails.

Fig. 5.18
figure 18

Zoom on the histogram revealing abnormal peak of empirical distribution

Indeed, complete analysis of the impulse response [8] reveals that the variance of the frequency response at a given frequency is proportional to the reverberation time at that frequency. Since reverberation times at high frequencies are always shorter, this is why the variance at high frequency is also smaller. Proper compensation of the spectrum should therefore take this property into account.

7 Conclusion

In this paper, we hope to have convinced the reader that, despite some unsuccessful previous attempts to prove it, impulse responses are Gaussian process, provided that global analysis is carried out on hand of a proper model of impulse responses. In this process, three points are essential: discard the early part with strong reflections, and the late part which simply is background noise; accurately compensate for the decay; and compensate for the source spectrum. Further, we have shown that such an analysis can reveal the shortcomings of the measurement procedure, especially of the inverse filtering used to recover the impulse from the measurement signal: care must be taken that it accurately provides zero mean.

Moreover, the analysis has also revealed that Schroeder’s simple model is not sufficient, especially in the frequency domain. There remains a frequency dependent variance which Schroeder’s model does not account for. As a consequence, a more complex time-frequency compensation is needed to improve the model so that it passes the statistical tests. This sets the goal for further research in the domain.