7.1 Sound Source Distance and Sound Image Distance

First, the experimental results on the relationship between the sound source distance and the sound image distance are introduced.

Figure 7.1 shows the relationship between the speaker distance up to 10 m in the front direction and the sound image distance (Von Bekesy 1949). Note that, in this figure, the horizontal axis is the sound image distance, and the vertical axis is the sound source distance. The sound image distance coincides with the sound source distance up to approximately 3 m, but the sound image distance does not increase greatly, even if the sound source distance increases.

Fig. 7.1
figure 1

Relationship between the distance from a listener to a speaker and the sound image distance perceived by the listener (Von Bekesy 1949). Open circles and x symbols denote the responses of two subjects. The bold line denotes the average of five subjects. The subjects were blindfolded

In other words, the sound image does not occur over a wide range of distance, and there is a limitation for the auditory space in which the sound image distance is perceived.

Why does such a phenomenon occur? As described above, in direction perception, an HRTF, the characteristics of which change significantly depending on the sound source direction, is an important cue. However, the HRTF depends on the distance only in the near sound field within approximately 1 m of the sound source and changes only slightly at greater distances.

In other words, for a sound source at a distance of 1 m or more, the HRTF does not become a cue for distance perception. The absence of “physical cues derived from the human body” reflecting the difference in sound source distance makes distance perception difficult.

7.2 Physical Characteristics that Affects Sound Image Distance

In the process of propagating sound waves in space, there are several physical characteristics that affect the sound source distance. The principal characteristics are described below.

7.2.1 Sound Pressure Level

Keeping the output sound pressure of the sound source constant and changing the distance from the sound source, the sound pressure level at the receiving point changes, and as a result, the loudness also changes.

Figure 7.2 shows subjects’ responses to the sound image distance obtained in an experiment in which five loudspeakers were arranged in a line at equal intervals between 3 m and 9 m in front of the subject and two loudspeakers at 3 m and 9 m emitted voices at various sound pressure levels (Gardner 1969). The sound image distance is independent of the actual distance of the sound source and depends on the sound pressure level at the position of a subject. Similar results have been obtained in a number of studies, and it is clear that the sound pressure level at the listener’s position affects the sound image distance.

Fig. 7.2
figure 2

Relationship between the sound pressure level at the listener and the sound image distance for loudspeakers placed at 3 m and 9 m. (Gardner 1969)

Here, let us consider the mechanism by which the sound pressure level is related to the sound image distance. In order for the sound pressure level to be a cue for distance perception, it is necessary for the listener to have knowledge of the listening sound pressure level or the loudness of the target sound source at a certain distance in advance. Since this condition is not always satisfied, it is not always possible to accurately perceive the distance of the sound source based on the sound pressure level.

Figure 7.3 shows the relationships between the sound source distance and the sound image distance in an anechoic room for different types of live voice, such as “whisper”, “shout”, “low level”, and “conversation level”. Since the subject was blindfolded, there was no visual information on the sound source distance. This figure shows that even for the same sound source distance, the sound image distance for “shout” is farther than that of “conversation”, and that of “whisper” is closer than that of “conversation”. This suggests that humans learn the relationship between the sound source distance and the listening sound pressure level or loudness for each type of sound source and use this value for distance perception.

Fig. 7.3
figure 3

Relationships between the sound source distance and the sound image distance for different types of live voice. (Gardner 1969)

7.2.2 Time Delay of Reflections

In a usual sound field, in addition to direct sound, many reflected sounds are incident. It has been reported that sound image distance increases with an increase in the delay time of a reflection, as shown in Fig. 7.4 (Gotoh et al. 1977).

Fig. 7.4
figure 4

Relationship between the time delay of a single reflection and the sound image distance. (Gotoh et al. 1977)

Furthermore, experiments in which the reflections of the actual sound field were simulated while changing the distance between the sound source and the receiving point were performed. The results showed that sound image distance was perceived in the order of the distance between the sound source and the receiving point.

These results suggest that humans use reflections as a cue for distance perception.

7.2.3 Incident Direction

  1. A.

    Relationship between incident azimuth angle and sound image distance

The sound image distance is also influenced by the incident direction. Experiments were conducted in which white noise was presented in pairs to a subject from loudspeakers placed at twelve azimuth angles in the horizontal plane (in 30° steps) in an anechoic chamber.

Figure 7.5 shows the sound image distance obtained by Ura Variation of Scheffe’s paired comparison. The curve is concave at the front and broadens as it moves backward. That is, the sound images at 0° and ± 30° are close, and the sound images at 180° and ± 150° are far.

Fig. 7.5
figure 5

Relationship between the azimuth angle of a sound source and the sound image distance

Figure 7.6 shows groups of azimuth angles, in which there exists no significant difference in sound image distance (p < 0.01). The twelve azimuth angles were divided into five groups: front, diagonally front, sideways, diagonally back, and rear. The sound sources located at symmetrical positions are in the same group, and the distance perception of a sound source in the horizontal plane is considered to have left-right symmetry.

Fig. 7.6
figure 6

Groups of azimuth angles, in which there exists no significant difference in sound image distance (p < 0.01)

Figure 7.7 shows relationship between the azimuth angle and the sound image distance for each subject. There exist some individual differences in the incident azimuth angle dependence of the sound image distance.

Fig. 7.7
figure 7

Relationship between the azimuth angle of a sound source and the sound image distance for each subject

Figure 7.8 shows groups of azimuth angles in which there exist no significant difference in sound image distance (p < 0.01). The azimuth angle, at which the sound image distance was the nearest, was 0° for seven out of ten subjects. The azimuth angle at which the sound image distance was the farthest was 150°, −150°, or 180° for eight out of ten subjects. This tends to be the same as the average sound image distance for all subjects.

Fig. 7.8
figure 8

Groups of azimuth angles, in which there exists no significant difference in sound image distance (p < 0.01) for each subject

Although the relationship between the incident azimuth angle of the sound source and that of the sound image distance differ somewhat depending on the subject, the tendency to perceive a front sound image as being near and a rear sound image as being far is common. However, subject F did not differ significantly for all pairs.

  1. B.

    Relationship between incident vertical angle and sound image distance

The sound image distance was obtained for seven vertical angles in the upper median plane (in 30° steps) for ten subjects as well as for the azimuth angle. The results are shown in Fig. 7.9. Sound images of 0° and 30° were perceived to be near, and those of 120°, 90°, and 150° were perceived to be far.

Fig. 7.9
figure 9

Relationship between the vertical angle of a sound source and the sound image distance

Figure 7.10 shows groups of vertical angles, in which there exists no significant difference in sound image distance (p < 0.01). The seven vertical angles were divided into three groups. The vertical angle at which the sound image distance was the nearest was 0°, and that at which the sound image distance was the farthest was diagonal back.

Fig. 7.10
figure 10

Groups of vertical angles, in which there exists no significant difference in sound image distance (p < 0.01)

Figure 7.11 shows relationship between the vertical angle and the sound image distance for each subject. The vertical angle at which the sound image distance was the nearest was 0° or 30° for nine out of ten subjects. All ten subjects responded that 90°, 120°, or 150° was the farthest.

Fig. 7.11
figure 11

Relationship between the vertical angle of the sound source and the sound image distance for each subject

Figure 7.12 shows groups of vertical angles in which there exists no significant difference in sound image distance (p < 0.01). The vertical angle group for which the sound image distance was the nearest included 0° for seven out of ten subjects. The vertical angle group for which the sound image distance was the farthest included 120° for eight out of ten subjects.

Fig. 7.12
figure 12

Groups of vertical angles, in which there exists no significant difference in sound image distance (p < 0.01) for each subject

These results infer that most of the subjects perceived the front sound images as being near and the diagonally rear sound images as being far. However, subject J did not differ significantly for all pairs.

  1. C.

    Relationship between binaural sound pressure level and sound image distance

One reason for the difference in sound image distance depending on the sound source direction can be considered binaural summation of sound pressure level (BSPL), which is defined as follows (Robinson and Whittle 1960):

$$ BSPL=6{\log}_2\left({2}^{L_l/6}+{2}^{L_r/6}\right) $$
(7.1)

where Ll and Lr indicate the sound pressure levels of the left ear and the right ear, respectively.

Figure 7.13 shows the average values of all subjects’ relative BSPLs in the horizontal plane and the upper median plane, with the BSPL of the front direction set as 0 dB. This figure suggests that the BSPL at the front direction is large in both the horizontal and median planes and there is a negative correlation between the BSPL and the sound image distance.

Fig. 7.13
figure 13

Relative BSPL in the horizontal plane and the upper median plane

Figure 7.14 shows the relative BSPL of each subject in the upper median plane. For all subjects, the BSPL at the front direction is large, and, in the eight subjects excluding subjects G and J, the BSPL at the diagonally back direction is small.

Fig. 7.14
figure 14

Relative BSPL in the upper median plane for each of ten subjects

Furthermore, the result of single regression analysis with the relative BSPL as an objective variable and sound image distance as an explanatory variable is shown in Fig. 7.15. Sound image distance tends to be near with an increase in BSPL, with the exception of several subjects.

Fig. 7.15
figure 15

Relationship between relative BSPL and sound image distance for each of ten subjects

  1. D.

    Summary of the relationship between incident direction and sound image distance

In summary, the sound image distance differs depending on the incident direction of sound both in the horizontal plane and in the median plane. In the horizontal plane, the sound image distance in the front direction is near, and that in the rear direction (150° to 210°) is far. In the median plane, the sound image distance in the front direction is near, and those in the above and diagonally back directions (90°–150°) are far, as shown in Fig. 7.16.

Fig. 7.16
figure 16

Sound image distance in the horizontal plane and the median plane

It has been reported that the “sound image distance of the front direction is perceived nearer than those in other directions” when a three-dimensional acoustic reproduction regeneration system is used. The reason was thought to be that signal processing was not precisely realized. However, the experimental results suggest that the frontal sound image distance is perceived near not only due to a signal processing problem, but also due to human auditory characteristics.

This indicates the possibility that an equidistant sound image is generated by presenting sound so that the BSPL of each direction is the same (Fig. 7.17).

Fig. 7.17
figure 17

Sound image distance in case of equal output sound pressure and equal BSPL