11.1 Representative HRTF Database

Table 11.1 shows representative publically available sites of HRTF databases.

Table 11.1 Publically available sites of HRTF databases

In this chapter, the databases of the following five research institutes are compared.

  1. 1).

    Acoustics Research Institute (ARI), Austria

  2. 2).

    Center for Image Processing and Integrated Computing Interface Laboratory (CIPIC), U.S.A.

  3. 3).

    Spatial Hearing Laboratory (SHL), Chiba Institute of Technology, Japan

  4. 4).

    Institut de Recherche et Coordination Acoustique/Musique (IRCAM), France

  5. 5).

    Research Institute of Electrical Communication (RIEC), Tohoku University

An outline of these databases is shown in Table 11.2 (Yan et al. 2014). All research institutes measured HRTFs under the blocked-entrance condition (Shaw and Teranishi 1968).

Table 11.2 Outline of the five HRTF databases considered herein. (Yan et al. 2014)

The minimum number of subjects is 45 (CIPIC) and the maximum number of subjects is 105 (RIEC). In the four research institutes other than IRCAM, an array in which multiple loudspeakers are arranged in the vertical direction is installed, and the HRIRs for various three-dimensional directions were measured by rotating the subject or the array in the horizontal direction. At IRCAM, HRIRs were measured by moving one loudspeaker in the vertical direction and then rotating the subject in the horizontal direction.

There are large differences in the lengths of the measured HRIRs, which are 200 samples in CIPIC and 8192 samples in IRCAM. The number of measurement directions is seven to 148 at SHL, 1250 at CIPIC, 1550 at ARI, 187 at IRCAM, and 865 at RIEC.

11.2 Comparison of Spectral Cues

Next, N1, N2, and P1 frequency are compared between databases. The N1, N2 and P1 frequencies were calculated from the HRIRs for the front direction in the five databases using the method described in Sect. 10.2. Their histograms are shown in Fig. 11.1 (Yan et al. 2014).

Fig. 11.1
figure 1

Histograms of P1, N1, and N2 frequencies of HRTFs for front direction for five HRTF databases. (Yan et al. 2014)

The histograms for each database are approximately normally distributed. However, the peaks of the ARI histograms are at a higher frequency than the other histograms.

The average, minimum, and maximum frequencies of N1, N2 and P1 for the front direction of each database are shown in Table 11.3. Comparing the databases, the average frequency of RIEC is the lowest, and that of ARI is the highest for N1 and P1. For N2, the average frequency of SHL is the lowest, and that of ARI is the highest.

Table 11.3 Average, minimum, and maximum frequencies of N1, N2, and P1 for the front direction for five database (Hz). (Yan et al. 2014)

As such, the N1, N2, and P1 frequencies in Japanese databases are low, and those in ARI are high, as compared to the other databases.

Furthermore, statistical tests were performed in order to verify whether there exists a significant difference in the average frequencies among the databases. The results are shown in Table 11.4.

Table 11.4 Results of statistical tests for N1, N2, and P1 frequency. (Yan et al. 2014)

The average frequency of ARI was significantly higher for N1, N2, and P1 compared to the other four databases (p < 0.01). For P1, the average frequency of RIEC is significantly lower compared with the other four databases (p < 0.05).

11.3 Comparison of Pinna Shape

As mentioned in Chap. 3, since N1, N2, and P1 are caused by the resonance of the cavities in the pinna, the differences in N1, N2, and P1 frequencies among the databases are related to the differences in pinna size.

Among the five HRTF databases, for which the N1, N2, and P1 frequencies were analyzed, detailed pinna anthropometric dimension data for the subjects are available for ARI, CIPIC, and SHL. The number of subjects for ARI, CIPIC, and SHL are 40 (80 ears), 34 (74 ears), and 28 (56 ears), respectively.

Histograms of each pinna anthropometric dimension are shown in Fig. 11.2, and their statistics are shown in Table 11.5. The average pinna anthropometric dimension of SHL is larger than those of ARI and CIPIC, except for x7.

Fig. 11.2
figure 2

Histograms of pinna anthropometric dimensions. (Yan et al. 2014)

Table 11.5 Statistics for pinna anthropometric dimensions (mm). (Yan et al. 2014)

Statistical tests were performed in order to verify whether there exists a significant difference in the average pinna anthropometric dimensions among the databases. The results are shown in Table 11.6. Significant differences (p < 0.05) were observed for almost all combinations. In other words, there exists a difference in ear size among the databases.

Table 11.6 Results of statistical tests for pinna anthropometric dimensions. ∗: p < 0.05, ∗∗: p < 0.01 (Yan et al. 2014)

In the previous section, we found that N1, N2, and P1 frequencies were higher for ARI than for the other four databases. Here, let us consider the reasons. Since N1, N2, and P1 are generated by resonance in the pinna cavities, it is inferred that the pinna dimensions of ARI are smaller than those in other databases. The pinna anthropometric parameter for which the average dimension of ARI was statistically significantly smaller than in other databases was x6 (length of cavity of concha) (p < 0.01). As described in Sect. 3.5, the x6 dimension is shown to have a significant effect on the N1, N2, and P1 frequencies. In other words, the fact that x6 is smaller than other databases is considered to be one of the reasons why the N1, N2, and P1 frequencies for ARI are higher than for other databases.