Keywords

1 Introduction

Auralization is a subject of high scientific interest due to its extensive applications in areas such as research and consultancy [1,2,3]. In that sense, the development of new methods for interactive spatial sound reproduction that allows a more realistic hearing experience is desired. Binaural reproduction can be achieved by using headphones or loudspeakers with crosstalk cancellation [4]. Arrays of loudspeakers can be also implement to synthesize the acoustic field based on spatial reproduction techniques such as Ambisonics or Wave Field Synthesis [5].

A methodology commonly used to reconstruct acoustic fields that satisfy the homogenous Helmholtz equation is based on a Plane Wave Expansion (PWE) [6]. This mathematical representation allows for the use of multiple sound reproduction techniques and the generation of interactive auralizations in which the listener can interact with the virtual space [7]. For example, translation can be simply achieved by the application of delays that modify the phase of the planes waves according to the listener’s movement [8].

In terms of rotation, several methods can be implemented for a plane wave expansion. The interpolation of Head Related Transfer Functions (HRTFs) is a widely used methodology by the scientific community, but it is restricted only for binaural reproduction [9]. Other approaches that enable multiple sound reproduction techniques are spherical harmonics [10] and Vector Base Amplitude Panning [11]. An analysis of these two last methods as rotation operators is presented in this paper.

The paper is organized as follows: the theoretical bases of the plane wave expansion and the derivation of the rotation operators are described in Sect. 2. Section 3 addresses the methods and experiments. An analysis of the outcomes is carried out in Sect. 4. Finally, in Sect. 5, the conclusions are presented.

2 Theoretical Bases

2.1 Plane Wave Expansion

A sound field that satisfy the homogenous Helmholtz equation can be described in terms of plane waves as [7]

$$\begin{aligned} p(\mathbf {x},\omega )=\int _{\mathbf {\widehat{y}} \in \varOmega } e^{jk\mathbf {x}\cdot \mathbf {\widehat{y}}} q(\mathbf {\widehat{y}},\omega ) d\varOmega (\mathbf {\widehat{y}}), \end{aligned}$$
(1)

where j is the imaginary unit, k is the wavenumber, \(\mathbf {x}\) corresponds to the evaluation point, \(\mathbf {\widehat{y}}\) is a unit vector identifying the direction of arrival of each plane wave, q is the amplitude density function and \(\varOmega \) denotes a sphere of unitary radius. This continuous distribution can be related to an infinite number of loudspeakers that are located far from the listener’s location. Nevertheless, in terms of implementation, the use of an infinite number of loudspeakers is not feasible so Eq. 1 must be discretized in a finite number of L plane waves leading to

$$\begin{aligned} p(\mathbf {x},\omega )=\sum ^{L}_{l=1} e^{jk\mathbf {x}\cdot \mathbf {\widehat{y}}_{l}} q(\mathbf {\widehat{y}}_{l},\omega )\varDelta \varOmega _{l}, \end{aligned}$$
(2)

in which \(\varDelta \varOmega \) is the area attributed to each direction \(\mathbf {\widehat{y}}_{l}\). A consequence of discretizing equation is the local dependency on the accuracy of the reconstructed acoustic field. In that sense, [12] propose the following relation between the number of plane waves, the frequency of the field and the area of accurate reconstruction

$$\begin{aligned} L = \left( \left\lceil 2\pi \frac{R}{\lambda } \right\rceil + 1\right) ^{2}, \end{aligned}$$
(3)

where \(\left\lceil \cdot \right\rceil \) is the ceiling round operator, L is the number of plane waves, \(\lambda \) is the wavelength and R is the radius of a sphere within which the reconstruction is accurate. Finally, it is important to point out that the use of plane waves as kernel of propagation imposes constrains such as the reconstruction of near acoustic fields.

2.2 Rotation Operators

Spherical Harmonics

The spherical harmonics are the angular component of the solution of the wave equation when it is expressed in spherical coordinates. For an interior case, in which there are not acoustic sources inside of the reconstructed area, the acoustic pressure is given by [13]

$$\begin{aligned} p(\mathbf {r},\omega )=\sum ^{\infty }_{n=0}\sum ^{n}_{m=-n} A_{nm}(\omega )j_{n}(kr)Y_{n}^{m}(\theta ,\phi ), \end{aligned}$$
(4)

where \(j_{n}\) is the spherical Bessel function of the first kind of order n and \(Y_{n}^{m}(\theta ,\phi )\) are the spherical harmonics defined by as

$$\begin{aligned} Y_{n}^{m}(\theta ,\phi )=\sqrt{\frac{(2n+1)}{4\pi }\frac{(n-m)!}{(n+m)!}}P_{n}^{m}(\cos \theta )e^{jm\phi }, \end{aligned}$$
(5)

in which \(P_{n}^{m}\) is the Legendre associated function. The discretized plane wave expansion, namely Eq. (2), can be described in terms of spherical harmonics using the Jacobi-Anger expansion as [14]

(6)

where \((\cdot )^{*}\) denotes the complex conjugate. Based on the orthogonality relation of the spherical harmonics, Eq. (6) can be simplified as

$$\begin{aligned} A_{nm}(\omega )= 4\pi \sum ^{L}_{l=1}\sum ^{\infty }_{n=0}\sum ^{n}_{m=-n}j^{n}q_{l}(\omega )Y_{n}^{m}(\theta _{l},\phi _{l})^{*}d\varOmega (\widehat{\mathbf {y}_{l}}). \end{aligned}$$
(7)

Equation (7) describes the plane wave expansion in terms of complex spherical harmonic coefficients. Based on this representation, it is possible to implement a sound field operator and return to the plane wave domain after the rotation has been performed. A shifting in the azimuthal plane of \(\phi _{0}\) can be expressed as

$$\begin{aligned} p(r,\theta ,\phi -\phi _{0},\omega )=\sum ^{\infty }_{n=0}\sum ^{n}_{m=-n} A_{nm}(\omega )j_{n}(kr)Y_{n}^{m}(\theta ,\phi -\phi _{0}). \end{aligned}$$
(8)

Expanding the right side of Eq. (8)

(9)

yields

$$\begin{aligned} p(r,\theta ,\phi -\phi _{0},\omega )=\sum ^{\infty }_{n=0}\sum ^{n}_{m=-n}j_{n}(kr)Y_{n}^{m}(\theta ,\phi )A_{\phi _{0}nm}(\omega ), \end{aligned}$$
(10)

in which

$$\begin{aligned} A_{\phi _{0}nm}(\omega )=A_{nm}(\omega )e^{-jm\phi _{0}}. \end{aligned}$$
(11)

Equation (11) indicates that the rotation of the sound field in the azimuthal plane can be performed by taking the product between the complex spherical harmonic coefficients and a complex exponential, which argument depends on the angle of rotation. A decoding approach can be performed to return to the plane wave domain after the rotation has been conducted [14]. This is achieved by truncating the spherical harmonic series, namely Eq. (7), to an order N.

$$\begin{aligned} A_{nm}(\omega )= 4\pi \sum ^{L}_{l=1}\sum ^{N}_{n=0}\sum ^{n}_{m=-n}j^{n}q_{l}(\omega )Y_{n}^{m}(\theta _{l},\phi _{l})^{*}d\varOmega (\widehat{\mathbf {y}_{l}}), \end{aligned}$$
(12)

for \(n=0...N\) and \(\left| m\right| \le n\). This is a finite set of linear equations that can be solved in terms of the least squares solution by formulating an inverse problem [14]. In order to have at least one solution, the number of spherical harmonic coefficients \((N+1)^{2}\) is required to be lower than, or equal to, the number of plane waves, namely \(L\ge (N+1)^{2}\). Equation (12) can be written in matrix notation as

$$\begin{aligned} \mathbf {a}=\mathbf {Y}\mathbf {q}. \end{aligned}$$
(13)

The relation between the number of spherical harmonic coefficients and the number of plane waves defines the dimensions of matrix \(\mathbf {Y}\). For the case of \(L>(N+1)^{2}\), the problem is overdetermined yielding a matrix that is not squared. The solution for \(\mathbf {q}\) is given by

$$\begin{aligned} \mathbf {q}=\mathbf {Y}^{\dagger }\mathbf {a}, \end{aligned}$$
(14)

where \((\cdot )^{\dagger }\) indicates the Moore-Penrose pseudo-inverse (L2 Norm).

Vector Base Amplitude Panning (VBAP):

VBAP is a sound reproduction technique based on the formulation of amplitude panning functions as vectors and vector basis. It allows the incoming direction of a wave to be controlled over a unit sphere. For 3D sound reproduction, a set of three loudspeakers closest to the target incoming direction are selected to reproduce the sound. In that sense, the sound field generated by a PWE can be rotated in the azimuthal plane by \(\phi \) degrees simply by shifting the plane waves in the opposite direction of the orientation of the listener.

$$\begin{aligned} p(\mathbf {x_{\text {rotated}}},\omega )=\sum ^{L}_{l=1} e^{jk\mathbf {x}\cdot \mathbf {\widehat{y}}_{(l-\phi )}} q(\mathbf {\widehat{y}}_{(l-\phi )},\omega )\varDelta \varOmega _{(l-\phi )}, \end{aligned}$$
(15)

These rotated plane waves can be recreated by using multiple sets of three different plane waves, whose incoming directions are restricted to the directions established by the original discretized plane wave expansion. This means that for each plane wave of the PWE, a set of three planes waves must be used to generate the rotated version. The amplitude weightings of each set of three plane waves are estimated by means of an inverse method. The formulation to rotate one plane wave is presented as follows (same principle applies for the remaining plane waves) [15].

The direction of the target “rotated” plane wave direction is determined by the unit vector \(\mathbf {\widehat{y}}= [y_{1}, y_{2}, y_{3}]^{T}\). Likewise, the amplitude weightings of the three plane waves used to generate this “rotated” plane wave are represented by the vector \(\mathbf {q} = [q_{1}, q_{2}, q_{3}]^{T}\). Finally, the matrix that contains the direction of the three selected plane waves closest to the target incoming direction is denoted as \(\mathbf {L} \in \mathbb {R}^{(3 \times 3)}\), in which the coordinates of each plane wave are determined by each column of the matrix i.e. \( \mathbf {l}_{1}=\mathbf {L}(:,1)\). Therefore, the following relation is established

$$\begin{aligned} \mathbf {\widehat{y}} = \mathbf {L}\mathbf {q}, \end{aligned}$$
(16)

whose solution for \(\mathbf {q}\) is given by

$$\begin{aligned} \mathbf {q} = \mathbf {L}^{-1}\mathbf {\widehat{y}}. \end{aligned}$$
(17)

In addition, the amplitude weightings are normalized based on a coherent summation in which the sum lead to unity, namely,

$$\begin{aligned} \mathbf {q}_{\text {normalized}}= \frac{\mathbf {q}}{q_{1}+q_{2}+q_{3}}. \end{aligned}$$
(18)

3 Methods and Results

Numerical simulations have been conducted in Matlab to evaluate the performance of the rotation operators. Firstly, a sound field corresponding to a plane wave of 250 Hz coming from an elevation (\(\theta =90\)) and azimuth angles of (\(\phi =45,170\)) were analytically synthesized in a free field domain with dimensions of 5 m \(\times \) 10 m \(\times \) 3 m. Samples of the sound fields were extracted by using a cubic virtual microphone array with linear dimensions of 1.6 m and a spatial resolution of 0.2 m (729 microphone positions). This information was used to estimate the complex amplitude of a PWE by means of an inverse method. The number of plane waves was chosen to be (\(L = 64\)) because it corresponds to the number of complex spherical harmonic coefficients for an order (\(N=7\)), which facilitates the implementation and assessment of the rotation operators.

Figure 1 shows the comparison between the real part of the analytical (A) and the reconstructed (B) acoustic pressure (Pa) in a cross-section of the domain (z = 1.5 m). The black circle corresponds to the area of expected accurate reconstruction by solving Eq. (3) for R. The results indicate that the plane wave expansion is able to accurately synthesize the target acoustic field, but as expected, only within a specify area of the domain. Good match between the area of accurate reconstruction and the radius predicted by Eq. 3 was also found.

Fig. 1.
figure 1

Acoustic field reconstructed by means of a plane wave expansion (\(L=64\)). Target field corresponds to a plane wave coming from (\(\theta =90\)) and (\(\phi =45\)).

Two cases have been evaluated. The first corresponds to the plane wave incoming from \(\theta =90\) and \(\phi = 45\), which is rotated by \(\phi _{0}= 45^\circ \). The second case is the plane wave incoming from \(\theta =90\) and \(\phi = 170\), which is rotated by \(\phi _{0}= 60^\circ \).

3.1 Rotation by Means of Spherical Harmonics

Based on the reconstructed acoustic field illustrated in Fig. 1B, a shift of \(45^{\circ }\) and \(60^{\circ }\) in the azimuthal angle was carried out to evaluate the rotation of sound fields using a description in terms of spherical harmonics. Figure 2 illustrates a diagram of the implementation.

Fig. 2.
figure 2

Diagram of the implementation of the spherical harmonic rotation operator.

Figures 3 and 4 show the reconstructed acoustic pressure compared to their analytical references. (A) corresponds to the initial reference sound field, (B) is the reconstructed sound field by the plane wave expansion, (C) is the reference rotated sound field and (D) is the reconstructed and rotated sound field by the implementation of a spherical harmonic transformation.

Fig. 3.
figure 3

Rotation of an acoustic field by means of spherical harmonics. The reference sound field is a plane wave coming from (\(\theta =90\)) and (\(\phi =45\)). The rotation angle corresponds to (\(\phi _{0}=45\)).

Fig. 4.
figure 4

Rotation of an acoustic field by means of spherical harmonics. The reference sound field is a plane wave coming from (\(\theta =90\)) and (\(\phi =170\)). The rotation angle corresponds to (\(\phi _{0}=60\)).

Results confirm the suitability of the spherical harmonic transformation to rotate the acoustic field. No relevant differences were found between the acoustic fields of the plots (B) and (D) close to the central point of the expansion, which indicates that rotation of the sound field using spherical harmonics does not affect the initial accuracy achieved by discretized plane wave expansion, namely the radius predicted by Eq. (3). However, this statement is true only if the number of spherical harmonic coefficients is equal to the number of plane waves, \((N+1)^{2}=L\). Otherwise, the area of accurate reconstruction is reduced according to the number of spherical harmonic coefficients implemented.

3.2 Rotation by Means of VBAP

An implementation based on VBAP has been carried out to rotate the acoustic field in the plane domain directly. Figure 5 describes the signal processing flow of the algorithm. A comparison between rotated acoustic fields using VBAP and their analytical references is presented in Figs. 6 and 7. Same angles implemented for the spherical harmonic case has been considered.

Fig. 5.
figure 5

Diagram of the implementation of the VBAP rotation operator.

Fig. 6.
figure 6

Rotation by means of VBAP. The reference sound field is a plane wave coming from (\(\theta =90\)) and (\(\phi =45\)). The rotation angle corresponds to (\(\phi _{0}=45\)).

Fig. 7.
figure 7

Rotation of an acoustic field by means of VBAP. The reference sound field is a plane wave coming from (\(\theta =90\)) and (\(\phi =170\)). The rotation angle corresponds to (\(\phi _{0}=60\)).

The results indicate that VBAP is also a suitable approach to perform the rotation of acoustic fields, which are described by a plane wave expansion. Nevertheless, a more robust analysis is performed in the following section to compare both approaches.

4 Metrics for Performance

In this section, a comparison of the rotation methods is conducted by means of the spatial distribution of the energy in the plane wave expansion, the energy required for the synthesis of the acoustic field and normalized error.

4.1 Spatial Distribution of the Energy

An analysis of the spatial distribution of the energy has been conducted to assess whether the rotation operator affects its integrity. The PWE is discretized by means of an “uniform” sampling so it is expected that this spatial distribution should not change. Figures 8 and 9 show the interpolated energy density function q plotted over an unwrapped unit sphere for the two cases considered. (A) corresponds to the initial spatial energy distribution of the PWE, (B) is the spatial energy distribution of the PWE after rotation has been performed using spherical harmonics and (C) is the spatial energy distribution of the PWE after rotation has been performed using VBAP.

Fig. 8.
figure 8

Spatial energy distribution. The reference sound field is a plane wave coming from (\(\theta =90\)) and (\(\phi =45\)). The rotation angle corresponds to (\(\phi _{0}=45\)).

Fig. 9.
figure 9

Spatial energy distribution. The reference sound field is a plane wave coming from (\(\theta =90\)) and (\(\phi =170\)). The rotation angle corresponds to (\(\phi _{0}=60\)).

Outcomes indicate that the energy is mainly focused on the direction in which rotation has been performed. Nevertheless, the spatial distribution of the energy of the PWE tends to remain unmodified in the case of the spherical harmonic rotation operator. In contrast, for VBAP, a change in the spatial energy distribution is found. This suggests that the relation in terms of amplitudes and phases between the plane waves is modified yielding to a rotated, but different synthesized sound field.

4.2 Energy Required for the Synthesis of the Rotated Sound Field

An evaluation of the total energy required by the plane wave expansion to synthesize the acoustic field when rotation operators are implemented is carried out. The total energy of the PWE is estimated from Eq. (19).

$$\begin{aligned} \text {E}(\omega )\sim \sum _{l=1}^{L} \left| q_{l} \right| ^{2} (\omega ). \end{aligned}$$
(19)

Table 1 illustrates the values of energy corresponding to the PWE before and after rotation operators are implemented. The results indicate that the energy is similar for the spherical harmonic operator. However, the energy is lower when VBAP is implemented. The reason is because the reference acoustic field corresponds to a single plane wave. This means that only 3 plane waves of the PWE are generating the sound field for the case of VBAP. It is expected that in more complex acoustic fields in which all the plane waves are used, the energy gets higher.

Table 1. Energy of the plane wave expansion before and after rotation of the acoustic field.
Fig. 10.
figure 10

Normalized error for the spherical harmonic case. (A) is the reference sound field (\(\phi =45\)), (B) is the rotated sound field (\(\phi _{0}=45\)), (C) is the reference sound field (\(\phi =170\)) and (D) corresponds to the rotated field (\(\phi _{0}=60\)).

Fig. 11.
figure 11

Normalized error for the VBAP case. (A) is the reference sound field (\(\phi =45\)), (B) is the rotated sound field (\(\phi _{0}=45\)), (C) is the reference sound field (\(\phi =170\)) and (D) corresponds to the rotated field (\(\phi _{0}=60\)).

4.3 Normalized Error

A normalized error is implemented to compare the rotated reconstructed acoustic field respect to the target one. This allows to evaluate if the area of accurate reconstruction given by the PWE is affected by the rotation operators. The normalized error is defined as:

$$\begin{aligned} \text {e}(\mathbf {x},\omega )=10\log _{10}\left[ \frac{\left| p(\mathbf {x},\omega )-\tilde{p}(\mathbf {x},\omega ) \right| ^{2}}{\left| p(\mathbf {x},\omega ) \right| ^{2}} \right] , \end{aligned}$$
(20)

where \(p(\mathbf {x},\omega )\) is the target acoustic pressure and \(\tilde{p}(\mathbf {x},\omega )\) is the reconstructed acoustic pressure. Figures 8 and 9 show the normalized errors in a cross-section of the domain (z = 1.5 m) for both rotation algorithms. The white contour defines the region within which the normalized error is smaller than −20 dB. Figures 10 and 11 for both approaches, spherical harmonics and VBAP, respectively.

An analysis of the acoustic errors indicates that the area of accurate reconstruction is similar when the spherical harmonic operator is used. In contrast, the outcomes show that the implementation of VBAP as a rotation operator reduce the area in which the synthesis of the sound field is correct. These findings suggest that a lower order of spherical harmonics is required to achieve the same accuracy as VBAP for use as rotation operators.

5 Conclusions

Two different methodologies have been evaluated to perform the rotation of acoustic fields based on a plane wave representation. The suitability of these approaches extent the use of the PWE as kernel for interactive auralizations. Applications of this method can be real-time sound field processing for video games, listening tests, among others.

An implementation of VBAP as an interpolation tool validates the suitability of this method to rotate sound fields in the plane wave domain. However, a comparison with the rotation operator based on a spherical harmonic transformation reveals that the latter approach is more accurate in terms of sound field reconstruction.

The outcomes also support that it is required to implement the spherical harmonic rotation operator to preserve the initial accuracy given by the PWE. However, this statement holds as the number of spherical harmonic coefficients is equal to the number of plane waves.