Signal-Independent Array Processing

Jarrett, Daniel P.; Habets, Emanuël A. P.; Naylor, Patrick A.

doi:10.1007/978-3-319-42211-4_6

Daniel P. Jarrett⁶,
Emanuël A. P. Habets⁷ &
Patrick A. Naylor⁸

Part of the book series: Springer Topics in Signal Processing ((STSP,volume 9))

1751 Accesses
1 Citations

Abstract

The process of combining signals acquired by a microphone array in order to ‘focus’ on a signal in a specific direction is known as beamforming or spatial filtering. This chapter considers signal-independent (fixed) beamformers, controlled by weights only dependent on the direction of arrival of the source to be extracted, and which do not otherwise depend on the desired signal. Because the weights of these beamformers are given by simple expressions, they present the advantages of being straightforward to implement and of having low computational complexity.

Access provided by CONRICYT-eBooks. Download chapter PDF

Multimicrophone MMSE-Based Speech Source Separation

Microphone Array

Robust adaptive beamforming based on covariance matrix and new steering vector estimation

Article 31 January 2019

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

The process of combining signals acquired by a microphone array in order to ‘focus’ on a signal in a specific direction is known as beamforming or spatial filtering. We present in this chapter a number of such beamforming methods that are specifically controlled by weights dependent only on the direction of arrival (DOA) of the desired source. They are otherwise signal-independent such that they do not depend on the statistics of the desired or noise signals. We derive maximum directivity and maximum white noise gain beamformers that establish performance bounds for spherical harmonic domain (SHD) beamformers. Because the weights of these beamformers are given by simple expressions, they present the advantages of being straightforward to implement and of having low computational complexity.

6.1 Signal Model

The sound pressure P captured at a position $\mathbf {r} = (r,\varOmega ) = (r,\theta ,\phi )$ (in spherical coordinates, where $\theta $ denotes the inclination and $\phi $ denotes the azimuth) on a spherical microphone array of radius r is commonly expressed as the sum of a desired signal X and a noise signal V [12, 15]. In the spatial domain, the signal model is expressed as

$$\begin{aligned} P(k,\mathbf {r}) = X(k,\mathbf {r}) + V(k,\mathbf {r}), \end{aligned}$$

(6.1)

where k denotes the wavenumber.^{Footnote 1} The desired signal X is assumed to be spatially coherent, while the noise signal V models background noise or sensor noise, for example, and may be spatially incoherent, coherent or partially coherent .

When using spherical microphone arrays, it is convenient to work in the SHD [1, 17]. In this chapter, we assume error-free spatial sampling by Q microphones at positions $\mathbf {r}_q = (r,\varOmega _q), q \in \left\{ 1, \ldots , Q \right\} $, and refer the reader to Chap. 3 for information on spatial sampling and aliasing. By applying the complex spherical harmonic transform (SHT) to the signal model in (6.1), we obtain the SHD signal model

$$\begin{aligned} P_{lm}(k) = X_{lm}(k) + V_{lm}(k), \end{aligned}$$

(6.2)

where $P_{lm}(k)$, $X_{lm}(k)$ and $V_{lm}(k)$ are respectively the spherical harmonic transforms of the spatial domain signals $P(k,\mathbf {r}_q)$, $X(k,\mathbf {r}_q)$ and $V(k,\mathbf {r}_q)$, as defined in (3.6), and are referred to as eigenbeams to reflect the fact that the spherical harmonics are eigensolutions of the wave equation in spherical coordinates [26]. The order and degree of the spherical harmonics are respectively denoted as l and m.

By combining the eigenbeams $P_{lm}(k)$ in a particular way, the noise V can be suppressed and the desired signal X can be extracted from the noisy mixture P. This is accomplished using a spatio-temporal filter or beamformer. In the spatial domain, the output of a beamformer is obtained as the weighted sum of the pressure signals at each of the microphones [3, 4]; in the SHD, the beamformer output is given by a weighted sum of the eigenbeams $P_{lm}(k)$ [14, 21]. The output of an Lth-order SHD beamformer can thus be expressed as [21, Eq. 12]^{Footnote 2}

$$\begin{aligned} Z(k) = \sum _{l=0}^{L} \sum _{m=-l}^{l} W^*_{lm}(k) P_{lm}(k), \end{aligned}$$

(6.3)

where $W_{lm}(k)$ denotes the beamformer weights and $(\cdot )^{*}$ denotes the complex conjugate.

Beamformers can either be signal-independent (fixed) or signal-dependent; their weights are chosen in order to achieve specific performance objectives. Signal-independent beamformers apply a constraint to a specific steering direction and optimize the beamformer weights with respect to array performance measures such as the white noise gain (WNG) and directivity. They can also, more generally, attempt to achieve a specific spatial response in all directions by minimizing the difference between the beamformer’s spatial response and the desired spatial response, according to some distance measure (see [6, Sects. 8.3 and 8.4] for examples). Signal-dependent beamformers optimize the weights taking into account characteristics of the desired signal and noise. In this chapter, we will discuss signal-independent beamformers and later address signal-dependent beamformers in Chap. 7 .

A block diagram of a signal-independent beamformer is shown in Fig. 6.1. We begin by capturing the sound pressure signals $P(k,\mathbf {r}_q)$ at microphones $q \in \{ 1, \ldots , Q \}$, and applying the SHT to obtain the SHD sound pressure signals, the eigenbeams $P_{lm}(k)$, gathered together to form a vector $\mathbf {p}(k)$. The output Z(k) of the beamformer is obtained by taking the weighted sum of these eigenbeams, where the weights $W_{lm}(k,\varOmega _{\text {u}})$ depend only on the steering direction $\varOmega _{\text {u}}$ and do not otherwise depend on the sound pressure signals P.

The signal-independent beamformers presented in this chapter are designed assuming anechoic conditions with a single active sound source, though these assumptions are unlikely to be valid in practical use scenarios. Depending on the distance between this source and the array, the desired signal is either assumed to consist of a plane wave or a spherical wave. Under farfield conditions, the eigenbeams of a unit amplitude plane wave incident from a direction $\varOmega _{\text {s}}$ are given by (3.22a). The SHD sound pressure $X_{lm}(k,\varOmega _{\text {s}})$ related to a plane wave with power $P_{\text {pw}}(k)$ can then be written as [18, 20, 26 ]

$$\begin{aligned} X_{lm}(k,\varOmega _{\text {s}}) = {\sqrt{P_{\text {pw}}(k)}} b_l(k) Y_{lm}^*(\varOmega _{\text {s}}), \end{aligned}$$

(6.4)

where $Y_{lm}(\varOmega _{\text {s}})$ denotes the complex spherical harmonic^{Footnote 3} of order l and degree m evaluated at an angle $\varOmega _{\text {s}}$, as defined in (2.14), and the mode strength $b_l(k)$ captures the eigenbeams’ dependence on the array properties, such as microphone type or array configuration, and is discussed in more detail in Sect. 3.4.2.

All the beamformers designed in this chapter seek to suppress the noise while maintaining a distortionless constraint on the signal originating from the steering direction $\varOmega _{\text {u}}$. This constraint is expressed as

$$\begin{aligned} \sum _{l=0}^{L} \sum _{m=-l}^{l} W^*_{lm}(k) b_l(k) Y_{lm}^*(\varOmega _{\text {u}}) = 1. \end{aligned}$$

(6.5)

It is important to note that this distortionless constraint depends only on the steering direction $\varOmega _{\text {u}}$. It is different from the distortionless constraint imposed in Chap. 7, which takes into account the complex multipath propagation effects of a reverberant environment. Using the constraint in (6.5) can be appealing, as it does not require the estimation of the acoustic transfer functions (ATFs) or relative transfer functions, however this comes at the expense of sensitivity to errors in the steering direction and reduced robustness to reverberation .

For convenience, the SHD signal model in (6.2) can also be expressed in vector form as

$$\begin{aligned} \mathbf {p}(k) = \mathbf {x}(k) + \mathbf {v}(k) \end{aligned}$$

(6.6)

where the SHD signal vector $\mathbf {p}(k)$ of length $(L+1)^2$ is defined as

$$\begin{aligned} \mathbf {p}(k)&= \left[ P_{00}(k)\,\, P_{1(-1)}(k)\,\, P_{10}(k)\,\, P_{11}(k)\,\, P_{2(-2)}(k) \,\cdots \, P_{LL}(k)\right] ^{\text {T}}, \end{aligned}$$

and $\mathbf {x}(k)$ and $\mathbf {v}(k)$ are defined similarly to $\mathbf {p}(k)$. The beamformer output signal Z(k) can be expressed as

$$\begin{aligned} Z(k) = \mathbf {w}^{\text {H}}(k) \mathbf {p}(k), \end{aligned}$$

(6.7)

where the filter weights vector is defined as

$$\begin{aligned} \mathbf {w}(k)&= \left[ W_{00}(k)\,\, W_{1(-1)}(k)\,\, W_{10}(k)\,\, W_{11}(k)\,\, W_{2(-2)}(k) \,\cdots \, W_{LL}(k)\right] ^{\text {T}}. \end{aligned}$$

In matrix form the desired signal is written as

$$\begin{aligned} \mathbf {x}(k,\varOmega _{\text {s}}) = {\sqrt{P_{\text {pw}}(k)}} \mathbf {B}(k) \mathbf {y}^{*}(\varOmega _{\text {s}}), \end{aligned}$$

(6.8)

where the vector of spherical harmonics $\mathbf {y}(\varOmega _{\text {s}})$ of length $(L+1)^2$ is defined as

$$\begin{aligned} \mathbf {y}(\varOmega _{\text {s}}) = \left[ Y_{00}(\varOmega _{\text {s}})\,\, Y_{1(-1)}(\varOmega _{\text {s}})\,\, Y_{10}(\varOmega _{\text {s}})\,\, Y_{11}(\varOmega _{\text {s}})\,\, \cdots \, Y_{LL}(\varOmega _{\text {s}})\right] ^{\text {T}}, \end{aligned}$$

(6.9)

and the $(L+1)^2 \times (L+1)^2$ matrix of mode strengths $\mathbf {B}(k)$ is defined as

$$\begin{aligned} \mathbf {B}(k)&= \text {diag}\left\{ b_{0}(k), b_{1}(k), b_{1}(k), b_{1}(k), b_{2}(k), \ldots , b_{L}(k)\right\} , \end{aligned}$$

(6.10)

therefore $\mathbf {B}(k)$ consists of $2l+1$ repetitions of $b_l(k)$ for $l \in \left\{ 0, \ldots , L \right\} $ along its diagonal. Finally, the distortionless constraint is given by

$$\begin{aligned} \mathbf {w}^{\text {H}}(k) \mathbf {B}(k) \mathbf {y}^{*}(\varOmega _{\text {u}}) = 1. \end{aligned}$$

(6.11)

6.2 Design Criteria

In this section, we introduce a number of measures that can be used to design optimal beamformers as in Sect. 6.3. It should be noted that these measures are defined with respect to the signals with physical significance, namely the spatial domain signals, and not with respect to the eigenbeams. Nevertheless, these measures will still depend on the eigenbeams as they form a part of the spherical harmonic expansion (SHE) of the spatial domain signals.

6.2.1 Directivity

Directivity is a measure of a beamformer’s spatial selectivity and quantifies its ability to suppress sound waves that do not originate from a specifically chosen steering direction. It is defined as the ratio of the power of the beamformer output due to a plane wave arriving from the steering direction $\varOmega _{\text {u}}$ to the power of the beamformer output averaged over all directions [28]. The directivity $\mathcal {D}(k)$ is therefore written as

$$\begin{aligned} \mathcal {D}(k)&= \frac{\left| Z(k,\varOmega _{\text {u}})\right| ^2}{\frac{1}{4 \pi } \int _{\varOmega \in \mathcal {S}^2} \left| Z(k,\varOmega ) \right| ^2 \text {d}\varOmega } \end{aligned}$$

(6.12)

$$\begin{aligned}&= \frac{\left| \sum _{l=0}^{L} \sum _{m=-l}^{l} W^*_{lm}(k) X_{lm}(k,\varOmega _{\text {u}})\right| ^2}{\frac{1}{4 \pi } \int _{\varOmega \in \mathcal {S}^2} \left| \sum _{l=0}^{L} \sum _{m=-l}^{l} W^*_{lm}(k) X_{lm}(k,\varOmega ) \right| ^2 \text {d}\varOmega }, \end{aligned}$$

(6.13)

where the notation $\int _{\varOmega \in \mathcal {S}^2} \text {d}\varOmega $ is used to denote compactly the solid angle $\int _{\phi = 0}^{2\pi } \int _{\theta = 0}^{\pi } \sin \theta \text {d}\theta \text {d}\phi $. Applying the distortionless constraint (6.5), and by substituting the expression for a plane wave (6.4) into (6.12), we find

$$\begin{aligned} \mathcal {D}(k)&= \frac{4 \pi {P_{\text {pw}}(k)}}{\int _{\varOmega \in \mathcal {S}^2} \left| \sum _{l=0}^{L} \sum _{m=-l}^{l} W^*_{lm}(k) {\sqrt{P_{\text {pw}}(k)}} b_l(k) Y_{lm}^*(\varOmega ) \right| ^2 \text {d}\varOmega } \nonumber \\&= {\frac{4 \pi }{\int _{\varOmega \in \mathcal {S}^2} \left| \sum _{l=0}^{L} \sum _{m=-l}^{l} W^*_{lm}(k) b_l(k) Y_{lm}^*(\varOmega ) \right| ^2 \text {d}\varOmega }}. \end{aligned}$$

(6.14)

Using the orthonormality of the spherical harmonics (2.18), this can be simplified to^{Footnote 4}

$$\begin{aligned} \mathcal {D}(k)&= 4 \pi \left( \sum _{l=0}^{L} \sum _{m=-l}^{l} \left| W^*_{lm}(k) b_l(k)\right| ^2\right) ^{-1}, \end{aligned}$$

(6.15)

or in vector form

$$\begin{aligned} \mathcal {D}(k)&= 4 \pi \left| \left| \mathbf {B}(k) \mathbf {w}^{*}(k) \right| \right| ^{-2}, \end{aligned}$$

(6.16)

where $\left| \left| \cdot \right| \right| $ denotes the 2-norm. The directivity is therefore a function of the array properties, such as radius or microphone type, and the beamformer weights $W_{lm}(k)$.

The directivity is frequently expressed in dB and is then referred to as the directivity index (DI),

$$\begin{aligned} \text {DI}(k) = 10 \log _{10} \mathcal {D}(k). \end{aligned}$$

(6.17)

6.2.2 Front-to-Back Ratio

The front-to-back ratio is another alternative measure of a beamformer’s spatial selectivity and quantifies its ability to differentiate between sound waves that originate from the front and the back. It is defined as the ratio of the average power of the beamformer output due to a plane waves arriving from the front to the average power of the beamformer output due to plane waves arriving from the back. The front-to-back ratio $\mathcal {F}(k)$ is therefore written as [7 ]

$$\begin{aligned} \mathcal {F}(k)&= \frac{\frac{1}{4 \pi } \int _{\varOmega \in \mathcal {S}_{\text {F}}^2} \left| \sum _{l=0}^{L} \sum _{m=-l}^{l} W^*_{lm}(k) X_{lm}(k,\varOmega ) \right| ^2 \text {d}\varOmega }{\frac{1}{4 \pi } \int _{\varOmega \in \mathcal {S}_{\text {B}}^2} \left| \sum _{l=0}^{L} \sum _{m=-l}^{l} W^*_{lm}(k) X_{lm}(k,\varOmega ) \right| ^2 \text {d}\varOmega }, \end{aligned}$$

(6.18)

where for a beamformer steered to $(\pi /2,\pi /2)$ we have

$$\begin{aligned} \int _{\varOmega \in \mathcal {S}_{\text {F}}^2} \text {d}\varOmega = \int _{\phi = 0}^{\pi } \int _{\theta = 0}^{\pi } \sin \theta \text {d}\theta \text {d}\phi \end{aligned}$$

(6.19)

and

$$\begin{aligned} \int _{\varOmega \in \mathcal {S}_{\text {B}}^2} \text {d}\varOmega = \int _{\phi = \pi }^{2\pi } \int _{\theta = 0}^{\pi } \sin \theta \text {d}\theta \text {d}\phi . \end{aligned}$$

(6.20)

6.2.3 White Noise Gain

White noise gain (WNG) is a measure of a beamformer’s robustness against sensor noise and errors in microphone placement and steering direction [10], and is defined as the array gain in the presence of spatially incoherent noise [28], i.e., the ratio of the signal-to-noise ratio (SNR) at the beamformer output ($\text {oSNR}$) to the SNR at the beamformer input ($\text {iSNR}$ ) .

We now derive the WNG for a spherical microphone array employing a set of microphones uniformly distributed on the sphere. The desired signal power is different at each microphone, particularly for a rigid sphere where the scattering effects depend on the angle of incidence [16]. When calculating the $\text {iSNR}$, the desired signal power is therefore averaged over the sphere .

Let us assume that the noise at each microphone has equal power $\sigma _{v}^2(k)$. The input SNR is then given by

$$\begin{aligned} \text {iSNR}_{\text {w}}(k)&= \frac{\frac{1}{4 \pi } \int _{\varOmega \in \mathcal {S}^2} \left| X(k,\mathbf {r}) \right| ^2 \text {d}\varOmega }{\sigma _{v}^2(k)} \end{aligned}$$

(6.21a)

$$\begin{aligned}&= \frac{\frac{1}{4 \pi } \int _{\varOmega \in \mathcal {S}^2} \left| \sum _{l=0}^{\infty } \sum _{m=-l}^l X_{lm}(k) Y_{lm}(\varOmega ) \right| ^2 \text {d}\varOmega }{\sigma _{v}^2(k)}, \end{aligned}$$

(6.21b)

where (6.21b) is obtained using the spherical harmonic decomposition of $X(k,\mathbf {r})$. Assuming plane-wave incidence from a direction $\varOmega _{\text {s}}$, by substituting (6.4) into (6.21), we find

$$\begin{aligned} \text {iSNR}_{\text {w}}(k)&= \frac{\int _{\varOmega \in \mathcal {S}^2} \left| \sum _{l=0}^{\infty } \sum _{m=-l}^l {\sqrt{P_{\text {pw}}(k)}} b_l(k) Y_{lm}^*(\varOmega _{\text {s}}) Y_{lm}(\varOmega ) \right| ^2 \text {d}\varOmega }{4 \pi \sigma _{v}^2(k)}. \end{aligned}$$

(6.22)

Using Unsöld’s theorem [29], a special case of the spherical harmonic addition theorem (2.23), and the orthonormality of the spherical harmonics, we simplify (6.22) to

$$\begin{aligned} \text {iSNR}_{\text {w}}(k)&= \frac{ \sum _{l=0}^{\infty } \sum _{m=-l}^l \left| {\sqrt{P_{\text {pw}}(k)}} b_l(k) Y_{lm}^*(\varOmega _{\text {s}}) \right| ^2}{4 \pi \sigma _{v}^2(k)} \end{aligned}$$

(6.23a)

$$\begin{aligned}&= \frac{{P_{\text {pw}}(k)} \sum _{l=0}^{\infty } \left| b_l(k)\right| ^2 (2l+1)}{(4 \pi )^2 \sigma _{v}^2(k)}. \end{aligned}$$

(6.23b)

The input SNR is therefore a function of the plane wave power $P_{\text {pw}}(k)$, the array properties, via the mode strength $b_l(k)$, and the noise power $\sigma _{v}^2(k)$.

The output SNR is given by

$$\begin{aligned} \text {oSNR}_{\text {w}}(k)&= \frac{\left| \sum _{l=0}^{L} \sum _{m=-l}^l W_{lm}^{*}(k) X_{lm}(k) \right| ^2}{\text {E} \left\{ \left| \sum _{l=0}^{L} \sum _{m=-l}^l W_{lm}^{*}(k) V_{lm}(k) \right| ^2 \right\} }. \end{aligned}$$

(6.24)

Applying the distortionless constraint (6.5), this reduces to

$$\begin{aligned} \text {oSNR}_{\text {w}}(k)&= \frac{{P_{\text {pw}}(k)}}{\text {E} \left\{ \left| \sum _{l=0}^{L} \sum _{m=-l}^l W_{lm}^{*}(k) V_{lm}(k) \right| ^2 \right\} }. \end{aligned}$$

(6.25)

With Q microphones uniformly distributed on the sphere, the cross power spectral density of the noise is given by [31, Eq. 7.31]

$$\begin{aligned} \text {E} \left\{ V_{lm}(k) V^{*}_{l'm'}(k) \right\} = \sigma _{v}^2(k) \frac{4 \pi }{Q} \delta _{l,l'} \delta _{m,m'}, \end{aligned}$$

(6.26)

where $\delta $ denotes the Kronecker delta, and $\text {oSNR}$ simplifies to

$$\begin{aligned} \text {oSNR}_{\text {w}}(k)&= {P_{\text {pw}}(k)} \left( \frac{4 \pi }{Q} \sigma _{v}^2(k) \sum _{l=0}^{L} \sum _{m=-l}^l \left| W_{lm}^{*}(k)\right| ^2\right) ^{-1}. \end{aligned}$$

(6.27)

The output SNR is a function of the beamformer weights $W_{lm}(k)$, the plane wave power $P_{\text {pw}}(k)$, the noise power $\sigma _{v}^2(k)$, and the beamformer order L. The beamformer order can be increased by adding microphones, as discussed in Sect. 3.4.

Finally, the WNG can be expressed as

$$\begin{aligned} \text {WNG}(k)&= \frac{\text {oSNR}_{\text {w}}(k)}{\text {iSNR}_{\text {w}}(k)} \end{aligned}$$

(6.28a)

$$\begin{aligned}&= \frac{ 4 \pi Q}{\left| \left| \mathbf {w}(k) \right| \right| ^2 \sum _{l=0}^{\infty } \left| b_l(k)\right| ^2 (2l+1)}. \end{aligned}$$

(6.28b)

The WNG is a function of the beamformer weights $W_{lm}(k)$, array order L and the array properties. As expected, it is also an increasing function of the number of microphones Q. In the case of an open sphere, $b_l(k) = i^l j_l(kr)$, and since $\sum _{l=0}^{\infty } \left| j_l(kr)\right| ^2 (2l+1) = 1$ [2, 13], the WNG is given by the simple expression

$$\begin{aligned} \text {WNG}(k)&= \frac{ 4 \pi Q}{\left| \left| \mathbf {w}(k) \right| \right| ^2}. \end{aligned}$$

(6.29)

6.2.4 Spatial Response

The output of the beamformer in the presence of a single unit amplitude plane wave originating from a DOA $\varOmega $ is given by

$$\begin{aligned} \mathcal {B}(k,\varOmega ) = \mathbf {w}^{\text {H}}(k) \mathbf {B}(k) \mathbf {y}^{*}(\varOmega ), \end{aligned}$$

(6.30)

and is known as the spatial response of the beamformer. The square magnitude of the spatial response $\mathcal {B}(k,\varOmega )$ is referred to as the beam pattern [4].^{Footnote 5} The beam pattern describes the beamformer’s ability to select signals originating from a direction of interest, while suppressing signals that do not. Beam patterns typically exhibit multiple peaks or lobes; the largest lobe, in the direction of interest, is referred to as the main lobe, while the other lobes are referred to as sidelobes. Due to the effects of spatial aliasing, some sidelobes may have an amplitude equal to that of the main lobe, and they are then referred to as grating lobes [27].

Due to the spherical symmetry of the SHD, the beam pattern can also be expressed as a function of the angle between the DOA $\varOmega $ and the beamformer’s steering direction $\varOmega _{\text {u}}$, denoted as $\varTheta $. Ideally, the response in the steering direction, $\mathcal {B}(k,\varTheta = 0)$, should be as large as possible compared to the response in other directions, i.e., the sidelobe levels should be minimized. We refer to the width of the region that has a higher response than the maximum sidelobe level as the main lobe width,^{Footnote 6} as illustrated in Fig. 6.2 .

6.3 Signal-Independent Beamformers

Having established our signal model in Sect. 6.1, we now develop a number of signal-independent beamformers based on the design criteria introduced in Sect. 6.2. The beam patterns of all the beamformers presented in this section are rotationally symmetric about the steering direction.

6.3.1 Farfield Beamformers

In this section, we derive three beamformers suitable for use in farfield conditions: a maximum directivity beamformer, a maximum WNG beamformer, and a multiply constrained beamformer.

6.3.1.1 Maximum Directivity Beamformer

The beamformer that maximizes the directivity while imposing a distortionless constraint in the steering direction satisfies

$$\begin{aligned} \max _{ \mathbf {w}(k) } \,\mathcal {D}(k) \quad&\text {subject to} \quad \mathbf {w}^{\text {H}}(k) \mathbf {B}(k) \mathbf {y}^{*}(\varOmega _{\text {u}}) = 1, \end{aligned}$$

or equivalently,

$$\begin{aligned} \min _{ \mathbf {w}(k) } \,\left| \left| \mathbf {B}(k) \mathbf {w}^{*}(k) \right| \right| ^2 \quad&\text {subject to} \quad \mathbf {w}^{\text {H}}(k) \mathbf {B}(k) \mathbf {y}^{*}(\varOmega _{\text {u}}) = 1, \end{aligned}$$

where $\mathbf {y}(\varOmega _{\text {u}})$ is the vector of spherical harmonics defined in (6.9).

Following the approach proposed by Brandwood [5], if we use a Lagrange multiplier to adjoin the constraint to the cost function, the weights of the maximum directivity beamformer are then given by

$$\begin{aligned} \mathbf {w}_{\mathrm {maxDI}}(k)&= \underset{\mathbf {w}(k)}{\arg \min } \, \mathcal {L}(\mathbf {w}(k), \lambda ), \end{aligned}$$

(6.31)

where $\mathcal {L}$ is the complex Lagrangian given by

$$\begin{aligned} \mathcal {L}(\mathbf {w}(k), \lambda )&= \left[ \mathbf {B}(k) \mathbf {w}^{*}(k) \right] ^{\text {H}} \left[ \mathbf {B}(k) \mathbf {w}^{*}(k) \right] \nonumber \\&\quad + \lambda \left( \mathbf {w}^{\text {H}}(k) \mathbf {B}(k) \mathbf {y}^{*}(\varOmega _{\text {u}}) - 1 \right) + \lambda ^{*} \left( \mathbf {y}^{\text {T}}(\varOmega _{\text {u}}) \mathbf {B}^{\text {*}}(k) \mathbf {w}(k) - 1 \right) \quad \end{aligned}$$

(6.32)

and $\lambda $ is the Lagrange multiplier. Setting the gradient of $\mathcal {L}(\mathbf {w}_{\mathrm {maxDI}}(k), \lambda )$ with respect to $\mathbf {w}^{*}_{\mathrm {maxDI}}$ to zero yields

$$\begin{aligned} \nabla _{\mathbf {w}^{*}_{\mathrm {maxDI}}} \mathcal {L}(\mathbf {w}_{\mathrm {maxDI}}(k), \lambda )&= \mathbf {0}_N\nonumber \\ \mathbf {B}(k) \mathbf {B}^{*}(k) \mathbf {w}_{\mathrm {maxDI}}(k) + \lambda \mathbf {B}(k) \mathbf {y}^{*}(\varOmega _{\text {u}})&= \mathbf {0}_N, \end{aligned}$$

(6.33)

where $\mathbf {0}_N$ is a column vector of N zeros. Using the constraint in (6.31), we then find

$$\begin{aligned} \mathbf {w}_{\mathrm {maxDI}}(k) = \frac{\left[ \mathbf {B}^{*}(k)\right] ^{-1} \mathbf {y}^{*}(\varOmega _{\text {u}})}{\left| \left| \mathbf {y}(\varOmega _{\text {u}}) \right| \right| ^2}. \end{aligned}$$

(6.34)

Using Unsöld’s theorem [29], this simplifies to

$$\begin{aligned} \mathbf {w}_{\mathrm {maxDI}}(k) = \frac{4 \pi }{(L+1)^2} \left[ \mathbf {B}^{*}(k)\right] ^{-1} \mathbf {y}^{*}(\varOmega _{\text {u}}), \end{aligned}$$

(6.35)

or in scalar form

$$\begin{aligned} W_{lm}^{\mathrm {maxDI}}(k) = \frac{4 \pi }{(L+1)^2} \frac{Y_{lm}^{*}(\varOmega _{\text {u}})}{b_l^{*}(k)}. \end{aligned}$$

(6.36)

A well-known farfield SHD beamformer is the plane-wave decomposition (PWD) beamformer, also sometimes known as a regular beamformer [24], whose weights are given by [22]

$$\begin{aligned} \mathbf {w}_{\mathrm {PWD}}(k) = \left[ \mathbf {B}^{*}(k)\right] ^{-1} \mathbf {y}^{*}(\varOmega _{\text {u}}). \end{aligned}$$

(6.37)

As the (frequency-independent) scaling factor does not affect the directivity, the PWD beamformer is also a maximum directivity beamformer. The reason for the name PWD will become clear in the next paragraph.

Assuming a single unit amplitude plane wave is incident upon the array from a direction $\varOmega _{\text {s}}$, the output Z(k) of the PWD beamformer is given by

$$\begin{aligned} Z(k)&= \mathbf {w}^{\text {H}}(k) \mathbf {B}(k) \mathbf {y}^{*}(\varOmega _{\text {s}}) \end{aligned}$$

(6.38a)

$$\begin{aligned}&= \mathbf {y}^{\text {T}}(\varOmega _{\text {u}}) \mathbf {B}^{-1}(k) \mathbf {B}(k) \mathbf {y}^{*}(\varOmega _{\text {s}}) \end{aligned}$$

(6.38b)

(6.38c)

(6.38d)

where $\varTheta $ is the angle between $\varOmega _{\text {s}}$ and $\varOmega _{\text {u}}$ and is the Legendre polynomial of order L. The Christoffel summation formula [11, Sect. 8.915] is used to obtain (6.38d) [20]. The beamformer output Z(k) reaches its maximum when $\varTheta = 0$, such that the steering direction $\varOmega _{\text {u}}$ is equal to the arrival direction $\varOmega _{\text {s}}$, as desired. We normalize the beamformer output with respect to its value for $\varTheta = 0$, and plot it as a function of $\varTheta $ in Fig. 6.3. We see that as L increases, the distribution of Z(k) narrows around $\varTheta = 0$, tending towards a delta function for $L \rightarrow \infty $ [31, Eq. 6.47].

The directivity of the maximum directivity beamformer is given by substituting (6.35) into (6.16)^{Footnote 7}

$$\begin{aligned} \mathcal {D}(k)&= 4 \pi \left| \left| \frac{4 \pi }{(L+1)^2} \mathbf {B}(k) \mathbf {B}^{-1}(k) \mathbf {y}(\varOmega _{\text {u}}) \right| \right| ^{-2} \end{aligned}$$

(6.39a)

$$\begin{aligned}&= \frac{(L+1)^4}{4 \pi } \left| \left| \mathbf {y}(\varOmega _{\text {u}}) \right| \right| ^{-2} \end{aligned}$$

(6.39b)

$$\begin{aligned}&= (L+1)^2. \end{aligned}$$

(6.39c)

The directivity of the maximum directivity beamformer is therefore frequency-independent and only depends on the beamformer order L.

Since at least $(L+1)^2$ microphones are required to sample a sound field up to order L without spatial aliasing, the directivity is upper bounded by the number of microphones Q. This is also the maximum directivity of a spatial domain beamformer based on a standard linear array [28, Eq. 2.160].

The WNG of the maximum directivity beamformer is given by substituting (6.35) into (6.28)

$$\begin{aligned} \text {WNG}(k)&= \frac{Q (L+1)^4}{4 \pi \sum _{l=0}^{L} \sum _{m=-l}^l \left| \frac{Y_{lm}(\varOmega _{\text {u}})}{b_l(k)}\right| ^2 \sum _{l=0}^{\infty } \left| b_l(k)\right| ^2 (2l+1)} \end{aligned}$$

(6.40a)

$$\begin{aligned}&= \frac{Q (L+1)^4}{\sum _{l=0}^{L} \left| b_l(k)\right| ^{-2} (2l+1) \sum _{l=0}^{\infty } \left| b_l(k)\right| ^2 (2l+1)}. \end{aligned}$$

(6.40b)

In the open sphere case, this simplifies to^{Footnote 8}

$$\begin{aligned} \text {WNG}(k)&= \frac{Q (L+1)^4}{\sum _{l=0}^{L} \left| b_l(k)\right| ^{-2} (2l+1)}, \end{aligned}$$

(6.41)

or in matrix form

$$\begin{aligned} \text {WNG}(k)&= Q (L+1)^4 \left| \left| \mathbf {B}^{-1}(k) \right| \right| ^{-2}. \end{aligned}$$

(6.42)

In Fig. 6.4, we plot the WNG of the maximum directivity beamformer of order $L = 4$ as a function of the product of the wavenumber k and array radius r, kr, for an array of $Q = 32$ microphones. Assuming a speed of sound of 343 $\text {m}\cdot \text {s}^{-1}$, a kr value of 1 corresponds to a frequency of 1.1 kHz for an array radius of $r = 10$ cm, for example. It can be seen that the beamformer’s WNG is low except at high frequencies or large array radii. When an open sphere is used, the maximum directivity beamformer has particularly poor robustness at certain values of kr; this is due to the presence of zeros in the open sphere mode strength (see Sect. 3.4.2). The rigid sphere does not present this issue, and in addition provides an increase in WNG of approximately 3.7 dB over the open sphere at low values of kr .

6.3.1.2 Maximum White Noise Gain Beamformer

The beamformer that maximizes the WNG while imposing a distortionless constraint in the steering direction satisfies

$$\begin{aligned} \max _{ \mathbf {w}(k) } \,\text {WNG}(k) \quad&\text {subject to} \quad \mathbf {w}^{\text {H}}(k) \mathbf {B}(k) \mathbf {y}^{*}(\varOmega _{\text {u}}) = 1, \end{aligned}$$

or equivalently,

$$\begin{aligned} \min _{ \mathbf {w}(k) } \,\left| \left| \mathbf {w}(k) \right| \right| ^2 \quad&\text {subject to} \quad \mathbf {w}^{\text {H}}(k) \mathbf {B}(k) \mathbf {y}^{*}(\varOmega _{\text {u}}) = 1. \end{aligned}$$

Proceeding in a similar way as for the analysis of the maximum directivity beamformer, if we use a Lagrange multiplier to adjoin the constraint to the cost function, the weights of the maximum directivity beamformer are then given by

$$\begin{aligned} \mathbf {w}_{\mathrm {maxWNG}}(k)&= \underset{\mathbf {w}(k)}{\arg \min } \, \mathcal {L}(\mathbf {w}(k), \lambda ), \end{aligned}$$

(6.43)

where $\mathcal {L}$ is the complex Lagrangian given by

$$\begin{aligned} \mathcal {L}(\mathbf {w}(k), \lambda )&= \left[ \mathbf {w}(k) \right] ^{\text {H}} \left[ \mathbf {w}(k) \right] \nonumber \\&\quad + \lambda \left( \mathbf {w}^{\text {H}}(k) \mathbf {B}(k) \mathbf {y}^{*}(\varOmega _{\text {u}}) - 1 \right) + \lambda ^{*} \left( \mathbf {y}^{\text {T}}(\varOmega _{\text {u}}) \mathbf {B}^{\text {*}}(k) \mathbf {w}(k) - 1 \right) \quad \end{aligned}$$

(6.44)

and $\lambda $ is the Lagrange multiplier. Setting the gradient of $\mathcal {L}(\mathbf {w}_{\mathrm {maxWNG}}(k), \lambda )$ with respect to $\mathbf {w}^{*}_{\mathrm {maxWNG}}$ to zero yields

$$\begin{aligned} \nabla _{\mathbf {w}^{*}_{\mathrm {maxWNG}}} \mathcal {L}(\mathbf {w}_{\mathrm {maxWNG}}(k), \lambda )&= \mathbf {0}_N\nonumber \\ \mathbf {w}_{\mathrm {maxWNG}}(k) + \lambda \mathbf {B}(k) \mathbf {y}^{*}(\varOmega _{\text {u}})&= \mathbf {0}_N, \end{aligned}$$

(6.45)

where $\mathbf {0}_N$ is a column vector of N zeros. Using the constraint in (6.43), we then find

$$\begin{aligned} \mathbf {w}_{\mathrm {maxWNG}}(k) = \frac{\mathbf {B}(k) \mathbf {y}^{*}(\varOmega _{\text {u}})}{\mathbf {y}^{\text {T}}(\varOmega _{\text {u}}) \mathbf {B}^{*}(k) \mathbf {B}(k) \mathbf {y}^{*}(\varOmega _{\text {u}})}. \end{aligned}$$

(6.46)

Using Unsöld’s theorem [29], this simplifies to

$$\begin{aligned} \mathbf {w}_{\mathrm {maxWNG}}(k) = 4 \pi \frac{\mathbf {B}(k) \mathbf {y}^{*}(\varOmega _{\text {u}})}{\left| \left| \mathbf {B}(k) \right| \right| ^2}, \end{aligned}$$

(6.47)

or in scalar form

$$\begin{aligned} W_{lm}^{\mathrm {maxWNG}}(k) = 4 \pi \frac{Y_{lm}^{*}(\varOmega _{\text {u}}) b_l(k)}{\sum _{l=0}^{L} |b_l(k)|^2 (2l+1)}. \end{aligned}$$

(6.48)

A well-known farfield SHD beamformer is the delay-and-sum beamformer, whose weights are given by [22]

$$\begin{aligned} \mathbf {w}_{\mathrm {DSB}}(k) = \mathbf {B}(k) \mathbf {y}^{*}(\varOmega _{\text {u}}). \end{aligned}$$

(6.49)

In the case of an open sphere, $b_l(k) = i^l j_l(kr)$, and since $\sum _{l=0}^{\infty } \left| j_l(kr)\right| ^2 (2l+1) = 1$ [2], the following relationship between the maximum WNG and delay-and-sum beamformers is obtained:

$$\begin{aligned} \lim _{L \rightarrow \infty } \mathbf {w}_{\mathrm {maxWNG}}(k) = 4 \pi \, \mathbf {w}_{\mathrm {DSB}}(k). \end{aligned}$$

(6.50)

When an open sphere is used, the delay-and-sum beamformer therefore approaches a maximum WNG beamformer as $L \rightarrow \infty $ (ignoring the $4 \pi $ scaling factor, which does not affect the WNG). For a finite L and/or if another microphone type or array configuration is used (such as a rigid sphere), the delay-and-sum beamformer is slightly suboptimal.

The delay-and-sum beamformer owes its name to the fact that for an open sphere as $L \rightarrow \infty $, its output converges to the output of the widely known spatial domain delay-and-sum beamformer [22].

The directivity of the maximum WNG beamformer is given by substituting (6.47) into (6.16)

$$\begin{aligned} \mathcal {D}(k)&= 4 \pi \left| \left| \frac{4 \pi }{\left| \left| \mathbf {B}(k) \right| \right| ^2} \mathbf {B}(k) \mathbf {B}^{*}(k) \mathbf {y}(\varOmega _{\text {u}}) \right| \right| ^{-2} \end{aligned}$$

(6.51a)

$$\begin{aligned}&= \frac{4 \pi }{(4 \pi )^2} \left| \left| \mathbf {B}(k) \right| \right| ^4 \left| \left| \mathbf {B}(k) \mathbf {B}^{*}(k) \mathbf {y}(\varOmega _{\text {u}}) \right| \right| ^{-2} \end{aligned}$$

(6.51b)

$$\begin{aligned}&= \left| \left| \mathbf {B}(k) \right| \right| ^4 \left| \left| \mathbf {B}(k) \mathbf {B}^{*}(k) \right| \right| ^{-2}. \end{aligned}$$

(6.51c)

The WNG of the maximum WNG beamformer is given by substituting (6.47) into (6.28)

$$\begin{aligned} \text {WNG}(k)&= \frac{4 \pi Q \left| \left| \mathbf {B}(k) \right| \right| ^4}{(4 \pi )^2 \left| \left| \mathbf {B}(k) \mathbf {y}^{*}(\varOmega _{\text {u}}) \right| \right| ^2 \sum _{l=0}^{\infty } \left| b_l(k)\right| ^2 (2l+1)} \end{aligned}$$

(6.52a)

Using Unsöld’s theorem [29], this simplifies to

$$\begin{aligned} \text {WNG}(k)&= \frac{Q \left| \left| \mathbf {B}(k) \right| \right| ^2}{\sum _{l=0}^{\infty } \left| b_l(k)\right| ^2 (2l+1)} \end{aligned}$$

(6.53)

In the open sphere case, the WNG approaches Q as $L \rightarrow \infty $ (as in [22]), so it can be seen that the maximum WNG beamformer achieves a constant WNG of Q that is independent of frequency. This is also the highest achievable WNG for a distortionless beamformer in the spatial domain [28].

In Fig. 6.5, we plot the DI of the maximum directivity and maximum WNG beamformers of order $L = 4$ as a function of kr for an array of $Q = 32$ microphones. As expected, the maximum directivity beamformer provides the highest directivity; while the maximum WNG beamformer has poor directivity at low values of kr (i.e., low frequencies or small array radii). Due to the effects of scattering introduced by the rigid sphere (see Sect. 3.4.1), the maximum WNG beamformer has better directivity with a rigid array than with an open array. The directivity of the maximum directivity beamformer is independent of kr, while for the maximum WNG beamformer the directivity decays as kr decreases, tending towards 0 dB (i.e., no directivity) .

The WNG of the maximum WNG beamformer of order $L = 4$ is shown in Fig. 6.4; as expected, it provides the highest WNG. Using Figs. 6.4 and 6.5, it can be observed that there is a tradeoff between WNG and directivity. The maximum directivity and WNG beamformers provide performance bounds for SHD beamformers in terms of directivity and WNG, and are attractive due to their low computational complexity. However, in practice a compromise solution is desirable, such as the multiply constrained beamformer presented in Sect. 6.3.1.3, or the signal-dependent beamformers in Chap. 7, which adaptively control the tradeoff between these two objectives depending on the nature of the noise to be suppressed.

6.3.1.3 Multiply Constrained Beamformer

Another approach to the design of a signal-independent beamformer is to minimize its sidelobe levels for a given main lobe width, to ensure that interfering signals that do not originate from the steering direction are effectively suppressed. However, in order to obtain a beamformer that is robust to errors in sensor position and steering direction, and to sensor noise, it is desirable to introduce a constraint on the beamformer’s WNG.

In [25], the authors propose a robust minimum sidelobe beamformer, which minimizes the maximum sidelobe level, subject to a distortionless constraint in the steering direction and a minimum WNG constraint. The objective can therefore be expressed in the form of a minimax criterion as

$$\begin{aligned} \min _{ \mathbf {w}(k) }&\quad \max _{\varTheta > \varDelta /2} { \left| \mathcal {B}(k,\varTheta )\right| } \quad \text {subject to}\nonumber \\&\mathbf {w}^{\text {H}}(k) \mathbf {B}(k) \mathbf {y}^{*}(\varOmega _{\text {u}}) = 1, \quad \text {WNG}(k) \ge \zeta (k), \end{aligned}$$

(6.54)

where $\mathcal {B}(k,\varTheta )$ is the spatial response of the beamformer, $\varTheta $ denotes the angle between the steering direction and the DOA, $\varDelta $ denotes the main lobe width (as defined in Sect. 6.2.4), and $\zeta $ is the minimum WNG. The sidelobe region is defined as $\varTheta _{\text {SL}} = \left\{ \varTheta | \varTheta > \varDelta /2 \right\} $.

As shown in [25], the problem in (6.54) can be reformulated as a convex optimization problem, solvable using second-order cone programming. The sidelobe region is approximated using a finite grid $\varTheta _{n_\text {g}} \in \varTheta _{\text {SL}}, n_\text {g} \in \{ 1, \ldots , N_\text {g}\}$; the approximation then improves as $N_\text {g}$ increases.

Finding a solution to (6.54) can be computationally intensive. However, a significant advantage of SHD beamforming is that if the desired beam pattern is rotationally symmetric about the steering direction $\varOmega _{\text {u}}$, the process of computing the beamformer weights and steering of the beamformer can be decoupled. In this case, the beamformer weights are expressed as $W_{lm}(k) = C_{l}(k) Y_{lm}^{*}(\varOmega _{\text {u}})$, and the weights $C_l(k)$ then become the quantities to be optimized. If the desired beam pattern is not rotationally symmetric about the steering direction, the beam pattern can be rotated by multiplying the SHD beamformer weights by Wigner-D functions that depend on the rotation angles, as proposed in [23 ] .

6.3.2 Nearfield Beamformers

In this chapter, we have until now assumed that the desired signal was due to a single plane wave, i.e., farfield conditions. However, under nearfield conditions, the plane wave assumptions cannot be considered valid. The SHD sound pressure due to a spherical wave originating from a source at a position $\mathbf {r}_{\text {s}} = (r_{\text {s}},\varOmega _{\text {s}})$ is given by

$$\begin{aligned} X_{lm}(k,\mathbf {r}_{\text {s}})&= X_{\text {sw}}(k) b_l^{\text {nf}}(k,r_{\text {s}}) Y_{lm}^*(\varOmega _{\text {s}}), \end{aligned}$$

(6.55)

where $X_{\text {sw}}(k)$ denotes the spherical wave amplitude and the nearfield mode strength $b_l^{\text {nf}}(k,r_{\text {s}})$ is given by

$$\begin{aligned} b_l^{\text {nf}}(k,r_{\text {s}}) = -i k i^{-l} h_l^{(2)}(kr_{\text {s}}) b_l(k), \end{aligned}$$

(6.56)

and $h_l^{(2)}$ is the spherical Hankel function of the second kind and of order l.

Beamformers suitable for nearfield conditions [8, 9, 19] can be designed by replacing the farfield mode strength expression $b_l(k)$ with the nearfield mode strength $b_l^{\text {nf}}(k,r_{\text {s}})$ in the beamformer weights. For example, the weights of a nearfield plane-wave decomposition beamformer are given by

$$\begin{aligned} W_{lm}^{\text {PWD,nf}}(k) = \frac{Y_{lm}^{*}(\varOmega _{\text {u}})}{\left[ b_l^{\text {nf}}(k,r_{\text {s}})\right] ^{*}}, \end{aligned}$$

(6.57)

instead of (6.37). While this process is straightforward, it does require knowledge of the source-array distance $r_{\text {s}}$. If the source-array knowledge is not known, the source-array distance $r_{\text {s}}$ becomes a controllable parameter, which is effectively a look distance and enables radial discrimination [9].

An appropriate boundary between the farfield and nearfield regions can be determined by comparing the magnitudes of the farfield mode strength $b_l(k)$ and the nearfield mode strength $b_l^{\text {nf}}(k,r_{\text {s}})$, as proposed in [8]. Using this criterion, the cut-off distance $r_{\text {nf}}$ is determined as

$$\begin{aligned} r_{\text {nf}}(k) = \frac{L}{k}. \end{aligned}$$

(6.58)

The extent of the nearfield region therefore decreases with frequency. An array with good radial discrimination, i.e., a large nearfield region, can be realized either at low frequencies (small k), or by oversampling the array (large N) [9].

Example: At a frequency of 100 Hz, assuming a speed of sound of 343 $\text {m}\cdot \text {s}^{-1}$ and an array order $L = 4$, the cut-off distance is $r_{\text {nf}}(k) = 2.2$ m, while at a frequency of 4 kHz it is 5.5 cm.

6.4 Chapter Summary

An overview of beamforming in the SHD using signal-independent beamformers has been presented. We introduced a number of performance measures, which were then used to derive beamformers weights that are optimal with respect to these measures. We also showed the relationship between these optimal beamformers and two well-known SHD beamformers: the PWD and delay-and-sum beamformers. Finally, where similarities existed, the performance bounds for SHD beamformers were related to previously derived bounds for spatial domain beamformers.

Notes

1.
The dependency on time is omitted for brevity. In practice, the signals acquired using a spherical microphone array are usually processed in the short-time Fourier transform domain, as explained in Sect. 3.1, where the discrete frequency index is denoted by $\nu $.
2.
We use the complex conjugate weights $W^*_{lm}$ rather than the weights $W_{lm}$; this notational convention originates in the spatial domain [30].
3.
If the real SHT is applied instead of the complex SHT, the complex spherical harmonics $Y_{lm}$ used throughout this chapter should be replaced with the real spherical harmonics $R_{lm}$, as defined in Sect. 3.3.
4.
It should be noted that this simplified expression is only valid for beamformers that satisfy the distortionless constraint given in (6.5). It therefore does not apply to the plane-wave decomposition beamformer presented in Sect. 6.3.1.1, which satisfies a scaled version of this constraint.
5.
Note that in some publications, such as [28], $\mathcal {B}(k,\varOmega )$ is referred to as the beam pattern, and its square magnitude is referred to as the power pattern .
6.
The main lobe width is sometimes also defined as the width of the region where the beam pattern is no less than half of its maximum value, or equivalently, no more than 3 dB below its maximum value.
7.
This expression is identical to (12) in [22] if we substitute $d_n = 1$.
8.
This expression is identical to (11) in [22] if we substitute $d_n = 1$, with the exception of the $(4 \pi )^2$ scaling factor, which is required due to the fact that in [22] a $4 \pi $ scaling factor is included in the definition of the mode strength.

References

Abhayapala, T.D., Ward, D.B.: Theory and design of high order sound field microphones using spherical microphone array. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), vol. 2, pp. 1949–1952 (2002). doi:10.1109/ICASSP.2002.1006151
Abramowitz, M., Stegun, I.A. (eds.): Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. Dover Publications, New York (1972)
MATH Google Scholar
Benesty, J., Chen, J., Huang, Y.: Microphone Array Signal Processing. Springer, Berlin (2008)
Google Scholar
Brandstein, M.S., Ward, D.B. (eds.): Microphone Arrays: Signal Processing Techniques and Applications. Springer, Berlin (2001)
Google Scholar
Brandwood, D.H.: A complex gradient operator and its application in adaptive array theory. Proc. IEEE 130(1, Parts F and H), 11–16 (1983)
Google Scholar
Doclo, S.: Multi-microphone noise reduction and dereverberation techniques for speech applications. Ph.D. thesis, Katholieke Universiteit Leuven, Belgium (2003)
Google Scholar
Elko, G.W.: Differential microphone arrays. In: Huang, Y., Benesty, J. (eds.) Audio Signal Processing for Next-Generation Multimedia Communication Systems, pp. 2–65. Kluwer (2004)
Google Scholar
Fisher, E., Rafaely, B.: The nearfield spherical microphone array. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5272–5275 (2008). doi:10.1109/ICASSP.2008.4518849
Fisher, E., Rafaely, B.: Near-field spherical microphone array processing with radial filtering. IEEE Trans. Audio, Speech, Lang. Process. 19(2), 256–265 (2011). doi:10.1109/TASL.2010.2047421
Gilbert, E., Morgan, S.: Optimum design of directive antenna arrays subject to random variations. Bell Syst. Tech. J. 34, 637–663 (1955)
Article Google Scholar
Gradshteyn, I.S., Ryzhik, I.M.: Table of Integrals, Series, and Products, seventh edn. Academic Press, Cambridge (2007)
Google Scholar
Habets, E.A.P., Benesty, J., Cohen, I., Gannot, S., Dmochowski, J.: New insights into the MVDR beamformer in room acoustics. IEEE Trans. Audio, Speech, Lang. Process. 18, 158–170 (2010)
Google Scholar
Jarrett, D.P., Habets, E.A.P.: On the noise reduction performance of a spherical harmonic domain tradeoff beamformer. IEEE Signal Process. Lett. 19(11), 773–776 (2012)
Article Google Scholar
Jarrett, D.P., Habets, E.A.P., Benesty, J., Naylor, P.A.: A tradeoff beamformer for noise reduction in the spherical harmonic domain. In: Proceedings of the International Workshop Acoustics Signal Enhancement (IWAENC). Aachen, Germany (2012)
Google Scholar
Jarrett, D.P., Habets, E.A.P., Naylor, P.A.: Spherical harmonic domain noise reduction using an MVDR beamformer and DOA-based second-order statistics estimation. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 654–658. Vancouver, Canada (2013)
Google Scholar
Jarrett, D.P., Thiergart, O., Habets, E.A.P., Naylor, P.A.: Coherence-based diffuseness estimation in the spherical harmonic domain. In: Proceedings of the IEEE Convention of Electrical and Electronics Engineers in Israel (IEEEI). Eilat, Israel (2012)
Google Scholar
Meyer, J., Agnello, T.: Spherical microphone array for spatial sound recording. In: Proceedings of the Audio Engineering Society Convention, pp. 1–9. New York (2003)
Google Scholar
Meyer, J., Elko, G.: A highly scalable spherical microphone array based on an orthonormal decomposition of the soundfield. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), vol. 2, pp. 1781–1784 (2002)
Google Scholar
Meyer, J., Elko, G.W.: Position independent close-talking microphone. Signal Processing 86(6), 1254–1259 (2006). doi:10.1016/j.sigpro.2005.05.036
Article MATH Google Scholar
Rafaely, B.: Plane-wave decomposition of the pressure on a sphere by spherical convolution. J. Acoust. Soc. Am. 116(4), 2149–2157 (2004)
Article Google Scholar
Rafaely, B.: Analysis and design of spherical microphone arrays. IEEE Trans. Speech Audio Process. 13(1), 135–143 (2005). doi:10.1109/TSA.2004.839244
Article Google Scholar
Rafaely, B.: Phase-mode versus delay-and-sum spherical microphone array processing. IEEE Signal Process. Lett. 12(10), 713–716 (2005). doi:10.1109/LSP.2005.855542
Article Google Scholar
Rafaely, B., Kleider, M.: Spherical microphone array beam steering using Wigner-D weighting. IEEE Signal Process. Lett. 15, 417–420 (2008). doi:10.1109/LSP.2008.922288
Article Google Scholar
Rafaely, B., Peled, Y., Agmon, M., Khaykin, D., Fisher, E.: Spherical microphone array beamforming. In: Cohen, I., Benesty, J., Gannot, S. (eds.) Speech Processing in Modern Communication: Challenges and Perspectives, Chap. 11. Springer, Heidelberg (2010)
Google Scholar
Sun, H., Yan, S., Svensson, U.P.: Robust minimum sidelobe beamforming for spherical microphone arrays. IEEE Trans. Audio, Speech, Lang. Process. 19(4), 1045–1051 (2011). doi:10.1109/TASL.2010.2076393
Teutsch, H.: Wavefield decomposition using microphone arrays and its application to acoustic scene analysis. Ph.D. thesis, Friedrich-Alexander Universität Erlangen-Nürnberg (2005)
Google Scholar
van Trees, H.L.: Detection, Estimation, and Modulation Theory, Optimum Array Processing, vol. IV. Wiley, New York (2002)
Google Scholar
van Trees, H.L.: Optimum Array Processing. Detection Estimation and Modulation Theory. Wiley, New York (2002)
Google Scholar
Unsöld, A.: Beiträge zur Quantenmechanik der Atome. Annalen der Physik 387(3), 355–393 (1927). doi:10.1002/andp.19273870304
Article MATH Google Scholar
van Veen, B.D., Buckley, K.M.: Beamforming: a versatile approach to spatial filtering. IEEE Acoust. Speech Signal Mag. 5(2), 4–24 (1988)
Google Scholar
Williams, E.G.: Fourier Acoustics: Sound Radiation and Nearfield Acoustical Holography, 1st edn. Academic Press, London (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

Kilburn & Strode LLP, London, UK
Daniel P. Jarrett
International Audio Laboratories Erlangen, Erlangen, Germany
Emanuël A. P. Habets
Department of Electrical and Electronic Engineering, Imperial College London, London, UK
Patrick A. Naylor

Authors

Daniel P. Jarrett
View author publications
You can also search for this author in PubMed Google Scholar
Emanuël A. P. Habets
View author publications
You can also search for this author in PubMed Google Scholar
Patrick A. Naylor
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daniel P. Jarrett .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Jarrett, D.P., Habets, E.A.P., Naylor, P.A. (2017). Signal-Independent Array Processing. In: Theory and Applications of Spherical Microphone Array Processing. Springer Topics in Signal Processing, vol 9. Springer, Cham. https://doi.org/10.1007/978-3-319-42211-4_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-42211-4_6
Published: 27 August 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42209-1
Online ISBN: 978-3-319-42211-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Signal-Independent Array Processing

Abstract

Similar content being viewed by others

Multimicrophone MMSE-Based Speech Source Separation

Microphone Array

Robust adaptive beamforming based on covariance matrix and new steering vector estimation

Keywords

6.1 Signal Model

6.2 Design Criteria

6.2.1 Directivity

6.2.2 Front-to-Back Ratio

6.2.3 White Noise Gain

6.2.4 Spatial Response

6.3 Signal-Independent Beamformers

6.3.1 Farfield Beamformers

6.3.1.1 Maximum Directivity Beamformer

6.3.1.2 Maximum White Noise Gain Beamformer

6.3.1.3 Multiply Constrained Beamformer

6.3.2 Nearfield Beamformers

6.4 Chapter Summary

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Signal-Independent Array Processing

Abstract

Similar content being viewed by others

Multimicrophone MMSE-Based Speech Source Separation

Microphone Array

Robust adaptive beamforming based on covariance matrix and new steering vector estimation

Keywords

6.1 Signal Model

6.2 Design Criteria

6.2.1 Directivity

6.2.2 Front-to-Back Ratio

6.2.3 White Noise Gain

6.2.4 Spatial Response

6.3 Signal-Independent Beamformers

6.3.1 Farfield Beamformers

6.3.1.1 Maximum Directivity Beamformer

6.3.1.2 Maximum White Noise Gain Beamformer

6.3.1.3 Multiply Constrained Beamformer

6.3.2 Nearfield Beamformers

6.4 Chapter Summary

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation