1 Introduction

The acoustic vector sensors have been well-recognized to measure the acoustic pressure and combine all three orthogonal components of the acoustic particle velocity at a single point in space. In contrast to the traditional acoustic pressure sensor arrays, the vector sensor ones have several considerable advantages such as collecting more information of acoustics, better exploitation of beamforming, and the enhancement of system performance (Sun et al. 2006, 2003; Nehorai and Paldi 1994; Hawkes and Nehorai 1998; Chen and Zhao 2004; Hochwald and Nehorai 1996) Since the measurement model for acoustic vector-sensor array (Nehorai and Paldi 1994) was developed, the research on it has been concentrated on direction of arrival (DOA) estimation of incoming signals. DOA estimation is a key problem in array signal processing (Zhang et al. 2012a, b; Yamada and Oguchi 2011; Liu et al. 2013; He et al. 2011). DOA estimation algorithms for acoustic vector-sensor array contain Capon technique (Hawkes and Nehorai 1998), estimation of signal parameters via rotational invariance technique (ESPRIT) algorithms (Wong and Zoltowski 1997a, b; He et al. 2009), root multiple signal classification (MUSIC) (Wong and Zoltowski 1999), self-initiating multiple signal classification MUSIC (Wong and Zoltowski 2000), hypercomplex MUSIC (Wang et al. 2008), quaternion-MUSIC (Miron et al. 2006; Bihan et al. 2007; Gong et al. 2008), trilinear decomposition (Zhang et al. 2012a, b), propagator method (PM) (He and Liu 2008), cross-correlation method (Liu et al. 2013), as well as others (Palanisamy et al. 2012; Hawkes and Nehorai 2001; Yuan et al. 2008; Arunkumar and Anand 2007; Tam and Wong 2009; Abdi and Guo 2009; Hawkes and Nehorai 2003).

ESPRIT algorithm requires eigen-value decomposition (EVD) of the cross spectral matrix or singular value decomposition of the received data, and it has been used for two-dimensional (2D) DOA estimation for acoustic vector-sensor array (Wong and Zoltowski 1997a). However, the problem of how to pair the signal parameter is inherent within the ESPRIT algorithm, which is used for 2D-DOA estimation. Furthermore, the pair matching requires the extra computational load, and it usually fails to work in lower signal to noise ratio (SNR). Trilinear decomposition-based 2D-DOA estimation for acoustic vector-sensor array is investigated in Zhang et al. (2012a). Compared with ESPRIT algorithm, trilinear decomposition algorithm requires no pair matching. Root-MUSIC algorithm is suitable for the uniformly spaced arrays, which limits its application. Miron et al. (2006) used a biquaternion formalism to model vector-sensor array signal, and proposed a quaternion-based MUSIC algorithm for DOA estimation. Bi-quaternion MUSIC (Bihan et al. 2007) algorithm and quad-quaternion MUSIC (Gong et al. 2008) algorithm were presented for DOA estimation in vector-sensor array. When quaternion-like algorithms are used for 2D-DOA estimation, they require 2D searches, which are still computationally intensive and time-consuming. Liu et al. (2013) proposed a cross-correlation-based coherent 2D DOA estimation algorithm for sparse acoustic vector-sensor array, and the method does not require the computationally cumbersome eigen-decomposition into signal/noise subspaces of the array data.

It has been proved that 2D-MUSIC algorithm represents a possible implementation for 2D-DOA estimation in acoustic vector-sensor array. However, the requirement of 2D searches renders much higher computational complexity. In this paper, we propose a successive MUSIC algorithm for angle estimation in acoustic vector-sensor array. The proposed algorithm obtains initial estimations of the azimuth and the elevation angles from the signal subspace, and uses successively one-dimensional local searches to achieve the joint estimation of 2D-DOA. The proposed algorithm has the following advantages: (1) it can obtain automatically paired two-dimensional angle estimation; (2) it just requires the one-dimensional local searches, while 2D-MUSIC algorithm needs a two-dimensional global search; (3) it has better DOA estimation performance than ESPRIT method and trilinear decomposition algorithm; (4) the angle estimation performance of the proposed algorithm is close to 2D-MUSIC algorithm; (5) it imposes less constraint on the sensor spacing, which does not have to be restricted within half-wavelength.

Fig. 1
figure 1

The structure of the array

The remainder of this paper is structured as follows. Section 2 develops the data model, and Sect. 3 presents the proposed algorithm. In Sect. 4, the estimation error and Cramer-Rao bound (CRB) are derived. In Sect. 5, simulation results are presented to verify the improvement for the proposed algorithm, while our conclusions are shown in Sect. 5.

Notation

Bold lower (upper) case letters are adopted to represent vectors (matrices). (.)\(^{*}\), (.)\(^{T}\), (.)\(^{H}\), (.)\(^{-1}\), and (.)\(^{+ }\)denote the complex conjugation, transpose, conjugate-transpose, inverse and pseudo-inverse, respectively. \({\vert }{\vert }.{\vert }{\vert }_{F}\) stands for Forbenius norm. diag (v) stands for diagonal matrix whose diagonal is the vector \(\mathbf{v}\). \(\mathbf{I}_{P}\) and \(\mathbf{0}_{P\times Q} \) denote a \(P \times P\) identity matrix and a \(P\times \text{ Q}\) matrix of zeros, respectively. \(\otimes ,\circ \), and \(\odot \) stand for Kronecker product, Khatri–Rao product and Hardamard product, respectively. \(\hat{{\mathbf{A}}}\) is an estimate of A.

2 Data model

A total of \(K\) narrowband plane waves impinge on a linear array containing \(M\) acoustic vector sensors, which is shown in Fig. 1. The reference acoustic vector sensor is located at origin of coordinates, and the distance between the \(m\)th acoustic vector sensor and the reference element is \(d_m \left( {m=1,\ldots ,M} \right)\) with \(d_1 =0\). We consider the signals in the far-field, in which case the sources are far enough away that the arriving waves are essentially planes over the array. We assume that the noise is independent of the source, and noise is additive independent and identically distributed \((i.i.d.)\) Gaussian. The \(k\)th signal is arriving from direction \((\phi _k ,\varphi _k )\), where \(\phi _k \) and \(\varphi _k \) stand for the azimuth angle and the elevation angle, respectively. Let \(\varvec{\uptheta }_k =[\phi _k ,\varphi _k ]^{T}\), which is the 2D-DOA of the \(k\)th source. In the scenario of free-spacing, the output of an acoustic vector sensor at \(d_m \) is given by

$$\begin{aligned} \mathbf{x}_m (t)=\sum _{k=1}^K {\left[ {{\begin{array}{c} 1 \\ {\mathbf{u}(\phi _k ,\varphi _k )} \\ \end{array} }} \right]e^{-j2\pi d_m \sin \varphi _k /\lambda }b_k (t)} +\mathbf{n}_m (t) \end{aligned}$$
(1)

where \(b_{k}(t)\) shows the transmit signal of the \(k\)th source. \(\mathbf{n}_m (t)\) is the received noise. \(\mathbf{u}(\phi _k ,\varphi _k )\) is shown as

$$\begin{aligned} \mathbf{u}(\phi _k ,\varphi _k )=\left[ {{\begin{array}{c} {\cos \phi _k \cos \varphi _k } \\ {\sin \phi _k \cos \varphi _k } \\ {\sin \varphi _k } \\ \end{array} }} \right] \end{aligned}$$
(2)

The output of the linear array containing \(M\) acoustic vector sensors is

$$\begin{aligned} \begin{array}{ll} \mathbf{x}(t)&= \left[ \begin{array}{c} {\mathbf{x}_1 (t)} \\ {\mathbf{x}_2 (t)} \\ \vdots \\ {\mathbf{x}_M (t)} \\ \end{array} \right]\\&= \mathop {\underbrace{\left[ \mathbf{a}(\varphi _1 )\otimes \mathbf{h}(\phi _1 ,\varphi _1 )\quad \mathbf{a}(\varphi _2 )\otimes \mathbf{h}(\phi _2 ,\varphi _2 )\quad \ldots \quad \mathbf{a}(\varphi _K )\otimes \mathbf{h}(\phi _K ,\varphi _K ) \right]}}\limits _{\varvec{\Psi }} \mathbf{b}(t)+\mathbf{n}(t) \\ \end{array} \end{aligned}$$
(3)

where \(\mathbf{b}(t)\) contains \(K\) source signals, \(\mathbf{n}(t)\) is the received additive white Gaussian noise (AWGN) vector with zeros mean and covariance matrix \(\sigma ^{2}\mathbf{I}_{4M} \). \(\mathbf{a}\left( {\varphi _k } \right) = [ 1,\exp ( {-i2\pi d_2} \sin \varphi _k /\lambda ),\ldots ,\exp \left( {-i2\pi d_M \sin \varphi _k /\lambda } \right)]^{T}\) with \(\lambda \) being wavelength, and \(\mathbf{a}(\varphi _k)\) is the \(M \times 1\) steering vector of an acoustic pressure sensor array with the same geometry as the acoustic vector-sensor array for the \(k\)th signal. \(\mathbf{h}(\phi _k ,\varphi _k )=[1,\mathbf{u}(\phi _k ,\varphi _k )^T ]^{T}\) is bearing vector of the \(k\)th source. \(\otimes \) stands for the Kronecker product. \({\varvec{\Psi } (\varvec{\uptheta })}\) is denoted by

$$\begin{aligned} {\varvec{\Psi }} =\mathbf{A}\circ \mathbf{H}=\left[ {{\begin{array}{c} {\mathbf{H}D_1 (\mathbf{A})} \\ {\mathbf{H}D_2 (\mathbf{A})} \\ \vdots \\ {\mathbf{H}D_M (\mathbf{A})} \\ \end{array} }} \right] \end{aligned}$$
(4)

where \(D_{m}\)(.) is to extract the \(m\)th row of its matrix argument and construct a diagonal matrix out of it. \(\mathbf{A}=[\mathbf{a}(\varphi _1 ),\mathbf{a}(\varphi _2 ),\ldots ,\mathbf{a}(\varphi _K )]\in C^{M\times K}\) and \(\mathbf{H}=[\mathbf{h}(\phi _1 ,\varphi _1 ),\mathbf{h}(\phi _2 ,\varphi _2 ),\ldots , \mathbf{h}(\phi _K ,\varphi _K )] \quad \in C^{4\times K}\). \(\mathbf{A}\circ \mathbf{H}\) represents Khatri–Rao product. There exists a transformation matrix C corresponding to the finite number of row interchange operations such that

$$\begin{aligned} \mathbf{C}{\varvec{\Psi }}=\left[ {{\begin{array}{c} {\mathbf{A}D_1 (\mathbf{H})} \\ {\mathbf{A}D_2 (\mathbf{H})} \\ {\mathbf{A}D_3 (\mathbf{H})} \\ {\mathbf{A}D_4 (\mathbf{H})} \\ \end{array} }} \right]=\left[ {{\begin{array}{c} \mathbf{A} \\ {\mathbf{A}\varvec{\Phi }_x } \\ {\mathbf{A}\varvec{\Phi }_y } \\ {\mathbf{A}\varvec{\Phi }_z } \\ \end{array} }} \right] \end{aligned}$$
(5)

where \({\varvec{\Phi }}_x =diag(\cos \phi _1 \cos \varphi _1 ,\cos \phi _2 \cos \varphi _2 ,\ldots ,\cos \phi _K \cos \varphi _K ),{\varvec{\Phi }}_z =diag(\sin \varphi _1 , \sin \varphi _2 ,\ldots ,\sin \varphi _K )\), and \({\varvec{\Phi }}_y =diag(\sin \phi _1 \cos \varphi _1 ,\sin \phi _2 \cos \varphi _2 ,\ldots ,\sin \phi _K \cos \varphi _K ). \mathbf{C}\in C^{4M\times 4M}\) is

(6)

We collect \(L\) snapshots, and define \(\mathbf{X}=[\mathbf{x}(1),\mathbf{x}(2),\ldots ,\mathbf{x}(L)]\), which is denoted as

$$\begin{aligned} \mathbf{X}={\varvec{\Psi }}\mathbf{B}^{T}+\mathbf{N} \end{aligned}$$
(7)

where \(\mathbf{B}\in C^{L\times K}\) is the source matrix for \(L\) samples, \(\mathbf{N}\in C^{4M\times L}\) is the noise matrix. For the signal model in (7), the covariance matrix \(\mathbf{R}_{x}\) can be estimated with \(L\) snapshots by \({\hat{\mathbf{R}}}_x =\mathbf{XX}^{H}/L\). Using eigen-value decomposition, \({\hat{\mathbf{R}}}_x \) is denoted by

$$\begin{aligned} {\hat{\mathbf{R}}}_x =\mathbf{E}_s \mathbf{D}_s \mathbf{E}_s^H +\mathbf{E}_n \mathbf{D}_n \mathbf{E}_n^H \end{aligned}$$
(8)

where \(\mathbf{D}_s \) is a \(K \times K\) diagonal matrix whose diagonal elements contain the \(K\) largest eigen-values and \(\mathbf{D}_n \) stands for a diagonal matrix whose diagonal entries contain the \(4M -K\) smallest eigen-values. \(\mathbf{E}_s \) is the matrix composed of the eigen-vectors corresponding to the \(K\) largest eigen-values of \({\hat{\mathbf{R}}}_x \), while \(\mathbf{E}_n \) represents the matrix including the rest eigen-vectors. Note that \(\mathbf{E}_s \) and \(\mathbf{E}_n\) can be regarded as the signal subspace and the noise subspace, respectively.

3 2D-DOA estimation for acoustic vector-sensor array

3.1 2D-MUSIC algorithm

We construct the 2D-MUSIC spatial spectrum function in this form

$$\begin{aligned} f_{2dmusic} ({\varvec{\uptheta } })=\frac{1}{[\mathbf{a}(\varphi )\otimes \mathbf{h}(\phi ,\varphi )]^{H}\mathbf{E}_n \mathbf{E}_n^H [\mathbf{a}(\varphi )\otimes \mathbf{h}(\phi ,\varphi )]} \end{aligned}$$
(9)

where

$$\begin{aligned} \mathbf{a}\left( \varphi \right)&= \left[ {1,\exp \left( {-i2\pi d_2 \sin \varphi /\lambda } \right),\ldots ,\exp \left( {-i2\pi d_M \sin \varphi /\lambda } \right)} \right]^{T},\end{aligned}$$
(10)
$$\begin{aligned} \mathbf{h}(\phi ,\varphi )&= [1,\cos \phi \cos \varphi ,\sin \phi \cos \varphi ,\sin \varphi ]^{T}. \end{aligned}$$
(11)

Hence we take the \(K\) largest peaks of \(f_{2dmusic} ({\varvec{\uptheta } })\) as the estimates of the DOAs for the sources. Since 2D-MUSIC requires an exhaustive 2D search, the approach is normally inefficient due to high computational cost. In the following subsections, we present another MUSIC algorithm, which qualifies for the DOA estimation just through the one-dimensional local searches.

3.2 Successive MUSIC algorithm for 2D-DOA estimation

He et al. (2011) used successive MUSIC algorithm for angle estimation in multiple-input multiple-output radar. In this paper, we have extended the idea to acoustic vector-sensor array parameter estimation. In no-noise case,

$$\begin{aligned} \mathbf{E}_s ={\varvec{\Psi }}\mathbf{T} \end{aligned}$$
(12)

where T is a \(K\times K\) full rank matrix. We form the following matrix \(\mathbf{E}_c \mathop {=}\limits ^\Delta \mathbf{CE}_s\). We partition \(\mathbf{E}_c\) as

$$\begin{aligned} \mathbf{E}_c =\left[ {{\begin{array}{c} {\mathbf{E}_{c1} } \\ {\mathbf{E}_{c2} } \\ {\mathbf{E}_{c3} } \\ {\mathbf{E}_{c4} } \\ \end{array} }} \right] \end{aligned}$$
(13)

where \(\mathbf{E}_{cm} \in C^{M\times K} (m=1, 2, 3, 4)\). In no-noise case,

$$\begin{aligned} \mathbf{E}_c =\left[ {{\begin{array}{c} {\mathbf{E}_{c1} } \\ {\mathbf{E}_{c2} } \\ {\mathbf{E}_{c3} } \\ {\mathbf{E}_{c4} } \\ \end{array} }} \right]=\left[ {{\begin{array}{c} {\mathbf{A}D_1 (\mathbf{H})} \\ {\mathbf{A}D_2 (\mathbf{H})} \\ {\mathbf{A}D_3 (\mathbf{H})} \\ {\mathbf{A}D_4 (\mathbf{H})} \\ \end{array} }} \right]\mathbf{T}=\left[ {{\begin{array}{c} \mathbf{A} \\ {\mathbf{A}\varvec{\Phi }_x } \\ {\mathbf{A}\varvec{\Phi }_y } \\ {\mathbf{A}\varvec{\Phi }_z } \\ \end{array} }} \right]\mathbf{T} \end{aligned}$$
(14)

And we get \(\mathbf{E}_{c1} =\mathbf{AT}, \mathbf{E}_{c4} =\mathbf{A}\varvec{\Phi }_z \mathbf{T}\), and then

$$\begin{aligned} \begin{array}{ll} \mathbf{E}_{c4}&=\mathbf{ATT}^{-1}{\varvec{\Phi }}_z \mathbf{T} \\&=\mathbf{E}_{c1} \mathbf{T}^{-1}{\varvec{\Phi }}_z \mathbf{T} \\ \end{array} \end{aligned}$$
(15)

According to (15), we have

$$\begin{aligned} \mathbf{E}_{c1} {^{+}}\mathbf{E}_{c4} =\mathbf{T}^{-1}{\varvec{\Phi }}_z \mathbf{T} \end{aligned}$$
(16)

Using (16), we obtain \(\mathbf{E}_{c1} {^{+}}\mathbf{E}_{c4} \mathbf{T}^{-1}=\mathbf{T}^{-1}{\varvec{\Phi }}_z \). The diagonal elements of \({\varvec{\Phi }}_z \) are the eigen-values of \(\mathbf{E}_{c1} {^{+}}\mathbf{E}_{c4} \), and the corresponding eigen-vectors can be used as the columns of estimation of \(\mathbf{T}^{-1}\). We assume that the estimation of \(\mathbf{T}\) is \({\hat{\mathbf{T}}}\). In no-noise case, \({\hat{\mathbf{T}}}={\varvec{\Pi } }\mathbf{T}\), where \({\varvec{\Pi } }\) is a column permutation matrix. The estimation of \({\varvec{\Phi }}_z \) is \({\hat{{\varvec{\Phi } }}}_z ={\varvec{\Pi } }^{-1}{\varvec{\Phi }}_z {\varvec{\Pi } }\). Then the initial estimations of elevation angles are

$$\begin{aligned} \hat{{\varphi }}_k^{ini} =\sin ^{-1}(z_k ),\quad k=1,\ldots ,K \end{aligned}$$
(17)

where \(z_k \) is the \(k\)th eigen-value of \(\mathbf{E}_{c1} {^{+}}\mathbf{E}_{c4} \). We form the following matrix,

$$\begin{aligned} \mathbf{E}_d =\mathbf{E}_c {\hat{\mathbf{T}}}^{-1}=\left[ {{\begin{array}{c} {\mathbf{A}D_1 (\mathbf{H})} \\ {\mathbf{A}D_2 (\mathbf{H})} \\ {\mathbf{A}D_3 (\mathbf{H})} \\ {\mathbf{A}D_4 (\mathbf{H})} \\ \end{array} }} \right]{\varvec{\Pi } } \end{aligned}$$
(18)

We partition the matrix \(\mathbf{E}_d \) as

$$\begin{aligned} \mathbf{E}_d =\left[ {{\begin{array}{c} {\mathbf{E}_{d1} } \\ {\mathbf{E}_{d2} } \\ {\mathbf{E}_{d3} } \\ {\mathbf{E}_{d4} } \\ \end{array} }} \right] \end{aligned}$$
(19)

where \(\mathbf{E}_{dm} \in C^{M\times K} (m=1, 2, 3, 4)\). And we get

$$\begin{aligned} \mathbf{E}_{d2}&= \mathbf{E}_{d1} {\varvec{\Pi } }^{-1}{\varvec{\Phi }}_x {\varvec{\Pi } }\\ \mathbf{E}_{d3}&= \mathbf{E}_{d1} {\varvec{\Pi } }^{-1}{\varvec{\Phi }}_y {\varvec{\Pi } } \end{aligned}$$

We assume that \(x_k \) and \(y_k \) are the \(k\)th diagonal element of \(\mathbf{E}_{d1}^{+} \mathbf{E}_{d2} \) and \(\mathbf{E}_{d1}^{+} \mathbf{E}_{d3} \), respectively. And then initial estimations of the azimuth angles are obtained through

$$\begin{aligned} \hat{{\phi }}_k ^{ini}=angle\left( {x_k +iy_k } \right),\quad k={1},\ldots ,K \end{aligned}$$
(20)

The initial estimations of the azimuth and the elevation angles are obtained, and they are automatically paired.

Using the spatial spectrum of one-dimensional (1D) MUSIC shown in (21), the azimuth angles can be estimated

$$\begin{aligned}&\hat{{\phi }}_k =\mathop {\arg \max }\limits _{\phi \in [\hat{{\phi }}_k ^{ini}-\Delta \phi ,\hat{{\phi }}_k ^{ini}+\Delta \phi ]} \frac{1}{[\mathbf{a}(\hat{{\varphi }}_k^{ini} )\otimes \mathbf{h}(\phi ,\hat{{\varphi }}_k^{ini} )]^{H}\mathbf{E}_n \mathbf{E}_n^H [\mathbf{a}(\hat{{\varphi }}_k^{ini} )\otimes \mathbf{h}(\phi ,\hat{{\varphi }}_k^{ini} )]},\quad \nonumber \\&\quad k={1},{2},\ldots ,K\nonumber \\ \end{aligned}$$
(21)

Through searching locally \(\phi \) within \(\phi \in [\hat{{\phi }}_k^{ini} -\Delta \phi ,\hat{{\phi }}_k^{ini} +\Delta \phi ]\), where \(\Delta \phi \) is a small value, we get more accurate estimation of the azimuth angle \(\hat{{\phi }}_k \).

Then \(\varphi _k \) can be estimated via (22) by locally searching \(\varphi \) within \([\hat{{\varphi }}_k ^{ini}-\Delta \varphi ,\hat{{\varphi }}_k ^{ini}+\Delta \varphi ]\), where \(\Delta \varphi \) is a small value,

$$\begin{aligned} \hat{{\varphi }}_k =\mathop {\arg \max }\limits _{\varphi \in [\hat{{\varphi }}_k ^{ini}-\Delta \varphi ,\hat{{\varphi }}_k ^{ini}+\Delta \varphi ]} \frac{1}{[\mathbf{a}(\varphi )\otimes \mathbf{h}(\hat{{\phi }}_k ,\varphi )]^{H}\mathbf{E}_n \mathbf{E}_n^H [\mathbf{a}(\varphi )\otimes \mathbf{h}(\hat{{\phi }}_k ,\varphi )]},\quad k={1},{2},\ldots ,K\nonumber \\ \end{aligned}$$
(22)

where \(\hat{{\phi }}_k \) is estimation of \(\phi _k \) via (18).

Till now, we have achieved the proposal for the algorithm for 2D-DOA estimation for acoustic vector-sensor array. We show the major steps of the proposed algorithm as follows:

  1. 1.

    Estimate the covariance matrix of the received data through \({\hat{\mathbf{R}}}_x =\sum \nolimits _{t=1}^L {\mathbf{x}\left( t \right)\mathbf{x}^{H}\left( t \right)} \).

  2. 2.

    Perform eigen-value decomposition of \({\hat{\mathbf{R}}}_x \) to get \({\hat{\mathbf{E}}}_s \) and \({\hat{\mathbf{E}}}_N \), form the matrix \({\hat{\mathbf{E}}}_c \), partition the matrix \({\hat{\mathbf{E}}}_c \) to get \({\hat{\mathbf{E}}}_{c1} ,{\hat{\mathbf{E}}}_{c2} ,{\hat{\mathbf{E}}}_{c3} ,{\hat{\mathbf{E}}}_{c4} \), and obtain the initial estimation of the elevation angle \(\hat{{\varphi }}_k ^{ini}\) from the eigen-values of \({\hat{\mathbf{E}}}_{c1}^{+} {\hat{\mathbf{E}}}_{c4} \), and \({\hat{\mathbf{T}}}^{-1}\) from the corresponding eigen-vectors.

  3. 3.

    Compute \({\hat{\mathbf{E}}}_d \), and get the initial estimation of the azimuth angle \(\hat{{\phi }}_k ^{ini}\).

  4. 4.

    Obtain the estimate of the azimuth angle \(\hat{{\phi }}_k \) through 1D-MUSIC via (21) while keeping \(\hat{{\varphi }}_k ^{ini}\) fixed.

  5. 5.

    Get the estimate of the elevation angle \(\hat{{\varphi }}_k \) through 1D-MUSIC via (22) while keeping \(\hat{{\phi }}_k \) fixed.

Remark 1

For angle searching within \([\hat{{\varphi }}_k ^{ini}-\Delta \varphi ,\hat{{\varphi }}_k ^{ini}+\Delta \varphi ]\), the inter-element spacing of the sensor array does not have to be restricted within a half-wavelength. In order to eliminate angle ambiguity, it is required that \(2\pi d\sin \Delta \varphi /\lambda \le \pi \), and then we obtain \(\Delta \varphi \le \pi \lambda /4d\), and \(d\le \pi \lambda /4\Delta \varphi \). The maximum spacing between adjacent receive elements is \(d_{r\max } =\lambda \pi /(4\Delta \varphi )\).

Remark 2

We sort the initial estimation of angle \(\hat{{\varphi }}_k^{ini} \left( {k={1},{2},\ldots ,K} \right)\), and assume that \(\hat{{\varphi }}_i^{ini} \) and \(\hat{{\varphi }}_j^{ini} \) are adjacent. The local searching range \(\Delta \varphi \) is

$$\begin{aligned} \left\{ {{\begin{array}{c} {\Delta \varphi =\frac{1}{2}\left| {\hat{{\varphi }}_i^{ini} -\hat{{\varphi }}_j^{ini} } \right|,\quad \Delta \varphi \le 5^{\circ }} \\ {\Delta \varphi =5^{\circ },\quad \Delta \varphi >5^{\circ }} \\ \end{array} }} \right.\!\!\!. \end{aligned}$$

Using the similar method, we can get \(\Delta \phi \).

3.3 Complexity analysis

The proposed algorithm has much lower computational complexity than 2D-MUSIC algorithm. The major computational complexity of the proposed algorithm is \(O (16M^{2}L\,+\,64M^{3}\,+\,2K^{2}M\,+\,3K^{3}\,+\,{2Kn}_{1} (16M^{2}\,+\,4M\,-\,4{MK}\,-\,K))\), where \( n_{1}\) is the number of steps within the local searching range, while 2D-MUSIC algorithm requires \(O (16M^{2}L + 64M^{3}+ n^{2} (16 M^{2 }+ 4 M - 4{MK }-K))\), where \(n\) is the number of steps within the global searching range, and \(n>>n_{1}\). The proposed algorithm has higher computational complexity than ESPRIT algorithm. ESPRIT algorithm needs \(O(16M^{2}L + 64M^{3} + 2K^{2}M +3K^{3})\). In trilinear decomposition algorithm, the complexity of each iteration is \(O\{3K^{3} + 12{MLK}+ K^{2} (4M+\,4L+{ML}+4+M+L)\}\) (Zhang et al. 2012a), while the number of iterations depends on the trilinear model.

3.4 Advantages of the proposed algorithm

The proposed algorithm has the following advantages.

  1. (1)

    The proposed algorithm can obtain automatically paired two-dimensional angle estimation. For the initial estimations of the azimuth and the elevation angles are obtained, and they are automatically paired.

  2. (2)

    The proposed algorithm just requires the one-dimensional local searches, while 2D-MUSIC algorithm needs a two-dimensional global search.

  3. (3)

    The proposed algorithm has better DOA estimation performance than PM algorithm, ESPRIT method and trilinear decomposition algorithm, which will be shown in Sect. 5.

  4. (4)

    The angle estimation performance of the proposed algorithm is close to 2D-MUSIC algorithm, which will be shown in Sect. 5.

  5. (5)

    The proposed algorithm imposes less constraint on the sensor spacing, which does not have to be restricted within half-wavelength. The reason is shown in Remark 2.

4 Performance analysis

This section aims at analyzing estimation performance of the proposed algorithm. We establish the large-sample mean-square error (MSE) of 2D-DOA estimation of the proposed algorithm, and derive the CRB of 2D-DOA estimation.

4.1 Error analysis

For initial estimations of the elevation angles, we use eigen-value decomposition of \({\hat{\mathbf{R}}}_c =\sum \nolimits _{t=1}^L {\mathbf{Cx}\left( t \right)\left( {\mathbf{Cx}\left( t \right)} \right)^{H}} \), which is denoted by

$$\begin{aligned} {\hat{\mathbf{R}}}_c ={\hat{\mathbf{E}}\hat{\varvec{\Lambda }}\hat{\mathbf{E}}}^{H} \end{aligned}$$
(23)

where \({\hat{\mathbf{E}}}=\left[ {{\hat{\mathbf{S}}}_1 ,{\hat{\mathbf{S}}}_2 ,\ldots ,{\hat{\mathbf{S}}}_{4M}} \right], {\hat{\varvec{\Lambda }}} = diag\left( {\hat{{\lambda }}_1 ,\hat{{\lambda }}_2 ,\ldots ,\hat{{\lambda }}_{4M}} \right)\), with \({\hat{\mathbf{S}}}_i \) being the estimated eigen-vector and \(\hat{{\lambda }}_i \) the estimated eigen-value. Let \({\hat{\mathbf{S}}}_i =\mathbf{S}_i +{\varvec{\upeta } }_i \), where \(\mathbf{S}_i \) and \({\varvec{\upeta } }_i \) are perfect eigen-vector and estimation error vector, respectively. \(\hat{{\lambda }}_i =\lambda _i +\xi _i \), where \(\lambda _i\) and \(\xi _i\) are perfect eigen-value and estimation error, respectively. We have

$$\begin{aligned}&E\left[ {{\varvec{\upeta }}_i {\varvec{\upeta }}_j ^{H}} \right]\approx \frac{\lambda _i }{L}\mathop {\mathop {\sum }\limits _{l=1}}\limits _{l\ne i}^{4M} {\frac{\lambda _l }{\left( {\lambda _i -\lambda _l } \right)^{2}}\mathbf{S}_l \mathbf{S}_l ^{H}} \delta _{ij} ,\quad i,j=1,\ldots K\end{aligned}$$
(24)
$$\begin{aligned}&E\left[ {{\varvec{\upeta }}_i {\varvec{\upeta }}_j ^{T}} \right]\approx -\frac{\lambda _j \lambda _i }{L\left( {\lambda _i -\lambda _j } \right)^{2}}\mathbf{S}_j \mathbf{S}_i ^{T}\left( {1-\delta _{ij} } \right)\quad i,j=1,\ldots K \end{aligned}$$
(25)

The matrix \({\hat{\mathbf{E}}}\) in (23) can be partition as \({\hat{\mathbf{E}}}=[{\hat{\mathbf{E}}}_c ,{\hat{\mathbf{E}}}_N ]\), where \({\hat{\mathbf{E}}}_c \in C^{4M\times K}\) is the estimate of the signal subspace \(\mathbf{E}_c \), and \({\hat{\mathbf{E}}}_N \in C^{4M\times (4M-K)}\) is the estimate of the nose subspace \(\mathbf{E}_N \). We assume that the \(i\)th diagonal element of \({\varvec{\Phi }}_z \) is \(z_i,i=1,\ldots ,K\), define \({\varvec{\Psi }}_z \mathop {=}\limits ^\Delta {\hat{\mathbf{E}}}_{c1} {^{+}}{\hat{\mathbf{E}}}_{c4} =\mathbf{T}^{-1}{\varvec{\Phi }}_z \mathbf{T}\), and then \(z_i\) is the \(i\)th eigen-value of \({\varvec{\Psi }}_z \). The estimation error of \(z_i \) is

$$\begin{aligned} \Delta z_i =\mathbf{q}_i \Delta {\varvec{\Psi }}_{xy} \mathbf{x}_i \end{aligned}$$
(26)

where \(\mathbf{x}_i \) and \(\mathbf{q}_i \) are the right eigen-vector and left eigen-vector corresponding to \(z_i \), respectively. \({\varvec{\Psi }}_z \mathbf{x}_i =\mathbf{z}_i \mathbf{x}_i ,\mathbf{q}_i {\varvec{\Psi }}_z =\mathbf{z}_i \mathbf{q}_i \), and \(\mathbf{q}_i \mathbf{x}_i =1\). For \(\left( {\mathbf{E}_{c1} +\Delta \mathbf{E}_{c1} } \right)\left( {{\varvec{\Psi }}_z +\Delta {\varvec{\Psi }}_z } \right)\approx \mathbf{E}_{c4} +\Delta \mathbf{E}_{c4} ,\Delta {\varvec{\Psi }}_z \) can be written approximately as

$$\begin{aligned} \Delta {\varvec{\Psi }}_z \approx \mathbf{E}_{c1}^{+} \Delta \mathbf{E}_{c4} -\mathbf{E}_{c1}^{+} \Delta \mathbf{E}_{c1} {\varvec{\Psi }}_z \end{aligned}$$
(27)

According to (26) and (27), we get

$$\begin{aligned} \Delta z_i =\mathbf{q}_i \mathbf{E}_{c1}^{+} \left[ {\Delta \mathbf{E}_{c4} \mathbf{x}_i -\Delta \mathbf{E}_{c1} {\varvec{\Psi }}_z \mathbf{x}_i } \right]=\mathbf{q}_i \mathbf{E}_{c1}^{+} \left( {\mathbf{W}^{{\prime }{\prime }}-z_i \mathbf{W}{^\prime }} \right)\Delta \mathbf{E}_c \mathbf{x}_i \end{aligned}$$
(28)

where \(\mathbf{{W}^{\prime }}=\left[ {\mathbf{I}_{M\times M} ,\mathbf{0}_{M\times 3M} } \right],\mathbf{{W}^{\prime \prime }}=\left[ {\mathbf{0}_{NM\times 3NM} ,\mathbf{I}_{NM\times NM} } \right]\).

The mean squared error of \(z_i \) is given by

$$\begin{aligned} E\left[ {\left| {\Delta z_i } \right|^{2}} \right]=\mathbf{q}_i \mathbf{E}_{c1}^{+} \left( {\mathbf{{W}^{\prime \prime }}-z_i \mathbf{{W}}^{\prime }} \right)E\left[ {\Delta \mathbf{E}_c \mathbf{x}_i \mathbf{x}_i^H \Delta \mathbf{E}_c ^{H}} \right]\left( {\mathbf{{W}^{\prime \prime }}-z_i \mathbf{{W}}^{\prime }} \right)^{H}\left( {\mathbf{E}_{c1}^{+} } \right)^{H}\mathbf{q}_i^H \end{aligned}$$
(29)

Combining (24) and (29), we get

$$\begin{aligned} E\left[ {\left| {\Delta z_i } \right|^{2}} \right]&= \mathbf{q}_i \mathbf{E}_{c1}^{+} \left( {\sum _{j=1}^K {\left| {\mathbf{x}_{ij} } \right|^{2}\mathbf{F}_i E\left[ {{\varvec{\upeta }}_j {\varvec{\upeta }}_j^H } \right]\mathbf{F}_i ^{H}} } \right)\left( {\mathbf{q}_i \mathbf{E}_{c1}^{+} } \right)^{H} \nonumber \\&= \mathbf{q}_i \mathbf{E}_{c1}^{+} \mathbf{F}_i \left[ {\sum _{j=1}^K {\left| {\mathbf{x}_{ij} } \right|^{2}\frac{\lambda _j }{L}\mathop {\mathop {\sum }\limits _{k=1}}\limits _{k\ne i}^{4M} {\frac{\lambda _k }{\left( {\lambda _j -\lambda _k } \right)^{2}}\mathbf{S}_k \mathbf{S}_k ^{H}} } } \right]\mathbf{F}_i ^{H}\left( {\mathbf{q}_i \mathbf{E}_{c1}^{+} } \right)^{H} \nonumber \\&= \frac{1}{L}\mathbf{r}_i ^{H}\left[ {\mathbf{x}_i ^{H}\mathbf{W}_s \mathbf{x}_i \mathbf{E}_w \mathbf{E}_w ^{H}+\mathbf{E}_c \mathbf{WE}_c ^{H}} \right]\mathbf{r}_i \end{aligned}$$
(30)

where \(\mathbf{W}_s \mathop {=}\limits ^\Delta diag\left\{ {\frac{\lambda _1 \sigma ^{2}}{\left( {\lambda _1 -\sigma ^{2}} \right)^{2}},\ldots ,\frac{\lambda _K \sigma ^{2}}{\left( {\lambda _K -\sigma ^{2}} \right)^{2}}} \right\} ;\mathbf{F}_i =\mathbf{{W}^{\prime \prime }}-z_i \mathbf{{W}^{\prime }}; \mathbf{r}_i ^{H}=\left( {\mathbf{A}_{c1}^{+} \mathbf{F}_i } \right)^{(i)}\) is the \(i\)th row of \(\mathbf{A}_{c1}^{+} \mathbf{F}_i \) and \(\mathbf{A}_{c1}^{+} =\left( {\mathbf{A}_{c1}^H \mathbf{A}_{c1} } \right)^{-1}\mathbf{A}_{c1}^H ; \mathbf{E}_w \) is the noise subspace corresponding to \(\mathbf{E}_c ; \mathbf{W}\) is a diagonal matrix with the \(i\)th element is \(\mathop {\mathop {\sum }\nolimits _{k=1}}\nolimits _{k\ne i}^K {\frac{\lambda _k \lambda _i }{\left( {\lambda _k -\lambda _i } \right)^{2}}} \left| {\mathbf{x}_{ik} } \right|^{2}\).

Similarly, we have

$$\begin{aligned} E\left[ {\left( {\Delta z_i } \right)^{2}} \right]=\mathbf{r}_i ^{H}\left( {\sum _{j=1}^K {\mathop {\mathop {\sum }\limits _{k=1}}\limits _{k\ne i}^K {x_{ij} x_{ik} E\left[ {{\varvec{\upeta }}_j {\varvec{\upeta }}_k^T } \right]}}} \right)\mathbf{r}_i^*=\mathbf{r}_i ^{H}\left( {\sum _{j=1}^K {\mathop {\mathop {\sum }\limits _{k=1}}\limits _{k\ne i}^K {x_{ij} x_{ik} \frac{\lambda _j \lambda _k }{L\left( {\lambda _j -\lambda _k } \right)^{2}}\mathbf{S}_k \mathbf{S}_j^T } } } \right)\mathbf{r}_{i}^{*}\nonumber \\ \end{aligned}$$
(31)

We derive the mean square error of the initial estimation of \(\varphi _i \) as follows.

$$\begin{aligned} E\left[ {\left( {\Delta \varphi _i^{ini} } \right)^{2}} \right]=\frac{1}{2\cos ^{2}\varphi _i }\left( {E\left[ {\left| {\Delta z_i } \right|^{2}} \right]+\text{ Re}\left( {E\left[ {\left( {\Delta z_i } \right)^{2}} \right]} \right)} \right) \end{aligned}$$
(32)

Define \(\mathbf{V}_r (\phi )=[\mathbf{a}(\hat{{\varphi }}_k^{ini} )\otimes \mathbf{h}(\phi ,\hat{{\varphi }}_k^{ini} )]^{H}\mathbf{E}_n \mathbf{E}_n^H [\mathbf{a}(\hat{{\varphi }}_k^{ini} )\otimes \mathbf{h}(\phi ,\hat{{\varphi }}_k^{ini} )]\). For \(\hat{{\phi }}_k \) is the minimum point of \(\mathbf{V}_r (\phi )\), we have \(\mathbf{{V}^{\prime }}_r (\hat{{\phi }}_k )=0\). Use a first order Taylor series expansion, we have

$$\begin{aligned} 0=\mathbf{{V}^{\prime }}_r (\hat{{\phi }}_k )\cong \mathbf{{V}^{\prime }}_r (\phi _k )+\mathbf{{V}^{\prime \prime }}_r (\phi _k )(\hat{{\phi }}_k -\phi _k ) \end{aligned}$$
(33)

where “\(\cong \)” is a symbol used to denote items that are approximately equal.

Then the estimation error of \(\phi _k \) will be

$$\begin{aligned} \hat{{\phi }}_k -\phi _k =-\frac{\mathbf{{V}^{\prime }}_r (\phi _k )}{\mathbf{{V}^{\prime \prime }}_r (\phi _k )} \end{aligned}$$
(34)

According to the asymptotic analysis of MUSIC in Stoica and Nehorai (1989), mean square error of \(\phi _k \) estimation can be expressed as

$$\begin{aligned} E[(\hat{{\phi }}_k -\phi _k )^{2}]={\frac{\sigma ^{2}}{2L}\left[ {\sum _{k=1}^K {\frac{\lambda _k }{(\sigma ^{2}-\lambda _k )^{2}}\left| {\mathbf{a}_1^H (\phi _k )\mathbf{s}_k } \right|^{2}} } \right]}\Bigg /{\left[ {\sum _{k=1}^{4M-K} {\left| {\mathbf{d}_1^H (\phi _k )\mathbf{h}_k } \right|^{2}} } \right]} \end{aligned}$$
(35)

where \(\mathbf{a}_1 (\phi _k )=\mathbf{a}(\hat{{\varphi }}_k^{ini} )\otimes \mathbf{h}(\phi _k ,\hat{{\varphi }}_k^{ini} ),\mathbf{d}_1 (\phi _k )=d\mathbf{a}_1 (\phi _k )/d\phi _k ,\mathbf{s}_k, k=1,\ldots ,K\) is the column of the signal subspace \(\mathbf{E}_s ,\mathbf{h}_k,k=1,\ldots ,MN-K\) is the column of the noise subspace \(\mathbf{E}_n \).

Similarly, we get the mean square error of \(\varphi _k \) estimation

$$\begin{aligned} E[(\hat{{\varphi }}_k -\varphi _k )^{2}]={\frac{\sigma ^{2}}{2L}\left[ {\sum _{k=1}^K {\frac{\lambda _k }{(\sigma ^{2}-\lambda _k )^{2}}\left| {\mathbf{a}_2^H (\varphi _k )\mathbf{s}_k } \right|^{2}} } \right]}\Bigg /{\left[ {\sum _{k=1}^{4M-K} {\left| {\mathbf{d}_2^H (\varphi _k )\mathbf{h}_k } \right|^{2}} } \right]} \end{aligned}$$
(36)

where \(\mathbf{a}_2 (\varphi _k )=\mathbf{a}(\varphi _k )\otimes \mathbf{h}(\hat{{\phi }}_k ,\varphi _k ),\mathbf{d}_2 (\varphi _k )=d\mathbf{a}_2 (\varphi _k )/d\varphi _k\)

4.2 CRB

In this subsection, we derive CRB of angle estimation for acoustic vector-sensor array. We assume that the signal \(\mathbf{b}\left( t \right)\) is deterministic, and then estimation parameter vector is expressed as

$$\begin{aligned} {\varvec{\zeta }}=\left[ {\varphi _1 ,\ldots ,\varphi _K ,\phi _1 ,\ldots ,\phi _K ,\mathbf{b}_R ^{T}\left( 1 \right),\ldots ,\mathbf{b}_R ^{T}\left( L \right),\mathbf{b}_I ^{T}\left( 1 \right),\ldots ,\mathbf{b}_I ^{T}\left( L \right),\sigma ^{2}} \right]^{T} \end{aligned}$$
(37)

where \(\mathbf{b}_R \left( l \right)\) and \(\mathbf{b}_I \left( l \right)\) denote the real and imaginary parts of \(\mathbf{b}\left( l \right)\), respectively. According to (7), the output can be rewritten as

$$\begin{aligned} \mathbf{y}=\left[ {\mathbf{x}^{T}\left( 1 \right),\ldots ,\mathbf{x}^{T}\left( L \right)} \right] \end{aligned}$$
(38)

The mean \(\mu \) and the covariance matrix \({\varvec{\Gamma }}\) of \(\mathbf{y}\) are

$$\begin{aligned} {\varvec{\upmu }}=\left[ {{\begin{array}{c} {\left( {\mathbf{A}\circ \mathbf{H}} \right)\mathbf{b}\left( 1 \right)} \\ \vdots \\ {\left( {\mathbf{A}\circ \mathbf{H}} \right)\mathbf{b}\left( L \right)} \\ \end{array} }} \right],\quad {\varvec{\Gamma }}=\left[ {{\begin{array}{ccc} {\sigma ^{2}\mathbf{I}_{4M} }&\,&0 \\&\ddots&\\ 0&\,&{\sigma ^{2}\mathbf{I}_{4M} } \\ \end{array} }} \right] \end{aligned}$$
(39)

From Stoica and Nehorai (1990), we know that the \((i, j)\) element of the CRB matrix \((\mathbf{P}_{cr} )\) can be expressed as

$$\begin{aligned} \left[ {\mathbf{P}_{cr}^{-1} } \right]_{ij} =tr\left[ {{\varvec{\Gamma }}^{-1}{{\varvec{\Gamma }^{\prime }}}_i {\varvec{\Gamma }}^{-1}{{\varvec{\Gamma }^{\prime }}}_j } \right]+2\text{ Re}\left[ {{\varvec{\upmu }^{\prime }}_i^H {\varvec{\Gamma }}^{-1}{\varvec{\upmu }^{\prime }}_j } \right] \end{aligned}$$
(40)

where \({{\varvec{\Gamma }}^{\prime }}_i \) and \({\varvec{\upmu }^{\prime }}_i \) are the derivative of \({\varvec{\Gamma }}\) and \({\varvec{\upmu }}\) on the \(i\)th element of \({\varvec{\zeta }}\), respectively. Since the covariance matrix is just related to \(\sigma ^{2}\), the first part of (40) can be ignored. Then

$$\begin{aligned} \left[ {\mathbf{P}_{cr}^{-1} } \right]_{ij} =2\text{ Re}\left[ {{\varvec{\upmu }^{\prime }}_i^H {\varvec{\Gamma }}^{-1}{\varvec{\upmu }^{\prime }}_j } \right] \end{aligned}$$
(41)

And we have

$$\begin{aligned} \frac{\partial {\varvec{\upmu }}}{\partial \varphi _k }&= \left[ {{\begin{array}{c} {\frac{\partial \left( {\mathbf{A}\circ \mathbf{H}} \right)}{\partial \varphi _k }\mathbf{b}\left( 1 \right)} \\ \vdots \\ {\frac{\partial \left( {\mathbf{A}\circ \mathbf{H}} \right)}{\partial \varphi _k }\mathbf{b}\left( L \right)} \\ \end{array} }} \right]=\left[ {{\begin{array}{c} {\mathbf{d}_{k\varphi } \mathbf{b}_k \left( 1 \right)} \\ \vdots \\ {\mathbf{d}_{k\varphi } \mathbf{b}_k \left( L \right)} \\ \end{array} }} \right],\quad k=1,\ldots ,K\end{aligned}$$
(42a)
$$\begin{aligned} \frac{\partial {\varvec{\upmu }}}{\partial \phi _k }&= \left[ {{\begin{array}{c} {\frac{\partial \left( {\mathbf{A}\circ \mathbf{H}} \right)}{\partial \phi _k }\mathbf{b}\left( 1 \right)} \\ \vdots \\ {\frac{\partial \left( {\mathbf{A}\circ \mathbf{H}} \right)}{\partial \phi _k }\mathbf{b}\left( L \right)} \\ \end{array} }} \right]=\left[ {{\begin{array}{c} {\mathbf{d}_{k\phi } \mathbf{b}_k \left( 1 \right)} \\ \vdots \\ {\mathbf{d}_{k\phi } \mathbf{b}_k \left( L \right)} \\ \end{array} }} \right],\quad k=1,\ldots ,K \end{aligned}$$
(42b)

where \(\mathbf{b}_k \left( t \right)\) is the \(k\)th element of \(\mathbf{b}\left( t \right),\mathbf{d}_{k\varphi } =\frac{\partial \left( {\mathbf{A}\circ \mathbf{H}} \right)}{\partial \varphi _k },\mathbf{d}_{k\phi } =\frac{\partial \left( {\mathbf{A}\circ \mathbf{H}} \right)}{\partial \phi _k }\).

Define

(43)

Let

(44)

then \({\varvec{\upmu }}=\mathbf{Gb}\), and

$$\begin{aligned} \frac{\partial {\varvec{\upmu }}}{\partial \mathbf{b}_R ^{T}}=\mathbf{G},\quad \frac{\partial {\varvec{\upmu }}}{\partial \mathbf{b}_I ^{T}}=i\mathbf{G} \end{aligned}$$
(45)

where \(i\) is the imaginary part symbol. Now we have

$$\begin{aligned} \frac{\partial {\varvec{\upmu }}}{\partial {\varvec{\zeta }}^{T}}=\left[ {{\varvec{\Delta }},\mathbf{G},i\mathbf{G},0} \right] \end{aligned}$$
(46)

According to (41),

$$\begin{aligned} 2\text{ Re}\left\{ {\frac{\partial {\varvec{\upmu }}^{*}}{\partial {\varvec{\zeta }}}{\varvec{\Gamma }}^{-1}\frac{\partial {\varvec{\upmu }}}{\partial {\varvec{\zeta }}^{T}}} \right\} =\left[ {{\begin{array}{cc} \mathbf{J}&\quad 0 \\ 0&\quad 0 \\ \end{array} }} \right] \end{aligned}$$
(47)

where Define

(48)
(49)

where \(\mathbf{Q}_R ,\mathbf{Q}_I \) are the real and imaginary parts of \(\mathbf{Q}\), respectively.

We can demonstrate that

$$\begin{aligned} \left[ {{\begin{array}{ccc} {\varvec{\Delta }}&\quad \mathbf{G}&\quad {i\mathbf{G}} \\ \end{array} }} \right]\mathbf{F}=\left[ {{\begin{array}{ccc} {\left( {{\varvec{\Delta }}-\mathbf{GQ}} \right)}&\quad \mathbf{G}&\quad {i\mathbf{G}} \\ \end{array} }} \right]=\left[ {{\begin{array}{ccc} {{\varvec{\Pi }}_\mathbf{G}^\bot {\varvec{\Delta }}}&\quad \mathbf{G}&\quad {i\mathbf{G}} \\ \end{array} }} \right] \end{aligned}$$
(50)

where \({\varvec{\Pi }}_\mathbf{G}^\bot =\mathbf{I}-\mathbf{G}\left( {\mathbf{G}^{H}\mathbf{G}} \right)^{-1}\mathbf{G}^{H}\) and \(\mathbf{G}^{H}{\varvec{\Pi }}_\mathbf{G}^\bot =0\).

$$\begin{aligned} \begin{array}{ll} \mathbf{F}^{T}\mathbf{JF}&=\frac{2}{\sigma ^{2}}\text{ Re}\left\{ {\mathbf{F}^{H}\left[ {{\begin{array}{c} {{\varvec{\Delta }}^{H}} \\ {\mathbf{G}^{H}} \\ {\mathbf{-}i\mathbf{G}^{H}} \\ \end{array} }} \right]\left[ {{\begin{array}{ccc} {\varvec{\Delta }}&\quad \mathbf{G}&\quad {i\mathbf{G}} \\ \end{array} }} \right]\mathbf{F}} \right\} \\&=\frac{2}{\sigma ^{2}}\text{ Re}\left\{ {\left[ {{\begin{array}{c} {{\varvec{\Delta }}^{H}{\varvec{\Pi }}_\mathbf{G}^\bot } \\ {\mathbf{G}^{H}} \\ {\mathbf{-}i\mathbf{G}^{H}} \\ \end{array} }} \right]\left[ {{\begin{array}{ccc} {{\varvec{\Pi }}_\mathbf{G}^\bot {\varvec{\Delta }}}&\quad \mathbf{G}&\quad {i\mathbf{G}} \\ \end{array} }} \right]} \right\} \\&=\frac{2}{\sigma ^{2}}\text{ Re}\left\{ {\left[ {{\begin{array}{ccc} {{\varvec{\Delta }}^{H}{\varvec{\Pi }}_\mathbf{G}^\bot {\varvec{\Delta }}}&\quad {\mathbf{0}_{2K\times 4ML} }&\quad {\mathbf{0}_{2K\times 4ML} } \\ {\mathbf{0}_{4ML\times 2K} }&\quad {\mathbf{G}^{H}\mathbf{G}}&\quad {i\mathbf{G}^{H}\mathbf{G}} \\ {\mathbf{0}_{4ML\times 2K} }&\quad {-i\mathbf{G}^{H}\mathbf{G}}&\quad {\mathbf{G}^{H}\mathbf{G}} \\ \end{array} }} \right]} \right\} \\ \end{array} \end{aligned}$$
(51)

so \(\mathbf{J}^{-1}\) can be written as

$$\begin{aligned} \mathbf{J}^{-1}&= \mathbf{F}\left( {\mathbf{F}^{T}\mathbf{JF}} \right)^{-1}\mathbf{F}^{T} \nonumber \\&= \frac{\sigma ^{2}}{2}\left[ {{\begin{array}{ccc} {\mathbf{I}_{2K} }&\quad {\mathbf{0}_{2K\times 4ML} }&\quad {\mathbf{0}_{2K\times 4ML} } \\ {-\mathbf{Q}_R }&\quad {\mathbf{I}_{4ML} }&\quad {\mathbf{0}_{4ML\times 4ML} } \\ {-\mathbf{Q}_I }&\quad {\mathbf{0}_{4ML\times 4ML} }&\quad {\mathbf{I}_{4ML} } \\ \end{array} }} \right]\left[ {{\begin{array}{ccc} {\text{ Re}\left( {{\varvec{\Delta }}^{H}{\varvec{\Pi }}_\mathbf{G}^\bot {\varvec{\Delta }}} \right)}&\quad {\mathbf{0}_{2K\times 4ML} }&\quad {\mathbf{0}_{2K\times 4ML} } \\ {\mathbf{0}_{2K\times 4ML} }&\quad \kappa&\quad \kappa \\ {\mathbf{0}_{2K\times 4ML} }&\quad \kappa&\quad \kappa \\ \end{array} }} \right] \nonumber \\&\left[ {{\begin{array}{ccc} {\mathbf{I}_{2K} }&\quad {-\mathbf{Q}_R^T }&\quad {-\mathbf{Q}_I^T } \\ {\mathbf{0}_{2K\times 4ML} }&\quad {\mathbf{I}_{4ML} }&\quad {\mathbf{0}_{4ML\times 4ML} } \\ {\mathbf{0}_{2K\times 4ML} }&\quad {\mathbf{0}_{4ML\times 4ML} }&\quad {\mathbf{I}_{4ML} } \\ \end{array} }} \right] =\left[ {{\begin{array}{ccc} {\frac{\sigma ^{2}}{2}\left[ {\text{ Re}\left( {{\varvec{\Delta }}^{H}{\varvec{\Pi }}_\mathbf{G}^\bot {\varvec{\Delta }}} \right)} \right]^{-1}}&\quad \kappa&\quad \kappa \\ \kappa&\quad \kappa&\quad \kappa \\ \kappa&\quad \kappa&\quad \kappa \\ \end{array} }} \right] \end{aligned}$$
(52)

where \(\kappa \) denotes the part we do not concern about. \(\mathbf{Q}_R ,\mathbf{Q}_I \) are the real and imaginary parts of \(\mathbf{Q}\), respectively.

Till now, we can give the CRB matrix

$$\begin{aligned} CRB=\frac{\sigma ^{2}}{2}\left[ {\text{ Re}\left( {{\varvec{\Delta }}^{H}{\varvec{\Pi }}_\mathbf{G}^\bot {\varvec{\Delta }}} \right)} \right]^{-1} \end{aligned}$$
(53)

After further simplification, we can rewrite the CRB matrix as

$$\begin{aligned} CRB=\frac{\sigma ^{2}}{2L}\left\{ {\text{ Re}\left[ {\mathbf{D}^{H}{\varvec{\Pi }}_{\mathbf{A}\circ \mathbf{H}}^\bot \mathbf{D}\odot {\hat{\mathbf{P}}}^{T}} \right]} \right\} ^{-1} \end{aligned}$$
(54)

where \(\mathbf{D}=\left[ {\mathbf{d}_{1\varphi } ,\mathbf{d}_{2\varphi } ,\ldots ,\mathbf{d}_{K\varphi } ,\mathbf{d}_{1\phi } ,\mathbf{d}_{2\phi } ,\ldots ,\mathbf{d}_{K\phi } } \right]; {\hat{\mathbf{P}}}=\left[ {{\begin{array}{cc} {{\hat{\mathbf{P}}}_s }&{{\hat{\mathbf{P}}}_s } \\ {{\hat{\mathbf{P}}}_s }&{{\hat{\mathbf{P}}}_s } \\ \end{array} }} \right]\) with  \({\hat{\mathbf{P}}}_s =\frac{1}{L}\sum \nolimits _{t=1}^L {\mathbf{b}\left( t \right)\mathbf{b}^{H}\left( t \right)} ; {\varvec{\Pi }}_{\mathbf{A}\circ \mathbf{H}}^\bot =\mathbf{I}_{4M\times 4M} -\left( {\mathbf{A}\circ \mathbf{H}} \right)\left[ {\left( {\mathbf{A}\circ \mathbf{H}} \right)^{H}\left( {\mathbf{A}\circ \mathbf{H}} \right)} \right]^{-1}\left( {\mathbf{A}\circ \mathbf{H}} \right)^{H}\).

5 Simulation results

In order to assess the angle estimation performance of the proposed algorithm, we present Monte Carlo simulations and set its trial numbers as 1,000. Define root mean squared error (RMSE) as

$$\begin{aligned} RMSE=\frac{1}{K}\sum \limits _{k=1}^K {\sqrt{\frac{1}{1000}\sum \limits _{l=1}^{1000} {\left[ {\left( {\hat{{\phi }}_{k,l} -\phi _k } \right)^{2}+\left( {\hat{{\varphi }}_{k,l} -\varphi _k } \right)^{2}} \right]} }} \end{aligned}$$
(55)

where \(\hat{{\varphi }}_{k,l} \) is the estimate of \(\varphi _k \) of the \(l\)th Monte Carlo trial, and \(\hat{{\phi }}_{k,l} \) is the estimate of \(\phi _k \) of the \(l\)th Monte Carlo trial. Note that \(L\) is the number of snapshots; \(M\) is the number of array elements.

In the simulations except for Fig. 8, the non-coherent source number is assumed to be \(K = 2\). The source signals impinge upon the acoustic vector-sensor array with \(\left( {\phi _1 ,\varphi _1 } \right)=\left( {15^{\circ },10^{\circ }} \right)\), and \(\left( {\phi _2 ,\varphi _2 } \right)=\left( {35^{\circ },20^{\circ }} \right)\), respectively. In most case, non-uniform linear array with \([d_1 ,d_2 ,\ldots ,d_M ]=\left[ {0,{ 1},{ 1}.{7},{ 2}.{5},{ 3}.{4},{ 4}.{2},{ 5}.{2},{ 6}.{1},{ 7},{ 7}.{6}} \right]\times 0.5\lambda \) is used.

Figure 2 displays angle estimation result of the proposed algorithm for two sources over 100 Monte Carlo simulations with \(M = 8, L = 200\) and \(\text{ SNR} = 5~\text{ dB}\). Figure 3 presents the estimation result with \(M = 8, L = 100\) and \(\text{ SNR} = 15~\text{ dB}\). From Figs. 2, 3, we find that the proposed algorithm is able to estimate DOA, and it can work in the lower SNR.

Fig. 2
figure 2

DOA estimation of the proposed algorithm with SNR = 5 dB

Fig. 3
figure 3

DOA estimation of the proposed algorithm with SNR = 15 dB

We compare the proposed algorithm against PM algorithm [17], ESPRIT algorithm [7], 2D-MUSIC algorithm, trilinear decomposition algorithm [16] and CRB. Figure 4 shows the DOA estimation performance of the algorithms with \(M = 6\) and \(L = 100\), while Fig. 5 presents the angle estimation performance of the algorithms with \(M = 8\) and \(L = 50\). It is indicated in Figs. 4, 5 that the proposed algorithm has better angle estimation performance than PM algorithm, ESPRIT method and trilinear decomposition algorithm. Also, the proposed algorithm has very close angle estimation performance to 2D-MUSIC algorithm.

Figure 6 depicts the algorithmic performance where the proposed algorithm has been adopted, and the simulation is shown with different \(L (M=8)\). It is indicated that the angle estimation performance of the proposed algorithm becomes better in collaboration with \(L\) increasing.

Figure 7 illustrates the angle estimation performance of the proposed algorithm in condition of \(L = 100\) and different \(M\). It is clearly indicated that the angle estimation performance of the proposed algorithm is gradually improving with the number of sensors increasing. Multiple sensors improve angle estimation performance because of diversity gain.

Fig. 4
figure 4

DOA estimation comparison with \( M = 6\), and \(L= 100\)

Fig. 5
figure 5

DOA estimation comparison with \( M = 8\), and \(L = 50\)

Fig. 6
figure 6

DOA estimation performance with different \(L\)

Fig. 7
figure 7

DOA estimation performance with different \(M\)

Fig. 8
figure 8

DOA estimation performance for two sources with same azimuth angle

Figure 8 shows the angle estimation performance of the proposed algorithm when dealing with two sources with the same azimuth angle. The source signals impinge upon the acoustic vector-sensor array with \(\left( {\phi _1 ,\varphi _1 } \right)=\left( {15^{\circ },10^{\circ }} \right)\), and \(\left( {\phi _2 ,\varphi _2 } \right)=\left( {15^{\circ },20^{\circ }} \right)\), respectively. We set \(M=8\) and \(L=100\) in Fig. 8. From Fig. 8, we find that the proposed algorithm can work well for the sources with the same azimuth angle.

6 Conclusions

We have presented a successive MUSIC algorithm for 2D-DOA estimation in the acoustic vector-sensor array. The proposed algorithm obtains the initial estimations of the azimuth angle and the elevation angle from the signal subspace, and employs successively one-dimensional local searches to achieve the joint estimation of 2D-DOA. Our approach is able to estimate automatically paired 2D-DOA, and enjoys a significant computational advantage over 2D-MUSIC. The proposed algorithm has better DOA estimation performance than PM, ESPRIT method and trilinear decomposition algorithm. Meanwhile, it has close angle estimation performance to 2D-MUSIC algorithm. Furthermore it is suitable for non-uniform linear arrays, works well for the same azimuth angle, and imposes less constraint on the sensor spacing, which does not have to be restricted within half-wavelength. Numerical experiments illustrate the accuracy and efficacy of the proposed algorithm in a variety of parameter and scenarios.