1 Introduction

Adaptive design technique has been considered as an important role in many fields of applications like radar, sonar, wireless communications, and so on in recent years [4, 16, 20, 21, 28]. The well-known conventional adaptive beamforming algorithm is usually used to extract the desired signal and suppress simultaneously the interference as well as noise at the array output. However, it has also been recognized that the conventional beamformer is sensitive to the imprecise signal information especially when the desired signal component is present in the received data (i.e., the self-null phenomenon in the direction of desired signal [17], which will result in severe performance degradation of adaptive beamformers.

As a solution, various robust techniques that combine different principles were developed to improve the performance [6,7,8, 10, 11, 14, 22, 25, 27, 29, 30]. One of the most popular robust approaches is the Diagonal loading (DL) algorithm, which relies on the diagonal loading of the sample covariance matrix [6]. However, the suitable diagonal loading factor is hard to be selected in practice. As for the eigenspace-based projection beamformer developed in [8], the desired signal steering vector can be obtained by projecting the presumed steering vector on the estimated signal-plus-interference subspace. This algorithm may suffer considerable performance degradation in the low signal-to-noise ratio (SNR) because the signal subspace may be corroded by the noise subspace. In [7], a robust method is proposed by automatically choosing the diagonal loading level, which has the potential to enhance the robust performance. The worst-case technique delimits the uncertainty set by upper bounding the norm of the mismatch vector in [25]. It has been proved the worst-case technique belongs to the variants of DL technique. In [27], a robust adaptive beamforming technique (RAB) with magnitude response constraints employs convex optimization to restrict the region of interest, which can be flexibly controlled with specified robust region.

In order to remove the signal of interest from the sample covariance matrix, recently, some approaches have been considered based on the interference-plus-noise covariance matrix reconstructive method. An iterative shrinkage-based algorithm which estimates the steering vector and covariance matrix was investigated in [22]. This beamforming approach offers excellent performance when the interference power is weak. A classic design approach based on the desired signal steering vector estimation and the interference-plus-noise covariance matrix reconstruction is originally studied by Gu and Leshem in [10]. The algorithm based on spatial power spectrum sampling is shown to be more robust and excellent than that by using the sample covariance matrix. Motivated by this landmark experimental result, there are several further researches on efficient adaptive beamformer design based on the principle of desired signal steering vector estimation and interference-plus-noise covariance matrix reconstruction [11, 14, 29, 30]. A sparse way to reconstruct the interference covariance matrix was proposed in [11], the robust design is solved via compressive sensing techniques [23, 31]. In [29, 30], the robust designs for coprime array adaptive beamforming are proposed. The authors also demonstrate that a good performance can be achieved in the reconstruction-based adaptive beamformer. Hence, covariance matrix optimization plays an important role in further improving the system performance. This fact motivates us to explore the research on adaptive robust design.

On the basis of the studies mentioned above, we devise a novel robust design algorithm via estimating the desired signal steering vector and covariance matrix. We first utilize the shrinkage method to estimate the theoretical covariance matrix. Then, the steering vector is corrected by maximizing the array output power under correlation coefficient and norm constraint. The hidden convexity property of the design is analytically proved. Thus, the formulated nonconvex quadratic programming problem can be globally solved using the semidefinite relaxation (SDR) technique [5]. Moreover, in order to remove the desired signal component from the estimated covariance matrix, the interference-plus-noise covariance matrix is estimated based on the corrected steering vector and subspace theorem. The performance of the proposed algorithm is improved expectantly in terms of achieved signal-to-interference-plus-noise ratio (SINR). Theoretical analysis and numerical simulations exhibits satisfactory performance.

The remainder of this paper is organized as follows. The signal model is described in Sect. 2. In Sect. 3, a novel steering vector method is proposed. Then, a new method to estimate the interference-plus-noise covariance matrix is introduced. In Sect. 4, we evaluate the performance via numerical simulations. Finally, conclusions are drawn in Sect. 5.

Notation

\(\mathcal {CN}\left( {\mu , \varvec{\Sigma }} \right) \) denotes the circularly symmetric complex Gaussian distribution with mean \(\mu \) and covariance matrix \(\varvec{\Sigma }\). Vectors are denoted by bold-case letters, e.g., \(\mathbf{a}\) (lowercase); matrices are represented by capitalized bold-case letters and sets e.g., \(\mathbf{A}\) (uppercase). The Hermitian, transpose and conjugate operators are denoted by \(\left( \cdot \right) ^{H}\), \(\left( \cdot \right) ^{\mathrm{T}}\) and \(\left( \cdot \right) ^{*}\), respectively. \(\left\| \cdot \right\| _{F}\) denotes the Frobenius norm. \(\left\| \cdot \right\| \) denotes the Euclidean norm. \(\left| \cdot \right| \) is an absolute operator. \({\mathbf{tr}}\left( \mathbf{A }\right) \) denotes the trace of \(\mathbf{A}\). \(\mathbf{I}\) and \(\mathbf{0}\) denote, respectively, the identity matrix and the matrix with zero entries (Their size is determined from the context). \(\mathbb {R}^{m}\) , \(\mathbb {C}^{m}\) , and \(\mathbb {H}^{m}\) are the sets of m-dimensional vectors of real numbers, the sets of m-dimensional vectors of complex numbers, and \(m \times m\) Hermitian matrices, respectively.

2 System Model and Problem Statement

Assume that a uniform linear array (ULA) equipped with M omnidirectional antennas. The \(P+1\) narrowband far-field signals are received by the array

$$\begin{aligned} \begin{array}{l} {\mathbf{x}}(t) = {{\mathbf{x}}_s}(t) + {{\mathbf{x}}_i}(t) + {\mathbf{n}}(t)= \sum \limits _{i = 0}^P {{s_i}(t){\mathbf{a}}({\theta _i})} + {\mathbf{n}}(t),\\ \end{array} \end{aligned}$$
(1)

where t is the index of time, \({\mathbf{n}}(t)\) denotes the additive white Gaussian noise vector, distributed as \( \mathcal {CN}\left( {0,{\sigma _n^2}} \right) \), \(\sigma _{n}^2\) represents the noise power. \({s_i}(t)\) and \({\theta _i}\) are the ith signal waveform and the corresponding signal direction, respectively. Assume that the signal sources and noise are statistically independent with each other. \({{{\mathbf{x}}_s}(t)={s_0}(t){\mathbf{a}}({\theta _0})}\) and \({{\mathbf{x}}_i}(t)=\sum \nolimits _{i = 1}^P {{s_i}(t){\mathbf{a}}({\theta _i})}\) denote the desired signal and interference vectors, respectively. \({\mathbf{a}}(\cdot )\in \mathbb {C}^{M}\) represents the steering vector, which has the following general form

$$\begin{aligned} {\mathbf{a}}({\theta }) = {\left[ {1 \ \ {e^{{{j2\pi d\sin \theta } /\lambda }}}\ldots {e^{{{j2\pi \left( {M - 1} \right) d\sin \theta } / \lambda }}}} \right] ^{\mathrm{T}}}, \end{aligned}$$
(2)

where \(\lambda \) and d denote the carrier wavelength and the array spacing, respectively. The output of beamformer is expressed as

$$\begin{aligned} {\mathbf{y}}(t) = {{\mathbf{w}}^H}{\mathbf{x}}(t), \end{aligned}$$
(3)

where \({\mathbf{w}} \in \mathbb {C}^{M}\) stands for the weight vector. Under the assumption that both the signal steering vector and the data matrix are known precisely, the beamforming weight vector \({\mathbf{w}}\) can be obtained via maximizing the output SINR

$$\begin{aligned} \text {SINR} =\frac{{{\sigma _{0}^{2}}{{\left| {{{\mathbf{w}}^H}{\mathbf{a}}({\theta _0})} \right| }^2}}}{{ {{\mathbf{w}}^H}{{\mathbf{R}}_\text {in}}{\mathbf{w}}}}, \end{aligned}$$
(4)

where \(\sigma _{0}^{2}\triangleq {{{\text {E}}\left\{ {{{\left| {{s_0}(t)} \right| }^2}} \right\} }}\) denotes the desired signal power, and \(\sigma _{i}^{2}\) is the ith interference power. The interference-plus-noise covariance matrix \({\mathbf{{R}}_{\text {in}}}\) can be defined as

$$\begin{aligned} {\mathbf{R}_\text {in}}={\text {E}}\left[ \left( {\mathbf{x}_{i}}(t)+\mathbf{n}(t) \right) {{\left( {\mathbf{x}_{i}}(t)+\mathbf{n}(t) \right) }^{H}} \right] ={\mathbf{R}_i}+\sigma _{n}^{2}{} \mathbf{I}, \end{aligned}$$
(5)

where \({{\mathbf{{R}}}_{i}}\) denotes the covariance matrix of interference. We also can find that the solution to the adaptive weight vector is able to maintain a distortionless response toward the desired signal and minimize the output interference-plus-noise power, i.e.,

$$\begin{aligned} \begin{array}{l} \mathop {\min }\limits _{\mathbf{w}} {{\mathbf{w}}^{H}}{{\mathbf{R}}_\text {in}}{} \mathbf{w}, \ \ \text {s.t.}\ {{\mathbf{w}}^{H}}{{\mathbf{a}}({\theta _0})}=1. \end{array} \end{aligned}$$
(6)

The adaptive weight vector based on minimum variance distortionless response (MVDR) principle is given by

$$\begin{aligned} {{\mathbf{w}}_{\text {MVDR}}} = \frac{{{\mathbf{R}}_\text {in}^{ - 1}{\mathbf{a}}({\theta _0})}}{{{{\mathbf{a}}^H}({\theta _0}){\mathbf{R}}_\text {in}^{ - 1}{\mathbf{a}}({\theta _0})}}. \end{aligned}$$
(7)

In practice, the covariance matrix is difficult to obtain, and it is usually replaced by the sample covariance matrix \({{\hat{{\mathbf{R}}}}_\text {x}}^{}\), which is calculated from the received signal vectors as

$$\begin{aligned} {{\hat{{\mathbf{R}}}}_\text {x}} = \frac{1}{N}\sum \limits _{t = 1}^N {{{\mathbf{x}}}(t){\mathbf{x}}^H(t)}, \end{aligned}$$
(8)

where N is the number of snapshots. In addition, the adaptive beamformer is sensitive to the desired signal steering vector errors, which may cause self-null phenomenon in the direction of desired signal. Thus, the imperfect covariance matrix or imprecise signal information will result in dramatic performance degradation. Motivated by the reasons, this paper will focus on estimation of the desired steering vector and covariance matrix for robust design.

3 Proposed Algorithm

In this section, we propose a novel approach to obtain the beamforming weight vector \({\mathbf{w}}\). The main idea is to estimate the theoretical covariance matrix based on the shrinkage method first and then use the shrinkage estimate covariance matrix in the following estimation of the desired signal steering vector and the interference-plus-noise covariance matrix.

As discussed in previous work [17], the sample covariance matrix is a poor estimate of the theoretical covariance matrix when the sample size is small. Compared with the classical adaptive approaches, the shrinkage-based methods have the potential to enhance the performance of covariance matrix estimate with small number of samples [7], we get

$$\begin{aligned} {\tilde{\mathbf{R}}_{\text {x}}} = \alpha _0 {\mathbf{I}} + \beta _0 {{{ \hat{{\mathbf{R}}}}_{\text {x}}}}, \end{aligned}$$
(9)

where \( \alpha _0\) and \( \beta _0\) are combination coefficients (\(\beta _0 \in \left[ {0,1} \right] , \alpha _0 \ge 0\)), which are solutions to the minimization of the Mean Squared Error (MSE) formulation

$$\begin{aligned} {\mathrm{MSE}}\left( {{{\tilde{\mathbf{R}}}_{\mathrm{x}}}} \right)= & {} {\mathrm{E}}\left\{ {{{\left\| {{{\tilde{\mathbf{R}}}_{\mathrm{x}}} - {{\mathbf{R}}_{\mathrm{x}}}} \right\| }^2_{F}}} \right\} \nonumber \\= & {} {\mathrm{E}}\left\{ {{{\left\| {{{ \alpha }_0}{\mathbf{I}} - \left( {1 - {{ \beta }_0}} \right) {{\mathbf{R}}_{\mathrm{x}}} + {{ \beta }_0}\left( {{{\hat{\mathbf{R}}}_{\mathrm{x}}} - {{\mathbf{R}}_{\mathrm{x}}}} \right) } \right\| }^2_{F}}} \right\} \nonumber \\= & {} {\left\| {{{ \alpha }_0}{\mathbf{I}} - \left( {1 - {{ \beta }_0}} \right) {{\mathbf{R}}_{\mathrm{x}}}} \right\| ^2} + \beta _0^2{\mathrm{E}}\left\{ {{{\left\| { {{{\hat{\mathbf{R}}}_{\mathrm{x}}} - {{\mathbf{R}}_{\mathrm{x}}}} } \right\| }^2_{F}}} \right\} \nonumber \\= & {} \alpha _0^2M - 2{{ \alpha }_0}\left( {1 - {{ \beta }_0}} \right) {\mathbf{tr}}\left( {{{\mathbf{R}}_{\mathrm{x}}}} \right) + \left( {1 - {{ \beta }_0}} \right) ^2{\left\| {{{\mathbf{R}}_{\mathrm{x}}}} \right\| ^2_{F}} \nonumber \\&+ \,\beta _0^2{\mathrm{E}}\left\{ {{{\left\| { {{{\hat{\mathbf{R}}}_{\mathrm{x}}} - {{\mathbf{R}}_{\mathrm{x}}}}} \right\| }^2_{F}}} \right\} , \end{aligned}$$
(10)

where \({{\mathbf{R}}_{\text {x}}}\) represents the theoretical covariance matrix of the array output. As suggested in [7], the optimal shrinkage parameters \(\alpha _0\) and \(\beta _0\) are given by

$$\begin{aligned} \begin{array}{l} {\hat{\alpha } _0} = \min \left[ {\frac{{\hat{\nu }\hat{\rho } }}{{{{\left\| {{{\hat{\mathbf{R}}}_{\text {x}}} - \hat{\nu }{\mathbf{I}}} \right\| }^2_{F}}}},\hat{\nu }} \right] , \; {\hat{\beta } _0} = 1 - \frac{{{{\hat{\alpha } }_0}}}{{\hat{\nu }}}, \end{array} \end{aligned}$$
(11)

where \(\hat{\rho } = \frac{1}{{{N^2}}}\sum \limits _{t = 1}^N {{{\left\| {{\mathbf{x}}\left( {{t}} \right) } \right\| }^4}} - \frac{1}{N}{\left\| {{{ \hat{{\mathbf{R}}}}_{\text {x}}}} \right\| ^2_{F}}\), \(\hat{\nu }= {{{\mathbf{tr}}\left( {{{ \hat{{\mathbf{R}}}}_{\text {x}}}} \right) } / M}\). Substituting (11) into (9), we can obtain an enhanced form of the estiamte \({\tilde{\mathbf{R}}_{\text {x}}}\) [12, 15]:

$$\begin{aligned} {\tilde{\mathbf{R}}_{\text {x}}} = \hat{\alpha }_0 {\mathbf{I}} + \hat{\beta }_0 {{{ \hat{{\mathbf{R}}}}_{\text {x}}}}. \end{aligned}$$
(12)

The estimate in (12) is a kind of completely automatic diagonal loading approaches with diagonal loading factor \(\hat{\alpha }_0/\hat{\beta }_0\). The shrinkage estimate \({\tilde{\mathbf{R}}_{\text {x}}}\) will be used in the following estimation of the desired signal steering vector and the interference-plus-noise covariance matrix.

3.1 Desired Signal Steering Vector Estimation

From the standpoint of covariance matrix, the Capon spatial spectrum \({P}(\theta )\) is usually used to estimate the direction of arrival, i.e.,

$$\begin{aligned} {P}(\theta )=\frac{1}{{\mathbf{a}^{H}}(\theta ){{{{\tilde{{\mathbf{R}}}}}_\text {x}^{-1}}}{} \mathbf{a}(\theta )}. \end{aligned}$$
(13)

Using the definition in (13), we aim to obtain the beamformer that maximizes the array output power, namely, \(\underset{\mathbf{a}}{\mathop {\min }}\,{\mathbf{a}^{H}}{{{\tilde{{\mathbf{R}}}}_\text {x}^{-1}}}{} \mathbf{a}\), where \(\mathbf{{a}}\) represents the actual desired steering vector. We assume that \({\bar{\mathbf{a}}}\left( {{{\bar{\theta } }_0}} \right) \) is the presumed desired steering vector, where \({{{\bar{\theta } }_0}}\) denotes the presumed direction of desired signal. The correlation coefficient of two given vectors \(\mathbf{a}_{1}\) and \(\mathbf{a}_{2}\) can be generally defined as

$$\begin{aligned} \mathrm{cor}\left( {\mathbf{a}_{1}},{\mathbf{a}_{2}} \right) =\frac{\left| \mathbf{a}_{1}^{H}{\mathbf{a}_{2}} \right| }{\left\| {\mathbf{a}_{1}} \right\| \left\| {\mathbf{a}_{2}} \right\| }. \end{aligned}$$
(14)

In this paper, we assume that the steering vector satisfies the same norm constraint (i.e., \(\left\| {\mathbf{a}} \right\| = \left\| {\bar{\mathbf{a}}}\left( {{{\bar{\theta } }_0}} \right) \right\| = \sqrt{M} \), the norm constraint is reasonable in the cases of direction error or phase perturbations, and the norm constraint still holds approximately for the small gain perturbations [17]). From (14), it is straightforward to deduce that

$$\begin{aligned} \rho \le \frac{\left| {{\mathbf{a}}}^{H}{\bar{\mathbf{a}}}\left( {{{\bar{\theta } }_0}} \right) \right| }{M}\le 1, \end{aligned}$$
(15)

where \(\rho \) is an appropriate scalar factor. For a robust beamforming, the constraint on steering vector correlation coefficient can be exploited to restrict the desired signal in the region of interest [24]. As a reference, we assume \(\rho ={{\bar{\mathbf{a}}}^{H}\left( {{\theta _\rho }} \right) {\bar{\mathbf{a}}}\left( {{{\bar{\theta } }_0}} \right) }/{M}\), with the symmetric structure of the correlation coefficient property for the symmetric structure array, \({\bar{\mathbf{a}}}\left( {{\theta _\rho }} \right) \) is a reference vector, which can be set by the region of interest. Without loss of generality, we assume that the possible angular sector of the desired signal is set to be \(\mathrm {\Theta } = \left[ {{{\bar{\theta } }_0}- {{\bar{\theta } }'_0}, {{\bar{\theta } }_0}+{{\bar{\theta } }'_0}} \right] \),Footnote 1 where \({{\bar{\theta } }'_0}\) accounts for the uncertainty on \({{\bar{\theta } }_0}\). Then, \({\rho }\) can be set according to the following problem:

$$\begin{aligned} \rho \le \mathrm {minimize} \left\{ {\frac{{\left| {{\mathbf{a}}{{\left( {{{\bar{\theta } }_0} - {{\bar{\theta } '}_0}} \right) }^H}{\bar{\mathbf{a}}}\left( {{{\bar{\theta } }_0}} \right) } \right| }}{M},\frac{{\left| {{\mathbf{a}}{{\left( {{{\bar{\theta } }_0} + {{\bar{\theta } '}_0}} \right) }^H}{\bar{\mathbf{a}}}\left( {{{\bar{\theta } }_0}} \right) } \right| }}{M}} \right\} . \end{aligned}$$
(16)

As for \({\bar{\mathbf{a}}}\left( {{\theta _\rho }} \right) \), we can choose \(\mathrm {\Theta } \in \left[ {{{\bar{\theta } }_0} - {\theta _\rho },{{\bar{\theta } }_0} + {\theta _\rho }} \right] \) so that (16) is satisfied. Thus, (15) can be used to flexibly control the beamwidth of the robust region via choosing the parameter \(\rho \). We append the correlation coefficient and the norm constraint. Proceeding in this way, the steering vector \({{\mathbf{a}}} \in \mathbb {C}{^{M}}\) estimation problem can be formulated as

$$\begin{aligned} \begin{array}{l} {{\mathcal {P}1}}\;\left\{ {\begin{array}{l} {\mathop {\mathrm {minimize}}\limits _{{{\mathbf{a}}}} \quad \;{{\mathbf{a}}^H}\tilde{\mathbf{R}}_{\mathrm{x}}^{ - 1}{\mathbf{a}}},\\ {\mathrm {subject}\;\mathrm {to}\quad \;\nu \le \left| {{\mathbf{a}}_{}^H{{\bar{\mathbf{a}}}\left( {{{\bar{\theta } }_0}} \right) }} \right| \le M,\left( {\nu = \rho M} \right) },\\ {\qquad \qquad \left\| {\mathbf{a}} \right\| = \sqrt{M} }. \end{array}} \right. \end{array} \end{aligned}$$
(17)

The original optimization problem is a nonconvex quadratic program. The difficulty for solving (17) is its nonconvexity due to the left side of the inequality constraint \(\nu \le \left| \mathbf{a}^{H}{\bar{\mathbf{a}}}\left( {{{\bar{\theta } }_0}} \right) \right| \) and the nonlinear equality constraint \(\left\| {\mathbf{a}} \right\| =\sqrt{M}\). Such problems are in general NP-hard, and thus, the original design cannot be directly solved by the convex optimization technique. Next, we focus on solving the nonconvex problem. The homogenized version of problem \({\mathcal {P}1}\) is given by

$$\begin{aligned} \begin{array}{l} \mathcal {P}2\left\{ \begin{array}{l} \mathop {\mathrm {minimize}}\limits _{{{\mathbf{a}}},s} \ {\mathbf{tr}}\left( {\left[ {\begin{array}{cc} {\tilde{\mathbf{R}}_{\mathrm{x}}^{ - 1}}\,&{}{\mathbf{0}}\\ {\mathbf{0}}\,&{}0 \end{array}} \right] \left[ {\begin{array}{cc} {{\mathbf{a}}{{\mathbf{a}}^H}}\,&{}{{\mathbf{a}}{s^*}}\\ {{{\mathbf{a}}^H}s}\,&{}{{{\left| s \right| }^2}} \end{array}} \right] } \right) ,\\ \mathrm {subject}\;\mathrm {to}\ \;{\nu ^2} \le {\mathbf{tr}}\left( {\left[ {\begin{array}{cc} {{\bar{\mathbf{a}}}\left( {{{\bar{\theta } }_0}} \right) \left( {\bar{\mathbf{a}}}\left( {{{\bar{\theta } }_0}} \right) \right) ^H}\,&{}{\mathbf{0}}\\ {\mathbf{0}}\,&{}0 \end{array}} \right] \left[ {\begin{array}{cc} {{\mathbf{a}}{{\mathbf{a}}^H}}\,&{}{{\mathbf{a}}{s^*}}\\ {{{\mathbf{a}}^H}s}\,&{}{{{\left| s \right| }^2}} \end{array}} \right] } \right) \le {M^2},\\ \qquad \qquad {\mathbf{tr}}\left( {\left[ {\begin{array}{cc} {{{\mathbf{I}}_M}}\,&{}{\mathbf{0}}\\ {\mathbf{0}}\,&{}0 \end{array}} \right] \left[ {\begin{array}{cc} {{\mathbf{a}}{{\mathbf{a}}^H}}\,&{}{{\mathbf{a}}{s^*}}\\ {{{\mathbf{a}}^H}s}\,&{}{{{\left| s \right| }^2}} \end{array}} \right] } \right) = M,\\ \qquad \qquad {\mathbf{tr}}\left( {\left[ {\begin{array}{cc} {\mathbf{0}}\,&{}{\mathbf{0}}\\ {\mathbf{0}}\,&{}1 \end{array}} \right] \left[ {\begin{array}{cc} {{\mathbf{a}}{{\mathbf{a}}^H}}\,&{}{{\mathbf{a}}{s^*}}\\ {{{\mathbf{a}}^H}s}\,&{}{{{\left| s \right| }^2}} \end{array}} \right] } \right) = 1. \end{array} \right. \end{array} \end{aligned}$$
(18)

where s is a complex-valued scalar, which is used to construct the homogenized version for problem \({\mathcal {P}1}\).Footnote 2 Precisely, problem \({\mathcal {P}1}\) and \({\mathcal {P}2}\) have the same optimal solutions, i.e., \(\nu \left( {\mathcal {P}1}\right) = \nu \left( {\mathcal {P}2}\right) \). Specifically, suppose that \({{\tilde{\mathbf{a}}}}^{\#}=\left[ {{{{\mathbf{a}}}}^\# ,{s^\# }} \right] ^{\mathrm{T}}\) is an optimal solution for problem \(\mathcal {P}2\). Then \({{{{{\mathbf{a}}}}^\# } /{{s^\# }}}\) solves \(\mathcal {P}1\). Thus, we can get the optimal solution of \({\mathcal {P}1}\) by solving problem \({\mathcal {P}2}\). The SDP relaxation of \({\mathcal {P}2}\) is

$$\begin{aligned} \begin{array}{l} {{\mathcal {P}3}}\left\{ \begin{array}{l} \mathop {\mathrm {minimize}} \limits _{{\mathbf{A}}}\ \ {\mathbf{tr}}\left\{ {{{\mathbf{Q}}_0}{\mathbf{A}}} \right\} ,\\ \mathrm {subject}\;\mathrm {to} \; {\nu ^2} \le {\mathbf{tr}}\left\{ {{{\mathbf{Q}}_1}{\mathbf{A}}} \right\} \le {M^2},\\ \qquad \qquad {\mathbf{tr}}\left\{ {{{\mathbf{Q}}_2}{\mathbf{A}}} \right\} = M,\\ \qquad \qquad {\mathbf{tr}}\left\{ {{{\mathbf{Q}}_3}{\mathbf{A}}} \right\} = 1,\\ \qquad \qquad {\mathbf{A}}\succeq {\mathbf{0}}. \end{array} \right. \end{array} \end{aligned}$$
(19)

where \({\mathbf{Q}}_{i} \in {\mathbb {H}^{M + 1}}\) and \( {\mathbf{A}} \in {\mathbb {H}^{M + 1}}\) are defined as follows

$$\begin{aligned} \begin{array}{l} {{\mathbf{Q}}_{0}} = \begin{bmatrix} {\tilde{{\mathbf{R}}}_{\text {x}}^{ - 1}}\,\ &{}\quad {\mathbf{0}}\\ {\mathbf{0}} \, &{} \quad 0 \end{bmatrix}, {{\mathbf{Q}}_{1}} = \begin{bmatrix} {{\bar{\mathbf{a}}}\left( {{{\bar{\theta } }_0}} \right) \left( {\bar{\mathbf{a}}}\left( {{{\bar{\theta } }_0}} \right) \right) ^H}\ \, &{}\quad {\mathbf{0}}\\ {\mathbf{0}} \, &{} \quad 0 \end{bmatrix}, {{\mathbf{Q}}_{2}} = \begin{bmatrix} {{\mathbf{I}}_{M}}\ \, &{}\quad {\mathbf{0}}\\ {\mathbf{0}} \, &{}\quad 0 \end{bmatrix}, \end{array} \end{aligned}$$
(20)

and

$$\begin{aligned} \begin{array}{l} {{\mathbf{Q}}_{3}} = \begin{bmatrix} {\mathbf{0}} \ \, &{}\quad {\mathbf{0}}\\ {\mathbf{0}} \, &{}\quad 1 \end{bmatrix}, {{\mathbf{A}}} = \begin{bmatrix} {{{\mathbf{a}}}}{\mathbf{a}}^H\ \, &{}\quad {{\mathbf{a}}}{s^*}\\ {{\mathbf{a}}^H}s \, &{} \quad {\left| s \right| ^2} \end{bmatrix}. \end{array} \end{aligned}$$
(21)

In general, the solution obtained by solving a relaxed SDP problem may not be exactly rank-one. Nevertheless, better approximation can be obtained for lower rank of the solution. The relationship between the rank of the matrix and the number of constraints has been addressed in [18]. For a complex-valued problem, solving the relaxed SDP problem is equivalent to solving the original quadratically constrained quadratic programming (QCQP) problem when the number of constraints is no more than 3 [13]. Furthermore, Ai et al. [1] have proven the rank-one decomposition theorem and used it to show that the SDRs of a large class of complex-valued homogeneous QCQPs with no more than four constraints are in fact tight. Hence, the relaxed SDP problem \({\mathcal {P}3}\) is tight due to the fact that the original design with three homogeneous constraints, is hidden convex, and the objective function of \({\mathcal {P}1}\) evaluated at \({{{{{\mathbf{a}}}}^\# } / {{s^\# }}}\) is equal to the optimal value of \({\mathcal {P}3}\), provided that \(\left[ {{{{\mathbf{a}}}}^\# ,{s^\# }} \right] ^{\mathrm{T}}\) is optimal to \({\mathcal {P}3}\) [2, 9].

Thus, once an optimal solution \({\mathbf{A}}^\# \) is obtained, we can check the rank of \({{{\mathbf{A}}}}^\# \). If the rank of \({{{\mathbf{A}}}}^\# \) equals to one, then \({{\tilde{\mathbf{a}}}}^{\#}\) can be obtained exactly, and \({{\mathcal {P}1}}\) is solved. If \({{{\mathbf{A}}}}^\# \) has rank higher than one, we can construct a rank-one optimal solution via the matrix decomposition theorem [1, Theorem 2.3]. Specifically, let us check the conditions of the matrix decomposition theorem to \({{\mathcal {P}}3}\): First, it is easy to verify that for any nonzero complex Hermitian positive semidefinite matrix \(\mathbf{Y}\) of size \((M+1)\times (M+1)\), \(\left( {{\mathrm{Tr}}\left( {\mathbf{Q}}_{\mathbf{0}}{} \mathbf{Y} \right) ,{\mathrm{Tr}}\left( {{{\mathbf{Q}}_1}{\mathbf{Y}}} \right) ,{\mathrm{Tr}}\left( {{{\mathbf{Q}}_2}{\mathbf{Y}}} \right) ,{\mathrm{Tr}}\left( {{{\mathbf{Q}}_3}{\mathbf{Y}}} \right) } \right) \ne \left( {0,0,0,0} \right) \). More precisely, there exists \(\left( {{a_1},{a_2},{a_3},{a_4}} \right) \in \mathbb {R}_ + ^4\) such that \({a_1}{{\mathbf{Q}}_0} + {a_2}{{\mathbf{Q}}_1} + {a_3}{{\mathbf{Q}}_2} + {a_4}{{\mathbf{Q}}_3} \succ \mathbf{0}\); Second, the condition \((M+1)\ge 3\) is mild and practical. Thus, the assumptions in [1, Theorem 2.3] are satisfied. Then,

\(\bullet \) if \({\mathrm{rank}}\left( {\mathbf{A}}^\# \right) \ge 3\), there is a rank-one decomposition \({\mathbf{A}}^\# = \sum \nolimits _{i = 1}^r {\tilde{{\mathbf{a}}}_i {{ {\tilde{{\mathbf{a}}}_i^H}}}}\), where r denotes the rank of \({\mathbf{A}}^\#\), and for all \({\mathbf{z}} \in {\mathrm{Null}}\left( {\mathbf{A}}^\# \right) \), \({ {\tilde{{\mathbf{a}}}^H_i } }{\mathbf{z}} = 0\), i.e., \(\tilde{{\mathbf{a}}}_i \in {\mathrm{Range}}\left( {\mathbf{A}}^\# \right) \), \(i = 1, \ldots ,r\). Thus one can find a nonzero vector \({\tilde{{\mathbf{a}}}^{\# }}\in \mathrm {Range}\left( {\mathbf{A}}^\# \right) \) (synthetically denoted as \({\tilde{{\mathbf{a}}}^{\# }}= {\mathcal{D}_1}\left( {{\mathbf{A}}^\# ,{{\mathbf{Q}}_0},{{\mathbf{Q}}_1},{{\mathbf{Q}}_2},{{\mathbf{Q}}_3}} \right) \)Footnote 3) satisfying

$$\begin{aligned}&\left( {{\mathrm{Tr}}\left( {{\mathbf{Q}}_0} {\mathbf{A}}^\#\right) ,{\mathrm{Tr}}\left( {{\mathbf{Q}}_1} {{\mathbf{A}}^\#} \right) ,{\mathrm{Tr}}\left( {{\mathbf{Q}}_2} {{\mathbf{A}}^\#} \right) ,{\mathrm{Tr}}\left( {{\mathbf{Q}}_3} {{\mathbf{A}}^\#} \right) } \right) \nonumber \\&\quad = \left( {\left( {\tilde{{\mathbf{a}}}^{\# }}\right) ^H}{{\mathbf{Q}}_0}{\tilde{{\mathbf{a}}}^{\# }},{\left( {\tilde{{\mathbf{a}}}^{\# }} \right) ^H}{{\mathbf{Q}}_1}{\tilde{{\mathbf{a}}}^{\# }} ,{\left( {\tilde{{\mathbf{a}}}^{\# }} \right) ^H}{{\mathbf{Q}}_2}{\tilde{{\mathbf{a}}}^{\# }} ,{\left( {\tilde{{\mathbf{a}}}^{\# }} \right) ^H}{{\mathbf{Q}}_3}{\tilde{{\mathbf{a}}}^{\# }} \right) . \end{aligned}$$
(22)

\(\bullet \) if \({\mathrm{rank}}\left( {\mathbf{A}}^\# \right) = 2\), there exists a rank-one decomposition \({\mathbf{A}}^\# = {\tilde{\mathbf{a}}}_1 { {{\tilde{\mathbf{a}}}^H_1 }} + {\tilde{\mathbf{a}}}_2 { {{\tilde{\mathbf{a}}} ^H_2 }}\), then we have \({{{\mathbb {C}^{(M+1)}}} /{{\mathrm{Range}}\left( {\mathbf{A}}^\# \right) }} \ne \emptyset \) since \((M+1)\ge 3\). Thus, for any \({\mathbf{z}} \notin \mathrm {Range}\left( {\mathbf{A}}^\#\right) \), one can find a nonzero vector \({{\tilde{\mathbf{a}}}}^{\#}\) in the linear subspace spanned by \( {\mathbf{z}} \cup {\mathrm{range}}\left( {\mathbf{A}}^\#\right) \) (synthetically denoted as \({{\tilde{\mathbf{a}}}}^{\#}= {\mathcal{D}_2}\left( {{\mathbf{A}}^\# ,{{\mathbf{Q}}_0},{{\mathbf{Q}}_1},{{\mathbf{Q}}_2},{{\mathbf{Q}}_3}} \right) \)), satisfying (22).

Consequently, we can get that

$$\begin{aligned} \begin{array}{l} {\mathrm{Tr}}\left( {{{\mathbf{Q}}_0}{{{\tilde{\mathbf{a}}}}^{\# }} \left( {{{\tilde{\mathbf{a}}}}^{\# }} \right) ^{H}} \right) = {\mathrm{Tr}}\left( {{\mathbf{Q}}_0} {\mathbf{A}}^\#\right) , \end{array} \end{aligned}$$
(23)

and

$$\begin{aligned} \begin{array}{l} {\nu ^2} \le {\mathrm{Tr}}\left( {{{\mathbf{Q}}_1}{{{\tilde{\mathbf{a}}}}^{\# }} \left( {{{\tilde{\mathbf{a}}}}^{\# }} \right) ^{H}} \right) = {\mathrm{Tr}}\left( {{\mathbf{Q}}_1} {\mathbf{A}}^\#\right) \le {M^2},\\ {\mathrm{Tr}}\left( {{{\mathbf{Q}}_2}{{{\tilde{\mathbf{a}}}}^{\# }} \left( {{{\tilde{\mathbf{a}}}}^{\# }} \right) ^{H}} \right) = {\mathrm{Tr}}\left( {{{\mathbf{Q}}_2}{\mathbf{A}}^\#} \right) =M,\\ {\mathrm{Tr}}\left( {{\mathbf{Q}}_3} {{{\tilde{\mathbf{a}}}}^{\# }} \left( {{{\tilde{\mathbf{a}}}}^{\# }} \right) ^{H}\right) = {\mathrm{Tr}}\left( {{\mathbf{Q}}_3}{\mathbf{A}}^\#\right) = 1.\\ \end{array} \end{aligned}$$
(24)

Therefore, \({{{\tilde{\mathbf{a}}}}^{\# }}{{\left( {{{\tilde{\mathbf{a}}}}^{\# }}\right) }^H}\) is the optimal rank-one solution for \({{\mathcal {P}3}}\) and the optimal solution of \({{\mathcal {P}1}}\) ( i.e., \({{{{{\mathbf{a}}}}^\# } /{{s^\# }}}\)) can be obtained. Finally, the procedure for solving \({\mathcal {P}1}\) is summarized in Algorithm 1. In our work for the steering vector estimation, the theoretical covariance matrix is first estimated based on the shrinkage method [7]. We assume that the desired signal region and interference region can be distinguished via the correlation coefficient and norm constraint. The constraint set is also used to restrict the desired signal in the region of interest. Then, we can obtain the corrected desired signal steering vector based on the Capon spatial spectrum estimator, which provides efficiency because the extended Capon beamformer can determine accurately the power of the desired signal, even when only imprecise knowledge of its steering vector is available. Thus, with the constraint set, the steering vector is estimated through the beamformer output power maximization under the constraint, where the estimate dose not converge to any of the interference steering vectors or corresponding linear combinations. Furthermore, exploiting the hidden convexity properties of the original design, the optimal solution steering vector can be found based on SDR and matrix decomposition algorithms [1, Theorem 2.3].

figure a

3.2 Covariance Matrix Estimation

As stated above, although the shrinkage-based methods have the potential to enhance the performance of covariance matrix estimate, the estimated covariance matrix \({\tilde{\mathbf{R}}_{\text {x}}}\) still includes the desired signal. Next, we will attempt to eliminate the desired signal component from the covariance matrix \({\tilde{\mathbf{R}}_{\text {x}}}\). The eigendecomposition of the \({\tilde{\mathbf{R}}_{\text {x}}}\) is

$$\begin{aligned} {\tilde{\mathbf{R}}_{\text {x}}} = \sum \limits _{j = 1}^M {{\lambda _j}{{\mathbf{u}}_j}{{\mathbf{u}}_j}^H} = {{\mathbf{U}}_s}{{\varvec{\Lambda }} _s}{{\mathbf{U}}_s}^H + {{\mathbf{U}}_n}{{\varvec{\Lambda }} _n}{{\mathbf{U}}_n}^H, \end{aligned}$$
(25)

where \({\lambda _1} \ge {\lambda _2} \ge \cdots {\lambda _{p}} \ge {\lambda _{p + 1}} \ge \cdots \ge {\lambda _M}\) are the eigenvalues in the descending order, \({\mathbf{u}}_j^{}\) denotes the eigenvector of \({\tilde{\mathbf{R}}_{\text {x}}}\). For notational simplicity, we exploit the notations \({{\mathbf{U}}_s} = [{{\mathbf{u}}_1}\ldots {{\mathbf{u}}_{P+1}}]\) to represent the signal subspace, which includes desired signal and interference signal, \({{\varvec{\Lambda }} _s} =\text {diag}[{\lambda _1}, \ldots ,{\lambda _{P+1}}]\), \({{\mathbf{U}}_n} = [{{\mathbf{u}}_{P + 2}},\ldots ,{{\mathbf{u}}_M}]\) denotes the noise subspace, \({{\varvec{\Lambda }} _n}=\text {diag}[{\lambda _{P + 2}}, \ldots ,{\lambda _M}]\). Based on the subspace theorem, the signal subspace spans the same space with the steering matrix, i.e.,

$$\begin{aligned} {\mathrm{span}}\left\{ {{{\mathbf{u}}_1},{{\mathbf{u}}_2}, \ldots ,{{\mathbf{u}}_{P+1}}} \right\} = {\mathrm{span}}\left\{ {{\mathbf{a}} ({\theta _1}),{\mathbf{a}} ({\theta _2}), \ldots , {\mathbf{a}} ({\theta _{P+1}})} \right\} . \end{aligned}$$
(26)

Moreover, the adaptive beamforming approaches are able to synthesize array patterns with deep nulls at the interference directions, which can be shaped as a subtraction of eigenbeams from the quiescent array pattern [6], and can be written by

$$\begin{aligned} {{\mathbf{G}}_a}(\theta ) = {{\mathbf{G}}_q}(\theta ) - \sum \limits _{j = 1}^M {\frac{{{\lambda _j} - {\lambda _{M }}}}{{{\lambda _j}}}\left[ {{\mathbf{w}}_q^H{\mathbf{u}} _j^{}} \right] } {{\mathbf{G}}_j}(\theta ), \end{aligned}$$
(27)

where \({{\mathbf{G}}_q}(\theta ) = {\mathbf{w}}_{q}^H{\mathbf{a}}(\theta )\) is the quiescent array pattern that has no ability to null interference, \({\mathbf{w}}_{q}\) is the quiescent weight vector. \({{\mathbf{G}}_j}(\theta ) = {\mathbf{u}}_{_j}^H{\mathbf{a}}(\theta )\) denotes the eigenbeam of the jth eigenvector. \({{\mathbf{w}}_q^H{\mathbf{u}} _j^{}}\) scales the eigenbeams to proper size. The eigenbeam usually appears as beam pointed in the direction of the corresponding interference so that the nulls are produced in the directions of the interference when the eigenbeams are subtracted from the quiescent pattern [19, 26]. Thus, we get

$$\begin{aligned} {{\mathbf{G}}_j}({\theta _j}) = \mathrm { maximize} \left\{ {{\mathbf{u}}_{_j}^H\varvec{a}(\theta )} \right\} ,(j = 1,2, \ldots ,P+1). \end{aligned}$$
(28)

From (27), it is also seen that for large eigenvalues associated with powerful interferences the factor \(\frac{{{\lambda _j} - {\lambda _M}}}{{{\lambda _j}}}\) approaches 1and causes the eigenvalue energy to be nulled almost completely, i.e.,

$$\begin{aligned} \frac{{{\lambda _j} - {\lambda _M}}}{{{\lambda _j}}} \approx 1, j = 1, 2,\ldots ,P+1. \end{aligned}$$
(29)

For the small eigenvalues near the noise level, this factor approaches 0 and no subtraction occurs, i.e.,

$$\begin{aligned} \frac{{{\lambda _j} - {\lambda _M}}}{{{\lambda _j}}} \approx 0, j = P+2, \ldots ,M. \end{aligned}$$
(30)

As mentioned above, if the eigenvalue of \({\tilde{\mathbf{R}}_{\text {x}}}\) decrease to the level of background noise, a large part of the corresponding signal source component in the covariance matrix \({\tilde{\mathbf{R}}_{\text {x}}}\) will be eliminated. It is because this eigenvalue contains most of the signal energy. Next we focus on obtaining the corresponding eigenvector and the corresponding eigenvalue of the desired signal. It is known that the maximum of correlation coefficients is achieved when \(\mathbf{e}_{d}\) is the eigenvector of the corrected desired signal \({\mathbf{a}}\). Thus, \(\mathbf{e}_{d}\) can be given by

$$\begin{aligned} {{\mathbf{e}}_d} = \arg \mathop {\mathop { \mathrm { maximize} }\limits _{{{\mathbf{u}}_j}} \frac{{\left| {{\mathbf{u}}_j^H{\mathbf{a}}} \right| }}{{\left\| {{{\mathbf{u}}_j}} \right\| \left\| {\mathbf{a}} \right\| }}}\limits _{} \ \ \ j = 1,\ldots ,P + 1. \end{aligned}$$
(31)

Then, the corresponding eigenvalue of \({{\mathbf{e}}_d}\) can be obtained, which is the corresponding eigenvalue of the desired signal. Usually, each steering vector of signal is assumed to be the set of signal in a linear space, which implies that the steering vector of signal is a linear combination of \({{\mathbf{u}}_j}\), \(j=1,\ldots ,P+1\). As stated above, if the eigenvalue of desire signal is decreased to the level of background noise, a large portion of the desired signal will be eliminated. Here, the eigenvalue of desire signal is replaced by the noise power. Furthermore, in order to eliminate the effect of noise perturbation, the eigenvalues of noise can be replaced by the average of the small eigenvalues of sample covariance matrix

$$\begin{aligned} \tilde{\sigma }_{0}^{2}=\tilde{\sigma }_{n}^{2}=\frac{1}{{\left( M-P-1 \right) }} \sum \nolimits _{j=P+2}^{M}{{{\lambda }_{j}}\left( {{\tilde{{\mathbf{R}}}_{\text {x}}}}\right) }, \end{aligned}$$
(32)

where \({{\lambda }_{j}}\) denotes the jth largest eigenvalue of the matrix within braces. Consequently, the interference-plus-noise covariance matrix \(\tilde{\mathbf{R}}_{\mathrm{in}}\) is given by

$$\begin{aligned} \tilde{\mathbf{R}}_{\mathrm{in}}^{{}}= {{\mathbf{U}}}{{\tilde{\varvec{\Lambda }}}}{{\mathbf{U}}}^H, \end{aligned}$$
(33)

where \({\tilde{\varvec{\Lambda }}}=\text {diag}[{\lambda _1},\ldots , \tilde{\sigma }_{0}^{2}\ldots , {\lambda _{P+1}}, \tilde{\sigma }_{n}^{2}, \ldots , \tilde{\sigma }_{n}^{2}]\) is the reconstructed eigenvalue matrix, and the eigenvector matrix remains unchanged, namely \({{\mathbf{U}}} = [{{\mathbf{U}}_s}, {{\mathbf{U}}_n}]\). Finally, the proposed beamforming algorithm based on the estimated covariance matrix \({{\tilde{\mathbf{R}}}}_{{in}}\) and the steering vector \({\mathbf{a}}\) is given by

$$\begin{aligned} {{\mathbf{w}}} = \frac{{ \tilde{\mathbf{R}^{- 1}_{\mathrm{in}}}{\mathbf{a}}}}{{{{\mathbf{a}}^H}\tilde{\mathbf{R}^{ - 1}_{{\mathrm{in}}}}{\mathbf{a}}}}. \end{aligned}$$
(34)

In summary, we can obtain an estimate of the desired steering vector based on maximizing array output power under the correlation coefficient constraint and norm constraint. The optimal solution steering vector can be obtained based on SDR and matrix decomposition algorithms. Note that the smaller \(\rho \), the larger the constraint region becomes, i.e., the robust design is able to flexibly control the beamwidth of the robust region with constraints on steering vector. Then, with the corrected steering vector and the subspace theorem, the interference-plus-noise covariance matrix can be estimated effectively. Finally, the adaptive beamformer weight vector can be obtained. As for the computational complexity connected with the implementation of the algorithm, the computation complexity of the modified interference-plus-noise covariance matrix estimation is \(O\left( {{M}^{3}} \right) \) due to the eigendecomposition of the Hermitian matrix \({{{ \hat{{\mathbf{R}}}}_{\text {x}}}}\). Additionally, the main complexity of the proposed algorithm is attributed to the desired steering vector estimation, which can be solved by taking the interior point approaches in order of \(O\left( {{M^{4.5}}\log \left( {{1 / \eta }} \right) } \right) \) (where \(\eta > 0\) represents a prescribed accuracy [3]), and the computation complexity of the rank-one decomposition is \(O\left( {{M^3}} \right) \).

4 Simulation Results

In the following simulations, a ULA of \(M = 17\) omnidirectional elements with the inter-element spacing of half wavelength is considered. Without loss of generality, we assume three far-field sources, two interferences with the directions of arrival \(-25{}^\circ \) and \(35{}^\circ \), respectively. The interference-to-noise ratio (INR) equals to 30 dB. The nominal direction of the desired signal is set to \(0{}^\circ \), and the SNR is fixed to be 10 dB (except the figures where SNR varies). The number of snapshots is fixed as \(N =100\) (except the figures where number of snapshots varies), and 200 Monte-Carlo trials are performed for each scenario. For reference, we assume that \(\rho = 0.8\) in the proposed method.

In all simulation runs, the performance of the proposed beamformer is compared with several typical robust adaptive beamformers given as follows.

  1. 1.

    The DL method in [6], the Diagonal loading factor is assumed as 10 times noise power;

  2. 2.

    The signal subspace projection (SSP) method in [8];

  3. 3.

    The Worst-Case Optimization in [25], the value \(\varepsilon =0.3M\) is used in the figures;

  4. 4.

    The robust design with magnitude response constraints (RAB-MRC) in [27]. We assume that the ripple of the RAB-MRC approach equals to 0.2 dB and a fixed beamwidth is \(8{}^\circ \). The relative regularization factor \(\gamma \) is chosen as 6, and the designed beamwidth is discretized with a step size of \(1{}^\circ \);

  5. 5.

    The reconstruction-based beamformer (Reconstruction) in [10], the possible angular sector of the desired signal is set to be \({\Theta } = \left[ {{{\bar{\theta } }_0}-\, 5{}^\circ , {{\bar{\theta } }_0}+5{}^\circ } \right] \), so the corresponding complement sector is \(\bar{\Theta } = \left[ { - {{90}^\circ }, ({{\bar{\theta } }_0}-\, 5{}^\circ }) \right) \cup \left( {({{\bar{\theta } }_0}+5{}^\circ ),{{90}^\circ }} \right] \), and the region of interference is discretized with a step size of \(1{}^\circ \).

In the first example, we investigate the effects of different values of input SNR versus the output SINR in the case of the fixed signal direction error. The optimal performance is also provided for comparison. We assume that the actual direction is fixed to be \(3^\circ \). As shown in Fig. 1, one can observe that the proposed method and the reconstruction-based beamformer in [10] achieve better performance in a large range of SNR as compared with the beamformers in [6, 8, 25, 27], and this improvement is especially remarkable at high SNRs. It implies that the proposed method is not sensitive to the power of the desired signal, i.e., the desired signal steering vector and interference-plus-noise covariance matrix estimation in our method are effective in this situation. The DL method is sensitive to the direction mismatch. This may be due to the ill-suited DL factor. It can be observed that the performance of proposed method is less than the reconstruction beamformer in [10] when SNR is larger than 5 dB. The main reason is that a few part of desired signal energy is still existing in interference-plus-noise covariance matrix for the proposed design.

Fig. 1
figure 1

Output SINR versus SNR in the case of the fixed signal direction error

In the second example, the SINR performances of these methods are shown versus the number of training snapshots. Other parameters are as the same as in the first experiment except the number of snapshots. The result is shown in Fig. 2. Compared with the existing algorithms, the proposed method provides a satisfactory convergence rate and output performance. This performance behavior implies that the proposed covariance estimator is well conditioned under small sample sizes. The reconstruction-based beamformer in [10] also has excellent sample convergence rate.

Fig. 2
figure 2

SINR versus the number of snapshots in the case of the fixed signal direction error

In the third example, we analyze the impact of the correlation coefficient constraint parameter \(\rho \) on the output SINR of the proposed algorithm. The direction mismatch is assumed to be random and uniformly distributed over \(\left[ -\,5{}^\circ , 5{}^\circ \right] \). We can see from Fig. 3 that the proposed method experiences serious performance degradation when \(\rho \) is small. The main reason is that the constraint region is larger for smaller \(\rho \). If the region of interest includes the interference signal, it may result in the interference being regarded as the desired signal, and the beamformer may attempt to suppress the desired signal as if it was interference. Namely, the calculated steering vector may converge to an interference steering vector or corresponding linear combination due to the large constraint set. Therefore, if possible, \(\rho \) should not be chosen too small. It can be also observed that the proposed method suffers from performance degradation in output SINR when \(\rho \) is too large. The main reason is that when \(\rho \) is too large, the constraint region may not cover all the uncertainly regions, and thus, the output SINR will degrade.

Fig. 3
figure 3

Output SINR versus the correlation coefficient constraint parameter \(\rho \)

The SINR performance versus the pointing error is illustrated in Fig. 4. The pointing error changes from \(- \,5{}^\circ \) to \( 5{}^\circ \). According to the results of the third example, here \(\rho \) is updated to 0.7 in the proposed method, and the beamwidth of [27] is set to be \(10^\circ \). Other parameters remain the same as in the first experiment. As expected, a wider range of the mismatch angle leads to a worse SINR. It can be observed that even small pointing error can lead to severe performance degradation for the DL beamformer. Meanwhile, other approaches are more robust against the pointing error. The proposed algorithm also achieves very close performance to the beamformer in [10]. The result indicates that the proposed algorithm can keep robust over a large pointing error range because the proposed method can flexibly control the beamwidth of the robust region via choosing the parameter \(\rho \).

Fig. 4
figure 4

SINR versus pointing error

For the more practical scenario with random look direction mismatch, we consider the influence of random signal direction error on array output SINR. We assume that the random direction of arrive mismatch of the desired signal is uniformly distributed in \(\left[ {-\, 5{}^\circ , 5{}^\circ } \right] \). Other parameters remain the same as the first experiment. Figure 5 shows the output SINR of the beamformers versus input SNR. It is observed that the proposed method still performs well and maintains a satisfactory performance. Thus, the proposed method is robust against the desired signal look direction mismatch.

Fig. 5
figure 5

Output SINR versus SNR in the case of the random direction errors

Fig. 6
figure 6

Output SINR versus SNR in the case of the steering vector random error

From the above arguments, we only consider the direction mismatch of the desired signal. For further insight, the performance of a more general mismatch scenario is considered in the following. This mismatch is modeled as \({\mathbf{a}} = {\bar{\mathbf{a}}}\left( {{{\bar{\theta } }_0}} \right) +{\mathbf{e}}\), where \({\mathbf{e}}\) is a zero-mean complex Gaussian random vector, distributed as \(\mathcal {CN}\left( {0,\sigma _e^2} \right) \). In this example, \(\sigma _e^2\) is set as \(\sigma _e^2=0.1\). Other parameters are identical to the case of the fixed signal direction error. Figures  6 and 7 correspond, respectively, to the performance curves versus the SNR and the number of snapshots in the case of the steering vector random error. It can be observed from the figures that the proposed technique still has satisfactory performance for this more general type of mismatch scenario. This indicates that the proposed method has the potential to play a significant role in a more general mismatched situation.

Fig. 7
figure 7

SINR versus the number of snapshots in the case of the steering vector random error

Fig. 8
figure 8

Output SINR versus \(\sigma _e^2\) in the case of steering vector error

In this example, we investigate the impacts of steering vector random error on the proposed algorithm. The output SINR of the proposed algorithm is displayed versus \(\sigma _e^2\) in Fig. 8. It can be observed that the proposed algorithm is sensitive to the steering vector error \(\sigma _e^2\). The main reason is that the proposed algorithm imposes the desired signal steering vector to satisfy the norm constraint, which may lead to an inaccurate approximation and the performance degradation of the beamformer. In addition, we assume that the correlation coefficient constraint can be used to distinguish the desired signal region and interference region based on the norm constraint. If the steering vector norm is violated severely, the correlation coefficient constraint may also be invalid. Hence, the proposed algorithm is more suitable for the applications where steering vectors satisfy or approximate the norm constraint scenarios.

5 Conclusion

In this paper, we derived a novel robust adaptive beamforming technique that achieves satisfactory performance by estimating signal steering vector and covariance matrix. The desired signal steering vector was designed to guarantee the robustness against the steering vector mismatch and the finite sample effects. The optimal solution of steering vector can be found through exploiting the hidden convexity properties of the original design. Moreover, a novel formulation for interference-plus-noise covariance matrix estimation is developed based on the subspace theorem. Several numerical examples have been used to demonstrate the effectiveness of the proposed design.