1 Introduction

The scaled total least squares (STLS) problem (or technique) was first proposed in [20] to give a unified treatment of the ordinary least squares (OLS) problem, the total least squares (TLS) problem and the data least squares (DLS) problem. Paige and Strakoš [19] reformulated the STLS problem and presented a detailed analysis of conditions that guarantee the STLS problem has a unique solution. With their formulation, the STLS problem is given as follows

$$\begin{aligned} \min \left\| \begin{bmatrix} E,&f \\ \end{bmatrix}\right\| _F,\quad \text {subject to }\;\lambda b-f \in {\mathscr {R}}(A+E), \end{aligned}$$
(1)

where \(A\in {\mathbb {R}}^{m\times n}\), \(b\in {\mathbb {R}}^m\), \(\lambda \) is a positive real number, \(\Vert \cdot \Vert _F\) denotes the Frobenius norm and \({\mathscr {R}}(\cdot )\) is the range space. Let \([E_{S},\; f_{S}]\) be the solution to (1), then the solution to the linear system \((A+E_{S})\lambda x=\lambda b-f_{S}\) is called the STLS solution and denoted by \(x_{S}\). As shown in [19], when \(\lambda =1\), \(\lambda \rightarrow 0\) and \(\lambda \rightarrow \infty \), \(x_{S}\) becomes the TLS solution \(x_{T}\), the OLS solution \(x_{\mathrm {O}}\) and the DLS solution \(x_D\), respectively.

The condition number gives a quantitative measurement of the maximum amplification of the resulting change in solution with respect to a perturbation in the data, and has been extensively studied. The interested reader is referred to the comprehensive survey [5]. For the STLS problem, Zhou et al. [28] considered its perturbation analysis and presented the normwise, mixed and componentwise condition numbers. With the perturbation theory of singular value decomposition (SVD) given in [24], Li and Jia [16] gave a different approach to derive the normwise and componentwise condition numbers of the STLS problem, and the corresponding structured condition numbers were also discussed. It should be noted that the normwise condition number in [28] contains Kronecker products which make it impractical to compute for large-scale problems. Based on the fact that \(\Vert A\Vert _2=\Vert A^TA\Vert _2^{1/2}\), where \(\Vert \cdot \Vert _2\) denotes the spectral norm of matrix or Euclidean norm of vector, some closed formulas and upper (or lower) bounds of the normwise condition number for the TLS problem were given in [1, 14], and these results are easy to compute and do not contain Kronecker product any more. Xie et al. [27] showed that the expressions of the condition numbers given in [1, 14] are mathematically equivalent. However, computing exact value of the condition number with the formula given in [1, 27] needs to calculate the matrix cross product, which is a source of rounding error and potentially numerical unstable [11, p. 386]. By designing iterative procedure or exploiting the SVDs in solving the TLS problem, some progress to avoid computing matrix cross product was made in [1, 14, 27]. In this paper, we present a new explicit expression of the normwise condition number of the STLS problem. The new expression is easy to compute and does not need to compute Kronecker product or matrix cross product. Meanwhile, the new expression should be of particular interest to the practitioners from applied disciplines, who are more likely to directly compute the normwise condition number of the STLS problem.

The rest of the paper is organized as follows: Sect. 2 contains the main results of the paper. Numerical experiments are presented in Sect. 3. Concluding remarks are given in Sect. 4. Before proceeding to the following sections, we introduce some notation first. For any matrix B, \(A\otimes B=[a_{ij}B]\) denotes the Kronecker product of A and B. \(A\circ B=[a_{ij}b_{ij}]\) denotes the Hadamard product of A and B. \(\mathrm {vec}(\cdot )\) is a linear map defined by \(\mathrm {vec}(A)=[a_{1,1},\ldots , a_{m,1},\ldots , a_{1,n},\ldots , a_{m,n}]^T\).

2 Main results

As stated in the Introduction, the TLS problem can be treated as a special case of the STLS problem. An interesting result is that we can solve the STLS problem by finding the solution to a special TLS problem. When \(\lambda =1\), we get the following TLS problem

$$\begin{aligned} \min \left\| \begin{bmatrix} E,&f \\ \end{bmatrix}\right\| _F,\quad \text {subject to }\;b-f \in {\mathscr {R}}(A+E). \end{aligned}$$
(2)

It is easy to check that when \(x_{S}\) is the solution of (1), \(\lambda x_{S}\) is the TLS solution to the following TLS problem

$$\begin{aligned} \min \left\| \begin{bmatrix} E,&f \\ \end{bmatrix}\right\| _F,\quad \text { subject to }\;\lambda b-f \in {\mathscr {R}}(A+E). \end{aligned}$$
(3)

Let the SVDs of the matrices \([A, \; \lambda b]\) and A be

$$\begin{aligned} U^T\begin{bmatrix} A,&\lambda b \\ \end{bmatrix}V={\varSigma },\quad {\hat{U}}^TA{\hat{V}}={\hat{{\varSigma }}}, \end{aligned}$$

where \(U=[u_1, \ldots , u_{n+1}]\in {\mathbb {R}}^{m\times (n+1)}\), \(V=[v_1,\ldots , v_{n+1}]\in {\mathbb {R}}^{(n+1)\times (n+1)}\), \({\varSigma }=\mathrm {diag}(\sigma _1,\ldots ,\sigma _{n+1})\) with \(\sigma _1\ge , \ldots ,\ge \sigma _{n+1}\ge 0\), \({\hat{U}}=[{\hat{u}}_1, \ldots , {\hat{u}}_{n}]\in {\mathbb {R}}^{m\times n}\), \({\hat{V}}=[{\hat{v}}_1,\ldots , {\hat{v}}_{n}]\in {\mathbb {R}}^{n\times n}\), and \({\hat{{\varSigma }}}=\mathrm {diag}({\hat{\sigma }}_1,\ldots ,{\hat{\sigma }}_{n})\) with \({\hat{\sigma }}_1\ge , \ldots ,{\hat{\sigma }}_{n}\ge 0\). Analogous to the Golub and Van Loan condition [9] for the TLS problem to guarantee the existence and uniqueness of a solution, Zhou et al. [28] presented the following sufficient condition to ensure the STLS problem has a unique solution

$$\begin{aligned} {\hat{\sigma }}_{n}>\sigma _{n+1}>0, \end{aligned}$$
(4)

which implies that both A and [Ab] are of full column rank. Therefore, by (4) the TLS solution to (3) is

$$\begin{aligned} \lambda x_{S}=(A^TA-\sigma _{n+1}I_n)^{-1}A^T(\lambda b), \end{aligned}$$

which gives

$$\begin{aligned} x_{S}=(A^TA-\sigma _{n+1}I_n)^{-1}A^Tb. \end{aligned}$$
(5)

From (5), we get the normal equation of the STLS problem

$$\begin{aligned} (A^TA-\sigma _{n+1}I_n)x_{S}=A^Tb. \end{aligned}$$
(6)

When \(\lambda =1\), the Rayleigh quotient iteration (RQI) and preconditioned conjugate gradient method (PCG) were combined to solve the TLS problem with the normal equation (6) in [4]. For a given \(\lambda \), by substituting the block matrix [Ab] with \([A,\lambda b]\), the RQIPCG method can be directly applied to solve the STLS problem.

Let \({\varDelta } A\) and \({\varDelta } b\) be the corresponding perturbations to A and b, then we have the following perturbed STLS problem

$$\begin{aligned} \min \left\| \begin{bmatrix} E,&f \\ \end{bmatrix}\right\| _F,\text { subject to }\lambda (b+{\varDelta } b)-f \in {\mathscr {R}}\left( (A+{\varDelta } A)+E\right) . \end{aligned}$$
(7)

For the perturbed STLS problem, Zhou et al. [28] and Li and Jia [16] presented two different approaches to show that when the perturbation \([{\varDelta } A, \; {\varDelta } b]\) is sufficiently small, the perturbed STLS problem admits a unique solution. We adapt the result given in [16] as the following theorem.

Theorem 1

Under the assumption (4), if \(\Vert [{\varDelta } A,\; {\varDelta } b]\Vert _F\) is small enough, then the perturbed STLS problem (7) has unique solution. Moreover, if we denote the solution by \(x_{PS}\), then

$$\begin{aligned} {\varDelta } x=x_{PS}-x_S= K\begin{bmatrix} \mathrm {vec}({\varDelta } A) \\ {\varDelta } b \\ \end{bmatrix}+{\mathscr {O}}\left( \left\| \left[ {\varDelta } A, {\varDelta } b\right] \right\| _F^2\right) , \end{aligned}$$
(8)

where

$$\begin{aligned} K&=M^{-1}\left( \left( \frac{2}{\Vert r\Vert _2^2}A^Trr^T-A^T\right) \left( \begin{bmatrix} x_S^T,&-1 \\ \end{bmatrix}\otimes I_m \right) -\begin{bmatrix} I_n\otimes r^T,&0_{n\times m} \\ \end{bmatrix}\right) \end{aligned}$$
(9)

with \(M=A^TA-\sigma ^2_{n+1}I_n\) and \(r=Ax_S-b\).

Li and Jia [16] also presented a very detailed comparison of the above results with those given in [28], and showed that their perturbation estimate is the same as that given in [28]. According to Theorem 1, it can be easily deduced that if we set

$$\begin{aligned} F: {\mathbb {R}}^{m\times n}\times {\mathbb {R}}^{m}\rightarrow & {} {\mathbb {R}}^{n}, \\ \begin{bmatrix} A,&\lambda b \\ \end{bmatrix}\rightarrow & {} x_{S}=M^{-1}A^Tb, \end{aligned}$$

then the map F is Fréchet differentiable at \([A,\; \lambda b]\) under the Assumption (4) and the Fréchet derivative of F at \([A,\; \lambda b]\) is given by

$$\begin{aligned} DF(A,\lambda b):=K. \end{aligned}$$

According to the definition of condition number given in [8, 21], the relative normwise condition number of the STLS problem is given by

$$\begin{aligned} \kappa _{rF}(A,\lambda b)=\lim _{\delta \rightarrow 0}\sup _{\left\| [{\varDelta } A, \; \lambda {\varDelta } b]\right\| _F<\delta }\frac{\frac{\left\| F\left( A+{\varDelta } A,\lambda (b+{\varDelta } b)\right) -F(A, \lambda b)\right\| _2}{\Vert F(A,\lambda b)\Vert _2}}{\frac{\left\| [{\varDelta } A, \; \lambda {\varDelta } b]\right\| _F}{\left\| [A, \; \lambda b]\right\| _F}}. \end{aligned}$$
(10)

When F is Fréchet differentiable, \(\kappa _{rF}(A,\lambda b)\) reduces to

$$\begin{aligned} \kappa _{rF}(A,\lambda b)=\frac{\left\| DF\left( A,\lambda b\right) \right\| _2\left\| [A, \; \lambda b]\right\| _F}{\Vert F(A,\lambda b)\Vert _2}, \end{aligned}$$

and \(\kappa _{F}(A,\lambda b)=\left\| DF\left( A,\lambda b\right) \right\| _2\) is the absolute condition number. For the convenience of presentation, we summarize the above discussion as the following theorem.

Theorem 2

Under the assumption (4), the relative normwise condition number of the STLS problem defined by (10) is

$$\begin{aligned} \kappa _{rF}(A,\lambda b)=\frac{\left\| K\right\| _2\left\| [A, \; \lambda b]\right\| _F}{\Vert F(A,\lambda b)\Vert _2}, \end{aligned}$$

and its absolute condition number is

$$\begin{aligned} \kappa _{F}(A,\lambda b)=\left\| K\right\| _2, \end{aligned}$$
(11)

where K is given by (9).

It should be noted that the Kronecker product enlarges the size of matrix and may make it impractical to explicitly form K when m and n are large. To eliminate the influence of the Kronecker product, we present a new compact form of the normwise condition number for the STLS problem in the following theorem. Considering the relationship between relative and absolute condition numbers, we only focus on \(\kappa _{F}(A,\lambda b)\) in the following parts.

Theorem 3

The absolute condition number \(\kappa _{F}(A,\lambda b)\) for the STLS problem has the following two equivalent forms

$$\begin{aligned}&\kappa _{F1}(A,\lambda b)\nonumber \\&\quad =\left\| M^{-1}\left( (1+\Vert x_S\Vert ^2_2)A^TA-A^Trx_S^T-x_Sr^TA+\Vert r\Vert _2^2I_n\right) M^{-1}\right\| _2^{\frac{1}{2}}, \end{aligned}$$
(12)

and

$$\begin{aligned} \kappa _{F2}(A,\lambda b)=\left\| M^{-1} \begin{bmatrix} A^T,&\Vert x_S\Vert _2A^T\left( I_m-\frac{1}{\Vert r\Vert _2^2}rr^T\right) ,&\Vert r\Vert _2\left( I_n-\frac{1}{\Vert r\Vert _2^2}A^Trx_S^T\right) \\ \end{bmatrix}\right\| _2, \end{aligned}$$
(13)

where M and r are given in (9).

Proof

For a real matrix X, \(\Vert X\Vert _2=\Vert X^TX\Vert _2^{1/2}=\Vert XX^T\Vert _2^{1/2}\) holds. Thus with (9) and (11) we have

$$\begin{aligned} \kappa _{F}(A,\lambda b)=\left\| K\right\| _2=\left\| KK^T\right\| _2^{\frac{1}{2}}. \end{aligned}$$

Since M is symmetric, by the equality \(\mathrm {vec}(AXB)=(B^T\otimes A)\mathrm {vec}(X)\) [13, Ch. 4] we can get

$$\begin{aligned} KK^T= & {} M^{-1}\left( \left( \frac{2}{\Vert r\Vert _2^2}A^Trr^T-A^T\right) \left( \begin{bmatrix} x_S^T,&-1 \\ \end{bmatrix}\otimes I_m\right) -\begin{bmatrix} I_n\otimes r^T,&0_{n\times m} \\ \end{bmatrix}\right) \nonumber \\&\times \left( \left( \begin{bmatrix} x_S \\ -1 \\ \end{bmatrix} \otimes I_m\right) \left( \frac{2}{\Vert r\Vert _2^2}rr^TA-A\right) -\begin{bmatrix} I_n\otimes r \\ 0_{m\times n} \\ \end{bmatrix}\right) M^{-1}\nonumber \\= & {} M^{-1}\left( \left( 1+\Vert x_S\Vert ^2_2\right) A^TA-A^Trx_S^T-x_Sr^TA+\Vert r\Vert _2^2I_n\right) M^{-1}\end{aligned}$$
(14)
$$\begin{aligned}= & {} M^{-1}\left( \begin{bmatrix} A^T,&I_n \\ \end{bmatrix} \begin{bmatrix} \left( 1+\Vert x_S\Vert _2^2\right) I_m,&-rx_S^T \\ -x_Sr^T,&\Vert r\Vert _2^2I_n \\ \end{bmatrix} \begin{bmatrix} A \\ I_n \\ \end{bmatrix} \right) M^{-1}. \end{aligned}$$
(15)

Since

$$\begin{aligned} \begin{bmatrix} \left( 1+\Vert x_S\Vert _2^2\right) I_m,&-rx_S^T \\ -x_Sr^T,&\Vert r\Vert _2^2I_n \\ \end{bmatrix}= & {} \begin{bmatrix} I_m,&-\frac{1}{\Vert r\Vert _2^2}rx_S^T \\ 0_{n\times m},&I_n \\ \end{bmatrix} \begin{bmatrix} I_m+\Vert x_S\Vert _2^2\left( I_m-\frac{1}{\Vert r\Vert _2^2}rr^T\right) ,&0_{m\times n} \\ 0_{n\times m},&\Vert r\Vert _2^2I_n \\ \end{bmatrix}\nonumber \\&\quad \times \begin{bmatrix} I_m,&0_{m\times n} \\ -\frac{1}{\Vert r\Vert _2^2}x_Sr^T,&I_n \\ \end{bmatrix} \end{aligned}$$
(16)

and

$$\begin{aligned}&I_m+\Vert x_S\Vert _2^2\left( I_m-\frac{1}{\Vert r\Vert _2^2}rr^T\right) \nonumber \\&\quad =\begin{bmatrix} I_m,&\Vert x_S\Vert _2\left( I_m-\frac{1}{\Vert r\Vert _2^2}rr^T\right) \\ \end{bmatrix} \begin{bmatrix} I_m \\ \Vert x_S\Vert _2\left( I_m-\frac{1}{\Vert r\Vert _2^2}rr^T\right) \\ \end{bmatrix}, \end{aligned}$$
(17)

we substitute (16) and (17) into (15) and get

$$\begin{aligned} KK^T=M^{-1}WW^TM^{-1}, \end{aligned}$$
(18)

where

$$\begin{aligned} W=\begin{bmatrix} A^T,&\Vert x_S\Vert _2A^T\left( I_m-\frac{1}{\Vert r\Vert _2^2}rr^T\right) ,&\Vert r\Vert _2\left( I_n-\frac{1}{\Vert r\Vert _2^2}A^Trx_S^T\right) \\ \end{bmatrix}. \end{aligned}$$

Using the formula \(\Vert X\Vert _2=\Vert XX^T\Vert _2^{1/2}\) again, (12) and (13), the two equivalent forms of \(\kappa _{F}(A,\lambda b)\), follow from (14) and (18), respectively. \(\square \)

Remark 1

Theorem 3 shows that the two equivalent forms do not contain Kronecker product any more, and the sizes of the matrices in (11), (12) and (13) are \(n\times m(n+1)\), \(n\times n\) and \(n\times (2m+n)\), respectively. When m and n are comparable and large, if we compute the normwise condition number of the STLS problem with its explicit expressions, which is always preferred by the practitioners from applied disciplines, we note that the computation of (12) and (13) requires much less storage space compared with (11). However, as pointed out in [1, 11], computing the matrix cross product may lead to large rounding errors. The explicit formulation of (12) and (13) needs to calculate \(A^TA\) and \(M^{-1}\), which is not desired. But, it should be claimed that \(M^{-1}\) is often an intermediate result when the STLS problem is solved with its normal equation (6). For example, in the RQIPCG method, finding \(M^{-1}\) can be transformed into solving triangular linear systems which can be efficiently computed and preserves better numerical stability [11, Ch. 8].

The TLS problem is a special case of the STLS problem, with Theorems 2 and 3 we get something new on the normwise condition number of the TLS problem.

Corollary 1

When \(\lambda =1\), the STLS problem degenerates into the TLS problem. From Theorems 2 and 3, the absolute condition number of the TLS problem has the following equivalent expressions

$$\begin{aligned} \kappa _{TLSF}(A, b)= & {} \left\| M^{-1}\left( \left( \frac{2}{\Vert r\Vert _2^2}A^Trr^T-A^T\right) \left( \begin{bmatrix} x_{T}^T,&-1 \\ \end{bmatrix}\otimes I_m\right) -\begin{bmatrix} I_n\otimes r^T,&0_{n\times m} \\ \end{bmatrix}\right) \right\| _2,\nonumber \\ \kappa _{TLSF1}(A, b)= & {} \left\| M^{-1}\left( (1+\Vert x_{T}\Vert ^2_2)A^TA-A^Trx_{T}^T-x_{T}r^TA+\Vert r\Vert _2^2I_n\right) M^{-1}\right\| _2^{\frac{1}{2}}, \end{aligned}$$
(19)

and

$$\begin{aligned} \kappa _{TLSF2}(A, b)&=\left\| M^{-1}\begin{bmatrix} A^T,&\Vert x_{T}\Vert _2A^T\left( I_m-\frac{1}{\Vert r\Vert _2^2}rr^T\right) ,&\Vert r\Vert _2\left( I_n-\frac{1}{\Vert r\Vert _2^2}A^Trx_{T}^T\right) \\ \end{bmatrix}\right\| _2, \end{aligned}$$
(20)

where \(M=A^TA-\sigma ^2_{n+1}I_n\), \(\sigma _{n+1}\) is the smallest singular value of [Ab], and \(r=Ax_T-b\).

Remark 2

We note that \(\kappa _{TLSF1}(A, b)\) was an intermediate result in the proof of Theorem 1 in [1, Equation 3.8], and \(\kappa _{TLSF}(A, b)\) was given by Jia and Li [14, Theorem 2]. Based on the normal equation \(Mx_{T}=A^Tb\) and its variants, Baboulin and Gratton [1] showed that

$$\begin{aligned} \kappa _{TLSF1}(A, b)&=\left\| (1+\Vert x_{T}\Vert _2^2)M^{-1}\left( A^TA+\sigma _{n+1}\left( I_n-\frac{2}{1+ \Vert x_{T}\Vert _2^2}x_{T}x_{T}^T\right) \right) M^{-1}\right\| _2^{\frac{1}{2}}. \end{aligned}$$
(21)

To avoid computing the matrix cross product in (21), a power method [10, Ch. 7] based iterative procedure was proposed to compute the normwise condition number. Baboulin and Gratton [1] also suggested that when the TLS problem is solved by the SVD method, the computation of (21) can be further simplified. But their simplified expression needs the SVDs of both A and [Ab], which may be expensive. Jia and Li [14] further showed that only the SVD of [Ab] will be enough. Based on the SVDs of A and/or [Ab], some computable upper and lower bounds of the condition number were also presented in [1, 14]. In addition, it can be easily checked that

$$\begin{aligned} A^TA+\sigma _{n+1}\left( I_n-\frac{2}{1+\Vert x_{T}\Vert _2^2}x_{T}x_{T}^T\right) \end{aligned}$$

is positive definite. Xie et al. [27, Remark 2] suggested to use Cholesky decomposition to further simplify the expression of normwise condition number, but no explicit expression was given there. According to Remark 1, our new compact form \(\kappa _{TLSF2}(A, b)\) requires less storage space, and does not need to calculate Cholesky decomposition and the SVD. So we may say that the \(\kappa _{TLSF2}(A, b)\) is a new and more efficient result on the explicit computation of the normwise condition number of the TLS problem.

As in [16, 28], when \(\lambda \rightarrow 0\), we get \(\sigma _{n+1}\rightarrow 0\). Therefore, \(M^{-1}=(A^TA-\sigma _{n+1}I_n)^{-1}\rightarrow (A^TA)^{-1}\) and \(x_S\) converges to \(x_{\mathrm {O}}\). By the equality \(A^Tr=0\) with \(r=Ax_{\mathrm {O}}-b\), from Theorems 2 and 3 we get the following three equivalent expressions of the normwise condition number for the OLS problem

$$\begin{aligned} \kappa _{OLSF}(A, b)= & {} \left\| (A^TA)^{-1}\left( -A^T \left( \begin{bmatrix} x_{\mathrm {O}}^T,&-1 \\ \end{bmatrix}\otimes I_m\right) -\begin{bmatrix} I_n\otimes r^T,&0_{n\times m} \\ \end{bmatrix}\right) \right\| _2,\nonumber \\ \kappa _{OLSF1}(A,\lambda b)= & {} \left\| (A^TA)^{-1}\left( (1+\Vert x_{\mathrm {O}}\Vert ^2_2)A^TA+\Vert r\Vert _2^2I_n\right) (A^TA)^{-1}\right\| _2^{\frac{1}{2}}, \end{aligned}$$
(22)

and

$$\begin{aligned} \kappa _{OLSF2}(A,\lambda b)=\left\| (A^TA)^{-1}\begin{bmatrix} A^T,&\Vert x_{\mathrm {O}}\Vert _2A^T,&\Vert r\Vert _2I_n\\ \end{bmatrix}\right\| _2. \end{aligned}$$
(23)

With a little algebra, we can check that \(\kappa _{OLSF}(A, b)\) can be rewritten as follows

$$\begin{aligned} \kappa _{OLSF}(A, b)=\left\| \begin{bmatrix} -(x_{\mathrm {O}}^T\otimes A^{\dagger })-(A^TA)^{-1}\otimes r^T,&A^{\dagger } \\ \end{bmatrix}\right\| _2, \end{aligned}$$
(24)

where \(A^{\dagger }=(A^TA)^{-1}A^T\) is the Moore-Penrose inverse of matrix A (see [2, 26]). It should be noted that (24), (22) and (23) have been given by Li and Wang [18] in investigating the condition numbers for the indefinite least squares problem.

3 Numerical experiment

The computation and/or estimation of the condition numbers for the TLS problem has been extensively studied [1, 6, 14, 27], which can be easily adapted to compute the normwise condition number of the STLS problem. From computational aspect, direct calculation of the condition number is not desired due to the heavy computation burden. When we solve the STLS problem, some intermediate results will be produced. Using these intermediate results to compute the condition number will largely reduce the computational burden. For example, in the implementation of RQIPCG method for solving the TLS problem, the approximate singular value of \(\sigma _{n+1}\) and the Cholesky factor R of \(A^TA\) are available. So with these intermediate results the formulation of the condition number becomes much easier. Diao and Sun [6] proposed a power method based on the RQIPCG procedure to estimate the upper bounds of the mixed and componentwise condition numbers of the TLS problem, which can also be modified to estimate the normwise condition number of the STLS problem. Here, we will not focus on devising algorithms to compute or estimate the normwise condition numbers. But, for completeness and to show the condition number estimation should be solver based, we present a RQIPCG procedure based power method to compute the normwise condition number of the STLS problem in the Appendix part.

As we have shown, our new compact forms require less storage space and are easy to calculate. This should be convenient for the practitioners to directly compute the normwise condition number of the STLS problem via software, like Matlab. Here, we only focus on the “naive” method to compute the normwise condition number of the STLS problem, which means that we first formulate the explicit expression of the matrix and then compute its spectral norm as the value of condition number. We use two built-in commands norm(\(\cdot \),2) and normest(\(\cdot \),tol) in Matlab R2010b. All the computations are performed on a PC with Intel i5-6600M CPU 3.30 GHz and 4.00 GB RAM.

Example 1

Investigating the influence of different forms on the computation of the normwise condition number for the STLS problem is our main purpose. Similar to [1], we construct the following random STLS problem. Let \([A,\; \lambda b]\) be defined by

$$\begin{aligned} \begin{bmatrix} A,&\lambda b \\ \end{bmatrix}=Y\begin{bmatrix} D \\ 0 \\ \end{bmatrix}Z^T\in {\mathbb {R}}^{m\times (n+1)}, \quad Y=I_m-2yy^T,\quad Z=I_{n+1}-2zz^T, \end{aligned}$$

where \(y\in {\mathbb {R}}^{m}\), \(z\in {\mathbb {R}}^{n+1}\) are random unit vectors, and \(D=\mathrm {diag}(n,n-1,\ldots ,1, 1-e_p)\) for given parameter \(e_p\). Due to the interlacing property [3, p. 178], we get

$$\begin{aligned} {\hat{\sigma }}_n-\sigma _{n+1}\le \sigma _n-\sigma _{n+1}=e_p. \end{aligned}$$

Thus \(e_p\) gives a measure of the distance of the problem to nongenericity, and the solution \(x_S\) is given by (5). By varying \(\lambda \), \(e_p\) and the size of the matrix, we report the CPU time in seconds of computing the condition number of the STLS problem with its different forms.

We repeat the computation 200 times for one group of settings. Since these two commands give same values of the condition numbers, we only report the mean values of CPU time in Table 1. From Table 1, we can see that when \(m=500,n=300\), computing (11) becomes very time consuming due to the large size of the matrix, but (13) still works well. When we increase (mn) to (1000, 700), the computation of \(\kappa _F(A,\lambda b)\) with norm(\(\cdot \),2) breaks down due to the lack of memory. The numerical results also show that the normest(\(\cdot \),tol) can largely reduce the CPU time in computing the spectral norm of large and sparse matrix. Of course, the matrix K in (11) is sparse due to the Kronecker product. Moreover, we can also find that the CPU time for computing (13) is always the smallest in our numerical experiment. Therefore, we may say that the new compact form can greatly improve the efficiency of computing the exact value of the normwise condition number for the STLS problem with its explicit expression, and should be of much interest to the practical applications especially from applied disciplines.

Table 1 Average CPU time in seconds of two “naive” methods

Example 2

From the definition of condition number (10), the relative forward error is bounded by the relative condition number multiplied by the relative backward error. We consider the following example adopted from [15] and arising from the application in signal restoration. Let \(\alpha =1.25\), \(n=500\) and \(\omega =80\). The convolution matrix \({\bar{A}}\) is an \(n\times (n-2\omega )\) Toeplitz matrix, and its first column is given by

$$\begin{aligned} a_{i,1}=\frac{1}{\sqrt{2\pi \alpha ^2}}\exp \left[ \frac{-(\omega -i+1)^2}{2\alpha ^2}\right] ,\quad i=1,2,\ldots ,2\omega +1 \end{aligned}$$

and \(t_{i,1}=0\) otherwise. The elements in the first row are all zeros except \(a_{11}=1\). A Toeplitz matrix A and a right-hand side vector b are constructed as \(A={\bar{A}}+E\) and \(b={\bar{g}}+e\), where \({\bar{g}}=[1,\ldots ,1]^T\), E is a random Toeplitz matrix with the same structure as \({\bar{A}}\) and e is a random vector. The entries of E and e are generated from the standard normal distribution \({\mathscr {N}}(0,1)\), and scaled such that

$$\begin{aligned} \Vert E\Vert _{F}=\gamma \Vert {\bar{A}}\Vert _{F},\quad \Vert e\Vert _2=\gamma \Vert {\bar{g}}\Vert _2,\quad \gamma =10^{-3}. \end{aligned}$$

The elements of the perturbations \({\varDelta } A\) and \({\varDelta } b\) to A and b are randomly generated from the open interval \((-1,1)\), and scaled so that

$$\begin{aligned} \Vert {\varDelta } A\Vert _F=\epsilon \Vert A\Vert _F,\quad \Vert {\varDelta } b\Vert _{2}=\epsilon \Vert b\Vert _{2},\quad \epsilon =10^{-8}. \end{aligned}$$

Under the above settings, the perturbations to the coefficient matrices are not structured perturbation and can be interpreted as the additive white noise to the model [25].

Let \(x+{\varDelta } x\) be the solution to the perturbed STLS problem. The relative forward error is defined as \({\Vert {\varDelta } x\Vert _2}/{\Vert x\Vert _2}\), and its approximate upper bounds with respect to different expressions of the normwise condition number are given by \(\epsilon \kappa _{rF}(A,\lambda b)\) and \(\epsilon \kappa _{rF2}(A,\lambda b)\). To compare the running time for computing the approximate upper forward error bound, we use \(\mathrm {CPUi}_{\kappa _{*}}\) with \(i=1,2\) to denote the CPU time of computing the corresponding approximate upper forward error bounds with respect to norm(\(\cdot \),2) and normest(\(\cdot ,10^{-4}\)), respectively.

Table 2 Comparison of performances of different expressions in estimating the forward error

The numerical results are reported in Table 2, from which we can find that although \(\epsilon \kappa _{rF}(A,\lambda b)\) and \(\epsilon \kappa _{rF2}(A,\lambda b)\) give same upper bounds, \(\epsilon \kappa _{rF2}(A,\lambda b)\) requires much less CPU time compared with \(\epsilon \kappa _{rF}(A,\lambda b)\). This shows the great efficiency of the new compact form of the normwise condition number in estimating the forward error of the STLS problem.

4 Concluding remark

In this paper, new and compact forms of the normwise condition number for the STLS problem are presented. Compared with the original expression, the new forms enjoy great computational efficiency in calculating the exact value of the normwise condition number, which may extend the applications of the condition number theory of the STLS problem to other areas. Since the estimation of the condition numbers for the TLS problem has been extensively studied and these methods can be directly applied to estimate the normwise condition number of the STLS problem, to avoid repetitive work we only outline a RQIPCG procedure based power method to estimate the normwise condition number of the STLS problem in the Appendix part. This is mainly to show that using the intermediate results produced in solving the STLS problem can largely reduce the computational burden in the condition number estimation.

In addition, Li and Jia [16] also considered the linear structured condition number for the STLS problem. Here, we need to point out that the structured normwise condition number of the STLS problem can also be transformed into some compact form with the same method given in [17, Remark 3.3]. But as Li and Wang [17] claimed, the compact form becomes very complicated due to the structure matrices. Thus, in this paper we will not consider the simplification of the structured normwise condition number of the STLS problem, and more researches on the structured condition number theory should be referred to [12, 22, 23].