Partial condition number for the equality constrained linear least squares problem

Li, Hanyu; Wang, Shaoxin

doi:10.1007/s10092-017-0221-8

Partial condition number for the equality constrained linear least squares problem

Published: 07 April 2017

Volume 54, pages 1121–1146, (2017)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Calcolo Aims and scope Submit manuscript

Partial condition number for the equality constrained linear least squares problem

Download PDF

Hanyu Li¹ &
Shaoxin Wang²

397 Accesses
21 Citations
Explore all metrics

Abstract

In this paper, the normwise condition number of a linear function of the equality constrained linear least squares solution called the partial condition number is considered. Its expression and closed formulae are first presented when the data space and the solution space are measured by the weighted Frobenius norm and the Euclidean norm, respectively. Then, we investigate the corresponding structured partial condition number when the problem is structured. To estimate these condition numbers with high reliability, the probabilistic spectral norm estimator and the small-sample statistical condition estimation method are applied and two algorithms are devised. The obtained results are illustrated by numerical examples.

A note on the condition number of the scaled total least squares problem

Article 25 October 2018

On condition numbers of the total least squares problem with linear equality constraint

Article 01 February 2022

Statistical Estimates for the Conditioning of Linear Least Squares Problems

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction and preliminaries

The equality constrained linear least squares problem can be stated as follows:

$$\begin{aligned} \mathrm{LSE:}\quad \mathop {\min }\limits _{Bx=d } \left\| b - Ax\right\| _2, \end{aligned}$$

(1.1)

where $A \in {\mathbb {R}^{m \times n}}$ and $B \in {\mathbb {R}^{s \times n}}$ with $m+s\ge n\ge s\ge 0, b \in \mathbb {R}^{m}$ and $d \in \mathbb {R}^{s}$. Hereafter, the symbols $\mathbb {R}^{m \times n}$ and $\mathbb {R}^{n}$ stand for the set of $m\times n$ real matrices and the real vector space of dimension n, respectively. To ensure that the LSE problem (1.1) has a unique solution, we need to assume that [6]

$$\begin{aligned} \mathrm{rank}(B)=s,\quad \mathrm{null}(A)\cap \mathrm{null}(B)=\{0\}. \end{aligned}$$

(1.2)

The first condition in (1.2) implies that the linear system $Bx=d$ is consistent and hence that the LSE problem (1.1) has a solution, and vice versa; The second one, which says that the matrix $[A^{T}, B^{T}]^{T}$ is full column rank, guarantees that there is a unique solution to (1.1), and vice versa. Here, for a matrix $C, C^{T}$ denotes its transpose. Throughout this paper, we assume that the conditions in (1.2) always hold. In this case, the unique solution to the LSE problem (1.1) can be written as [6, 8]

$$\begin{aligned} x(A,B,b,d) = (AP)^ \dag b +B_A^ \dag d, \end{aligned}$$

(1.3)

where

$$\begin{aligned} P = I_n - B^ \dag B,\quad B_A^ \dag = (I_n - (AP)^ \dag A)B^ \dag \end{aligned}$$

with $I_n$ being the identity matrix of order n and $ B^ \dag $ being the Moore-Penrose inverse of B. When $s=0$, i.e., $B=0$ and $d=0$, the LSE problem (1.1) reduces the classic linear least squares (LLS) problem

$$\begin{aligned} \mathrm{LLS:}\quad \mathop {\min }\limits _{x\in \mathbb {R}^{n} } \left\| b - Ax\right\| _2, \end{aligned}$$

(1.4)

the conditions in (1.2) reduce to A being full column rank which ensures that the solution to (1.4) is unique, and the solution (1.3) reduces to $x(A,b) = A^ \dag b$.

The LSE problem finds many important applications in some areas. For example, we will encounter it in the analysis of large scale structures, in signal processing, and in solving inequality constrained least squares problem [4, 6, 19]. So, some scholars considered its algorithms and perturbation analysis (see e.g., [4, 6, 8, 10, 19, 29]). An upper bound for the normwise condition number of the LSE problem was presented in [8], and the mixed and componentwise condition numbers and their easily computable upper bounds of this problem can be derived from [21] as the special case.

In this paper, we mainly consider the partial condition number of the LSE problem when the data space $\mathbb {R}^{m \times n}\times \mathbb {R}^{s \times n}\times \mathbb {R}^{m}\times \mathbb {R}^{s}$ and the solution space $\mathbb {R}^{n}$ are measured by the weighted Frobenius norm

$$\begin{aligned} \left\| (\alpha _A A, \alpha _B B,\alpha _b b,\alpha _d d) \right\| _F=\sqrt{\alpha _A^2\left\| A\right\| _F^2+\alpha _B^2\left\| B\right\| _F^2+\alpha _b^2\left\| b \right\| _2^2+\alpha _d^2\left\| d \right\| _2^2}\nonumber \\ \end{aligned}$$

(1.5)

with $ \alpha _A>0, \alpha _B>0, \alpha _b>0$, and $\alpha _d>0$, and the Euclidean norm $\left\| x \right\| _2$, respectively. As mentioned in Abstract, the partial condition number which is also called the subspace condition number [7] is referred to the condition number of a linear function of the LSE solution x(A, B, b, d), i.e., $L^{T}x(A,B,b,d)$ with $L\in \mathbb {R}^{n \times k}$ ($k\le n$). This kind of condition number has some advantages. For example, when L is the identity matrix or a column vector of the identity matrix, the partial condition number will reduce to the condition number of the solution x(A, B, b, d) or of an element of the solution. Cao and Petzold first proposed the partial condition number for linear systems [7]. Later, it was proposed for LLS problem and total least squares problem [1, 2]. In [1, 2, 7], the authors also provided some specific motivations for investigating this kind of condition number.

The idea on the weighted Frobenius norm can be traced back to Gratton [14], who derived the normwise condition number for the LLS problem (1.4) based on the following weighted Frobenius norm

$$\begin{aligned} \left\| (\alpha A, \beta b) \right\| _F=\sqrt{\alpha ^2\left\| A\right\| _F^2+\beta ^2\left\| b \right\| _2},\quad \alpha>0, \beta >0. \end{aligned}$$

(1.6)

Subsequently, this kind of norm was used for the partial condition number for the LLS problem [1] and the normwise condition number of the truncated singular value solution of a linear ill-posed problem [5]. As pointed out in [14], this norm is very flexible. With it, we can monitor the perturbations on A and b. For example, if $\alpha \rightarrow \infty $, no perturbation on A will be permitted; similarly, if $\beta \rightarrow \infty $, there will be no perturbation on b allowed. Obviously, the norm in (1.5) is a simple generalization of the one in (1.6), and is also very flexible. There is another kind of generalization of the norm in (1.6): $\left\| (T A, \beta b) \right\| _F$, which was used by Wei et al. in [30] for the normwise condition number of rank deficient LLS problem. Here, T is a positive diagonal matrix.

Like the structured linear systems and the structured LLS problem, the structured LSE problem arises in many applications, e.g., in signal processing and the area of optimization [6, 19]. Rump [25, 26] presented the perturbation theory for the structured linear systems with respect to normwise distances and componentwise distances. The obtained results generalized the corresponding ones in [15]. For the structured LLS problems, Xu et al. [31] considered their structured normwise condition numbers, and Cucker and Diao [9] presented their structured mixed and componentwise condition numbers. In addition, the structured condition numbers for the total least squares problem were provided by Li and Jia in [20]. The results in [20, 25, 26, 31] show that the structured condition number can be much tighter than the unstructured one. So, based on the study on the partial condition number, we also investigate the structured partial condition number of the structured LSE problem.

The rest of this paper is organized as follows. Section 2 presents the expression and closed formulae of the partial condition number for the LSE problem. The expression of the corresponding structured partial condition number is given in Sect. 3. On basis of the probabilistic spectral norm estimator by Hochstenbach [16] and the small-sample statistical condition estimation (SSCE) method by Kenney and Laub [18], Sect. 4 is devoted to the statistical estimates and algorithms of the results derived in Sects. 2 and 3. The numerical experiments for illustrating the obtained results are provided in Sect. 5.

Before moving to the following sections, we first introduce some results on the operator ‘vec’ and Kronecker product, and the generalized singular value decomposition (GSVD) of a matrix pair. They will be necessary later in this paper.

For a matrix $A=[a_1,\ldots , a_n]\in \mathbb {R}^{m\times n}$ with $a_i\in \mathbb {R}^{m}$, the operator ‘vec’ is defined as follows

$$\begin{aligned} \mathrm{vec}(A)=[a_1^{T},\ldots ,a_n^{T}]^{T}\in \mathbb {R}^{mn}. \end{aligned}$$

Let $A =(a_{ij})\in {\mathbb {R}^{m \times n}}$ and $B \in {\mathbb {R}^{p \times q}}$. The Kronecker product between A and B is defined by (see, e.g., [17, Chapter 4])

$$\begin{aligned} A \otimes B = \left[ {\begin{array}{cccc} {{a_{11}}B}&{}\quad {{a_{12}}B}&{} \quad \cdots &{}\quad {{a_{1n}}B}\\ {{a_{21}}B}&{}\quad {{a_{22}}B}&{} \quad \cdots &{}\quad {{a_{2n}}B}\\ \vdots &{}\quad \vdots &{}\quad \ddots &{}\quad \vdots \\ {{a_{m1}}B}&{}\quad {{a_{m2}}B}&{} \quad \cdots &{}\quad {{a_{mn}}B} \end{array}} \right] \in {\mathbb {R}^{mp \times nq}}. \end{aligned}$$

This definition implies that when $m=1$ and $q=1$, i.e, when A is a row vector and B is a column vector,

$$\begin{aligned} A \otimes B=BA. \end{aligned}$$

(1.7)

From [17, Chapter 4], we have

$$\begin{aligned} (A \otimes B)^{T}= & {} (A^{T} \otimes B^{T}),\end{aligned}$$

(1.8)

$$\begin{aligned} \mathrm{vec}(AXB)= & {} \left( {B^{T}} \otimes A\right) \mathrm{vec}(X),\end{aligned}$$

(1.9)

$$\begin{aligned} \varPi _{mn} \mathrm{vec}(A)= & {} \mathrm{vec}({A^{T}}),\nonumber \\ \varPi _{pm} (A \otimes B) \varPi _{nq}= & {} (B \otimes A), \end{aligned}$$

(1.10)

where $X\in {\mathbb {R}^{n \times p}}$, and $\varPi _{mn}$ is the vec-permutation matrix depending only on the orders m and n. Especially, when $n=1$, i.e., A is a column vector, then $\varPi _{nq}=I_q$ and hence

$$\begin{aligned} \varPi _{pm} (A \otimes B)= (B \otimes A). \end{aligned}$$

(1.11)

In addition, the following result is also from [17, Chapter 4]

$$\begin{aligned} (A \otimes B)(C \otimes D)=(AC) \otimes (BD), \end{aligned}$$

(1.12)

where the matrices C and D are of suitable orders.

For the matrix pair A, B in (1.1) and (1.2), there exist orthogonal matrices $U\in {\mathbb {R}^{m \times m}}$ and $V\in {\mathbb {R}^{s \times s}}$, and a nonsingular matrix $X\in {\mathbb {R}^{n \times n}}$ such that

$$\begin{aligned} A = U\varSigma X^{ - 1} ,\quad B = V\varLambda X^{ - 1}, \end{aligned}$$

(1.13)

where

$$\begin{aligned} \varSigma = \left[ {\begin{array}{cc} {\varSigma _1 } &{}\quad 0 \\ 0 &{}\quad 0 \\ \end{array}} \right] = \left[ {\begin{array}{ccc} {I_{n - s} } &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad {S_A } &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 \\ \end{array}} \right] ,\quad \varLambda = \left[ {\begin{array}{cc} 0 &{} {\varLambda _1 } \\ \end{array}} \right] = \left[ {\begin{array}{ccc} 0 &{} {S_B } &{} {} \\ 0 &{} {} &{} {I_{s - t} } \\ \end{array}} \right] \end{aligned}$$

with

$$\begin{aligned}&S_A = \mathrm{diag}(\alpha _1 , \ldots ,\alpha _t ),\ S_B = \mathrm{diag}(\beta _1 , \ldots ,\beta _t ),\\&{\alpha _1\geqslant \alpha _2\geqslant \cdots \geqslant \alpha _t>0} ,{0<\beta _1\leqslant \beta _2\leqslant \cdots \leqslant \beta _t},\ \alpha _i^2 + \beta _i^2 = 1, {i=1,\ldots , t}, \end{aligned}$$

and $t=\mathrm{rank}(A)+s-n$. This decomposition is called the GSVD of a matrix pair (see e.g., [13, p. 309], [27]). When $B=0$, the GSVD can reduce to the SVD of A:

$$\begin{aligned} A = U\varSigma X^{ T} , \end{aligned}$$

(1.14)

where X is orthogonal and $\varSigma _1 =\mathrm{diag}(\sigma _1 , \ldots ,\sigma _{\mathrm{rank}(A)})$ with $\sigma _i$ being the i-th singular value of A.

2 The partial condition number

Let $L\in {\mathbb {R}^{n \times k}}$ with $k\le n$. We consider the following function

$$\begin{aligned}&g:\mathbb {R}^{m \times n} \times \mathbb {R}^{s \times n} \times \mathbb {R}^m \times \mathbb {R}^s \rightarrow R^k\\&(A,B,b,d)\rightarrow g(A,B,b,d) = L^{T} x(A,B,b,d) = L^{T} (AP)^{\dag } b+L^{T} B_A^{\dag } d. \end{aligned}$$

From [8, 21], it can be seen that the function g is continuously Fréchet differentiable in a neighborhood of (A, B, b, d). Thus, denoting by $g^{\prime }$ the Fréchet derivative of g, and using the chain rules of composition of derivatives or from [8, 21], we have

$$\begin{aligned}&g^{\prime }(A,B,b,d):\mathbb {R}^{m \times n} \times \mathbb {R}^{s \times n} \times \mathbb {R}^m \times \mathbb {R}^s \rightarrow R^k \\&\quad (\varDelta A,\varDelta B,\varDelta b,\varDelta d)\rightarrow g^{\prime } (A,B,b,d){\circ }{(\varDelta A,\varDelta B,\varDelta b,\varDelta d)} \\&\qquad = L^{T} ((AP)^{T} AP)^{\dag } (\varDelta A)^{T} r - L^{T} (AP)^{\dag } (\varDelta A)x + L^{T} (AP)^{\dag } (\varDelta b)\\&\quad \qquad - L^{T} ((AP)^{T} AP)^{\dag } (\varDelta B)^{T} (AB_A^{\dag } )^{T} r - L^{T} B_A^{\dag } (\varDelta B)x + L^{T} B_A^{\dag } (\varDelta d). \end{aligned}$$

Here, $ g^{\prime } (A,B,b,d){\circ }{(\varDelta A,\varDelta B,\varDelta b,\varDelta d)}$ denotes that we apply the function $g^{\prime }(A,B,b,d)$ to the perturbation variable $(\varDelta A,\varDelta B,\varDelta b,\varDelta d)$ at the point (A, B, b, d) and $r=b-Ax$ is called the residual vector. Thus, according to [11, 24], we have the absolute normwise condition number of g at the point (A, B, b, d) based on the weighted Frobenius norm (1.5):

$$\begin{aligned} \kappa _{LSE} (A,B,b,d) = \mathop {\max }\limits _{(\alpha _A\varDelta A,\alpha _B\varDelta B,\alpha _b\varDelta b,\alpha _d\varDelta d) \ne 0} \frac{{\left\| {g^{\prime } (A,B,b,d){\circ } (\varDelta A,\varDelta B,\varDelta b,\varDelta d)} \right\| _2 }}{{\left\| {(\alpha _A\varDelta A,\alpha _B\varDelta B,\alpha _b\varDelta b,\alpha _d\varDelta d)} \right\| _F}}.\nonumber \\ \end{aligned}$$

(2.1)

As mentioned in Sect. 1, the condition number $\kappa _{LSE} (A,B,b,d)$ is called the partial condition number of the LSE problem (1.1) with respect to L.

In the following, we provide an expression of $\kappa _{LSE} (A,B,b,d)$.

Theorem 2.1

The partial condition number of the LSE problem (1.1) with respect to L is

$$\begin{aligned} \kappa _{LSE} (A,B,b,d) = \left\| {M_{g^{\prime } } } \right\| _2 , \end{aligned}$$

(2.2)

where

$$\begin{aligned} M_{g^{\prime } } =\left[ {M_1 ,M_2 ,M_3 ,M_4 } \right] \end{aligned}$$

(2.3)

with

$$\begin{aligned} M_1= & {} \frac{{\left( {r^{T} \otimes (L^{T} ((AP)^{T} AP)^{\dag } )} \right) \varPi _{mn} - {x^{T} \otimes (L^{T} (AP)^{\dag } )} }}{{\alpha _A }}, \\ M_2= & {} - \frac{{\left( {(r^{T} AB_A^{\dag } ) \otimes (L^{T} ((AP)^{T} AP)^{\dag } )} \right) \varPi _{sm} + {x^{T} \otimes (L^{T} B_A^{\dag } )} }}{{\alpha _B }}, \\ M_3= & {} \frac{{L^{T} (AP)^{\dag } }}{{\alpha _b }},\quad M_4 = \frac{{L^{T} B_A^{\dag } }}{{\alpha _d }}. \end{aligned}$$

Proof

Applying the operator vec to $g^{\prime } (A,B,b,d){\circ }{(\varDelta A,\varDelta B,\varDelta b,\varDelta d)}$ and using (1.9) and (1.10), we have

$$\begin{aligned}&g^{\prime } (A,B,b,d){\circ }{(\varDelta A,\varDelta B,\varDelta b,\varDelta d)} =\mathrm{vec}(g^{\prime } (A,B,b,d){\circ }{(\varDelta A,\varDelta B,\varDelta b,\varDelta d)}) \nonumber \\&\quad = \left( {r^{T} \otimes (L^{T} ((AP)^{T} AP)^{\dag } )} \right) \varPi _{mn}\mathrm{vec}(\varDelta A) - \left( {x^{T} \otimes (L^{T} (AP)^{\dag } )} \right) \mathrm{vec}(\varDelta A) \\&\qquad - \left( {(r^{T} AB_A^{\dag } ) \otimes (L^{T} ((AP)^{T} AP)^{\dag } )} \right) \varPi _{sm}\mathrm{vec}(\varDelta B)- \left( {x^{T} \otimes (L^{T} B_A^{\dag } )} \right) \mathrm{vec}(\varDelta B)\\&\qquad + L^{T} (AP)^{\dag } (\varDelta b) + L^{T} B_A^{\dag } (\varDelta d)\nonumber \\&\quad = M_{g^{\prime } } \left[ {\begin{array}{c} {\alpha _A \mathrm{vec}(\varDelta A)} \\ {\alpha _B \mathrm{vec}(\varDelta B)} \\ {\alpha _b (\varDelta b)} \\ {\alpha _d (\varDelta d)} \\ \end{array}} \right] . \end{aligned}$$

Thus, considering (1.5) and the fact that for any matrix $C, \left\| C\right\| _F=\left\| \mathrm{vec}(C)\right\| _2$,

$$\begin{aligned} \kappa _{LSE} (A,B,b,d) = \mathop {\max }\limits _{(\alpha _A\varDelta A,\alpha _B\varDelta B,\alpha _b\varDelta b,\alpha _d\varDelta d) \ne 0} \frac{{\left\| {M_{g^{\prime } } \left[ {\begin{array}{c} {\alpha _A \mathrm{vec}(\varDelta A)} \\ {\alpha _B \mathrm{vec}(\varDelta B)} \\ {\alpha _b (\varDelta b)} \\ {\alpha _d (\varDelta d)} \\ \end{array}} \right] } \right\| _2 }}{{\left\| {\left[ {\begin{array}{c} {\alpha _A \mathrm{vec}(\varDelta A)} \\ {\alpha _B \mathrm{vec}(\varDelta B)} \\ {\alpha _b (\varDelta b)} \\ {\alpha _d (\varDelta d)} \\ \end{array}} \right] } \right\| _2 }} = \left\| {M_{g^{\prime } } } \right\| _2.\nonumber \\ \end{aligned}$$

(2.4)

$\square $

Remark 2.1

$$\begin{aligned} \kappa _{LSE} (A,B,b,d)\le & {} \left\| {\left( {r^{T} \otimes ((AP)^{T} AP)^{\dag } } \right) \varPi _{mn} - {x^{T} \otimes (AP)^{\dag } } }\right\| _2\\&+ \left\| {\left( {(r^{T} AB_A^ \dag ) \otimes ((AP)^{T} AP)^{\dag } } \right) \varPi _{sm} + {x^{T} \otimes B_A^{\dag } } }\right\| _2\\&+\left\| {(AP)^{\dag } }\right\| _2+ \left\| { B_A^{\dag } }\right\| _2\\= & {} {\kappa _{LSEup} (A,B,b,d),} \end{aligned}$$

where $\kappa _{LSEup} (A,B,b,d)$ is essentially the same as the upper bound for the normwise condition number of the LSE problem (1.1) obtained in [8]. Note that $\kappa _{LSEup} (A,B,b,d)\le 4 \kappa _{LSE} (A,B,b,d)$, which can be verified from a simple fact that $\left\| C\right\| _2+\left\| D\right\| _2+\left\| E\right\| _2+\left\| F\right\| _2\le 4\left\| [C,D,E,F]\right\| _2$. Hence, the bound is a good approximation of the normwise condition number $\kappa _{LSE} (A,B,b,d)$. The numerical results given in Sect. 5 confirm this claim.

Note that the expression of the partial condition number $\kappa _{LSE} (A,B,b,d)$ given in Theorem 2.1 contains Kronecker product. This introduces some large sparse matrices. The following theorem provides a closed formula of $\kappa _{LSE} (A,B,b,d)$ without Kronecker product.

Theorem 2.2

A closed formula of the partial condition number $\kappa _{LSE} (A,B,b,d)$ is given by

$$\begin{aligned} \kappa _{LSE} (A,B,b,d) = \left\| C\right\| _2^{1/2}, \end{aligned}$$

(2.5)

where

$$\begin{aligned} C= & {} \left( {\frac{{\left\| r \right\| _2^2 }}{{\alpha _A^2 }} + \frac{{\left\| {r^{T} AB_A^{\dag } } \right\| _2^2 }}{{\alpha _B^2 }}} \right) L^{T} ((AP)^{T} AP)^{\dag })^2 L\nonumber \\&+ \left( {\frac{{\left\| x \right\| _2^2 }}{{\alpha _A^2 }} + \frac{1}{{\alpha _b^2 }}} \right) L^{T} ((AP)^{T} AP)^{\dag })L \nonumber \\&+ \left( {\frac{{\left\| x \right\| _2^2 }}{{\alpha _B^2 }} + \frac{1}{{\alpha _d^2 }}} \right) L^{T} B_A^{\dag } (B_A^{\dag } )^{T} L + \frac{1}{{\alpha _B^2 }}L^{T} ((AP)^{T} AP)^{\dag } xr^{T} AB_A^{\dag } (B_A^{\dag } )^{T} L \nonumber \\&+ \frac{1}{{\alpha _B^2 }}L^{T} B_A^{\dag } (B_A^{\dag } )^{T}A^{T} rx^{T} ((AP)^{T} AP)^{\dag } L . \end{aligned}$$

(2.6)

Proof

Noting

$$\begin{aligned} \left\| {M_{g^{\prime } } } \right\| _2 = \left\| {M_{g^{\prime } } M_{g^{\prime } }^{T} } \right\| _2^{1/2}=\left\| M_1 M_1^{T} + M_2 M_2^{T} + M_3 M_3^{T} + M_4 M_4^{T}\right\| _2^{1/2}, \end{aligned}$$

and

$$\begin{aligned} M_3 M_3^{T}= \frac{{L^{T} ((AP)^{T} AP)^{\dag } )L}}{{\alpha _b^2 }} , \quad M_4 M_4^{T}=\frac{{L^{T} B_A^{\dag } (B_A^{\dag } )^{T} L}}{{\alpha _d^2 }}, \end{aligned}$$

(2.7)

it suffices to obtain the expressions of $ M_1 M_1^{T}$ and $ M_2 M_2^{T}$.

Let

$$\begin{aligned} M_{11} = \left( {r^{T} \otimes (L^{T} ((AP)^{T} AP)^{\dag } )} \right) \varPi _{mn} ,\quad M_{12} = {x^{T} \otimes (L^{T} (AP)^{\dag } )} . \end{aligned}$$

Then

$$\begin{aligned} M_1 M_1^{T} = \frac{1}{{\alpha _A^2 }}\left( {M_{11} M_{11}^{T} + M_{12} M_{12}^{T} - M_{11} M_{12}^{T} - M_{12} M_{11}^{T} } \right) . \end{aligned}$$

By (1.8) and (1.12), we have

$$\begin{aligned} M_{11} M_{11}^{T}= & {} \left( {r^{T} \otimes (L^{T} ((AP)^{T} AP)^{\dag } )} \right) \left( {r \otimes (((AP)^{T} AP)^{\dag } L)} \right) \\= & {} \left\| r \right\| _2^2 L^{T} ((AP)^{T} AP)^{\dag } )^2 L,\\ M_{12} M_{12}^{T}= & {} \left( {x^{T} \otimes (L^{T} (AP)^{\dag } )} \right) \left( {x \otimes (((AP)^{\dag } )^{T} L)} \right) = \left\| x \right\| _2^2 L^{T} ((AP)^{T} AP)^{\dag } )L. \end{aligned}$$

Note that

$$\begin{aligned} (AP)^{\dag } r= & {} (AP)^{\dag } (b - Ax) = x - B_A^{\dag } d - (AP)^{\dag } Ax \quad \text {by (1.3)}\\= & {} x - B_A^{\dag } Bx - (AP)^{\dag } Ax=0 . \end{aligned}$$

The last equality in the above equation follows from the GSVD of the matrix pair A, B in (1.13) and the expressions on $(AP)^{\dag }$ and $B_A^{\dag }$ in Remark 2.3 below. In fact,

$$\begin{aligned} B_A^{\dag } B= & {} X\varLambda ^{\dag }\varLambda X^{-1}=X\left[ {\begin{array}{cc} 0 &{} \quad 0 \\ 0 &{}\quad I_s \\ \end{array}} \right] X^{-1},\\ (AP)^{\dag } A= & {} X\left[ {\begin{array}{ccc} {I_{n - s} } &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 \\ \end{array}} \right] \varSigma X^{-1} =X\left[ {\begin{array}{cc} I_{n-s} &{}\quad 0 \\ 0 &{}\quad 0 \\ \end{array}} \right] X^{-1} , \end{aligned}$$

which mean that $B_A^{\dag } B+(AP)^{\dag } A=I_n$ and hence $ x - B_A^{\dag } Bx - (AP)^{\dag } Ax=0$. Thus, by (1.8), (1.11), and (1.12),

$$\begin{aligned} M_{11} M_{12}^{T}= & {} \left( {r^{T} \otimes (L^{T} ((AP)^{T} AP)^{\dag } )} \right) \left( {(((AP)^{\dag } )^{T} L) \otimes x} \right) \quad \text {by (1.8) and (1.11)}\\= & {} (r^{T} ((AP)^{\dag } )^{T} L) \otimes (L^{T} ((AP)^{T} AP)^{\dag } x) \quad \text {by (1.12)}\\= & {} 0=( M_{12} M_{11}^{T})^{T}. \end{aligned}$$

As a result,

$$\begin{aligned} M_1 M_1^{T} = \frac{1}{{\alpha _A^2 }}\left( {\left\| r \right\| _2^2 L^{T} ((AP)^{T} AP)^{\dag } )^2 L + \left\| x \right\| _2^2 L^{T} ((AP)^{T} AP)^{\dag } )L} \right) . \end{aligned}$$

(2.8)

Now, let

$$\begin{aligned} M_{21} = \left( {(r^{T} AB_A^{\dag } ) \otimes (L^{T} ((AP)^{T} AP)^{\dag } )} \right) \varPi _{sm} ,\quad M_{22} = {x^{T} \otimes \left( L^{T} B_A^{\dag } \right) } . \end{aligned}$$

Then

$$\begin{aligned} M_2 M_2^{T} = \frac{1}{{\alpha _B^2 }}\left( {M_{21} M_{21}^{T} + M_{22} M_{22}^{T} + M_{21} M_{22}^{T} + M_{22} M_{21}^{T} } \right) . \end{aligned}$$

(2.9)

By (1.8) and (1.12), we get

$$\begin{aligned} M_{21} M_{21}^{T}= & {} \left( {(r^{T} AB_A^{\dag } ) \otimes (L^{T} ((AP)^{T} AP)^{\dag } )} \right) \left( {(r^{T} AB_A^{\dag } )^{T} \otimes (((AP)^{T} AP)^{\dag } L)} \right) \nonumber \\= & {} \left\| {r^{T} AB_A^{\dag } } \right\| _2^2 L^{T} ((AP)^{T} AP)^{\dag })^2 L, \end{aligned}$$

(2.10)

$$\begin{aligned} M_{22} M_{22}^{T}= & {} \left( {x^{T} \otimes (L^{T} B_A^{\dag })} \right) \left( {x \otimes ((B_A^{\dag } )^{T} L)} \right) = \left\| x \right\| _2^2 L^{T} B_A^{\dag } (B_A^{\dag } )^{T} L , \end{aligned}$$

(2.11)

and by (1.8), (1.11), (1.12), and (1.7), we get

$$\begin{aligned} M_{21} M_{22}^{T}= & {} \left( {(r^{T} AB_A^{\dag } ) \otimes (L^{T} ((AP)^{T} AP)^{\dag } )} \right) \left( {((B_A^{\dag } )^{T} L) \otimes x} \right) \quad \text {by (1.8) and (1.11)} \nonumber \\= & {} (r^{T} AB_A^{\dag } (B_A^{\dag } )^{T} L) \otimes (L^{T} ((AP)^{T} AP)^{\dag } x) \quad \text {by (1.12)}\nonumber \\= & {} L^{T} ((AP)^{T} AP)^{\dag } xr^{T} AB_A^{\dag } (B_A^{\dag } )^{T} L \quad \text {by (1.7)}\nonumber \\= & {} \left( M_{22} M_{21}^{T}\right) ^{T}. \end{aligned}$$

(2.12)

Substituting (2.10)–(2.12) into (2.9) gives

$$\begin{aligned} M_2 M_2^{T}= & {} \frac{1}{{\alpha _B^2 }}\left( \left\| {r^{T} AB_A^{\dag } } \right\| _2^2 L^{T} ((AP)^{T} AP)^{\dag })^2 L + \left\| x \right\| _2^2 L^{T} B_A^{\dag } (B_A^{\dag } )^{T} L \right) \nonumber \\&+ \frac{1}{{\alpha _B^2 }}L^{T} ((AP)^{T} AP)^{\dag } xr^{T} AB_A^{\dag } (B_A^{\dag } )^{T} L\nonumber \\&+ \frac{1}{{\alpha _B^2 }}L^{T} B_A^{\dag } (B_A^{\dag } )^{T} A^{T}rx^{T} ((AP)^{T} AP)^{\dag } L. \end{aligned}$$

(2.13)

From (2.7), (2.8), and (2.13), we have the desired result (2.5). $\square $

Remark 2.2

When $B=0$ and $d=0$, that is, when the LSE problem reduces to the LLS problem (1.4), $P=I_n$ and $\mathrm{rank}(A)=n$. Thus,

$$\begin{aligned} ((AP)^{T}(AP))^{\dag } = (A^{T}A)^{-1},\quad B_A^{\dag } = 0, \end{aligned}$$

and hence

$$\begin{aligned} \kappa _{LLS} (A,b)= & {} \left\| \frac{{\left\| r \right\| _2^2 }}{{\alpha _A^2 }} L^{T} (A^{T}A)^{-2} L + \left( {\frac{{\left\| x \right\| _2^2 }}{{\alpha _A^2 }} + \frac{1}{{\alpha _b^2 }}} \right) L^{T} (A^{T}A)^{-1}L\right\| _2^{1/2},\nonumber \\ \end{aligned}$$

(2.14)

which is the closed formula of the partial condition number of the LLS problem.

Furthermore, if L is a column vector, i.e., $k=1$, then

$$\begin{aligned} \kappa _{LLS} (A,b)= & {} \left( \frac{{\left\| r \right\| _2^2 }}{{\alpha _A^2 }} L^{T} (A^{T}A)^{-2} L + \left( {\frac{{\left\| x \right\| _2^2 }}{{\alpha _A^2 }} + \frac{1}{{\alpha _b^2 }}} \right) L^{T} (A^{T}A)^{-1}L\right) ^{1/2} \nonumber \\= & {} \left( \frac{{\left\| r \right\| _2^2 }}{{\alpha _A^2 }} \left\| L^{T} (A^{T}A)^{-1} \right\| _2^2 + \left( {\frac{{\left\| x \right\| _2^2 }}{{\alpha _A^2 }} + \frac{1}{{\alpha _b^2 }}} \right) \left\| L^{T} A^{\dag }\right\| _2^2\right) ^{1/2},\nonumber \\ \end{aligned}$$

(2.15)

which is just the result given in Corollary 1 in [1].

Remark 2.3

Using the GSVD of the matrix pair A, B in (1.13), and (3.3), (3.4), and (3.15) in [28], we have

$$\begin{aligned} (AP)^{\dag }= & {} X(\varSigma (I_n - \varLambda ^{\dag } \varLambda ))^{\dag } U^{T}\\= & {} X\left( {\left[ {\begin{array}{ccc} {I_{n - s} } &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad {S_A } &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 \\ \end{array}} \right] \left( {I_n - \left[ {\begin{array}{cc} 0 &{}\quad 0 \\ {S_B^{ - 1} } &{}\quad 0 \\ 0 &{}\quad {I_{s - t} } \\ \end{array}} \right] \left[ {\begin{array}{ccc} 0 &{}\quad {S_B } &{} \quad 0 \\ 0 &{}\quad 0 &{}\quad {I_{s - t} } \\ \end{array}} \right] } \right) } \right) ^{\dag } U^{T} \\= & {} X\left( {\left[ {\begin{array}{cccc} {I_{n - s} } &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad {S_A } &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 \\ \end{array}} \right] \left[ {\begin{array}{ccc} {I_{n - s} } &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 \\ \end{array}} \right] } \right) ^{\dag } U^{T} = X\left[ {\begin{array}{ccc} {I_{n - s} } &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 \\ \end{array}} \right] U^{T} \end{aligned}$$

and

$$\begin{aligned} B_A^{\dag }= & {} (I_n - (AP)^{\dag } A)X\varLambda ^{\dag } V^{T}\\= & {} \left( {I_n - X\left[ {\begin{array}{ccc} {I_{n - s} } &{}\quad 0 &{} \quad 0 \\ 0 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{} \quad 0 \\ \end{array}} \right] U^{T} U\left[ {\begin{array}{ccc} {I_{n - s} } &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad {S_A } &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 \\ \end{array}} \right] X^{ - 1} } \right) X\varLambda ^{\dag } V^{T} \\= & {} X\left[ {\begin{array}{ccc} 0 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad {I_t } &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad {I_{s - t} } \\ \end{array}} \right] \left[ {\begin{array}{cc} 0 &{}\quad 0 \\ {S_B^{ - 1} } &{}\quad 0 \\ 0 &{}\quad {I_{s - t} } \\ \end{array}} \right] V^{T} = X\left[ {\begin{array}{cc} 0 &{}\quad 0 \\ {S_B^{ - 1} } &{}\quad 0 \\ 0 &{}\quad {I_{s - t} } \\ \end{array}} \right] V^{T} = X\varLambda ^{\dag } V^{T}. \end{aligned}$$

Then

$$\begin{aligned} (AP)^{\dag } \left( {(AP)^{\dag } } \right) ^{T}= & {} X_1X_1^{T} , \quad AB_A^{\dag }=U\left[ {\begin{array}{cc} 0 &{} 0 \\ {S_AS_B^{ - 1} } &{} 0 \\ 0 &{} 0 \\ \end{array}} \right] V^{T}=U_2\left[ {\begin{array}{cc} {S_AS_B^{ - 1} } &{} 0 \\ 0 &{} 0 \\ \end{array}} \right] V^{T},\end{aligned}$$

(2.16)

$$\begin{aligned} B_A^{\dag } (B_A^{\dag })^{T}= & {} X\left[ {\begin{array}{ccc} 0 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad {S_B^{ - 2} } &{} \quad 0 \\ 0 &{}\quad 0 &{}\quad {I_{s - t} } \\ \end{array}} \right] X^{T}=X_2\left[ {\begin{array}{cc} {S_B^{ - 2} } &{}\quad 0 \\ 0 &{}\quad {I_{s - t} } \\ \end{array}} \right] X^{T}_2 , \end{aligned}$$

(2.17)

$$\begin{aligned} AB_A^{\dag } (B_A^{\dag } )^{T}= & {} U\left[ {\begin{array}{ccc} 0 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad {S_A S_B^{ - 2} } &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 \\ \end{array}} \right] X^{T}= U_2\left[ {\begin{array}{cc} {S_A S_B^{ - 2} } &{}\quad 0 \\ 0 &{}\quad 0 \\ \end{array}} \right] X^{T}_2, \end{aligned}$$

(2.18)

where $X=[X_1,X_2]$ with $ X_1\in \mathbb {R}^{n \times (n-s)}$ and $ X_2\in \mathbb {R}^{n \times s}$, and $U=[U_1,U_2]$ with $ U_1\in \mathbb {R}^{m \times (n-s)}$ and $U_2\in \mathbb {R}^{m \times (m-n+s)}$. Substituting (2.16)–(2.18) into (2.6) yields

$$\begin{aligned} C&= \left( {\frac{{\left\| r \right\| _2^2 }}{{\alpha _A^2 }} + \frac{{\left\| {r^{T} U_2\left[ {\begin{array}{cc} {S_A S_B^{-1}} &{} 0 \\ 0 &{} 0 \\ \end{array}} \right] } \right\| _2^2 }}{{\alpha _B^2 }}} \right) L^{T} (X_1X_1^{T})^2 L + L^{T} (X_1S_1X_1^{T}+X_2S_2X_2^{T}) L \nonumber \\&\quad + \frac{1}{{\alpha _B^2 }}L^{T} X_1X^{T}_1 xr^{T} U_2\left[ {\begin{array}{cc} {S_A S_B^{ - 2} } &{} 0 \\ 0 &{} 0 \\ \end{array}} \right] X^{T}_2 L +\frac{1}{{\alpha _B^2 }}L^{T} X_2\left[ {\begin{array}{cc} {S_A S_B^{ - 2} } &{} 0 \\ 0 &{} 0 \\ \end{array}} \right] U^{T}_2 rx^{T} X_1X^{T}_1 L \end{aligned}$$

(2.19)

with

$$\begin{aligned} S_1=\left( {\frac{{\left\| x \right\| _2^2 }}{{\alpha _A^2 }} + \frac{1}{{\alpha _b^2 }}} \right) I_{n-s},\quad S_2=\left( {\frac{{\left\| x \right\| _2^2 }}{{\alpha _B^2 }} + \frac{1}{{\alpha _d^2 }}} \right) {\varLambda _1^{ - 2} }. \end{aligned}$$

(2.20)

In particular, when $B=0$, the GSVD (1.13) reduces to the SVD of A (1.14). In this case, $P=I_n$ and $\mathrm{rank}(A)=n$. Hence, we have

$$\begin{aligned} (AP)^{\dag } = X\varSigma ^{\dag } U^{T} =X[\varSigma ^ {-1}_1,0] U^{T},\quad B_A^{\dag } = 0, \end{aligned}$$

and

$$\begin{aligned} (AP)^{\dag } \left( {(AP)^{\dag } } \right) ^{T} = X\varSigma ^ {-2}_1 X^{T}. \end{aligned}$$

Thus,

$$\begin{aligned} C = \frac{{\left\| r \right\| _2^2 }}{{\alpha _A^2 }}L^{T}X \varSigma ^ {-4}_1 X^{T}L + \left( {\frac{{\left\| x \right\| _2^2 }}{{\alpha _A^2 }} + \frac{1}{{\alpha _b^2 }}} \right) L^{T} X\varSigma ^ {-2}_1X^{T} L. \end{aligned}$$

As a result, we get a closed formula of the partial condition number of the LLS problem based on the SVD of A:

$$\begin{aligned} \kappa _{LLS} (A,b) = \left\| {SX^{T} L} \right\| _2, \end{aligned}$$

(2.21)

where S is a diagonal matrix with the i-th diagonal element being

$$\begin{aligned} S_{ii} =\frac{1}{\sigma _i} \sqrt{\frac{{\sigma _i^{-2}\left\| r \right\| _2^2+\left\| x \right\| _2^2 }}{{\alpha _A ^2 }} + \frac{1}{{\alpha _b ^2 }}} . \end{aligned}$$

The closed formula (2.21) is just the one given in Theorem 1 in [1], where it was derived by a different approach.

Remark 2.4

From (2.19) and (2.5) with setting $L=I_n$ and $\alpha _A=\alpha _B=\alpha _b=\alpha _d=1$ and using (1.13), we have the following upper bound for the normwise condition number of the LSE problem:

$$\begin{aligned} \kappa _{LSE}(A,B,b,d) \le (a_1+a_2+a_3)^{1/2}, \end{aligned}$$

where

$$\begin{aligned}&a_1=\Vert r\Vert _2^2\left( 1+\frac{\alpha _1^2}{\beta _1^2}\right) \Vert X\Vert ^4_2, \quad a_2=\left( 1+\Vert x\Vert _2^2\right) \frac{1}{\beta _1^2}\Vert X\Vert ^2_2,\quad a_3=2\frac{\alpha _1}{\beta _1^2}\Vert r\Vert _2\Vert x\Vert _2\Vert X\Vert _2^3. \end{aligned}$$

Moreover, using (2.8) in [29], we obtain $\Vert X\Vert _2=\frac{1}{\sigma _n}$, where $\sigma _n$ is the smallest singular value of $\left[ \begin{array}{c}A \\ B\end{array}\right] $. Hence, the main factors causing the ill-conditioning of LSE problem are $\Vert r\Vert _2, \beta _1$, and $\sigma _n$.

3 The structured partial condition number

Suppose that $\mathbb {S}_1\subseteq \mathbb {R}^{m\times n}$ and $\mathbb {S}_2\subseteq \mathbb {R}^{s\times n}$ are two linear subspaces, which consist of two classes of structured matrices, respectively. From [15, 20, 25], we have that if $A\in \mathbb {S}_1$ and $B \in \mathbb {S}_2$, then

$$\begin{aligned} \mathrm{vec}(A)=\varPhi _{\mathbb {S}_1}s_1,\quad \mathrm{vec}(B)=\varPhi _{\mathbb {S}_2}s_2, \end{aligned}$$

(3.1)

where $\varPhi _{\mathbb {S}_1}\in \mathbb {R}^{mn\times k_1}$ and $\varPhi _{\mathbb {S}_2}\in \mathbb {R}^{sn\times k_2}$ are the fixed structure matrices reflecting the structures of $\mathbb {S}_1$ and $\mathbb {S}_2$, respectively, and $s_1\in \mathbb {R}^{k_1}$ and $s_2\in \mathbb {R}^{k_2}$ are the vectors of the independent parameters in the structured matrices, respectively. Based on the above explanation, the structured perturbations $\varDelta A\in \mathbb {S}_1$ and $\varDelta B \in \mathbb {S}_2$ can be written as

$$\begin{aligned} \mathrm{vec}(\varDelta A)=\varPhi _{\mathbb {S}_1}(\varDelta s_1),\quad \mathrm{vec}(\varDelta B)=\varPhi _{\mathbb {S}_2}(\varDelta s_2), \end{aligned}$$

(3.2)

where $\varDelta s_1\in \mathbb {R}^{k_1}$ and $\varDelta s_2\in \mathbb {R}^{k_2}$ can be regarded as the perturbations of $s_1$ and $s_2$, respectively.

Now we present the definition of the structured partial condition number of the LSE problem (1.1):

$$\begin{aligned} \kappa _{LSE}^S (A,B,b,d) = \mathop {\mathop {\max }\limits _{(\alpha _A \varDelta A,\alpha _B \varDelta B,\alpha _b \varDelta b,\alpha _d \varDelta d) \ne 0}}\limits _ {\varDelta A\in \mathbb {S}_1, \varDelta B\in \mathbb {S}_2} \frac{{\left\| {g^{\prime } (A,B,b,d){\circ } (\varDelta A,\varDelta B,\varDelta b,\varDelta d)} \right\| _2 }}{{\left\| {(\alpha _A\varDelta A,\alpha _B\varDelta B,\alpha _b\varDelta b,\alpha _d\varDelta d)} \right\| _F }}, \end{aligned}$$

which is a natural variant of the partial condition number in (2.1). From (2.4), it follows that

$$\begin{aligned} \kappa _{LSE}^S (A,B,b,d) = \mathop {\mathop {\max }\limits _{(\alpha _A \varDelta A,\alpha _B \varDelta B,\alpha _b \varDelta b,\alpha _d \varDelta d) \ne 0}}\limits _ {\varDelta A\in \mathbb {S}_1, \varDelta B\in \mathbb {S}_2}\frac{{\left\| {M_{g^{\prime } } \left[ {\begin{array}{c} {\alpha _A \mathrm{vec}(\varDelta A)} \\ {\alpha _B \mathrm{vec}(\varDelta B)} \\ {\alpha _b (\varDelta b)} \\ {\alpha _d (\varDelta d)} \\ \end{array}} \right] } \right\| _2 }}{{\left\| {\left[ {\begin{array}{c} {\alpha _A \mathrm{vec}(\varDelta A)} \\ {\alpha _B \mathrm{vec}(\varDelta B)} \\ {\alpha _b (\varDelta b)} \\ {\alpha _d (\varDelta d)} \\ \end{array}} \right] } \right\| _2 }} . \end{aligned}$$

(3.3)

Considering (3.2), we have

$$\begin{aligned} \left[ {\begin{array}{c} { \mathrm{vec}(\varDelta A)} \\ { \mathrm{vec}(\varDelta B)} \\ { \varDelta b} \\ { \varDelta d} \\ \end{array}} \right] =\left[ {\begin{array}{cccc} \varPhi _{\mathbb {S}_1} &{}0 &{}0&{}0 \\ 0&{}\varPhi _{\mathbb {S}_2}&{}0&{}0 \\ 0&{}0&{} I_m&{}0\\ 0&{}0&{}0&{}I_s\\ \end{array}} \right] \left[ {\begin{array}{c} \varDelta s _1 \\ \varDelta s_2 \\ { \varDelta b} \\ { \varDelta d} \\ \end{array}} \right] . \end{aligned}$$

Substituting the above equation into (3.3) yields

$$\begin{aligned}&\kappa _{LSE}^S (A,B,b,d)\nonumber \\&\quad = \mathop {\max }\limits _{(\alpha _A (\varDelta s_1),{\alpha _B (\varDelta s_2)},{\alpha _b (\varDelta b)},{\alpha _d (\varDelta d)} ) \ne 0}\frac{{\left\| {M_{g^{\prime } } \left[ {\begin{array}{cccc} \varPhi _{\mathbb {S}_1} &{}0 &{}0&{}0 \\ 0&{}\varPhi _{\mathbb {S}_2}&{}0&{}0 \\ 0&{}0&{} I_m&{}0\\ 0&{}0&{}0&{}I_s\\ \end{array}} \right] \left[ {\begin{array}{c} {\alpha _A (\varDelta s_1)} \\ {\alpha _B (\varDelta s_2)} \\ {\alpha _b (\varDelta b)} \\ {\alpha _d (\varDelta d)} \\ \end{array}} \right] } \right\| _2 }}{{\left\| {\left[ {\begin{array}{cccc} \varPhi _{\mathbb {S}_1} &{}0 &{}0&{}0 \\ 0&{}\varPhi _{\mathbb {S}_2}&{}0&{}0 \\ 0&{}0&{} I_m&{}0\\ 0&{}0&{}0&{}I_s\\ \end{array}} \right] \left[ {\begin{array}{c} {\alpha _A (\varDelta s_1)} \\ {\alpha _B (\varDelta s_2)} \\ {\alpha _b (\varDelta b)} \\ {\alpha _d (\varDelta d)} \\ \end{array}} \right] } \right\| _2 }}.\nonumber \\ \end{aligned}$$

(3.4)

Note that

$$\begin{aligned}&{{\left\| {\left[ {\begin{array}{cccc} \varPhi _{\mathbb {S}_1} &{}0 &{}0&{}0 \\ 0&{}\varPhi _{\mathbb {S}_2}&{}0&{}0 \\ 0&{}0&{} I_m&{}0\\ 0&{}0&{}0&{}I_s\\ \end{array}} \right] \left[ {\begin{array}{c} {\alpha _A (\varDelta s_1)} \\ {\alpha _B (\varDelta s_2)} \\ {\alpha _b (\varDelta b)} \\ {\alpha _d (\varDelta d)} \\ \end{array}} \right] } \right\| _2 }}\\&\quad =\left\| \left[ {\begin{array}{c} {\alpha _A (\varDelta s_1)} \\ {\alpha _B (\varDelta s_2)} \\ {\alpha _b (\varDelta b)} \\ {\alpha _d (\varDelta d)} \\ \end{array}} \right] ^{T} {\left[ {\begin{array}{cccc} \varPhi _{\mathbb {S}_1}^{T}\varPhi _{\mathbb {S}_1} &{}0 &{}0&{}0 \\ 0&{}\varPhi _{\mathbb {S}_2}^{T}\varPhi _{\mathbb {S}_2} &{}0&{}0 \\ 0&{}0&{} I_m&{}0\\ 0&{}0&{}0&{}I_s\\ \end{array}} \right] \left[ {\begin{array}{c} {\alpha _A (\varDelta s_1)} \\ {\alpha _B (\varDelta s_2)} \\ {\alpha _b (\varDelta b)} \\ {\alpha _d (\varDelta d)} \\ \end{array}} \right] } \right\| _2^{1/2} \end{aligned}$$

and the structured matrices $\varPhi _{\mathbb {S}_1}$ and $\varPhi _{\mathbb {S}_2}$ are column orthogonal [20]. Then

$$\begin{aligned} {{\left\| {\left[ {\begin{array}{cccc} \varPhi _{\mathbb {S}_1} &{}0 &{}0&{}0 \\ 0&{}\varPhi _{\mathbb {S}_2}&{}0&{}0 \\ 0&{}0&{} I_m&{}0\\ 0&{}0&{}0&{}I_s\\ \end{array}} \right] \left[ {\begin{array}{c} {\alpha _A (\varDelta s_1)} \\ {\alpha _B (\varDelta s_2)} \\ {\alpha _b (\varDelta b)} \\ {\alpha _d (\varDelta d)} \\ \end{array}} \right] } \right\| _2 }}=\left\| {\left[ {\begin{array}{cccc} D_1 &{}0 &{}0&{}0 \\ 0&{}D_2&{}0&{}0 \\ 0&{}0&{} I_m&{}0\\ 0&{}0&{}0&{}I_s\\ \end{array}} \right] \left[ {\begin{array}{c} {\alpha _A (\varDelta s_1)} \\ {\alpha _B (\varDelta s_2)} \\ {\alpha _b (\varDelta b)} \\ {\alpha _d (\varDelta d)} \\ \end{array}} \right] } \right\| _2,\nonumber \\ \end{aligned}$$

(3.5)

where $D_1=\mathrm{diag}(w_1)$ and $D_2=\mathrm{diag}(w_2)$ with

$$\begin{aligned} w_1=\left[ \left\| \varPhi _{\mathbb {S}_1}(1,:)\right\| _2,\ldots ,\left\| \varPhi _{\mathbb {S}_1}(k_1,:)\right\| _2\right] ,\quad w_2=\left[ \left\| \varPhi _{\mathbb {S}_2}(1,:)\right\| _2,\ldots ,\left\| \varPhi _{\mathbb {S}_2}(k_2,:)\right\| _2\right] . \end{aligned}$$

Here, the Matlab notation is used. Combining (3.4) and (3.5) implies

$$\begin{aligned}&\kappa _{LSE}^S (A,B,b,d)\\&= \mathop {\max }\limits _{(\alpha _A (\varDelta s_1),{\alpha _B (\varDelta s_2)},{\alpha _b (\varDelta b)},{\alpha _d (\varDelta d)} ) \ne 0}\frac{{\left\| {M_{g^{\prime } } \left[ {\begin{array}{cccc} \varPhi _{\mathbb {S}_1}D_1^{-1} &{}0 &{}0&{}0 \\ 0&{}\varPhi _{\mathbb {S}_2}D_2^{-1}&{}0&{}0 \\ 0&{}0&{} I_m&{}0\\ 0&{}0&{}0&{}I_s\\ \end{array}} \right] \left[ {\begin{array}{cccc} D_1 &{}0 &{}0&{}0 \\ 0&{}D_2&{}0&{}0 \\ 0&{}0&{} I_m&{}0\\ 0&{}0&{}0&{}I_s\\ \end{array}} \right] \left[ {\begin{array}{c} {\alpha _A (\varDelta s_1)} \\ {\alpha _B (\varDelta s_2)} \\ {\alpha _b (\varDelta b)} \\ {\alpha _d (\varDelta d)} \\ \end{array}} \right] } \right\| _2 }}{{\left\| {\left[ {\begin{array}{cccc} D_1 &{}0 &{}0&{}0 \\ 0&{}D_2&{}0&{}0 \\ 0&{}0&{} I_m&{}0\\ 0&{}0&{}0&{}I_s\\ \end{array}} \right] \left[ {\begin{array}{c} {\alpha _A (\varDelta s_1)} \\ {\alpha _B (\varDelta s_2)} \\ {\alpha _b (\varDelta b)} \\ {\alpha _d (\varDelta d)} \\ \end{array}} \right] } \right\| _2}}. \end{aligned}$$

Then we can derive the expression of the structured partial condition number of the LSE problem, which is presented in the following theorem.

Theorem 3.1

The structured partial condition number of the LSE problem (1.1) with respect to L and the structures $\mathbb {S}_1$ and $\mathbb {S}_2$ is

$$\begin{aligned} \kappa _{LSE}^S (A,B,b,d) = \left\| {M_{g^{\prime } } } \left[ {\begin{array}{cccc} \varPhi _{\mathbb {S}_1}D_1^{-1} &{}0 &{}0&{}0 \\ 0&{}\varPhi _{\mathbb {S}_2}D_2^{-1}&{}0&{}0 \\ 0&{}0&{} I_m&{}0\\ 0&{}0&{}0&{}I_s\\ \end{array}} \right] \right\| _2 , \end{aligned}$$

(3.6)

where $M_{g^{\prime } }$ is given in (2.3).

Remark 3.1

It is easy to verify that

$$\begin{aligned} \left[ {\begin{array}{cccc} \varPhi _{\mathbb {S}_1}D_1^{-1} &{}\quad 0 &{}\quad 0&{}\quad 0 \\ 0&{}\quad \varPhi _{\mathbb {S}_2}D_2^{-1}&{}\quad 0&{}\quad 0 \\ 0&{}\quad 0&{}\quad I_m &{}\quad 0\\ 0&{}\quad 0&{}\quad 0&{}\quad I_s\\ \end{array}} \right] \end{aligned}$$

is column orthonormal. Thus,

$$\begin{aligned} \left\| {M_{g^{\prime } } } \left[ {\begin{array}{cccc} \varPhi _{\mathbb {S}_1}D_1^{-1} &{}\quad 0 &{}\quad 0&{}\quad 0 \\ 0&{}\quad \varPhi _{\mathbb {S}_2}D_2^{-1}&{}\quad 0&{}\quad 0 \\ 0&{}\quad 0&{}\quad I_m&{}\quad 0\\ 0&{}\quad 0&{}\quad 0&{}\quad I_s\\ \end{array}} \right] \right\| _2\le \left\| {M_{g^{\prime } } }\right\| _2. \end{aligned}$$

That is, the structured partial condition number is always tighter than the unstructured one. This fact can also be seen from the definitions of these two condition numbers. As done in [25, 31], it is valuable to discuss the ratio between the structured and unstructured partial condition numbers of the LSE problem in detail. We won’t go that far in this paper, and only provide a numerical example in Sect. 5 to show that the structured partial condition number is indeed tighter than the unstructured one.

Remark 3.2

When $B=0$ and $d=0$, we have the structured partial condition number of the LLS problem and its upper bound:

$$\begin{aligned} \kappa _{LLS}^S (A,b)= & {} \left\| \left[ {\frac{{\left( {r^{T} \otimes (L^{T} (A^{T}A)^{ - 1} )} \right) \varPi _{mn} - x^{T} \otimes (L^{T} A^{ \dag })}}{\alpha _A }\varPhi _{\mathbb {S}_1}D_1^{-1} ,\frac{{L^{T} A^{ \dag } }}{\alpha _b }} \right] \right\| _2 \end{aligned}$$

(3.7)

$$\begin{aligned}\le & {} \left\| \left[ {\frac{{\left( {r^{T} \otimes (L^{T} (A^{T}A)^{ - 1} )} \right) \varPi _{mn} - x^{T} \otimes (L^{T} A^{ \dag })}}{\alpha _A } ,\frac{{L^{T} A^{ \dag } }}{\alpha _b }} \right] \right\| _2 , \end{aligned}$$

(3.8)

where the upper bound (3.8) is just the unstructured partial condition number of the LLS problem. Here, it should be pointed out that the structured condition number of the LLS problem derived from (3.7) by setting $L=I_n$ and $\alpha _A=\alpha _b=1$ is a little different from the ones in [31] because two additional conditions are added besides the structure requirement in [31].

Remark 3.3

Similar to Theorem 2.2, we can consider giving the compact expression for the structured condition number (3.6). Unfortunately, the obtained compact expression is very complicated. For example, the terms $M_{11}M_{11}^T$ and $M_{12}M_{12}^T$ appearing in the proof of Theorem 2.2 are replaced with the following two ones:

$$\begin{aligned} M_{11}\varPhi _{\mathbb {S}_1}D_1^{-2}\varPhi _{\mathbb {S}_1}^TM_{11}^T=&\left( {r^{T} \otimes (L^{T} ((AP)^{T} AP)^{\dag } )} \right) \varPi _{mn}\varPhi _{\mathbb {S}_1}D_1^{-2}\varPhi _{\mathbb {S}_1}^T\varPi _{mn}^T\\&\left( {r \otimes (((AP)^{T} AP)^{\dag } L)} \right) .\\ M_{12}\varPhi _{\mathbb {S}_1}D_1^{-2}\varPhi _{\mathbb {S}_1}^TM_{12}^T=&\left( {x^{T} \otimes (L^{T} (AP)^{\dag })}\right) \varPhi _{\mathbb {S}_1}D_1^{-2}\varPhi _{\mathbb {S}_1}^T\left( {x \otimes (((AP)^{\dag })^T L )} \right) . \end{aligned}$$

Considering that $\varPhi _{\mathbb {S}_1}$ can be written as

$$\begin{aligned} \varPhi _{\mathbb {S}_1}=\left[ \mathrm{vec}(Z_1),\ldots ,\mathrm{vec}(Z_{k_1})\right] , \end{aligned}$$

where $Z_1,\ldots ,Z_{k_1}$ are the basis matrices of the linear subspace $\mathbb {S}_1$, and using (1.9) and (1.10), we have

$$\begin{aligned}&\left( {r^{T} \otimes (L^{T} ((AP)^{T} AP)^{\dag })} \right) \varPi _{mn}\varPhi _{\mathbb {S}_1}\\&\quad =\left( {r^{T} \otimes (L^{T} ((AP)^{T} AP)^{\dag } )} \right) \left[ \mathrm{vec}(Z_1^T),\ldots ,\mathrm{vec}(Z^T_{k_1})\right] \\&\quad =\left[ L^{T} ((AP)^{T} AP)^{\dag } Z_1^Tr,\ldots ,L^{T} ((AP)^{T} AP)^{\dag } Z^T_{k_1}r\right] . \end{aligned}$$

Thus, writing $D_1=\mathrm{diag}(d_1,\ldots , d_{k_1})$, we obtain

$$\begin{aligned}&M_{11}\varPhi _{\mathbb {S}_1}D_1^{-2}\varPhi _{\mathbb {S}_1}^TM_{11}^T=\sum _{i=1}^{k_1}\frac{1}{d_i^2}L^{T} ((AP)^{T} AP)^{\dag } Z_i^Trr^TZ_i((AP)^{T} AP)^{\dag }L. \end{aligned}$$

Similarly, we have

$$\begin{aligned}&M_{12}\varPhi _{\mathbb {S}_1}D_1^{-2}\varPhi _{\mathbb {S}_1}^TM_{12}^T=\sum _{i=1}^{k_1}\frac{1}{d_i^2}L^{T} (AP)^{\dag } Z_ixx^TZ_i^T(( AP)^{\dag })^TL. \end{aligned}$$

Unlike the case in the proof of Theorem 2.2, we cannot combine the two terms

$$\begin{aligned} M_{11}\varPhi _{\mathbb {S}_1}D_1^{-2}\varPhi _{\mathbb {S}_1}^TM_{11}^T \text { and } M_{12}\varPhi _{\mathbb {S}_1}D_1^{-2}\varPhi _{\mathbb {S}_1}^TM_{12}^T. \end{aligned}$$

Worse, we cannot check that the cross term

$$\begin{aligned} M_{11}\varPhi _{\mathbb {S}_1}D_1^{-2}\varPhi _{\mathbb {S}_1}^TM_{12}^T=\sum _{i=1}^{k_1}\frac{1}{d_i^2}L^{T} ((AP)^{T} AP)^{\dag } Z_i^Trx^TZ_i^T(( AP)^{\dag })^TL \end{aligned}$$

is zero. These facts make the compact expression for the structured condition number (3.6) be very complicated. So, we omit the compact expression here. In addition, we only consider the linear structures of the matrices A and B in this section. Similarly, the linear structures of the vectors b and d can also be put into the partial condition number. Furthermore, inspired by [9, 20, 26], exploring the structured mixed and componentwise condition numbers of the LSE problem will be interesting. We will investigate this problem in the future research.

4 Statistical condition estimates

We first provide a statistical estimate of the partial condition number by using the probabilistic spectral norm estimator. This estimator was proposed by Hochstenbach [16] and can estimate the spectral norm of a matrix reliably. In more detail, the analysis of the estimator in [16] suggests that the spectral norm of a matrix can be contained in a small interval $[\alpha _1 , \alpha _2]$ with high probability, where $\alpha _1$ is the guaranteed lower bound of the spectral norm of the matrix derived by the famous Lanczos bibdiagonalization method [12] and $\alpha _2$ is the probabilistic upper bound with probability at least $1-\varepsilon $ with $\varepsilon \ll 1$ derived by finding the largest zero of a polynomial. Meanwhile, we can require $\alpha _2/ \alpha _1\leqslant 1+\delta $ with $\delta $ being a user-chosen parameter. Based on the above estimator, we can devise Algorithm 1.

Remark 4.1

The step 2 of Algorithm 1 is directly taken from [16], and a detailed explanation can be found there. In the practical implementation of Algorithm 1, d is the dimension of Krylov space and can be automatically determined by the algorithm. Moreover, as suggested in [16], explicitly forming matrix C is not necessary because what we really need is the product of a random vector with the matrix C or $C^T$. Hence, some techniques in solving linear system can be employed to reduce the computational burden, especially for large scale problems. Furthermore, it is worthy to point out that Algorithm 1 is also applicable to estimating the partial structured condition number (3.6) since it is also the spectral norm of a matrix.

Now we introduce an alternative approach based on the SSCE method [3, 18] for estimating the normwise condition number of the solution x(A, B, b, d). Denote by $\kappa _{LSEi}(A,B,b,d)$ the normwise condition number of the function $z_i^{T}x(A,B,b,d)$, where $z_i$s are chosen from $\mathcal {U}(S_{n-1})$ and are orthogonal. Then, from (2.6), we have

$$\begin{aligned} \kappa _{LSEi}^2 (A,B,b,d)= & {} \left( {\frac{{\left\| r \right\| _2^2 }}{{\alpha _A^2 }} + \frac{{\left\| {r^{T} AB_A^{\dag }} \right\| _2^2 }}{{\alpha _B^2 }}} \right) z_i^{T} ((AP)^{T} AP)^{\dag } )^2 z_i\nonumber \\&+ \left( {\frac{{\left\| x \right\| _2^2 }}{{\alpha _A^2 }} + \frac{1}{{\alpha _b^2 }}} \right) z_i^{T} ((AP)^{T} AP)^{\dag } )z_i \nonumber \\&+ \left( {\frac{{\left\| x \right\| _2^2 }}{{\alpha _B^2 }} + \frac{1}{{\alpha _d^2 }}} \right) z_i^{T} B_A^{\dag } (B_A^{\dag } )^{T} z_i \nonumber \\&+ \frac{2}{{\alpha _B^2 }}z_i^{T} ((AP)^{T} AP)^{\dag } xr^{T} AB_A^{\dag } (B_A^{\dag } )^{T} z_i. \end{aligned}$$

(4.1)

The analysis based on SSCE method in [3] shows that

$$\begin{aligned} {\kappa } _{SLSE}(A,B,b,d) = \frac{\omega _q}{\omega _n}\sqrt{\sum _{i=1}^q \kappa _{LSEi}^2 (A,B,b,d)} \end{aligned}$$

(4.2)

is a good estimate of the normwise condition number of the LSE problem (1.1). In the above expression, $\omega _q$ is the Wallis factor with $\omega _1=1, \omega _2={2}/{\pi }$, and

$$\begin{aligned} \omega _q =\left\{ \begin{array}{ll} \frac{1\cdot 3 \cdot 5 \cdots (q-2)}{2\cdot 4\cdot 6\cdots (q-1)}, &{}\quad \hbox {for } q \hbox { odd,} \\ \frac{2}{\pi }\frac{2\cdot 4\cdot 6\cdots (q-2)}{3\cdot 5\cdot 7\cdots (q-1)}, &{}\quad \hbox {for } q\hbox { even,} \end{array} \right. \text { when } q>2. \end{aligned}$$

It can be approximated by

$$\begin{aligned} \omega _q\approx \sqrt{\frac{2}{\pi (q-\frac{1}{2})}} \end{aligned}$$

(4.3)

with high accuracy. In summary, we can propose Algorithm 2.

Remark 4.2

In Algorithm 2, $\kappa _{LSEi}^2 (A,B,b,d)$ is computed by the Eq. (4.1). In practice, the computation of $\kappa _{LSEi}^2 (A,B,b,d)$ should rely on the intermediate results of the process for solving the LSE problem to reduce the computational burden. Just as carried out in [3], where the estimate is computed by using the R factor of QR decomposition, it is better to compute $\kappa _{LSEi}^2 (A,B,b,d)$ through a formula descended from (2.19) instead of (2.6) if we solve the LSE problem by GSVD.

4.1 Complexity analysis

Since the main tasks in both Algorithms 1 and 2 are to compute the Moore-Penrose inverses of B, AP and $(AP)^TAP$, and the computation of Moore-Penrose inverse in MATLAB needs SVD, we claim that the computational complexities of these two algorithms are dominated by the computation of SVD. When Golub-Reinsch SVD is used [13], the computational complexities of forming $B^{\dag }, (AP)^{\dag }$ and $((AP)^TAP)^{\dag }$ are $O(5s^2n+9sn^2+9n^3), O(n^2s+6m^2n+9mn^2+9n^3)$ and $O(m^2n+mn^2+23n^3)$, respectively. Based on these results, we now present the complexity analysis of Algorithms 1 and 2. We first consider Algorithm 2 whose computational complexity mainly consists of two parts. The first one is from forming C with $L=I_n$ which can be given by considering the results mentioned above, and the other one is from the orthonormalization procedure which requires $O(4n^2q-2nq^2+2/3q^3)$ if Household transformation is used [13]. Summing up the operations on matrix, we obtain the computational complexity of Algorithm 2: $O(5s^2n+13sn^2+13mn^2+5m^2n+41n^3+4n^2q-2nq^2+2/3q^3)$. Note that in the practical implication of Algorithm 2, p is usually chosen to be 2 or 3 which is enough for practical application and is much smaller compared with n. So the computational complexity of QR decomposition can be omitted. For Algorithm 1, the first part of computational complexity is also to form C. However, considering that d is the dimension of Krylov space and can be automatically determined by the stop criterion of iteration $\epsilon $, we find that it is not a easy task to give a detailed computational complexity analysis of the rest part of the algorithm in general. In a special case, for example, when k is given, the computational complexity of the inner loop is about $O(dk^2)$. Under this circumstance, the computational complexity of Algorithm 1 is composed of $O(5s^2n+13sn^2+13mn^2+5m^2n+41n^3+dk^2)$ and the process of finding $\delta $ and the probabilistic lower bound.

5 Numerical experiments

In this section, we will present two numerical examples to illustrate the reliability of the statistical condition estimates proposed in Sect. 4 and to compare the structured condition number and the unstructured one, respectively. In these two examples, we will set $\alpha _A=\alpha _B=\alpha _b=\alpha _d=1$ and the matrix L be the identity matrix. In addition, the claim in Remark 2.1 will also be verified in Example 5.1.

Example 5.1

Similar to [23], we construct the random LSE problem through its augmented system (see Equation (3.7) in [10]), and generate the example as follows. Let $u_1\in \mathbb {R}^{m}, u_2\in \mathbb {R}^{s}$, and $v_1, v_2 \in \mathbb {R}^{n}$ be unit random vectors, and set

$$\begin{aligned} A= & {} U_1\begin{bmatrix} D_1 \\ 0 \\ \end{bmatrix}V_1, \ B=U_2\begin{bmatrix} D_2&0 \\ \end{bmatrix}V_2 ,\ U_i=I_{m(s)}-2u_iu_i^T,\ \mathrm {and} \ V_i=I_n-2v_iv_i^T, \end{aligned}$$

where $D_1=n^{-l_1}\mathrm {diag}(n^{l_1},(n-1)^{l_1},\ldots ,1)$ and $D_2=s^{-l_2}\mathrm {diag}(s^{l_2},(s-1)^{l_2},\ldots ,1)$. Let the solution x be $x=(1, 2^2,\ldots , n^2)^T$ and the residual vector $r=b-Ax$ be a random vector of specified norm. Thus, just letting the lagrange multiplier $\lambda =-(BB^T)^{-1}BA^Tr$, $b=Ax+r$ and $d=Bx$ gives the desired LSE problem, and it is easy to check that the condition numbers of A and B are $\kappa (A)=n^{l_1}$ and $\kappa (B)=s^{l_2}$, respectively. Recall that for any matrix C, its condition number $\kappa (C)$ is defined by $\kappa (C)=\left\| C\right\| _2\left\| C^\dag \right\| _2$.

In our numerical experiments, we set $m=100, n=80$ and $s=50$, and choose the parameters $\epsilon =0.001, \delta = 0.01$ in Algorithm 1 and $q=2$ in Algorithm 2. By varying the condition numbers of A and B, and the residual’s norm $\Vert r\Vert _2$, we test the performance of Algorithms 1 and 2. More precisely, for each pair of $\kappa (A)$ and $\Vert r\Vert _2$ with a fixed $\kappa (B), 500$ random LSE problems are generated and used for the test. The numerical results on mean and variance of the ratios between the statistical condition estimate and the exact condition number defined as

$$\begin{aligned} r_{ssce}= & {} {\kappa } _{SLSE}(A,B,b,d)/{\kappa } _{LSE}(A,B,b,d),\\ r_{pce}= & {} {\kappa } _{PLSE}(A,B,b,d)/{\kappa } _{LSE}(A,B,b,d) \end{aligned}$$

are reported in Table 1. In addition, we also give a numerical verification of the claim given in Remark 2.1. The numerical results of the ratio

$$\begin{aligned} \kappa _{LSEup} (A,B,b,d)/\kappa _{LSE} (A,B,b,d) \end{aligned}$$

are presented in Table 2. In these experiments, we solve the constructed random LSE problems by GSVD, i.e., using the results from Remark 2.3, and compute the exact condition number via (2.19) and (2.5). Here, it should be pointed out that the predetermined exact solution is generally not the same as the computed one. This fact is illustrated in Fig. 1, where, for 50 random LSE problems, we plot the relative error $\Vert \hat{x}-x\Vert _2/\Vert x\Vert _2$ with $\Vert r\Vert _2=10^{-1}$. Moreover, from Fig. 1, we can see that when the coefficient matrices become ill-conditioned, the computed solution by GSVD method gets inaccurate. So we may resort to some other direct methods (see [13, Chapter 12]) or iterative techniques to get more accurate solution for the problem with ill-conditioned coefficient matrices. The main reasons why we use the GSVD method in this paper are that, with the intermediate results of solving LSE problem by this method, we can compute the condition number via (2.19) and (2.5) conveniently and this method is very effective for the coefficient matrices with small condition numbers. More specifically, from [22], the computational complexity of GSVD is about $O(8m^2n+8s^2n+6mn^2+6sn^2+8msn+62/3n^3)$, which occupies the major computational burden of solving LSE problem. With the intermediate results, we can compute (2.19) with about $O(4n^3+ns^2-2n^2s)$ flops, and this saves much computational effort compared with calculating (2.6) directly. On the other hand, from the complexity analysis given in 4.1, we need about $O(4n^3+ns^2-2n^2s+4n^2q-2nq^2+2/3q^3)$ flops for Algorithm 2, and $O(4n^3+ns^2-2n^2s+dk^2)$ flops plus the process of finding $\delta $ and the probabilistic lower bound for Algorithm 1 to estimate the condition number. It is easy to see that the computational burden of estimating the condition number is much smaller than that of computing the exact condition number via GSVD method.

Table 1 The efficiency of statistical condition estimates with $\kappa (A)=n^{l_1}$ and $\kappa (B)=s^{l_2}$

Full size table

From Table 1, one can easily find that in general both Algorithms 1 and 2 can give reliable estimates of the normwise condition number. In comparison, Algorithm 1 performs more stable since the variances with this algorithm are smaller in most cases. Meanwhile, it should be point out that when $l_1=l_2=0$, Algorithm 2 may give an inaccurate estimate, i.e., the ratio may be larger than 10. This phenomenon also exists in estimating the normwise condition number of the LLS problem [3]. Although the expression of $\kappa _{LSE} (A,B,b,d)$ is more complicated than that of the normwise condition number of the LLS problem and the circumstances on these two problems are different, we believe that the underlying reason should be the same; the reader can refer to [3] for a detailed explanation.

Table 2 Ratios of $\kappa _{LSEup} (A,B,b,d)$ and $\kappa _{LSE} (A,B,b,d)$ with $\kappa (A)=n^{l_1}, \kappa (B)=s^{l_2}$ and $\Vert r\Vert _2=10^{-1}$

Full size table

Table 2 indicates that $\kappa _{LSEup} (A,B,b,d)$ is indeed larger than $\kappa _{LSE} (A,B,b,d)$. However, the difference between them is not so significant, especially when the condition numbers of the involved matrices are very large. In the latter case, they are the same if we only take four decimal places in our numerical experiments. These numerical results confirm the analysis in Remark 2.1.

Example 5.2

Let A and B be nonsymmetric gaussian random Toeplitz matrices of order $n=100$. This means that the entries of these two matrices are generated from standard normal distribution and the basis matrices for generating the linear space ${\mathbb {S}_1}$ or ${\mathbb {S}_2}$ are

$$\begin{aligned} Z_1&=\left[ \begin{array}{ccccc}0 &{} 0 &{} \cdots &{} 0 &{} 0 \\ 0 &{} 0 &{} \cdots &{} 0 &{} 0 \\ \vdots &{} \vdots &{} \ddots &{} \vdots &{} \vdots \\ 0 &{} 0 &{} \cdots &{} 0 &{} 0 \\ 1 &{} 0 &{} \cdots &{} 0 &{} 0\end{array}\right] ,\;Z_2=\left[ \begin{array}{ccccc}0 &{} 0 &{} \cdots &{} 0 &{} 0 \\ 0 &{} 0 &{} \cdots &{} 0 &{} 0 \\ \vdots &{} \vdots &{} \ddots &{} \vdots &{} \vdots \\ 1 &{} 0 &{} \cdots &{} 0 &{} 0 \\ 0 &{} 1 &{} \cdots &{} 0 &{} 0\end{array}\right] ,\cdots ,\; Z_{n-1}=\left[ \begin{array}{ccccc}0 &{} 0 &{} \cdots &{} 1 &{} 0 \\ 0 &{} 0 &{} \cdots &{} 0 &{} 1 \\ \vdots &{} \vdots &{} \ddots &{} \vdots &{} \vdots \\ 0 &{} 0 &{} \cdots &{} 0 &{} 0 \\ 0 &{} 0 &{} \cdots &{} 0 &{} 0\end{array}\right] ,\\ Z_n&=\left[ \begin{array}{ccccc}0 &{} 0 &{} \cdots &{} 0 &{} 1 \\ 0 &{} 0 &{} \cdots &{} 0 &{} 0 \\ \vdots &{} \vdots &{} \ddots &{} \vdots &{} \vdots \\ 0 &{} 0 &{} \cdots &{} 0 &{} 0 \\ 0 &{} 0 &{} \cdots &{} 0 &{} 0\end{array}\right] , \end{aligned}$$

and hence $ \varPhi _{\mathbb {S}_1}= \varPhi _{\mathbb {S}_2}=\left[ \mathrm{vec}(Z_1),\cdots ,\mathrm{vec}(Z_n)\right] $. Analogous to Example 5.1, we also let the solution x be $x=(1, 2^2,\cdots , n^2)^T$ and the residual vector r be a random vector of specified norm, and obtain the computed solution by GSVD method. However, unlike Example 5.1, it seems impossible to restrict a specific condition number to gaussian random Toeplitz matrices.

In our numerical experiment, for each r, we test 200 pairs of random Toeplitz matrices A and B. The numerical results on the ratio between $\kappa _{LSE} (A,B,b,d) $ and $\kappa _{LSE}^S (A,B,b,d)$ defined by

$$\begin{aligned} ratio= \frac{\kappa _{LSE} (A,B,b,d)}{\kappa _{LSE}^S (A,B,b,d)} \end{aligned}$$

are presented in Fig. 2, which confirms the theoretical analysis in Remark 3.1.

From Fig. 2, we also find that there are some points near 10, which means that the unstructured condition number can be 10 times larger than the structured one. Thus, it may lead to an overestimate when using the unstructured condition number to give error bounds in a structured LSE problem. Moreover, we also note that, for different $\Vert r\Vert _2$s, the ratios seem to follow the same trend gathering in the interval [5, 10]. Whereas, from numerical experiments, we verify that the ratio tends to be larger as n increases. In the numerical experiments, we set $\Vert r\Vert _2=1$ and $n=20*i-10,\; i=1:11$, and, for every n, we test 50 LSE problems with random Toeplitz coefficient matrices A and B. The numerical results are presented in Fig. 3, where the circle line denotes the mean value of ratios and the solid line denotes the corresponding variances. The fact shown in the figure means that the structured condition number has more advantage compared with the unstructured one as the dimensions of coefficient matrices increase.

References

Arioli, M., Baboulin, M., Gratton, S.: A partial condition number for linear least squares problems. SIAM J. Matrix Anal. Appl. 29, 413–433 (2007)
Article MathSciNet MATH Google Scholar
Baboulin, M., Gratton, S.: A contribution to the conditioning of the total least squares problem. SIAM J. Matrix Anal. Appl. 32, 685–699 (2011)
Article MathSciNet MATH Google Scholar
Baboulin, M., Gratton, S., Lacroix, R., Laub, A.J.: Statistical estimates for the conditioning of linear least squares problems. Lect. Notes Comput. Sci. 8384, 124–133 (2014)
Article MathSciNet Google Scholar
Barlow, J.L., Nichols, N.K., Plemmons, R.J.: Iterative methods for equality constrained least squares problems. SIAM J. Sci. Stat. Comput. 9, 892–906 (1988)
Article MathSciNet MATH Google Scholar
Bergou, E.H., Gratton, S., Tshimanga, J.: The exact condition number of the truncated singular value solution of a linear ill-posed problem. SIAM J. Matrix Anal. Appl. 35, 1073–1085 (2014)
Article MathSciNet MATH Google Scholar
Björck, Å.: Numerical Methods for Least Squares Problems. SIAM, Philadelphia (1996)
Book MATH Google Scholar
Cao, Y., Petzold, L.: A subspace error estimate for linear systems. SIAM J. Matrix Anal. Appl. 24, 787–801 (2003)
Article MathSciNet MATH Google Scholar
Cox, A.J., Higham, N.J.: Accuracy and stability of the null space method for solving the equality constrained least squares problem. BIT 39, 34–50 (1999)
Article MathSciNet MATH Google Scholar
Cucker, F., Diao, H.: Mixed and componentwise condition numbers for rectangular structured matrices. Calcolo 44, 89–115 (2007)
Article MathSciNet MATH Google Scholar
Eldén, L.: Perturbation theory for the least squares problem with equality constraints. SlAM J. Numer. Anal. 17, 338–350 (1980)
Article MathSciNet MATH Google Scholar
Geurts, A.J.: A contribution to the theory of condition. Numer. Math. 39, 85–96 (1982)
Article MathSciNet MATH Google Scholar
Golub, G.H., Kahan, W.: Calculating the singular values and pseudo-inverse of a matrix. J. Soc. Ind. Appl. Math. Ser. B Numer. Anal. 2, 205–224 (1965)
Article MathSciNet MATH Google Scholar
Golub, G., Van Loan, C.F.: Matrix Computations, 4th edn. Johns Hopkins University Press, Baltimore (2013)
MATH Google Scholar
Gratton, S.: On the condition number of linear least squares problems in a weighted Frobenius norm. BIT 36, 523–530 (1996)
Article MathSciNet MATH Google Scholar
Higham, D.J., Higham, N.J.: Backward error and condition of structured linear systems. SIAM J. Matrix Anal. Appl. 13, 162–175 (1992)
Article MathSciNet MATH Google Scholar
Hochstenbach, M.: Probabilistic upper bounds for the matrix two-norm. J. Sci. Comput. 57, 464–476 (2013)
Article MathSciNet MATH Google Scholar
Horn, R.A., Johnson, C.R.: Topics in Matrix Analysis. Cambridge UP, New York (1991)
Book MATH Google Scholar
Kenney, C., Laub, A.: Small-sample statistical condition estimates for general matrix functions. SIAM J. Sci. Comput. 15, 36–61 (1994)
Article MathSciNet MATH Google Scholar
Lawson, C.L., Hanson, R.J.: Solving Least Squares Problems. SIAM, Philadelphia (1995)
Book MATH Google Scholar
Li, B.Y., Jia, Z.X.: Some results on condition numbers of the scaled total least squares problem. Linear Algebra Appl. 435, 674–686 (2011)
Article MathSciNet MATH Google Scholar
Li, H.Y., Wang, S.X., Yang, H.: On mixed and componentwise condition numbers for indefinite least squares problem. Linear Algebra Appl. 448, 104–129 (2014)
Article MathSciNet MATH Google Scholar
Paige, C.C., Saunders, M.A.: Towards a generalized singular value decomposition. SlAM J. Numer. Anal. 18(3), 398–405 (1981)
Article MathSciNet MATH Google Scholar
Paige, C.C., Saunders, M.A.: LSQR: an algorithm for sparse linear equations and sparse least squares. ACM Trans. Math. Software 8(1), 43–71 (1982)
Article MathSciNet MATH Google Scholar
Rice, J.R.: A theory of condition. SIAM J. Numer. Anal. 3, 287–310 (1966)
Article MathSciNet MATH Google Scholar
Rump, S.M.: Structured perturbations. Part I: normwise distances. SIAM J. Matrix Anal. Appl. 25, 1–30 (2003)
Article MathSciNet MATH Google Scholar
Rump, S.M.: Structured perturbation. Part II: componentwise distances. SIAM J. Matrix Anal. Appl. 25, 31–56 (2003)
Article MathSciNet MATH Google Scholar
Van Loan, C.F.: Generalizing the singular value decomposition. SIAM J. Numer. Anal. 13, 76–83 (1976)
Article MathSciNet MATH Google Scholar
Wei, M.: Algebraic properties of the rank-deficient equality-constrained and weighted least squares problem. Linear Algebra Appl. 161, 27–43 (1992)
Article MathSciNet MATH Google Scholar
Wei, M.: Perturbation theory for rank-deficient equality constrained least squares problem. SIAM J. Numer. Anal. 29, 1462–1481 (1992)
Article MathSciNet MATH Google Scholar
Wei, Y., Diao, H., Qiao, S.: Condition number for weighted linear least squares problem and its condition number, Technical report CAS 04–02-SQ, Department of Computing and Software, McMaster University, Hamilton, ON, Canada, (2004)
Xu, W., Wei, Y., Qiao, S.: Condition numbers for structured least squares problems. BIT 46, 203–225 (2006)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors would like to thank the editor, Froilan M. Dopico, and the anonymous referee for their helpful comments for improving the manuscript. They also would like to acknowledge Prof. Michiel E. Hochstenbach for providing the Matlab program of probabilistic spectral norm estimator. Part of this work was finished when the first author was a visiting scholar at the Department of Mathematics and Statistics of Auburn University from August 2014 to August 2015.

Author information

Authors and Affiliations

College of Mathematics and Statistics, Chongqing University, Chongqing, 401331, People’s Republic of China
Hanyu Li
School of Statistics, Qufu Normal University, Qufu, 273165, People’s Republic of China
Shaoxin Wang

Authors

Hanyu Li
View author publications
You can also search for this author in PubMed Google Scholar
Shaoxin Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hanyu Li.

Additional information

The work is supported by the National Natural Science Foundation of China (No. 11671060) and the Fundamental Research Funds for the Central Universities (No. 106112015CDJXY100003).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, H., Wang, S. Partial condition number for the equality constrained linear least squares problem. Calcolo 54, 1121–1146 (2017). https://doi.org/10.1007/s10092-017-0221-8

Download citation

Received: 30 August 2016
Accepted: 31 March 2017
Published: 07 April 2017
Issue Date: December 2017
DOI: https://doi.org/10.1007/s10092-017-0221-8

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Partial condition number for the equality constrained linear least squares problem

Abstract

Similar content being viewed by others

A note on the condition number of the scaled total least squares problem

On condition numbers of the total least squares problem with linear equality constraint

Statistical Estimates for the Conditioning of Linear Least Squares Problems

1 Introduction and preliminaries

2 The partial condition number

Theorem 2.1

Proof

Remark 2.1

Theorem 2.2

Proof

Remark 2.2

Remark 2.3

Remark 2.4

3 The structured partial condition number

Theorem 3.1

Remark 3.1

Remark 3.2

Remark 3.3

4 Statistical condition estimates

Remark 4.1

Remark 4.2

4.1 Complexity analysis

5 Numerical experiments

Example 5.1

Example 5.2

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Partial condition number for the equality constrained linear least squares problem

Abstract

Similar content being viewed by others

A note on the condition number of the scaled total least squares problem

On condition numbers of the total least squares problem with linear equality constraint

Statistical Estimates for the Conditioning of Linear Least Squares Problems

1 Introduction and preliminaries

2 The partial condition number

Theorem 2.1

Proof

Remark 2.1

Theorem 2.2

Proof

Remark 2.2

Remark 2.3

Remark 2.4

3 The structured partial condition number

Theorem 3.1

Remark 3.1

Remark 3.2

Remark 3.3

4 Statistical condition estimates

Remark 4.1

Remark 4.2

4.1 Complexity analysis

5 Numerical experiments

Example 5.1

Example 5.2

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation