1 Introduction

For a positive integer N, let \({\mathbf {I}}_1,\ldots ,{\mathbf {I}}_N\) be positive integers. An order N tensor \({\mathcal {A}}=( {\mathcal {A}}_{ {i_{1}i_{2}\ldots i_{{N}}} })_{1\le i_{j}\le {\mathbf {I}}_{j}}\), \((j = 1,\ldots , {N})\) is a multidimensional array with \({\mathfrak {I}}={\mathbf {I}}_{1}{\mathbf {I}}_{2}\cdots {\mathbf {I}}_{N}\) entries, where \({\mathbf {I}}_1,\ldots ,{\mathbf {I}}_N\) are positive integers. Let \({\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{N}}\) (resp. \({\mathbb {R}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{N}}\)) be the set of the order N tensors of dimension \({\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{N}\) over complex numbers \({\mathbb {C}}\) (resp. real numbers \({\mathbb {R}}\)).

The conjugate transpose of a tensor \({\mathcal {A}} = ({\mathcal {A}}_{ i_{1}\ldots i_{M}j_{1} \ldots j_{N}}) \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{M} \times {\mathbf {J}}_{1} \times \cdots \times {\mathbf {J}}_{N}}\) is denoted by \({\mathcal {A}}^{*}\) and elementwise defined as \(({\mathcal {A}}^{*})_{ j_{1} \ldots j_{N} i_{1} \ldots i_M}=(\overline{{\mathcal {A}}})_{ i_{1} \ldots i_{M} j_{1} \ldots j_N} \in {\mathbb {C}}^{{\mathbf {J}}_{1} \times \cdots \times {\mathbf {J}}_{N} \times {\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{M}}\), where the overline means the conjugate operator. When the tensors are defined over \({\mathbb {R}}\), the tensor \({\mathcal {A}}^{\mathrm T}\) satisfying \(({\mathcal {A}}^{\mathrm T})_{ j_{1} \ldots j_{N} i_{1} \ldots i_M}=({\mathcal {A}})_{ i_{1}\ldots i_{M} j_{1} \ldots j_N} \in {\mathbb {C}}^{{\mathbf {J}}_{1} \times \cdots \times {\mathbf {J}}_{N} \times {\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{M}}\) is called the transpose of \({\mathcal {A}}\).

The Einstein product of tensors is defined in Einstein (2007) by the operation \(*_{N}\) via

$$\begin{aligned} ({\mathcal {A}}*_{N}{\mathcal {B}})_{i_{1}\ldots i_{{N}}j_{1}\ldots j_{{M}}} = \sum _{k_{1} \ldots k_{{N}}}{\mathcal {A}}_{i_{1} \ldots i_{{N}}k_{1} \ldots k_{{N}}}{\mathcal {B}}_{k_{1} \ldots k_{{N}}j_{1} \ldots ,j_{{M}}}, \end{aligned}$$
(1.1)

where \({\mathcal {A}} \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}}\times {\mathbf {K}}_{1} \times \cdots \times {\mathbf {K}}_{{N}}}\), \({\mathcal {B}} \in {\mathbb {C}}^{{\mathbf {K}}_{1}\times \cdots \times {\mathbf {K}}_{{N}}\times {\mathbf {J}}_{1}\times \cdots \times {\mathbf {J}}_{{M}}}\) and \({\mathcal {A}}*_{N}{\mathcal {B}} \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}} \times {\mathbf {J}}_{1} \times \cdots \times {\mathbf {J}}_{{M}}}\). The associative law of this tensor product holds. In the above formula, when \({\mathcal {B}} \in {\mathbb {C}}^{{\mathbf {K}}_{1} \times \cdots \times {\mathbf {K}}_{{N}}}\), then

$$\begin{aligned} ({\mathcal {A}}*_{N}{\mathcal {B}})_{i_{1}i_{2}\ldots i_{{N}}} = \sum _{k_{1},\ldots ,k_{{N}}}{\mathcal {A}}_{i_{1} \ldots i_{{N}}k_{1} \ldots k_{{N}}}{\mathcal {B}}_{k_{1} \ldots k_{{N}}}, \end{aligned}$$

where \({\mathcal {A}}*_{N}{\mathcal {B}} \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}}}\). When \({\mathcal {A}} \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}}}\) and \({\mathcal {B}}\) is a vector \({\mathbf {b}} = (b_{i})\in {\mathbb {C}}^{{I}_{{N}}}\), the product is defined by operation \(\times _{{N}}\) via

$$\begin{aligned} ({\mathcal {A}}\times _{{N}}{\mathcal {B}})_{i_{1}i_{2}\ldots i_{{N}-1}} = \sum _{i_{{N}}}{\mathcal {A}}_{i_{1} \ldots i_{{N}}} b_{i_{{N}}}, \end{aligned}$$

where \({\mathcal {A}}\times _{{N}}{\mathcal {B}} \in {{\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {I}_{{N}-1}}}\).

Definition 1.1

(Sun et al. 2016) Let \({\mathcal {A}} \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}} \times {\mathbf {K}}_{1} \times \cdots \times {\mathbf {K}}_{{N}}}\). The tensor \({\mathcal {X}} \in {\mathbb {C}}^{{\mathbf {K}}_{1} \times \cdots \times {\mathbf {K}}_{{N}} \times {\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}}}\) which satisfies

$$\begin{aligned}&(1^T)\quad {\mathcal {A}}*_{{N}}{\mathcal {X}}*_{{N}}{\mathcal {A}} = {\mathcal {A}};\qquad (2^T)\quad {\mathcal {X}}*_{{N}}{\mathcal {A}}*_{{N}}{\mathcal {X}} = {\mathcal {X}};\\&(3^T)\quad ({\mathcal {A}}*_{{N}}{\mathcal {X}})^{*} = {\mathcal {A}}*_{{N}}{\mathcal {X}};\qquad (4^T)\quad ({\mathcal {X}}*_{{N}}{\mathcal {A}})^{*} = {\mathcal {X}}*_{{N}}{\mathcal {A}} \end{aligned}$$

is called the Moore–Penrose inverse of \({\mathcal {A}}\), abbreviated by M-P inverse, denoted by \({\mathcal {A}}^\dagger \). If the equation (i) of the above equations \((1^T)\)\((4^T)\) holds, then \({\mathcal {X}}\) is called an (i)-inverse of \({\mathcal {A}}\), denoted by \({\mathcal {A}}^{(i)}\).

For a tensor \({\mathcal {A}} \in {\mathbb {C}}^{{I}_{1} \times \cdots \times {I}_{{N}} \times {I}_{1} \times \cdots \times {I}_{{N}}}\), if there exists a tensor \({\mathcal {X}}\), such that \({\mathcal {A}}*_{{N}}{\mathcal {X}} = {\mathcal {X}}*_{{N}}{\mathcal {A}} = {\mathcal {I}}\), then \({\mathcal {X}}\) is called the inverse of \({\mathcal {A}}\), denoted by \({\mathcal {A}}^{-1}\). Clearly, if \({\mathcal {A}}\) is invertible, then \({\mathcal {A}}^\dagger = {\mathcal {A}}^{-1}\).

The Moore–Penrose inverse of matrices and linear operators plays an important role in theoretical study and numerical analysis in many areas, such as the singular matrix problems, ill-posed problems, optimization problems, total least-squares problem (Xie et al. 2019; Zheng et al. 2017), and statistical problems (Ben-Israel and Greville 2003; Cvetkovic-Illic and Wei 2017; Wang et al. 2018; Wei 2014). As a continuation of these results, operations with tensors have become increasingly prevalent in recent years (Che and Wei 2019; Che et al. 2019; Ding and Wei 2016; Harrison and Joseph 2016; Medellin et al. 2016; Qi and Luo 2017; Wei and Ding 2016). Brazell et al. (2013) introduced the notion of ordinary tensor inverse. Sun et al. (2014, 2016) prove the existence and uniqueness of the Moore–Penrose inverse and \(\{i,j,k\}\)-inverses of even-order tensors with the Einstein product. Panigrahy and Mishra (2018) and Sun et al. (2014, 2016) defined the Moore–Penrose inverse and \(\{i\}\)-inverses \((i = 1, 2, 3, 4)\) of even-order tensors with the Einstein product. In addition, the general solutions of some multilinear systems were given in terms of defined generalized inverses. A few further characterizations of different generalized inverses of tensors in conjunction with the new method to compute the Moore–Penrose inverse of tensors were considered in Behera and Mishra (2017). The weighted Moore–Penrose inverse in tensor spaces was introduced in Ji and Wei (2017). In addition, a characterization of the least-squares solutions to a multilinear system as well as the relationship between the weighted minimum-norm least-squares solution of a multilinear system and the weighted Moore–Penrose inverse of its coefficient tensor were considered in Ji and Wei (2017). Sun et al. (2018) defined \(\{i\}\)-inverses for \(i = 1, 2, 5\) and the group inverse of tensors, assuming a general tensor product. Panigrahy et al. (2018) proved some additional properties of the Moore–Penrose inverse of tensors via the Einstein product and also derived a few necessary and sufficient conditions for the reverse-order law for the Moore–Penrose inverse of tensors. Several new sufficient conditions which ensure the reverse-order law of the weighted Moore–Penrose inverse for even-order square tensors were presented in Panigrahy and Mishra (2019). Recently, Ji and Wei (2018) investigated the Drazin inverse of even-order tensors with Einstein product. Liang and Zheng (2018) defined an iterative algorithm for solving Sylvester tensor equation based on the Einstein product.

Using another definition of the tensor product, some basic properties for order 2 left (right) inverse and product of tensors were given in Bu et al. (2014). The generalized inverse of tensors was established in Jin et al. (2017) using tensor equations and the t-product of tensors. The definition of generalized tensor function via the tensor singular value decomposition based on the t-product was introduced in Miao et al. (2019). In addition, the least-squares solutions of tensor equations as well as an algorithm for generating the Moore–Penrose inverse of a tensor were proposed in Jin et al. (2017), Shi et al. (2013).

On the other hand, the additive and multiplicative perturbation models have been investigated frequently during the past decades. For more details, the reader is referred to the references (Cai et al. 2011; Liu et al. 2008; Meng and Zheng 2010; Stewart 1977; Wedin 1973; Wei and Ling 2010; Wei 1999). The classical results derived by Stewart (1977) and Wedin (1973) have been improved in Li et al. (2013), Xu et al. (2010b). The acute perturbation of the group inverse was investigated in Wei (2017). Some results related to the perturbation of the oblique projectors which include the weighted pseudoinverse were presented in Xu et al. (2008, 2010a). Some optimal perturbation bounds of the weighted Moore–Penrose inverse under the weighted unitary invariant norms, the weighted Q-norms and the weighted F-norms were obtained in Xu et al. (2011). A sharp estimation for the perturbation bounds of weighted Moore–Penrose inverse was considered in Ma (2018). Meyer (1980) presented a perturbation formula with application to Markov chains. The authors extended the formula to the Drazin inverse \(A^{\mathrm D}\). Two finite-time convergent Zhang neural network models for time-varying complex matrix Drazin inverse have been presented in Qiao et al. (2018). An explicit formula for perturbations of an outer inverse under certain conditions was given in Zhang and Wei (2008). The perturbation analysis for the nearest \(\{1\}\), \(\{1, i\}\), and \(\{1, 2, i\}\)-inverses with respect to the multiplicative perturbation model was considered in Meng et al. (2017).

In addition, the perturbation theory for the tensor eigenvalue and singular value problems of tensors has been investigated recently. The perturbation bounds of the tensor eigenvalue and singular value problems of even-order tensors were considered in Che et al. (2016). The explicit estimation of the backward error for the largest eigenvalue of an irreducible nonnegative tensor was given in Li and Ng (2014).

Our intention in the present paper is to extend the results concerning the perturbation of the Moore–Penrose inverse from the complex matrix space to more general results in the tensor space. According to this goal, our intention is to extend the classical results derived in Stewart (1977) and Wedin (1973) for the matrix case to even-order tensors.

The null spaces and the ranges of tensors were introduced in Ji and Wei (2017).

Definition 1.2

(Ji and Wei 2017) For \({\mathcal {A}} \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}} \times {\mathbf {K}}_{1} \times \cdots \times {\mathbf {K}}_{{N}}}\), the range \({\mathcal {R}}({\mathcal {A}})\) and the null space \({\mathcal {N}}({\mathcal {A}})\) of \({\mathcal {A}}\) are defined by

$$\begin{aligned}&{\mathcal {R}}({\mathcal {A}}) = \{{\mathcal {Y}} \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}}} : \ {\mathcal {Y}} = {\mathcal {A}}*_N{\mathcal {X}}, \ {\mathcal {X}} \in {\mathbb {C}}^{{\mathbf {K}}_{1} \times \cdots \times {\mathbf {K}}_{{N}}}\}\\&{\mathcal {N}}({\mathcal {A}}) = \{{\mathcal {X}} \in {\mathbb {C}}^{{\mathbf {K}}_{1} \times \cdots \times {\mathbf {K}}_{{N}}} : {\mathcal {A}}*_N {\mathcal {X}} = {\mathcal {O}}\}, \end{aligned}$$

where \({\mathcal {O}}\) is an appropriate zero tensor.

Definition 1.3

(Orthogonal Projection) The orthogonal projection onto a subspace \({\mathcal {R}}({\mathcal {A}})\) is denoted by \({P}_{{\mathcal {A}}}\) and defined as

$$\begin{aligned} {P}_{{\mathcal {A}}} = {\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger \end{aligned}$$

Clearly, \({P}_{{\mathcal {A}}}\) is the Hermitian and idempotent, and \({\mathcal {R}}({P}_{{\mathcal {A}}}) = {\mathcal {R}}({\mathcal {A}})\). Similarly

$$\begin{aligned} {R}_{{\mathcal {A}}} = {\mathcal {A}}^\dagger *_{{N}}{\mathcal {A}} \end{aligned}$$

is the projection onto \({\mathcal {R}}({\mathcal {A}}^*)\).

Definition 1.4

(Complement of projection) The projection onto \({\mathcal {R}}({\mathcal {A}})^{\bot }\) will be denoted by

$$\begin{aligned} {P}^{\bot }_{{\mathcal {A}}} \equiv {\mathcal {I}} - {P}_{{\mathcal {A}}}. \end{aligned}$$

Likewise

$$\begin{aligned} {R}^{\bot }_{{\mathcal {A}}} \equiv {\mathcal {I}} - {R}_{{\mathcal {A}}}. \end{aligned}$$

will denote the projection onto \({\mathcal {R}}({\mathcal {A}}^*)^{\bot }\).

Main contributions of the manuscript can be summarized as follows.

  1. (1)

    The spectral norm of a tensor is defined and investigated.

  2. (2)

    Useful representations of \({{\mathcal {A}}}*_N{\mathcal A}^\dagger \) and \({{\mathcal {I}}}-{{\mathcal {A}}}*_N{\mathcal A}^\dagger \) are derived.

  3. (3)

    The perturbation theory for the Moore–Penrose inverse of even-order tensors via Einstein product is considered using derived representations of some tensor expressions involving the Moore–Penrose inverse. Therefore, derived results represent the first contribution to the perturbation of the Moore–Penrose inverse of tensors.

The rest of this paper is organized as follows. The spectral tensor norm is defined and investigated in Sect. 2. Useful representations of \({{\mathcal {A}}}*_N{{\mathcal {A}}}^\dagger \) and \({{\mathcal {I}}}-{{\mathcal {A}}}*_N{{\mathcal {A}}}^\dagger \) are derived in Sect. 3. Section 4 generalizes some results from the matrix theory to the perturbation theory for the Moore–Penrose inverse of even-order tensor via Einstein product. Numerical examples are presented in Sect. 5.

2 Spectral norm of tensors

To simplify presentation, we use the additional notation

$$\begin{aligned} {\mathbf {I}}(N)={\mathbf {I}}_1\times \cdots \times {\mathbf {I}}_N,\ \ {\mathbb {I}}=\{I_1,\ldots ,I_N\}, \end{aligned}$$

where \({\mathbf {I}}_1,\ldots ,{\mathbf {I}}_N\) are positive integers. Then, the tensor \({\mathcal {A}} \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{M}} \times {\mathbf {K}}_{1} \times \cdots \times {\mathbf {K}}_{{N}}}\) is denoted shortly by \({\mathcal {A}} \in {\mathbb {C}}^{{\mathbf {I}}(M) \times {\mathbf {K}}(N)}\). The identity tensor \({\mathcal {I}}\) of the order \({\mathbf {I}}(N)\times {\mathbf {I}}(N)\) tensor is defined as in Brazell et al. (2013) by

$$\begin{aligned} {\mathcal {I}}_{i_1\ldots i_N\, j_1\ldots j_N}=\prod \limits _{k=1}^N \delta _{i_kj_k}, \end{aligned}$$

where

$$\begin{aligned} \delta _{ij}=\left\{ \begin{array}{ll} 1, &{}\quad i=j,\\ 0, &{}\quad i\ne j \end{array}\right. \end{aligned}$$

denotes the Kronecker delta operator.

The Frobenius inner product of two tensors \({\mathcal {A}}, {\mathcal {B}} \in {\mathbb {C}}^{{\mathbf {I}}({N})}\) is defined as

$$\begin{aligned} ({\mathcal {A}}, {\mathcal {B}}) =\sum \limits _{i_1=1}^{{\mathbf {I}}_{1}}\sum \limits _{i_2=1}^{{\mathbf {I}}_{2}}\cdots \sum \limits _{i_N=1}^{{\mathbf {I}}_{N}} {\mathcal {A}}_{ i_1 i_2 \ldots i_N} {\mathcal {B}}_{ i_1 i_2\ldots i_N}. \end{aligned}$$

If \(({\mathcal {A}}, {\mathcal {B}}) = 0\), then \({\mathcal {A}}\) is orthogonal to \({\mathcal {B}}\). The Frobenius norm of \({\mathcal {A}}\) is defined by \(\Vert {\mathcal {A}}\Vert _F=\sqrt{({\mathcal {A}}, {\mathcal {A}})}.\)

A complex (real) tensor of order m dimension n is defined by \({\mathcal {A}}=({\mathcal {A}}_{ i_1 \ldots i_m})\), \({\mathcal {A}}_{ i_1\ldots i_m}\in {\mathbb {C}} ({\mathbb {R}})\), where \(i_j=1,\ldots ,n\) for each \(j= 1, \ldots ,m\). If \({\mathbf {x}}=(x_1,\ldots ,x_n)^{\mathrm T}\) is an n-dimensional vector, then \({\mathbf {x}}^m=x \otimes x \otimes \cdots \otimes x\) is considered as an mth order n-dimensional rank-one tensor with entries \({ (x^m)_{i_{1}, \ldots , i_{m}}= x_{i_1}\ldots x_{i_m}}\), where “\(\otimes \)” is the Kronecker product of vectors. Then

$$\begin{aligned} {\mathcal {A}}{\mathbf {x}}^m=\sum \limits _{i_1,i_2,\ldots ,i_m=1}^n {\mathcal {A}}_{ i_1 i_2\ldots i_m} x_{ i_1 i_2\ldots i_m} \end{aligned}$$

is the tensor product of \({\mathcal {A}}\) and \({\mathbf {x}}^m\). A tensor–vector multiplication of a tensor \({\mathcal {A}}=(a_{i_1,\ldots ,i_m})\in {\mathbb {C}}^{n\times \cdots \times n}\) of order m dimension n and an n-dimensional vector \(x=(x_1, x_2,\ldots ,x_n)^{\mathrm T}\) is an n-dimensional vector \({\mathcal {A}}x^{m-1}\), whose ith component is equal to

$$\begin{aligned} ({\mathcal {A}}x^{m-1})_i=\sum \limits _{i_2,\ldots ,i_m=1}^n a_{ i_2\ldots i_m} x_{ i_2 \ldots i_m}. \end{aligned}$$

The eigenvalue of a tensor was introduced in Qi (2005). A complex number \(\lambda \) is called an eigenvalue of A and \({\mathbf {x}}\) is an eigenvector of A associated with \(\lambda \) if the equation

$$\begin{aligned} ({\mathcal {A}}{\mathbf {x}}^{m-1})_i=\lambda x^{[m-1]},\ \ {\mathbf {x}}^{[m-1]}=\left( x_1^{m-1}, x_2^{m-1},\ldots ,x_n^{m-1}\right) {,} \end{aligned}$$

is satisfied.

Recently, Liang et al. (2019) proposed a new definition of the eigenvalue of an even-order square tensor. Let \({\mathcal {A}} \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}} \times {\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}}}\) be given. If a nonzero tensor \({\mathcal {X}} \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}}}\) and a complex number \(\lambda \) satisfy

$$\begin{aligned} {\mathcal {A}}*_{N}{\mathcal {X}}=\lambda {\mathcal {X}}, \end{aligned}$$
(2.1)

then \(\lambda \) is an eigenvalue of the tensor \( {\mathcal {A}}\) and \( {\mathcal {X}}\) the eigentensor with respect to \(\lambda \).

In addition, we found the following definition of the tensor spectral norm from Li (2016).

Definition 2.1

(Li 2016) For a given tensor \({\mathcal {T}}\in {\mathbb {R}}^{{\mathbf {I}}_{1}\times {\mathbf {I}}_2\times \cdots \times {\mathbf {I}}_N}\), the spectral norm of \({\mathcal {T}}\), denoted by \(\Vert {\mathcal {T}}\Vert _{\sigma }\), is defined as

$$\begin{aligned} \Vert {\mathcal {T}}\Vert _{\sigma }:= \max \left\{ \langle {\mathcal {T}} , {\mathbf {x}}_1\otimes {\mathbf {x}}_2\otimes \cdots \otimes {\mathbf {x}}_N\rangle : {\mathbf {x}}_k\in {\mathbb {R}}^{{\mathbf {I}}_{k}},\ \Vert {\mathbf {x}}_k\Vert _F = 1,\ k = 1,\ldots ,N\right\} , \end{aligned}$$

where \(\Vert {\mathbf {x}}\Vert _F\) denotes the Frobenius norm of the vector \({\mathbf {x}}\) and \({\mathbf {x}}_1\otimes \cdots \otimes {\mathbf {x}}_N\) means the outer product of vectors: \(({\mathbf {x}}_1\otimes \cdots \otimes {{\mathbf {x}}_N)}_{i_1,\ldots ,i_N}=({\mathbf {x}}_1)_{i_1}\cdots ({\mathbf {x}}_N)_{i_N}\).

Essentially, \(\Vert {\mathcal {T}}\Vert _{\sigma }\) is the maximal value of the Frobenius inner product between \({\mathcal {T}}\) and the rank-one tensor \({\mathbf {x}}_1\otimes \cdots \otimes {\mathbf {x}}_N\) whose Frobenius norm is one.

Let the eigenvalues of a complex even-order square tensor are defined as in (2.1). By \(\lambda _{\min }({\mathcal {K}})\) and \(\lambda _{\max }({\mathcal {K}})\), we denote the smallest and largest eigenvalue of a tensor \({\mathcal {K}}\), respectively. Similarly, \({\mu _{1}({\mathcal {K}})}\) stands for the largest singular value of a tensor \({\mathcal {K}}\).

Lemma 2.1

Let \({\mathcal {A}} \in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {K}}({N})}\). Then, the spectral norm of \({\mathcal {A}}\) can be defined as

$$\begin{aligned} \Vert {\mathcal {A}}\Vert _{2} = \sqrt{\lambda _{\max }\left( {\mathcal {A}}^{*}*_{{N}}{\mathcal {A}}\right) } = \mu _{1}({\mathcal {A}}), \end{aligned}$$
(2.2)

where \(\lambda _{\max }\left( {\mathcal {A}}^{*}*_{{N}}{\mathcal {A}}\right) \) denotes the largest eigenvalue of \({\mathcal {A}}^{*}*_{{N}}{\mathcal {A}}\) and \(\mu _{1}({\mathcal {A}})\) is the largest singular value of \({\mathcal {A}}\).

Proof

It is necessary to verify that the definition (2.2) satisfies properties of a norm function.

(1) Clearly, \(\Vert {\mathcal {A}}\Vert _{2} \ge 0 \), and \(\Vert {\mathcal {A}}\Vert _{2} = 0\) if and only if \({\mathcal {A}}=0\),

(2) The second property of \(\Vert {\mathcal {A}}\Vert _{2}\) can be verified using

$$\begin{aligned} \begin{aligned} \Vert k{\mathcal {A}}\Vert _{2}&= \sqrt{\lambda _{\max }\left( (k{\mathcal {A}})^{*}*_{{N}}(k{\mathcal {A}})\right) }\\&= \sqrt{k^{2}\, \lambda _{\max }\left( {\mathcal {A}}^{*}*_{{N}}{\mathcal {A}}\right) }\\&= |k|\, \mu _{1}({\mathcal {A}})\\&= |k|\, \Vert {\mathcal {A}}\Vert _{2}. \end{aligned} \end{aligned}$$

(3) Since

$$\begin{aligned} \mu _{1}({\mathcal {A}}+{\mathcal {B}}) \le \mu _{1}({\mathcal {A}}) + \mu _{1}({\mathcal {B}}), \end{aligned}$$

immediately from the definition of the spectral norm it follows that

$$\begin{aligned} \Vert {\mathcal {A}}+{\mathcal {B}}\Vert _{2} \le \Vert {\mathcal {A}}\Vert _{2} +\Vert {\mathcal {B}}\Vert _{2}. \end{aligned}$$

Therefore, (2.2) is a valid definition of the matrix norm. \(\square \)

Our intention is to determine the spectral norm of a tensor explicitly using the approach based on the matricization or unfolding. Matricization is the transformation that transforms a tensor into a matrix. Let \({\mathbf {I}}_1,\ldots ,{\mathbf {I}}_M,{\mathbf {K}}_1,\ldots ,{\mathbf {K}}_N\) be positive integers. Assume that \({\mathfrak {I}}, {\mathfrak {K}}\) are positive integers defined by

$$\begin{aligned} {\mathfrak {I}}={\mathbf {I}}_{1}{\mathbf {I}}_{2}\cdots {\mathbf {I}}_{M}, \ \ {\mathfrak {K}}={\mathbf {K}}_{1}{\mathbf {K}}_{2}\cdots {\mathbf {K}}_{N}. \end{aligned}$$
(2.3)

Denote by \(\mathrm {Mat}({\mathcal {A}})\) the matrix obtained after the matricization

$$\begin{aligned} \mathrm {Mat}: {\mathbb {C}}^{{\mathbf {I}}({M}) \times {\mathbf {K}}({N})}\mapsto {\mathbb {C}}^{{\mathfrak {I}}\times {\mathfrak {K}}}, \end{aligned}$$

which transforms a tensor \({\mathcal {A}}\in {\mathbb {C}}^{{\mathbf {I}}({M}) \times {\mathbf {K}}({N})}\) into the matrix \(A\in {\mathbb {C}}^{{\mathfrak {I}}\times {\mathfrak {K}}}\). An arbitrary tensor \({\mathcal {A}}\) can be unfolded into an appropriate matrix A in different ways.

It is known that the spectral norm of the tensor is bounded by the spectral norm of the matricized tensor, i.e., \(\Vert \mathrm {Mat}({\mathcal {A}})\Vert _{\sigma }\ge \Vert {\mathcal {A}}\Vert _{\sigma }\) (see, for example, Li 2016).

One approach in the matricization, denoted by \(\psi :{\mathbb {C}}^{{\mathbf {I}}({M}) \times {\mathbf {K}}({N})}\mapsto {\mathbb {C}}^{{\mathfrak {I}}\times {\mathfrak {K}}}\) was proposed in Liang and Zheng (2018, Definition 2.4) (see also Brazell et al. 2013). The matricization \(\psi \) is defined by

$$\begin{aligned} \psi ({\mathcal {A}}_{i_1,\ldots ,i_M,k_1,\ldots ,k_N})=A_{\mathrm {ivec}(i,{\mathbb {I}}),\mathrm {ivec}(k,{\mathbb {K}})}, \end{aligned}$$

where \({i}=(i_1,\ldots ,i_M)^{\mathrm T}\) and

$$\begin{aligned} \mathrm {ivec}({i},{\mathbb {I}})=i_1+\sum \limits _{j=2}^M (i_j-1)\prod \limits _{s=1}^{j-1}I_s. \end{aligned}$$

To define an effective procedure for the tensor matricization, we use the reshaping operation denoted as rsh, which was introduced in Panigrahy et al. (2018). Later, we define the spectral norm of a tensor by means of the spectral norm of the reshaped tensor. This operation can be implemented by means of the standard Matlab function reshape.

Definition 2.2

(Panigrahy et al. 2018) Let \({\mathbf {I}}_1,\ldots {\mathbf {I}}_M,{\mathbf {K}}_1,\ldots {\mathbf {K}}_N\) be given integers. Assume that \({\mathfrak {I}}, {\mathfrak {K}}\) are the integers defined (2.3). The reshaping operation

$$\begin{aligned} \mathrm {rsh}: {\mathbb {C}}^{{\mathbf {I}}({M}) \times {\mathbf {K}}({N})}\mapsto {\mathbb {C}}^{{\mathfrak {I}}\times {\mathfrak {K}}}, \end{aligned}$$

transforms a tensor \({\mathcal {A}}\in {\mathbb {C}}^{{\mathbf {I}}({M}) \times {\mathbf {K}}({N})}\) into the matrix \(A\in \mathbb C^{{\mathfrak {I}}\times {\mathfrak {K}}}\) using the Matlab function reshape as follows:

$$\begin{aligned} \mathrm {rsh}\left( {\mathcal {A}}\right) =A=\mathrm {reshape}({\mathcal {A}},{\mathfrak {I}},{\mathfrak {K}}),\ \ {\mathcal {A}}\in {\mathbb {C}}^{{\mathbf {I}}({M}) \times {\mathbf {K}}({N})},\ A\in {\mathbb {C}}^{{\mathfrak {I}}\times {\mathfrak {K}}}. \end{aligned}$$

The inverse reshaping is the mapping defined by

$$\begin{aligned} \begin{aligned}&\mathrm {rsh}^{-1}\,:{\mathbb {C}}^{{\mathfrak {I}}\times {\mathfrak {K}}} \mapsto {\mathbb {C}}^{{\mathbf {I}}({M}) \times {\mathbf {K}}({N})},\\&\mathrm {rsh}^{-1}(A)={\mathcal {A}}=\mathrm {reshape}(A,{\mathbf {I}}_1,\ldots ,{\mathbf {I}}_M,{\mathbf {K}}_1,\ldots {,}{\mathbf {K}}_N),\ \ A\in {\mathbb {C}}^{{\mathfrak {I}}\times {\mathfrak {K}}},\ {\mathcal {A}}\in {\mathbb {C}}^{{\mathbf {I}}({M}) \times {\mathbf {K}}({N})}. \end{aligned} \end{aligned}$$

The following result from Panigrahy et al. (2018) will be useful.

Lemma 2.2

(Panigrahy et al. 2018) Let \({\mathcal {A}} \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}}\times {\mathbf {K}}_{1} \times \cdots \times {\mathbf {K}}_{{N}}}\) and \({\mathcal {B}} \in {\mathbb {C}}^{{\mathbf {K}}_{1}\times \cdots \times {\mathbf {K}}_{{N}}\times {\mathbf {L}}_{1}\times \cdots \times {\mathbf {L}}_{{N}}}\) be given tensors, integers \({\mathfrak {I}},{\mathfrak {K}}\) are computed as in (2.3) and \({\mathfrak {L}}={\mathbf {L}}_{1}{\mathbf {L}}_{2}\cdots {\mathbf {L}}_{N}\). Then

$$\begin{aligned} \mathrm {rsh}\left( {\mathcal {A}}*_{N}{\mathcal {B}}\right) =\mathrm {rsh}\left( {\mathcal {A}}\right) \mathrm {rsh}\left( {\mathcal {B}}\right) =AB, \end{aligned}$$
(2.4)

where \(A=\mathrm {rsh}\left( {\mathcal {A}}\right) \in \mathbb C^{{\mathfrak {I}}\times {\mathfrak {K}}},B=\mathrm {rsh}\left( {\mathcal {B}}\right) \in {\mathbb {C}}^{{\mathfrak {K}}\times {\mathfrak {L}}}\).

Applying the inverse reshaping operator \(\mathrm {rsh}^{-1}()\) on both sides in (2.4), it can be concluded that \(\mathrm {rsh}^{-1}(AB) = \mathrm {rsh}^{-1}(A)*_{N} \mathrm {rsh}^{-1}(B) = {\mathcal {A}}*_{N}{\mathcal {B}}\).

Now, our intention is to approximate the tensor norm \(\Vert {\mathcal {A}}\Vert _2\) by an effective computational procedure. For this purpose, we propose Algorithm 1 for computing \(\mathrm {rsh}^{-1}(A)\) in terms of the Singular Value Decomposition (SVD) of A. Since eigenvalues in Liang et al. (2019) are defined for even-order square tensors, our further investigation will be restricted to even-order tensors.

figure a

In Lemma 2.3, we show that the tensor \({\mathcal {A}}\) in Algorithm 1 is defined well.

Lemma 2.3

The tensor \({\mathcal {A}}\) in Algorithm 1 is defined well.

Proof

Under the assumptions of Algorithm 1, an application of Lemma 2.2 gives

$$\begin{aligned} \begin{aligned} \mathrm {rsh}\left( {{\mathcal {A}}}\right)&=\mathrm {rsh}\left( {\mathcal U}*_N {{\mathcal {D}}} *_N{{\mathcal {V}}}^*\right) \\&=\mathrm {rsh}\left( {{\mathcal {U}}}\right) \mathrm {rsh}\left( {\mathcal D}\right) \mathrm {rsh}\left( {{\mathcal {V}}}^*\right) \\&=UDV^*\\&=A, \end{aligned} \end{aligned}$$

which confirms \({\mathcal {A}}=\mathrm {rsh}^{-1}(A)\). \(\square \)

As a consequence of Algorithm 1, Lemma 2.4 shows that the spectral norm is invariant with respect to the function \(\mathrm {rsh}\).

Lemma 2.4

Let \({\mathcal {A}} \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}}\times {\mathbf {K}}_{1} \times \cdots \times {\mathbf {K}}_{{N}}}\) be a given tensor and integers \({\mathfrak {I}},{\mathfrak {K}}\) are computed as in (2.3). Then

$$\begin{aligned} \Vert {\mathcal {A}}\Vert _2=\Vert \mathrm {rsh}{^{-1}}\left( {\mathcal {A}}\right) \Vert _2=\Vert A\Vert _2, \end{aligned}$$
(2.5)

where \(A\in {\mathbb {C}}^{{\mathfrak {I}}\times {\mathfrak {K}}}\).

Proof

According to Algorithm 1, the tensor \({\mathcal {A}} \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}}\times {\mathbf {K}}_{1} \times \cdots \times {\mathbf {K}}_{{N}}}\) possesses the same singular values as the matrix \(A\in {\mathbb {C}}^{{\mathfrak {I}}\times {\mathfrak {K}}}\). \(\square \)

Example 2.1

Let \({\mathcal {A}} = rand(2,2,2,2)\) is defined by

$$\begin{aligned} \begin{aligned} {\mathcal {A}}(:,:,1,1)&=\begin{bmatrix} 0.8147&\quad 0.1270\\ 0.9058&\quad 0.9134 \end{bmatrix},\ {\mathcal {A}}(:,:,2,1) =\begin{bmatrix} 0.6324&\quad 0.2785\\ 0.0975&\quad 0.5469 \end{bmatrix},\\ {\mathcal {A}}(:,:,1,2)&=\begin{bmatrix} 0.9575&\quad 0.1576\\ 0.9649&\quad 0.9706\end{bmatrix},\ {\mathcal {A}}(:,:,2,2) =\begin{bmatrix} 0.9572&\quad 0.8003\\ 0.4854&\quad 0.1419\end{bmatrix}, \end{aligned} \end{aligned}$$

then

$$\begin{aligned} A = \mathrm {rsh}{^{-1}}({\mathcal {A}})=\mathrm {reshape}({\mathcal {A}},4,4)=\begin{bmatrix} 0.8147&\quad 0.6324&\quad 0.9575&\quad 0.9572\\ 0.9058&\quad 0.0975&\quad 0.9649&\quad 0.4854\\ 0.1270&\quad 0.2785&\quad 0.1576&\quad 0.8003\\ 0.9134&\quad 0.5469&\quad 0.9706&\quad 0.1419\end{bmatrix}. \end{aligned}$$

Simple verification shows that \(\Vert {\mathcal {A}}\Vert _2=\Vert A\Vert _2=2.6201.\)

Various definitions of the tensor rank can be found in the relevant literature. For more details see Brazell et al. (2013), and Comon et al. (2009). An alternative definition of the tensor rank was introduced in Panigrahy et al. (2018).

Definition 2.3

(Panigrahy et al. 2018) Let \({\mathcal {A}}\in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {K}}({N})}\) and \(A=reshape \left( {\mathcal {A}},{\mathfrak {I}}, {\mathfrak {K}}\right) =\mathrm {rsh}({\mathcal {A}})\in \mathbb C^{{\mathfrak {I}}\times {\mathfrak {K}}}\) are defined as in Algorithm 1. Then, the tensor rank of \({\mathcal {A}}\) is denoted by \(\mathrm {rshrank}({\mathcal {A}})\) and defined by \(\mathrm {rshrank}({{\mathcal {A}}})=\mathrm {rank}({A})\).

3 Preliminary results

For \(a \in {\mathbb {C}}\), let \(a^\dagger = a^{-1}\), if \(a \ne 0\) and \(a^\dagger = 0\), if \(a = 0\). Following this notation, the tensor \({\mathcal {D}} \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}} \times {\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}}}\) is called diagonal if all its entries are zero except \({\mathcal {D}}_{i_{1} \ldots i_{{N}}i_{1} \ldots i_{{N}}}\), that is

$$\begin{aligned} {\mathcal {D}}_{ i_{1}\cdots i_{{N}}j_{1}\ldots j_{{N}}} = {\left\{ \begin{array}{ll} 0, &{}\quad (i_{1},\ldots ,i_{{N}}) \ne (j_{1},\ldots ,j_{{N}}),\\ {\mathcal {D}}_{ i_{1}\cdots i_{{N}}i_{1}\ldots i_{{N}}} , &{}\quad (i_{1}, \ldots ,i_{{N}}) = (j_{1}, \ldots ,j_{{N}}), \end{array}\right. } \end{aligned}$$
(3.1)

where \({\mathcal {D}}_{i_{1} \ldots i_{{N}}i_{1}\ldots i_{{N}}}\) is a complex number. Particularly, a diagonal tensor becomes a unit tensor in this case \({\mathcal {D}}_{ i_{1}\cdots i_{{N}}j_{1}\ldots j_{{N}}}= \delta _{i_1j_1}\cdots \delta _{i_Nj_N}\), where

$$\begin{aligned}\delta _{lk}= {\left\{ \begin{array}{ll} 1, &{}\quad l=k, \\ 0, &{}\quad l=k \end{array}\right. } \end{aligned}$$

is the Kronecker delta, then \({\mathcal {D}}\) is a unit tensor, denoted by \({\mathcal {I}}\). It follows from Definition 1.1 that the Moore–Penrose inverse \({\mathcal {D}}^\dagger \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}} \times {\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}}}\) of the diagonal tensor defined in (3.1) is equal to

$$\begin{aligned} \begin{aligned}&({\mathcal {D}}^\dagger )_{j_{1} \ldots j_{{N}}i_{1} \ldots i_{{N}}}= {\left\{ \begin{array}{ll} \frac{1}{{\mathcal {D}}_{i_{1} \ldots i_{{N}}j_{1} \ldots j_{{N}}}},&{}\quad {\mathcal {D}}_{i_{1} \ldots i_{{N}}j_{1} \ldots j_{{N}}}\ne 0,\\ 0, &{}\quad {\mathcal {D}}_{i_{1} \ldots i_{{N}}j_{1} \ldots j_{{N}}}=0. \end{array}\right. }. \end{aligned} \end{aligned}$$

It is easy to see that if \({\mathcal {D}}\) is a diagonal tensor, then \({\mathcal {D}}*_{{N}}{\mathcal {D}}^\dagger \) and \({\mathcal {D}}^\dagger *_{{N}}{\mathcal {D}}\) are diagonal tensors, whose diagonal entries are 1 or 0.

The tensor \({\mathcal {A}} \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}} \times {\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}}}\) is orthogonal if \({\mathcal {A}}*_{{N}}{\mathcal {A}}^{*} = {\mathcal {A}}^{*}*_{{N}}{\mathcal {A}} = {\mathcal {I}}\).

The computation of the Moore–Penrose inverse of a tensor was proposed in Brazell et al. (2013), Sun et al. (2016). This method is restated in Lemma 3.1.

Lemma 3.1

(Brazell et al. 2013; Sun et al. 2016) For a tensor \({\mathcal {A}} \in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {K}}({N})}\), the singular value decomposition (SVD) of \({\mathcal {A}}\) has the form:

$$\begin{aligned} {\mathcal {A}} = {\mathcal {U}}*_{{N}}{\mathcal {D}}*_{{N}}{\mathcal {V}}^{*}, \end{aligned}$$
(3.2)

where \({\mathcal {U}} \in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {I}}({N})}\) and \({\mathcal {V}} \in {\mathbb {C}}^{{\mathbf {K}}({N}) \times {\mathbf {K}}({N})}\) are orthogonal tensors, \({\mathcal {D}} \in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {K}}({N})}\) is a diagonal tensor satisfying

$$\begin{aligned} {\mathcal {D}}_{ i_{1}\cdots i_{{N}}k_{1}\ldots k_{{N}}} = {\left\{ \begin{array}{ll} 0, &{}\quad (i_{1},\ldots ,i_{{N}}) \ne (k_{1},\ldots ,k_{{N}}),\\ \mu _{i_{1} \ldots i_{{N}}}, &{}\quad (i_{1}, \ldots ,i_{{N}}) = (k_{1}, \ldots ,k_{{N}}), \end{array}\right. } \end{aligned}$$

wherein \(\mu _{i_{1} \ldots i_{{N}}}\) are the singular values of \({\mathcal {A}}\). Then

$$\begin{aligned} {\mathcal {A}}^\dagger = {\mathcal {V}}*_{{N}}{\mathcal {D}}^\dagger *_{{N}}{\mathcal {U}}^{*}, \end{aligned}$$
(3.3)

where

$$\begin{aligned} ({\mathcal {D}}^\dagger )_{ k_{1}\ldots k_{{N}} i_{1} \ldots i_{{N}}} = {\left\{ \begin{array}{ll} 0, &{}\quad (i_{1} \ldots i_{{N}}) \ne (k_{1} \ldots k_{{N}}),\\ \left( \mu _{i_{1} \ldots i_{{N}}}\right) ^\dagger , &{}\quad (i_{1} \ldots i_{{N}}) = (k_{1} \ldots k_{{N}}), \end{array}\right. } \end{aligned}$$

and

$$\begin{aligned} \left( \mu _{i_{1} \ldots i_{{N}}}\right) ^\dagger = {\left\{ \begin{array}{ll} 0, &{}\quad \mu _{i_{1} \ldots i_{{N}}}=0,\\ \left( \mu _{i_{1} \ldots i_{{N}}}\right) ^{-1}, &{}\quad \mu _{i_{1} \ldots i_{{N}}}\ne 0. \end{array}\right. } \end{aligned}$$

An effective algorithm for computing the Moore–Penrose inverse of a tensor in the form (3.3) was presented in Algorithm 1 from Huang et al. (2018). To compute the Moore–Penrose inverse by means of (3.3), it is necessary to compute the transpose of a tensor. For this purpose, we developed the following Algorithm 2.

figure b

Example 3.1

This example is aimed to verification of Algorithm 2. Consider \({\mathcal {A}}=\mathrm {rand}(2,2,2,2)\) equal to

$$\begin{aligned} \begin{aligned} {\mathcal {A}}(:,:,1,1)&= \begin{bmatrix} 0.8147&\quad 0.1270\\ 0.9058&\quad 0.9134\end{bmatrix} ;\ \ {\mathcal {A}}(:,:,2,1) = \begin{bmatrix} 0.6324&\quad 0.2785\\ 0.0975&\quad 0.5469\end{bmatrix} ;\\ {\mathcal {A}}(:,:,1,2)&= \begin{bmatrix} 0.9575&\quad 0.1576\\ 0.9649&\quad 0.9706\end{bmatrix} ;\ \ {\mathcal {A}}(:,:,2,2) = \begin{bmatrix} 0.9572&\quad 0.8003\\ 0.4854&\quad 0.1419\end{bmatrix} . \end{aligned} \end{aligned}$$

Then, \(A = \mathrm {rsh}\left( {\mathcal {A}},4,4\right) \) is equal to

$$\begin{aligned} A=\begin{bmatrix} 0.8147&\quad 0.6324&\quad 0.9575&\quad 0.9572\\ 0.9058&\quad 0.0975&\quad 0.9649&\quad 0.4854\\ 0.1270&\quad 0.2785&\quad 0.1576&\quad 0.8003\\ 0.9134&\quad 0.5469&\quad 0.9706&\quad 0.1419 \end{bmatrix}. \end{aligned}$$

Furthermore, the results of Algorithm 2 is equal to \({\mathcal {A}}=\mathrm {reshape}(A^{\mathrm T},2,2,2,2)\), which gives

$$\begin{aligned} \begin{aligned} {\mathcal {A}}^{\mathrm T}(:,:,1,1)&= \begin{bmatrix} 0.8147&\quad 0.9575\\ 0.6324&\quad 0.9572\end{bmatrix} ;\ \ {\mathcal {A}}^{\mathrm T}(:,:,2,1) = \begin{bmatrix} 0.9058&\quad 0.9649\\ 0.0975&\quad 0.4854\end{bmatrix} ;\\ {\mathcal {A}}^{\mathrm T}(:,:,1,2)&= \begin{bmatrix} 0.1270&\quad 0.1576\\ 0.2785&\quad 0.8003\end{bmatrix} ;\ \ {\mathcal {A}}^{\mathrm T}(:,:,2,2) = \begin{bmatrix} 0.9134&\quad 0.9706\\ 0.5469&\quad 0.1419\end{bmatrix} . \end{aligned} \end{aligned}$$
(3.4)

On the other hand, a direct calculation gives

$$\begin{aligned} \begin{aligned} a_{1111}&=0.8147={\mathcal {A}}^{\mathrm T}_{1111};\ a_{2111}=0.9058={\mathcal {A}}^{\mathrm T}_{1121};\\ a_{1211}&=0.1270={\mathcal {A}}^{\mathrm T}_{1112};\ a_{2211}=0.9134={\mathcal {A}}^{\mathrm T}_{1122};\\ a_{1121}&=0.6324={\mathcal {A}}^{\mathrm T}_{2111};\ a_{2121}=0.0975={\mathcal {A}}^{\mathrm T}_{2121};\\ a_{1221}&=0.2785={\mathcal {A}}^{\mathrm T}_{2112};\ a_{2221}=0.5469={\mathcal {A}}^{\mathrm T}_{2122};\\ a_{1112}&=0.9575={\mathcal {A}}^{\mathrm T}_{1211};\ a_{2112}=0.9649={\mathcal {A}}^{\mathrm T}_{1221};\\ a_{1212}&=0.1576={\mathcal {A}}^{\mathrm T}_{1212};\ a_{2212}=0.9706={\mathcal {A}}^{\mathrm T}_{1222};\\ a_{1122}&=0.9572={\mathcal {A}}^{\mathrm T}_{2211};\ a_{2122}=0.4854={\mathcal {A}}^{\mathrm T}_{2221};\\ a_{1222}&=0.8003={\mathcal {A}}^{\mathrm T}_{2212};\ a_{2222}=0.1419={\mathcal {A}}^{\mathrm T}_{2222}. \end{aligned} \end{aligned}$$

Therefore, the result of direct calculation coincides with (3.4).

Lemma 3.2

Let \({\mathcal {A}} \in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {K}}({N})}\) be a given tensor, let \(\mu _{i_{1},\cdots ,i_{{N}}}\) be the singular values of \({\mathcal {A}}\) and \(\nu _{i_{1},\cdots ,i_{{L}}}\) be the nonzero singular values of \({\mathcal {A}}\). Then

$$\begin{aligned} \Vert {\mathcal {A}}\Vert _{2} = \mu _{1}({\mathcal {A}}); \qquad \Vert {\mathcal {A}}^\dagger \Vert _{2} = \frac{1}{\nu _{\min }({\mathcal {A}})}, \end{aligned}$$

where \(\nu _{\min }({\mathcal {A}})\) denotes the smallest nonzero singular value of \({\mathcal {A}}\).

Proof

The identity \(\Vert {\mathcal {A}}\Vert _{2} = \mu _{1}({\mathcal {A}})\) follows from Lemma 2.1. Since \(\nu _{i_{1},\ldots ,i_{{L}}}\) are nonzero singular values of \({\mathcal {A}}\) defined in (3.2), it follows from (3.3) that \(\left( \nu _{i_{1},\cdots ,i_{{L}}}\right) ^{-1} > 0\) are the nonzero singular values of \({\mathcal {A}}^\dagger \). Accordingly, \(\Vert {\mathcal {A}}^\dagger \Vert _{2} =\mu _{1}(A^\dagger )= \frac{1}{\nu _{\min }({\mathcal {A}})}\). \(\square \)

A useful representation for \({{\mathcal {A}}}*_N{{\mathcal {A}}}^\dagger \) is derived in Lemma 3.3.

Lemma 3.3

Let \({\mathcal {A}}\in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {K}}({N})}\) be an arbitrary tensor and the positive integers \({\mathfrak {I}}, {\mathfrak {K}}\) are defined in (2.3). Then

$$\begin{aligned} {{\mathcal {A}}}*_N{{\mathcal {A}}}^\dagger ={{\mathcal {U}}}_A*_N \mathrm {rsh}{^{-1}}\left( \begin{bmatrix} I_{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)} \end{bmatrix}\right) *_N{{\mathcal {U}}}_A^*. \end{aligned}$$
(3.5)

Proof

We follow Algorithm 1 from Huang et al. (2018) to define \({\mathcal {A}}^\dagger \) and \({\mathcal {B}}^\dagger \). According to Step 1, it is necessary to reshape \({\mathcal {A}}\in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {K}}({N})}\) into a matrix \(A\in {\mathbb {C}}^{{\mathfrak {I}}\times {\mathfrak {K}}}\), where \({\mathfrak {I}}, {\mathfrak {K}}\) are defined in (2.3). This transformation is denoted by \(\mathrm {rsh}({\mathcal {A}})=A\). Step 2 assumes the Singular Value Decomposition (SVD) of A of the form \([U_A,D_A,V_A]=SVD(A)\), which implies \(A=U_A D_A V_A^*\), where \(U_A\in {\mathbb {C}}^{{\mathfrak {I}}\times {\mathfrak {I}}}\) and \(V_A\in {\mathbb {C}}^{{\mathfrak {K}}\times {\mathfrak {K}}}\) are unitary and the matrix \(D_A\in {\mathbb {C}}^{{\mathfrak {I}}\times {\mathfrak {K}}}\) is of the diagonal form:

$$\begin{aligned} D_A=\begin{bmatrix} \Sigma _A&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {K}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {K}}-{\mathfrak {I}}_R)} \end{bmatrix}, \end{aligned}$$

where

$$\begin{aligned} \Sigma _A\in {\mathbb {C}}^{{\mathfrak {I}}_R\times {\mathfrak {I}}_R},\ {\mathfrak {I}}_R=\mathrm {rshrank}({A}) \end{aligned}$$

is diagonal with singular values of A on the main diagonal and

$$\begin{aligned} O_{{\mathfrak {I}}_R\times ({\mathfrak {K}}-{\mathfrak {I}}_R)}\in \mathbb C^{{\mathfrak {I}}_R\times ({\mathfrak {K}}-{\mathfrak {I}}_R)}, O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}\in \mathbb C^{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}, O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {K}}-{\mathfrak {I}}_R)}\in \mathbb C^{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {K}}-{\mathfrak {I}}_R)} \end{aligned}$$

are appropriate zero blocks. According to Step 3, we perform the reshaping operations:

$$\begin{aligned} \mathrm {rsh}(U_A)={{\mathcal {U}}}_A\in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {I}}({N})},\ \ \mathrm {rsh}(V_A^*)={{\mathcal {V}}}_A^*\in {\mathbb {C}}^{{\mathbf {K}}({N}) \times {\mathbf {K}}({N})},\ \ \mathrm {rsh}(D_A)={{\mathcal {D}}}_A\in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {K}}({N})}. \end{aligned}$$

Then, compute

$$\begin{aligned} D_A^\dagger =\begin{bmatrix} \Sigma _A^{-1}&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {K}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad O_{({\mathfrak {K}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)} \end{bmatrix} \in {\mathbb {C}}^{{\mathfrak {K}}\times {\mathfrak {I}}} \end{aligned}$$

and

$$\begin{aligned} {{\mathcal {D}}}_A^\dagger =\mathrm {rsh}{^{-1}}(D_A^\dagger )\in {\mathbb {C}}^{{\mathbf {K}}({N}) \times {\mathbf {I}}({N})}. \end{aligned}$$

According to Step 4 of Algorithm 1 from Huang et al. (2018)

$$\begin{aligned} {{\mathcal {A}}}^\dagger ={{\mathcal {V}}}_A*_N {{\mathcal {D}}}_A^\dagger *_N{{\mathcal {U}}}_A^*. \end{aligned}$$

Now, the tensor \({{\mathcal {A}}}\) possesses the representation:

$$\begin{aligned} {{\mathcal {A}}}={{\mathcal {U}}}_A*_N {{\mathcal {D}}}_A *_N{\mathcal V}_A^*. \end{aligned}$$

Later, one can verify

$$\begin{aligned} {{\mathcal {D}}}_A*_N{{\mathcal {D}}}_A^\dagger = \mathrm {rsh}{^{-1}}\left( \begin{bmatrix} I_{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)} \end{bmatrix}\right) \in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {I}}({N})}, \end{aligned}$$

where \(I_{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}\) is the identity \({\mathfrak {I}}_R\times {\mathfrak {I}}_R\) matrix. Consequently, \({{\mathcal {A}}}*_N{{\mathcal {A}}}^\dagger \) possesses the representation (3.5). \(\square \)

The result of Proposition 3.1 will be useful.

Proposition 3.1

(Meng and Zheng 2010) Let \(W\in {\mathbb {C}}^{n\times n}\) be a unitary matrix with the block form:

$$\begin{aligned} W =\begin{bmatrix} W_{11}&\quad W_{12} \\ W_{21}&\quad W_{22}\end{bmatrix}, \ W_{11} \in {\mathbb {C}}^{r\times r}, W_{22}\in {\mathbb {C}}^{(n-r)\times (n-r)}, 1\le r < n. \end{aligned}$$

Then, \(\Vert W_{12}\Vert = \Vert W_{21}\Vert \) for any unitarily invariant norm.

4 Main results

For the sake of convenience, we assume that the following condition holds

$$\begin{aligned} \begin{aligned} {\mathcal {A}}, {\mathcal {E}}&\in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {K}}({N})}, \ {\mathcal {B}} = {\mathcal {A}} + {\mathcal {E}}, \ \mathrm {rshrank}({{{\mathcal {A}}}})=\mathrm {rshrank}({{{\mathcal {B}}}}) = r\\ \bigtriangleup&= \Vert {\mathcal {A}}^\dagger \Vert _{2}\Vert {\mathcal {E}}\Vert _{2} < 1. \end{aligned} \end{aligned}$$
(4.1)

Lemma 4.1

If the Condition (4.1) is satisfied, then

$$\begin{aligned} \Vert {\mathcal {B}}^\dagger \Vert _{2} \le \frac{\Vert {\mathcal {A}}^\dagger \Vert _{2}}{1 - \bigtriangleup } \end{aligned}$$
(4.2)

Proof

According to the Lemma 3.2, we get

$$\begin{aligned} \Vert {\mathcal {B}}^\dagger \Vert _{2} = \frac{1}{\nu _{\min }({\mathcal {A}} + {\mathcal {E}})}, \end{aligned}$$

so

$$\begin{aligned} \begin{aligned} \frac{1}{\Vert {\mathcal {B}}^\dagger \Vert _{2}}&= \nu _{\min }({\mathcal {A}} + {\mathcal {E}}) \ge \nu _{\min }({\mathcal {A}}) - \nu _{\min }({\mathcal {E}}) \ge \nu _{\min }({\mathcal {A}}) - \mu _{1}({\mathcal {E}})\\&\ge \nu _{\min }({\mathcal {A}}) - \Vert {\mathcal {E}}\Vert _{2} \\&= \frac{1}{\Vert {\mathcal {A}}^\dagger \Vert _{2}} - \Vert {\mathcal {E}}\Vert _{2}. \end{aligned} \end{aligned}$$

Then

$$\begin{aligned} \Vert {\mathcal {B}}^\dagger \Vert _{2} \le \frac{\Vert {\mathcal {A}}^\dagger \Vert _{2}}{1 - \Vert {\mathcal {A}}^\dagger \Vert _{2}\Vert {\mathcal {E}}\Vert _{2}} = \frac{\Vert {\mathcal {A}}^\dagger \Vert _{2}}{1 - \bigtriangleup }, \end{aligned}$$

which completes the proof. \(\square \)

Next, we give the decomposition of \({\mathcal {B}}^\dagger - {\mathcal {A}}^\dagger \).

Theorem 4.1

Let \({\mathcal {A}}, {\mathcal {E}} \in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {K}}({N})}\) and \({\mathcal {B}} = {\mathcal {A}} + {\mathcal {E}}\). Then

$$\begin{aligned} \begin{aligned} {\mathcal {B}}^\dagger -{\mathcal {A}}^\dagger&= -{\mathcal {B}}^\dagger *_{{N}}{\mathcal {E}}*_{{N}}{\mathcal {A}}^\dagger + {\mathcal {B}}^\dagger *_{{N}}({\mathcal {B}}^\dagger )^{*}*_{{N}}{\mathcal {E}}^{*}*_{{N}}{P}^{\bot }_{{\mathcal {A}}} -{R}^{\bot }_{{\mathcal {B}}} *_{{N}}{\mathcal {E}}^{*}*_{{N}}({\mathcal {A}}^\dagger )^{*}*_{{N}}{\mathcal {A}}^\dagger . \end{aligned} \end{aligned}$$

Proof

After some verifications, one can obtain

$$\begin{aligned} \begin{aligned} {\mathcal {B}}^\dagger -{\mathcal {A}}^\dagger&= -{\mathcal {B}}^\dagger *_{{N}}{\mathcal {E}}*_{{N}}{\mathcal {A}}^\dagger +({\mathcal {B}}^\dagger -{\mathcal {A}}^\dagger ) +{\mathcal {B}}^\dagger *_{{N}}({\mathcal {B}}-{\mathcal {A}})*_{{N}}{\mathcal {A}}^\dagger \\&= -{\mathcal {B}}^\dagger *_{{N}}{\mathcal {E}}*_{{N}}{\mathcal {A}}^\dagger +{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger ) -({\mathcal {I}}-{\mathcal {B}}^\dagger *_{{N}}{\mathcal {B}})*_{{N}}{\mathcal {A}}^\dagger . \end{aligned} \end{aligned}$$

According to the properties \((3^T)\) and \((1^T)\) from Definition 1.1, it follows that

$$\begin{aligned} {\mathcal {A}}^{*}*_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )= {\mathcal {A}}^{*}-{\mathcal {A}}^{*}*_{{N}}({\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )^{*} = {\mathcal {A}}^{*}-\left( {\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger *_{{N}}{\mathcal {A}}\right) ^{*}={\mathcal {O}}, \end{aligned}$$

where \({\mathcal {O}}\in {\mathbb {C}}^{{\mathbf {K}}({N}) \times {\mathbf {I}}({N})}\) is an appropriate zero tensor. Consequently

$$\begin{aligned}&{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )= {\mathcal {B}}^\dagger *_{{N}}{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\\&\quad ={\mathcal {B}}^\dagger *_{{N}}({\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger )^{*}*_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\\&\quad ={\mathcal {B}}^\dagger *_{{N}}({\mathcal {B}}^\dagger )^{*} {{*_{{N}}}}({\mathcal {A}} +{\mathcal {E}})^{*}*_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\\&\quad ={\mathcal {B}}^\dagger *_{{N}}({\mathcal {B}}^\dagger )^{*}*_{{N}}{\mathcal {E}}^{*}*_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger ). \end{aligned}$$

Analogously, we arrive to

$$\begin{aligned} ({\mathcal {I}}-{\mathcal {B}}^\dagger *_{{N}}{\mathcal {B}})*_{{N}}{\mathcal {B}}^{*}={\mathcal {O}}, \end{aligned}$$

which further implies

$$\begin{aligned} ({\mathcal {I}}-{\mathcal {B}}^\dagger *_{{N}}{\mathcal {B}})*_{{N}}{\mathcal {A}}^\dagger =-({\mathcal {I}}-{\mathcal {B}}^\dagger *_{{N}}{\mathcal {B}})*_{{N}}{\mathcal {E}}^{*}*_{{N}}({\mathcal {A}}^\dagger )^{*}*_{{N}}{\mathcal {A}}^{*}. \end{aligned}$$

The conclusion can be obtained. \(\square \)

Lemma 4.2

If \({\mathcal {O}} \ne {\mathcal {P}} \in {\mathbb {C}}^{{\mathbf {K}}({N}) \times {\mathbf {K}}({N})}\), and \({\mathcal {P}}^{2}={\mathcal {P}}={\mathcal {P}}^{*}\), then

$$\begin{aligned} \Vert {\mathcal {P}}\Vert _{2}=1. \end{aligned}$$

Proof

Since

$$\begin{aligned} \Vert {\mathcal {P}}\Vert _{2}^{2}=\Vert {\mathcal {P}}^{*}*_{{N}}{\mathcal {P}}\Vert _{2}=\Vert {\mathcal {P}}^{2}\Vert _{2}=\Vert {\mathcal {P}}\Vert _{2}, \end{aligned}$$

it follows that

$$\begin{aligned} \Vert {\mathcal {P}}\Vert _{2}(\Vert {\mathcal {P}}\Vert _{2}-1)=0. \end{aligned}$$

Therefore, \(\Vert {\mathcal {P}}\Vert _{2}=1\) in the case \({\mathcal {P}} \ne {\mathcal {O}}\). \(\square \)

Theorem 4.2

Let \({\mathcal {A}}, {\mathcal {E}} \in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {K}}({N})}, {\mathcal {B}} = {\mathcal {A}} + {\mathcal {E}}{.}\) If the Condition (4.1) is satisfied, then

$$\begin{aligned} \frac{\Vert {\mathcal {B}}^\dagger - {\mathcal {A}}^\dagger \Vert _{2}}{\Vert {\mathcal {A}}^\dagger \Vert _{2}} \le \left( 1+\frac{1}{1-\bigtriangleup }+\frac{1}{(1-\bigtriangleup )^{2}}\right) \bigtriangleup . \end{aligned}$$
(4.3)

Proof

Since \(({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )^{2}=({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )=({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )^{*}\) and \(({\mathcal {I}}-{\mathcal {B}}^\dagger *_{{N}}{\mathcal {B}})^{2}=({\mathcal {I}}-{\mathcal {B}}^\dagger *_{{N}}{\mathcal {B}})=({\mathcal {I}}-{\mathcal {B}}^\dagger *_{{N}}{\mathcal {B}})^{*}\), by Lemma 4.2

$$\begin{aligned} \Vert {\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger \Vert _{2}=1,\quad \Vert {\mathcal {I}}-{\mathcal {B}}^\dagger *_{{N}}{\mathcal {B}}\Vert _{2}=1, \end{aligned}$$

and from Theorem 4.1

$$\begin{aligned} \Vert {\mathcal {B}}^\dagger - {\mathcal {A}}^\dagger \Vert _{2} \le (\Vert {\mathcal {A}}^\dagger \Vert _{2}\Vert {\mathcal {B}}^\dagger \Vert _{2}+\Vert {\mathcal {B}}^\dagger \Vert _{2}^{2}+\Vert {\mathcal {A}}^\dagger \Vert _{2}^{2})\Vert {\mathcal {E}}\Vert _{2}. \end{aligned}$$

An application of Lemma 4.1 initiates

$$\begin{aligned} \Vert {\mathcal {B}}^\dagger - {\mathcal {A}}^\dagger \Vert _{2} \le \left( \frac{\Vert {\mathcal {A}}^\dagger \Vert _{2}^{2}}{1 - \bigtriangleup }+ \frac{\Vert {\mathcal {A}}^\dagger \Vert _{2}^{2}}{(1-\bigtriangleup )^{2}}+\Vert {\mathcal {A}}^\dagger \Vert _{2}^{2}\right) \Vert {\mathcal {E}}\Vert _{2}. \end{aligned}$$

Furthermore, the inequality (4.3) can be verified taking into account \(\bigtriangleup = \Vert {\mathcal {A}}^\dagger \Vert _{2}\Vert {\mathcal {E}}\Vert _{2}\). \(\square \)

Theorem 4.3

If \({\mathcal {A}}, {\mathcal {E}} \in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {K}}({N})}, {\mathcal {B}} = {\mathcal {A}} + {\mathcal {E}}\), and \(\mathrm {rshrank}({{\mathcal {A}}})=\mathrm {rshrank}({{\mathcal {B}}})\), then

$$\begin{aligned} \Vert {\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\Vert _{2} =\Vert {\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger )\Vert _{2}, \end{aligned}$$
(4.4)

where \({\mathcal {I}}\) is the identity \({\mathbf {I}}(N)\times {\mathbf {I}}(N)\) tensor.

Proof

According to Lemma 3.3, it follows that

$$\begin{aligned} {{\mathcal {A}}}*_N{{\mathcal {A}}}^\dagger ={{\mathcal {U}}}_A*_N \mathrm {rsh}{^{-1}}\left( \begin{bmatrix} I_{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)} \end{bmatrix}\right) *_N{{\mathcal {U}}}_A^*. \end{aligned}$$

Furthermore, using Lemma 2.2 and

$$\begin{aligned} {\mathcal {I}}=\mathrm {rsh}{^{-1}}\left( I_{{\mathfrak {I}}\times {\mathfrak {I}}}\right) \in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {I}}({N})}, \end{aligned}$$

it follows that

$$\begin{aligned} \begin{aligned} {\mathcal {I}}-{{\mathcal {A}}}{{*_{{N}}}}{\mathcal A}^\dagger&={{\mathcal {U}}}_A*_N \left( {\mathcal {I}}-\mathrm {rsh}{^{-1}}\left( \begin{bmatrix} I_{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)} \end{bmatrix}\right) \right) *_N{{\mathcal {U}}}_A^*\\&= {{\mathcal {U}}}_A*_N \mathrm {rsh}{^{-1}}\left( \begin{bmatrix} O_{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad I_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)} \end{bmatrix}\right) *_N{{\mathcal {U}}}_A^*. \end{aligned} \end{aligned}$$

Similarly, in view of \(\mathrm {rshrank}({{\mathcal {B}}})=\mathrm {rshrank}({{\mathcal {A}}})\), it follows \(\mathrm {rank}({B})=\mathrm {rank}({A})\), where \(\mathrm {rsh}({\mathcal {A}})=A\) and \(\mathrm {rsh}({\mathcal {B}})=B\). The SVD of \({\mathcal {B}}\) is given by \([U_B,D_B,V_B]=SVD(B)\). Now, consider the reshaping operations

$$\begin{aligned} \begin{aligned} \mathrm {rsh}(U_B)&={{\mathcal {U}}}_B\in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {I}}({N})},\ \ \mathrm {rsh}(V_B^*)={{\mathcal {V}}}_B^*\in {\mathbb {C}}^{{\mathbf {K}}({N}) \times {\mathbf {K}}({N})},\\ \mathrm {rsh}(D_B)&= \mathrm {rsh}{^{-1}}\left( \begin{bmatrix} \Sigma _B&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {K}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {K}}-{\mathfrak {I}}_R)} \end{bmatrix}\right) ={{\mathcal {D}}}_B\in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {I}}({N})}, \end{aligned} \end{aligned}$$

where

$$\begin{aligned} \Sigma _B\in {\mathbb {C}}^{{\mathfrak {I}}_R\times {\mathfrak {I}}_R},\ {\mathfrak {I}}_R=\mathrm {rank}({A}) \end{aligned}$$

is diagonal with singular values of B on the main diagonal. This causes

$$\begin{aligned}&{{\mathcal {B}}}={{\mathcal {U}}}_B*_N {{\mathcal {D}}}_B *_N{\mathcal V}_B^*,\ \ {{\mathcal {B}}}{{*_{{N}}}}{{\mathcal {B}}}^\dagger \\&\quad ={{\mathcal {U}}}_B*_N \mathrm {rsh}{^{-1}}\left( \begin{bmatrix} I_{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)} \end{bmatrix}\right) *_N{\mathcal U}_B^*, \end{aligned}$$

and further

$$\begin{aligned} \begin{aligned} {\mathcal {I}}-{{\mathcal {B}}}{{*_{{N}}}}{\mathcal B}^\dagger&={{\mathcal {U}}}_B*_N \left( {\mathcal {I}}- \mathrm {rsh}{^{-1}}\left( \begin{bmatrix} I_{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)} \end{bmatrix}\right) \right) *_N{{\mathcal {U}}}_B^*\\&={{\mathcal {U}}}_B*_N \mathrm {rsh}{^{-1}}\left( \begin{bmatrix} O_{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad I_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)} \end{bmatrix}\right) *_N{{\mathcal {U}}}_B^*. \end{aligned} \end{aligned}$$

Now, observe the tensor products \({{\mathcal {U}}}_A^**_N {\mathcal U}_B\) and \({{\mathcal {U}}}_B^**_N {{\mathcal {U}}}_A\). They are also unitary and equal to

$$\begin{aligned} \begin{aligned} {{\mathcal {U}}}_A^**_N {{\mathcal {U}}}_B&=\mathrm {rsh}\left( U_A^* U_B\right) =\mathrm {rsh}\left( \begin{bmatrix} W_{11}&W_{12}\\ W_{21}&W_{22}\end{bmatrix} \right) \\ {{\mathcal {U}}}_B^**_N {{\mathcal {U}}}_A&=\mathrm {rsh}\left( U_B^* U_A\right) =\mathrm {rsh}\left( \begin{bmatrix} W_{11}^*&\quad W_{21}^*\\ W_{12}^*&\quad W_{22}^*\end{bmatrix} \right) , \end{aligned} \end{aligned}$$

where

$$\begin{aligned} W_{11}\in {\mathbb {C}}^{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}, W_{12}\in {\mathbb {C}}^{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}, W_{21}\in {\mathbb {C}}^{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}, W_{22}\in \mathbb C^{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}, \end{aligned}$$

In addition, it can be verified that \(\Vert {\cdot }\Vert _2\) is a unitary invariant tensor norm (Govaerts and Pryce 1989), which implies in conjunction with Lemma 2.2:

$$\begin{aligned} \begin{aligned}&\Vert {\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\Vert _{2}\\&\quad = \left\| \mathrm {rsh}{^{-1}}\left( \begin{bmatrix} I_{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)} \end{bmatrix} \begin{bmatrix} W_{11}&\quad W_{12}\\ W_{21}&\quad W_{22}\end{bmatrix} \begin{bmatrix} O_{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad I_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)} \end{bmatrix}\right) \right\| _2. \end{aligned} \end{aligned}$$

An application of Lemma 2.4 further implies

$$\begin{aligned} \begin{aligned}&\Vert {\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\Vert _{2}\\&\quad = \left\| \begin{bmatrix} I_{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)} \end{bmatrix} \begin{bmatrix} W_{11}&\quad W_{12}\\ W_{21}&\quad W_{22}\end{bmatrix} \begin{bmatrix} O_{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad I_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)} \end{bmatrix}\right\| _2\\&\quad =\left\| \begin{bmatrix} O&\quad W_{12}\\ O&\quad O \end{bmatrix} \right\| _2. \end{aligned} \end{aligned}$$

Finally, using the result from Govaerts and Pryce (1989), it follows that

$$\begin{aligned} \begin{aligned} \Vert {\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\Vert _{2}= \left\| W_{12}\right\| _2. \end{aligned} \end{aligned}$$

On the other hand, in dual case, it follows that

$$\begin{aligned} \begin{aligned}&\Vert {\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger )\Vert _{2}\\&\quad = \left\| \begin{bmatrix} I_{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)} \end{bmatrix} \begin{bmatrix} W_{11}^*&\quad W_{21}^*\\ W_{12}^*&\quad W_{22}^*\end{bmatrix} \begin{bmatrix} O_{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad I_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)} \end{bmatrix}\right\| _2\\&\quad =\left\| \begin{bmatrix} O&\quad W_{21}^*\\ O&\quad O \end{bmatrix} \right\| _2\\&\quad =\left\| W_{21}^*\right\| _2. \end{aligned} \end{aligned}$$

The proof can be completed by verifying \( \mu _{1}\left( {\mathcal {W}}_{12}\right) =\mu _{1}\left( {\mathcal {W}}_{21}^*\right) . \) Indeed, according to Proposition 3.1, it follows that \(\Vert W_{12}\Vert _2=\Vert W_{21}\Vert _2\). The proof can be completed using the result \(\Vert W_{21}\Vert _2=\Vert W_{21}^*\Vert _2\) from Govaerts and Pryce (1989). \(\square \)

Corollary 4.1

If \({\mathcal {A}}, {\mathcal {E}} \in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {K}}({N})}\), \({\mathcal {B}} = {\mathcal {A}} + {\mathcal {E}}\), and \(\mathrm {rshrank}({{\mathcal {A}}}) = \mathrm {rshrank}({{\mathcal {B}}})\), let

$$\begin{aligned} {\mathcal {G}}={\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger ), \end{aligned}$$

then

$$\begin{aligned} \Vert {\mathcal {G}}\Vert _{2} \le \Vert {\mathcal {E}}\Vert _{2}\Vert {\mathcal {A}}^\dagger \Vert _{2} \Vert {\mathcal {B}}^\dagger \Vert _{2}. \end{aligned}$$
(4.5)

Proof

Clearly, \({\mathcal {G}}={\mathcal {B}}^\dagger *_{{N}}{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\). Then

$$\begin{aligned} \Vert {\mathcal {G}}\Vert _{2} \le \Vert {\mathcal {B}}^\dagger \Vert _{2}\Vert {\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\Vert _{2}, \end{aligned}$$

since

$$\begin{aligned}&{\mathcal {B}}^{*}*_{{N}}({\mathcal {I}}-{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger )={{\mathcal {O}}},\\&({\mathcal {I}}-{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger )^{2} =({\mathcal {I}}-{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger ) =({\mathcal {I}}-{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger )^{*}. \end{aligned}$$

Therefore, \(\Vert {\mathcal {I}}-{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger \Vert _{2}=1\), applying Theorem 4.3, one can obtain

$$\begin{aligned} \begin{aligned} \Vert {\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\Vert _{2}&= \Vert {\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger )\Vert _{2} \\&= \Vert ({\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )^{*}*_{{N}}({\mathcal {I}}-{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger )\Vert _{2}\\&= \Vert ({\mathcal {A}}^\dagger )^{*}*_{{N}}{\mathcal {E}}^{*}*_{{N}}({\mathcal {I}}-{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger )\Vert _{2}\\&\le \Vert ({\mathcal {A}}^\dagger )^{*}*_{{N}}{\mathcal {E}}^{*}\Vert _{2}=\Vert {\mathcal {E}}*_{{N}}{\mathcal {A}}^\dagger \Vert _{2}\\&\le \Vert {\mathcal {E}}\Vert _{2} \Vert {\mathcal {A}}^\dagger \Vert _{2}. \end{aligned} \end{aligned}$$

Thus, the statement can be obtained. \(\square \)

Theorem 4.4

Let \({\mathcal {A}}, {\mathcal {E}} \in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {K}}({N})}, {\mathcal {B}} = {\mathcal {A}} + {\mathcal {E}}\) and \({\mathfrak {I}},{\mathfrak {K}}\) are defined as in (2.3). If the Condition (4.1) is satisfied, then

$$\begin{aligned} \frac{\Vert {\mathcal {B}}^\dagger - {\mathcal {A}}^\dagger \Vert _{2}}{\Vert {\mathcal {A}}^\dagger \Vert _{2}} \le k\frac{\triangle }{1-\triangle }, \end{aligned}$$
(4.6)

where \(\triangle = \Vert {\mathcal {A}}^\dagger \Vert _{2}\Vert {\mathcal {E}}\Vert _{2}\) and the parameter k is defined as follows:

  1. (1)

    if \(\mathrm {rshrank}({{\mathcal {A}}}) < \min ({\mathfrak {I}}, {\mathfrak {K}})\), then \(k = \frac{1+\sqrt{5}}{2}\);

  2. (2)

    if \(\mathrm {rshrank}({{\mathcal {A}}}) = \min ({\mathfrak {I}}, {\mathfrak {K}})\), then \(k = \sqrt{2}\);

  3. (3)

    if \(\mathrm {rshrank}({{\mathcal {A}}}) = {\mathfrak {I}} ={\mathfrak {K}}\), then \(k = 1\).

Proof

Let

$$\begin{aligned} {\mathcal {F}}=-{\mathcal {B}}^\dagger *_{{N}}{\mathcal {E}}*_{{N}}{\mathcal {A}}^\dagger ,\ {\mathcal {G}}={\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger ),\ {\mathcal {H}}=-({\mathcal {I}}-{\mathcal {B}}^\dagger *_{{N}}{\mathcal {B}})*_{{N}}{\mathcal {A}}^\dagger . \end{aligned}$$

By the Lemma 4.1, we can get

$$\begin{aligned} \begin{aligned} \Vert {\mathcal {F}}\Vert _{2}&\le \frac{\triangle }{1-\triangle }\Vert {\mathcal {A}}^\dagger \Vert _{2},\\ \Vert {\mathcal {G}}\Vert _{2}&\le \frac{\triangle }{1-\triangle }\Vert {\mathcal {A}}^\dagger \Vert _{2},\\ \Vert {\mathcal {H}}\Vert _{2}&\le \triangle \Vert {\mathcal {A}}^\dagger \Vert _{2}, \end{aligned} \end{aligned}$$

where \(\triangle = \Vert {\mathcal {A}}^\dagger \Vert _{2}\Vert {\mathcal {E}}\Vert _{2} = \Vert ({\mathcal {A}}^\dagger )^{*}\Vert _{2}\Vert {\mathcal {E}}^{*}\Vert _{2}\). Let

$$\begin{aligned} \alpha = \frac{\triangle }{1-\triangle }, \end{aligned}$$

then

$$\begin{aligned} \Vert {\mathcal {F}}\Vert _{2}, \Vert {\mathcal {G}}\Vert _{2}, \Vert {\mathcal {H}}\Vert _{2} \le \alpha \Vert {\mathcal {A}}^\dagger \Vert _{2}. \end{aligned}$$

(1) Let \({\mathcal {X}} \in {\mathbb {C}}^{{I}_{1} \times \cdots \times {I}_{{N}} \times {K}_{1} \times \cdots \times {K}_{{N}}}, \Vert {\mathcal {X}}\Vert _{2} = 1\), and \({\mathcal {X}} = {\mathcal {X}}_{1}+{\mathcal {X}}_{2}\), where

$$\begin{aligned} {\mathcal {X}}_{1}={\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger *_{{N}}{\mathcal {X}},\quad {\mathcal {X}}_{2}=({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )*_{{N}}{\mathcal {X}}. \end{aligned}$$

Clearly, \({\mathcal {X}}_{1}\) and \({\mathcal {X}}_{2}\) are orthogonal; hence,

$$\begin{aligned} 1 = \Vert {\mathcal {X}}\Vert _{2}^{2}= \Vert {\mathcal {X}}_{1}\Vert _{2}^{2}+\Vert {\mathcal {X}}_{2}\Vert _{2}^{2}. \end{aligned}$$

Therefore, there must be an angle \(\varphi \) which makes

$$\begin{aligned} \cos \varphi = \Vert {\mathcal {X}}_{1}\Vert _{2},\quad \sin \varphi = \Vert {\mathcal {X}}_{2}\Vert _{2}. \end{aligned}$$

Then

$$\begin{aligned} \begin{aligned}&({\mathcal {B}}^\dagger -{\mathcal {A}}^\dagger )*_{{N}}{\mathcal {X}}\\&\quad =-{\mathcal {B}}^\dagger *_{{N}}{\mathcal {E}}*_{{N}}{\mathcal {A}}^\dagger *_{{N}}{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger *_{{N}}{\mathcal {X}} +{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )*_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )*_{{N}}{\mathcal {X}}\\&\qquad -({\mathcal {I}}-{\mathcal {B}}^\dagger *_{{N}}{\mathcal {B}})*_{{N}}{\mathcal {A}}^\dagger *_{{N}}{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger *_{{N}}{\mathcal {X}}=\\&{\mathcal {F}}*_{{N}}{\mathcal {X}}_{1}+{\mathcal {G}}*_{{N}}{\mathcal {X}}_{2}+{\mathcal {H}}*_{{N}}{\mathcal {X}}_{1} \equiv {\mathcal {Y}}_{1}+{\mathcal {Y}}_{2}+{\mathcal {Y}}_{3}, \end{aligned} \end{aligned}$$

where \({\mathcal {Y}}_{1}={\mathcal {F}}*_{{N}}{\mathcal {X}}_{1},\ {\mathcal {Y}}_{2}={\mathcal {G}}*_{{N}}{\mathcal {X}}_{2},\ {\mathcal {Y}}_{3}={\mathcal {H}}*_{{N}}{\mathcal {X}}_{1} \).

Since

$$\begin{aligned} ({\mathcal {I}}-{\mathcal {B}}^\dagger *_{{N}}{\mathcal {B}})*_{{N}}{\mathcal {B}}^{*}={\mathcal {O}}. \end{aligned}$$

It is easy to verify that \({\mathcal {Y}}_{3}\) is orthogonal to \({\mathcal {Y}}_{1}\) and \({\mathcal {Y}}_{2}\); therefore

$$\begin{aligned} \Vert ({\mathcal {B}}^\dagger -{\mathcal {A}}^\dagger )*_{{N}}{\mathcal {X}}\Vert _{2}^{2}\le & {} \Vert {\mathcal {Y}}_{1}+{\mathcal {Y}}_{2}\Vert _{2}^{2}+\Vert {\mathcal {Y}}_{3}\Vert _{2}^{2} \\\le & {} \alpha ^{2}\Vert {\mathcal {A}}^\dagger \Vert _{2}^{2}[(\Vert {\mathcal {X}}_{1}\Vert _{2}+\Vert {\mathcal {X}}_{2}\Vert _{2})^{2}+\Vert {\mathcal {X}}_{1}\Vert _{2}^{2}] \\= & {} \alpha ^{2}\Vert {\mathcal {A}}^\dagger \Vert _{2}^{2}[(\cos \varphi +\sin \varphi )^{2}+\cos ^{2}\varphi ]\\= & {} \alpha ^{2}\Vert {\mathcal {A}}^\dagger \Vert _{2}^{2}(3+2\sin 2\varphi +\cos 2\varphi )/2\\\le & {} \left( \frac{3+\sqrt{5}}{2}\right) \alpha ^{2}\Vert {\mathcal {A}}^\dagger \Vert _{2}^{2}. \end{aligned}$$

Therefore

$$\begin{aligned} \Vert {\mathcal {B}}^\dagger - {\mathcal {A}}^\dagger \Vert _{2}= \max _{\Vert {\mathcal {X}}\Vert _{2}=1}\Vert ({\mathcal {B}}^\dagger -{\mathcal {A}}^\dagger )*_{{N}}{\mathcal {X}}\Vert _{2} \le \frac{1+\sqrt{5}}{2}\alpha \Vert {\mathcal {A}}^\dagger \Vert _{2}. \end{aligned}$$

(2) If \(\mathrm {rshrank}({{\mathcal {A}}})= \mathrm {rshrank}({{\mathcal {B}}}) = {\mathfrak {K}} < {\mathfrak {I}}\), owing to

$$\begin{aligned} {\mathcal {B}}^\dagger = ({\mathcal {B}}^{*}*_{{N}}{\mathcal {B}})^{-1}{\mathcal {B}}^{*}, \end{aligned}$$

then \(({\mathcal {I}}-{\mathcal {B}}^\dagger *_{{N}}{\mathcal {B}}) = {\mathcal {O}}\); therefore, \({\mathcal {H}}={\mathcal {O}}, {\mathcal {Y}}_{3}={\mathcal {O}}\). If rank\({(A)} =\) rank\({{(B)}} = {\mathfrak {I}} < {\mathfrak {K}}\), owing to

$$\begin{aligned} {\mathcal {A}}^\dagger = {\mathcal {A}}^{*}*_{{N}}({\mathcal {A}}*_{{N}}{\mathcal {A}}^{*})^{-1}, \end{aligned}$$

then \(({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )={\mathcal {O}}\), so \({\mathcal {G}}={\mathcal {O}}, {\mathcal {Y}}_{2}={\mathcal {O}}\). When one of \({\mathcal {Y}}_{2}\) or \({\mathcal {Y}}_{3}\) is the zero tensor

$$\begin{aligned} \Vert ({\mathcal {B}}^\dagger -{\mathcal {A}}^\dagger )*_{{N}}{\mathcal {X}}\Vert _{2}^{2} \le 2\alpha ^{2}\Vert {\mathcal {A}}^\dagger \Vert _{2}^{2}. \end{aligned}$$

Hence

$$\begin{aligned} \Vert {\mathcal {B}}^\dagger - {\mathcal {A}}^\dagger \Vert _{2}= \max _{\Vert {\mathcal {X}}\Vert _{2}=1}\Vert ({\mathcal {B}}^\dagger -{\mathcal {A}}^\dagger )*_{{N}}{\mathcal {X}}\Vert _{2} \le \sqrt{2}\alpha \Vert {\mathcal {A}}^\dagger \Vert _{2}. \end{aligned}$$

(3) In the case \(\mathrm {rshrank}({{\mathcal {A}}})= \mathrm {rshrank}({{\mathcal {B}}}) = {\mathfrak {I}} = {\mathfrak {K}}\), according to (2), we know \({\mathcal {G}}={\mathcal {H}}={\mathcal {O}}\), so the conclusion is established. \(\square \)

Next, we introduce the condition number of the Moore–Penrose inverse for tensor \({\mathcal {A}}\):

$$\begin{aligned} {\mathbb {K}}_{2}({\mathcal {A}}) = \Vert {\mathcal {A}}\Vert _{2}\Vert {\mathcal {A}}^\dagger \Vert _{2}. \end{aligned}$$

Theorem 4.5

If the Condition (4.1) is satisfied, then

$$\begin{aligned} \frac{\Vert {\mathcal {B}}^\dagger - {\mathcal {A}}^\dagger \Vert _{2}}{\Vert {\mathcal {A}}^\dagger \Vert _{2}} \le k{\mathbb {K}}_{2}({\mathcal {A}})\frac{\frac{\Vert {\mathcal {E}}\Vert _{2}}{\Vert {\mathcal {A}}\Vert _{2}}}{1-{\mathbb {K}}_{2}({\mathcal {A}})\frac{\Vert {\mathcal {E}}\Vert _{2}}{\Vert {\mathcal {A}}\Vert _{2}}}. \end{aligned}$$

Proof

The statement can be verified using Theorem 4.4 and the definition of \({\mathbb {K}}_{2}({\mathcal {A}})\). \(\square \)

Theorem 4.5 shows that the perturbation \({\mathcal {E}}\) of \({\mathcal {A}}\) has little influence on \({\mathcal {A}}^\dagger \) when the condition number \({\mathbb {K}}_{2}({\mathcal {A}})\) is small, and when the condition number \({\mathbb {K}}_{2}({\mathcal {A}})\) is large, the influence of \({\mathcal {E}}\) on the disturbance to \({\mathcal {A}}^\dagger \) may be larger.

5 Examples

Example 5.1

This example is aimed to the verification of the inequality (4.2). Let the tensor \({\mathcal {A}} = 10^3*\mathrm {rand}(2,2,2,2)\) be defined by

$$\begin{aligned} \begin{aligned} {\mathcal {A}}(:,:,1,1)&=\begin{bmatrix} 950.9152&\quad 400.0797\\ 722.3485&\quad 831.8713\end{bmatrix},\ {\mathcal {A}}(:,:,2,1) =\begin{bmatrix} 134.3383&\quad 84.2471\\ 60.4668&\quad 163.8983 \end{bmatrix},\\ {\mathcal {A}}(:,:,1,2)&=\begin{bmatrix} 324.2199&\quad 11.6810\\ 301.7268&\quad 539.9051\end{bmatrix},\ {\mathcal {A}}(:,:,2,2) =\begin{bmatrix} 95.3727&\quad 631.1412\\ 146.5149&\quad 859.3204\end{bmatrix}, \end{aligned} \end{aligned}$$

and let \({\mathcal {E}} = 10^{-1}*\mathrm {rand}(2,2,2,2)\) be defined by

$$\begin{aligned} \begin{aligned} {\mathcal {E}}(:,:,1,1)&=\begin{bmatrix} 0.0974&\quad 0.0997\\ 0.0571&\quad 0.0554\end{bmatrix},\ {\mathcal {E}}(:,:,2,1)&=\begin{bmatrix} 0.0515&\quad 0.0430\\ 0.0331&\quad 0.0492 \end{bmatrix},\\ {\mathcal {E}}(:,:,1,2)&=\begin{bmatrix} 0.0071&\quad 0.0065\\ 0.0888&\quad 0.0436\end{bmatrix},\ {\mathcal {E}}(:,:,2,2)&=\begin{bmatrix} 0.0827&\quad 0.0613\\ 0.0395&\quad 0.0819\end{bmatrix}. \end{aligned} \end{aligned}$$

Then, \({\mathcal {B}} = {\mathcal {A}}+{\mathcal {E}}\) is defined by

$$\begin{aligned} \begin{aligned} {\mathcal {B}}(:,:,1,1)&=\begin{bmatrix} 951.0126&\quad 400.1794\\ 722.4056&\quad 831.9267 \end{bmatrix},\ {\mathcal {B}}(:,:,2,1)&=\begin{bmatrix} 134.3899&\quad 84.2901\\ 60.4998&\quad 163.9475 \end{bmatrix},\\ {\mathcal {B}}(:,:,1,2)&=\begin{bmatrix} 324.2270&\quad 11.6875\\ 301.8156&\quad 539.9487 \end{bmatrix},\ {\mathcal {B}}(:,:,2,2)&=\begin{bmatrix} 95.4554&\quad 631.2026\\ 146.5543&\quad 859.4023\end{bmatrix}. \end{aligned} \end{aligned}$$

It holds, \(\mathrm {rshrank}({{\mathcal {A}}})=\mathrm {rshrank}({{\mathcal {B}}})=4\). In addition, an application of Huang et al. (2018, Algorithm 1) gives the Moore–Penrose inverse of \({\mathcal {A}}\):

$$\begin{aligned} \begin{aligned} {\mathcal {A}}^{\dagger }(:,:,1,1)&=\begin{bmatrix} -0.000384784660649&\quad -0.001099286197235\\ 0.013955577455799&\quad -0.001598581983579\end{bmatrix},\\ {\mathcal {A}}^{\dagger }(:,:,2,1)&=\begin{bmatrix} 0.002737547517780&\quad 0.000508206301570\\ -0.021392949244820&\quad 0.001110875409922 \end{bmatrix},\\ {\mathcal {A}}^{\dagger }(:,:,1,2)&=\begin{bmatrix} 0.001227173049100&\quad -0.003076391592687\\ -0.002071095392062&\quad 0.001139922248389\end{bmatrix},\\ {\mathcal {A}}^{\dagger }(:,:,2,2)&=\begin{bmatrix} -0.001325364668035&\quad 0.002294859505387\\ 0.003619787791101&\quad 0.000314492021375\end{bmatrix} \end{aligned} \end{aligned}$$

and the following Moore–Penrose inverse of \({\mathcal {B}}\):

$$\begin{aligned}\begin{aligned} {\mathcal {B}}^{\dagger }(:,:,1,1)&=\begin{bmatrix} -0.000385255308227&\quad -0.001098543826394\\ 0.013953167583606&\quad -0.001598698848909\end{bmatrix},\\ {\mathcal {B}}^{\dagger }(:,:,2,1)&=\begin{bmatrix} 0.002738190383336&\quad 0.000507663916870\\ -0.021390844989888&\quad 0.001111108748350 \end{bmatrix},\\ {\mathcal {B}}^{\dagger }(:,:,1,2)&=\begin{bmatrix} 0.001227624787307&\quad -0.003075755111089\\ -0.002076686203945&\quad 0.001140238650466\end{bmatrix},\\ {\mathcal {B}}^{\dagger }(:,:,2,2)&=\begin{bmatrix} -0.001325803821792&\quad 0.002294485476614\\ 0.003623245688721&\quad 0.000314224257971\end{bmatrix}. \end{aligned} \end{aligned}$$

In view of (2.5), it is easy to check that the tensor norms are equal to \(\Vert {\mathcal {A}}^\dagger \Vert _{2} =0.026095036211067\), \(\Vert {\mathcal {E}}\Vert _{2}=0.235145716909881\) and \(\bigtriangleup = \Vert {\mathcal {A}}^\dagger \Vert _{2}\Vert {\mathcal {E}}\Vert _{2}=0.006136135997641 < 1\). Therefore, Condition (4.1) is satisfied. Then, \(\Vert {\mathcal {B}}^\dagger \Vert _{2} =0.026093083833995\) and \(\frac{\Vert {\mathcal {A}}^\dagger \Vert _{2}}{1 - \bigtriangleup }= 0.026256147502919.\) Hence, the inequality (4.2) in Lemma 4.1 is verified.

Example 5.2

The tensors in Example 5.1 are invertible. Example 5.2 is aimed to the verification of the inequality (4.2) in singular tensor case. To this end, let \({\mathcal {A}}, {\mathcal {E}}\in {\mathbb {R}}^{(2\times 2)\times (2\times 2)}\) with

$$\begin{aligned} \begin{aligned} {\mathcal {A}}(:,:,1,1)&=10^2\cdot \begin{bmatrix} 0.985940927109977&\quad 1.682512984915278\\ 1.420272484319284&\quad 1.962489222569553\end{bmatrix},\ {\mathcal {A}}(:,:,2,1) =\begin{bmatrix} 0&\quad 0\\ 0&\quad 0 \end{bmatrix},\\ {\mathcal {A}}(:,:,1,2)&=10^2\cdot \begin{bmatrix} 8.929224052859770&\quad 5.557379427193866\\ 7.032232245562910&\quad 1.844336677576532\end{bmatrix},\ {\mathcal {A}}(:,:,2,2) =\begin{bmatrix} 0&\quad 0\\ 0&\quad 0\end{bmatrix}, \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \begin{aligned} {\mathcal {E}}(:,:,1,1)&=\begin{bmatrix} 0.055778896675488&\quad 0.016620356290215\\ 0.031342898993659&\quad 0.062249725927990\end{bmatrix},\ {\mathcal {E}}(:,:,2,1)&=\begin{bmatrix} 0&\quad 0\\ 0&\quad 0 \end{bmatrix},\\ {\mathcal {E}}(:,:,1,2)&=\begin{bmatrix} 0.007399476957694&\quad 0.040238833269616\\ 0.068409606696201&\quad 0.098283520139395\end{bmatrix},\ {\mathcal {E}}(:,:,2,2)&=\begin{bmatrix} 0&\quad 0\\ 0&\quad 0\end{bmatrix}. \end{aligned} \end{aligned}$$

Then, the tensor \({\mathcal {B}} = {\mathcal {A}}+{\mathcal {E}}\) is defined by

$$\begin{aligned} \begin{aligned} {\mathcal {B}}(:,:,1,1)&=10^2\cdot \begin{bmatrix} 0.986498716076732&\quad 1.682679188478180\\ 1.420585913309220&\quad 1.963111719828833 \end{bmatrix},\ {\mathcal {B}}(:,:,2,1)&=\begin{bmatrix} 0&\quad 0\\ 0&\quad 0 \end{bmatrix},\\ {\mathcal {B}}(:,:,1,2)&=10^2\cdot \begin{bmatrix} 8.929298047629347&\quad 5.557781815526563\\ 7.032916341629872&\quad 1.845319512777926 \end{bmatrix},\ {\mathcal {B}}(:,:,2,2)&=\begin{bmatrix} 0&\quad 0\\ 0&\quad 0\end{bmatrix}. \end{aligned} \end{aligned}$$

It holds, \(\mathrm {rshrank}({{\mathcal {A}}})=\mathrm {rshrank}({{\mathcal {B}}})=2\). In addition, an application of Huan et al. (2018, Algorithm 1) gives the following Moore–Penrose inverse of \({\mathcal {A}}\):

$$\begin{aligned}\begin{aligned} {\mathcal {A}}^{\dagger }(:,:,1,1)&=\begin{bmatrix} -0.002139621590719&\quad 0.000961949275511\\ 0&\quad 0\end{bmatrix},\\ {\mathcal {A}}^{\dagger }(:,:,2,1)&=10^{-3}\cdot \begin{bmatrix} 0.154116174683625&\quad 0.400242571625946\\ 0&\quad 0 \end{bmatrix},\\ {\mathcal {A}}^{\dagger }(:,:,1,2)&=\begin{bmatrix} 0.001721913469812&\quad 0.000005405961529\\ 0&\quad 0\end{bmatrix},\\ {\mathcal {A}}^{\dagger }(:,:,2,2)&=\begin{bmatrix} 0.004582706318709&\quad -0.000777570778470\\ 0&\quad 0 \end{bmatrix}. \end{aligned} \end{aligned}$$

Similarly

$$\begin{aligned}\begin{aligned} {\mathcal {B}}^{\dagger }(:,:,1,1)&=\begin{bmatrix}-0.002139080520305&\quad 0.000961905529252\\ 0&\quad 0\end{bmatrix},\\ {\mathcal {B}}^{\dagger }(:,:,2,1)&=10^{-3}\cdot \begin{bmatrix} 0.153469746601847&\quad 0.400351263240931\\ 0&\quad 0 \end{bmatrix},\\ {\mathcal {B}}^{\dagger }(:,:,1,2)&=\begin{bmatrix} 0.001720928456951&\quad 0.000005485433595\\ 0&\quad 0\end{bmatrix},\\ {\mathcal {B}}^{\dagger }(:,:,2,2)&=\begin{bmatrix} 0.004582730894265&\quad -0.000777786686340\\ 0&\quad 0\end{bmatrix}. \end{aligned} \end{aligned}$$

Following (2.5), it is easy to check \(\Vert {\mathcal {A}}^\dagger \Vert _{2} =0.005446932213520\), \(\Vert {\mathcal {E}}\Vert _{2}=0.149158220173799\) and \(\bigtriangleup = \Vert {\mathcal {A}}^\dagger \Vert _{2}\Vert {\mathcal {E}}\Vert _{2}=8.124547143759799\cdot 10^{-4} < 1\). Therefore, Condition (4.1) is satisfied. Then, \(\Vert {\mathcal {B}}^\dagger \Vert _{2} = 0.005446449437497\) and \(\frac{\Vert {\mathcal {A}}^\dagger \Vert _{2}}{1 - \bigtriangleup }= 0.005451361197625,\) which confirms the inequality (4.2) in Lemma 4.1.

Example 5.3

This example is a continuation of Example 5.1 to verify the inequality (4.3) proved in Theorem 4.2. Therefore, for the tensors \({\mathcal {A}}\) and \({\mathcal {E}}\) defined in Example 5.1, we have

$$\begin{aligned} \frac{\Vert {\mathcal {B}}^\dagger - {\mathcal {A}}^\dagger \Vert _{2}}{\Vert {\mathcal {A}}^\dagger \Vert _{2}}=2.834195478378557\cdot 10^{-4} \end{aligned}$$

and

$$\begin{aligned} \left( 1+\frac{1}{1-\bigtriangleup }+\frac{1}{(1-\bigtriangleup )^{2}}\right) \bigtriangleup = 0.018295682536782. \end{aligned}$$

Hence, inequality (4.3) of Theorem 4.2 is valid.

Example 5.4

This example is a continuation of Example 5.2 to verify the inequality (4.3) proved in Theorem 4.2. Therefore, for the tensors \({\mathcal {A}}\) and \({\mathcal {E}}\) defined in Example 5.2, we have

$$\begin{aligned} \frac{\Vert {\mathcal {B}}^\dagger - {\mathcal {A}}^\dagger \Vert _{2}}{\Vert {\mathcal {A}}^\dagger \Vert _{2}}=2.394253885112045\cdot 10^{-4} \end{aligned}$$

and

$$\begin{aligned} \left( 1+\frac{1}{1-\bigtriangleup }+\frac{1}{(1-\bigtriangleup )^{2}}\right) \bigtriangleup = 0.002435384431426. \end{aligned}$$

Hence, inequality (4.3) of Theorem 4.2 is confirmed.

Example 5.5

We shall use again the settings of Example 5.2 to verify the validity of equality (4.4).

After appropriate calculations, one can verify

$$\begin{aligned} \begin{aligned}&\left( {\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\right) (:,:,1,1) =10^{-4}\\&\quad \cdot \begin{bmatrix} -0.489468003874380&\quad 0.346424530042466\\ 0.006594102139740&\quad 0.970661633334091\end{bmatrix},\\&\left( {\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\right) (:,:,2,1) =10^{-4}\\&\quad \cdot \begin{bmatrix} 0.441733836613403&\quad -0.163043151723344\\ 0.084143275517548&\quad -0.629731697791430\end{bmatrix},\\&\left( {\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\right) (:,:,1,2) =10^{-3}\\&\quad \cdot \begin{bmatrix} 0.035216688972592&\quad -0.046360736953049\\ -0.013384105875716&\quad -0.105126013940471\end{bmatrix},\\&\left( {\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\right) (:,:,2,2) =10^{-4}\\&\quad \begin{bmatrix} -0.375707154131807&\quad 0.341422001841479\\ 0.050538644491456&\quad 0.869372625158238 \end{bmatrix}. \end{aligned} \end{aligned}$$

In addition

$$\begin{aligned} \begin{aligned}&\left( {\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger )\right) (:,:,1,1) =10^{-4}\\&\quad \cdot \begin{bmatrix} 0.489668212727348&\quad -0.346613307126986\\ -0.006685674852000&\quad -0.970692034327758\end{bmatrix},\\&\left( {\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger )\right) (:,:,2,1) =10^{-4}\\&\quad \cdot \begin{bmatrix} 0.441825409325317&\quad 0.163138141635252\\ -0.084077978807235&\quad 0.629669045010689\end{bmatrix},\\&\left( {\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger )\right) (:,:,1,2) =10^{-3}\\&\quad \cdot \begin{bmatrix} -0.035235566680988&\quad 0.046380141446686\\ 0.013393604866949&\quad 0.105126135280298\end{bmatrix},\\&\left( {\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger )\right) (:,:,2,2) =10^{-4}\\&\quad \begin{bmatrix} 0.375676753138765&\quad -0.341420788443209\\ -0.050601297272405&\quad -0.868951121121841 \end{bmatrix}. \end{aligned} \end{aligned}$$

Hence

$$\begin{aligned} \begin{aligned} \Vert {\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\Vert _2&= \Vert \left( {\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger )\right) \Vert _2\\&= 2.0813844590544\cdot 10^{-4}. \end{aligned} \end{aligned}$$

Example 5.6

This example is a continuation of Example 5.2 with the aim to verify the validity of inequality (4.5). It is possible to compute

$$\begin{aligned} \Vert {\mathcal {G}}\Vert _2=\Vert {\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\Vert _2=1.132716947645736\cdot 10^{-6}. \end{aligned}$$

In addition, \(\Vert {\mathcal {E}}\Vert _{2}\Vert {\mathcal {A}}^\dagger \Vert _{2} \Vert {\mathcal {B}}^\dagger \Vert _{2}=4.424993522104842\cdot 10^{-6}.\) Therefore, the inequality (4.5) is verified.

Example 5.7

In this example, we verify cases (1)–(3) of Theorem 4.4.

Case (1) The tensors \({\mathcal {A}}, {\mathcal {E}}\) and \({\mathcal {B}}\) are reused from Example 5.2. It can be verified that \(\mathrm {rshrank}({{\mathcal {A}}})=2 < 4=\min ({\mathfrak {I}}, {\mathfrak {K}})\). From example 5.4, we have

$$\begin{aligned} \frac{\Vert {\mathcal {B}}^\dagger - {\mathcal {A}}^\dagger \Vert _{2}}{\Vert {\mathcal {A}}^\dagger \Vert _{2}}=2.394253885112045\cdot 10^{-4} \end{aligned}$$

and since \(\bigtriangleup = \Vert {\mathcal {A}}^\dagger \Vert _{2}\Vert {\mathcal {E}}\Vert _{2}=8.124547143759799\cdot 10^{-4}\) (see Example 5.2), it follows

$$\begin{aligned} \frac{1+\sqrt{5}}{2}\cdot \frac{\bigtriangleup }{1-\bigtriangleup } = 0.001315648246801. \end{aligned}$$

Hence, inequality (4.6) of Theorem 4.4 is valid.

Case (2) We consider the tensors \({\mathcal {A}}, {\mathcal {E}}\) and \({\mathcal {B}}\) from Example 5.1. It is clear that \(\mathrm {rshrank}({{\mathcal {A}}})= 4=\min ({\mathfrak {I}}, {\mathfrak {K}})\). From Example 5.3, we have

$$\begin{aligned} \frac{\Vert {\mathcal {B}}^\dagger - {\mathcal {A}}^\dagger \Vert _{2}}{\Vert {\mathcal {A}}^\dagger \Vert _{2}}=2.834195478378557\cdot 10^{-4}, \end{aligned}$$

and since \(\bigtriangleup = \Vert {\mathcal {A}}^\dagger \Vert _{2}\Vert {\mathcal {E}}\Vert _{2}= 0.006136135997641\) (see Example 5.1), we have

$$\begin{aligned} \sqrt{2}\cdot \frac{\bigtriangleup }{1-\bigtriangleup } = 0.008731383706299. \end{aligned}$$

Hence, inequality (4.6) of Theorem 4.4 is valid.

Case (3) We consider the tensors \({\mathcal {A}}, {\mathcal {E}}\) and \({\mathcal {B}}\) from Example 5.1. Then, \(\mathrm {rshrank}({{\mathcal {A}}})= 4={\mathfrak {I}}={\mathfrak {K}}\). As in the previous case [Case (2)], we have

$$\begin{aligned} \frac{\Vert {\mathcal {B}}^\dagger - {\mathcal {A}}^\dagger \Vert _{2}}{\Vert {\mathcal {A}}^\dagger \Vert _{2}}=2.834195478378557\cdot 10^{-4}, \end{aligned}$$

and

$$\begin{aligned} 1\cdot \frac{\bigtriangleup }{1-\bigtriangleup } = 0.006174020627866. \end{aligned}$$

Hence, inequality (4.6) of Theorem 4.4 is valid.

6 Concluding remarks

The aim of this paper is to generalize some results about the perturbation theory of the matrix pseudoinverse to tensors. For this purpose, we derive several useful representations and introduce some notions. The spectral norm of even-order tensors is defined by a computationally effective definition and investigated. In addition, useful representations of \({{\mathcal {A}}}*_N{{\mathcal {A}}}^\dagger \) and \({{\mathcal {I}}}-{{\mathcal {A}}}*_N{{\mathcal {A}}}^\dagger \) are derived. As a result, we explore the perturbation bounds for Moore–Penrose inverse of tensor via Einstein product. Unlike to so far exploited approaches which were developed only in the tensor or in the matrix case, our approach assumes an exact transition from one to another space. In this way, it is possible to extend many of known results from the matrix case into the multiarray case. The results derived in current research extend the classical results in the matrix case, derived by Stewart (1977) and Wedin (1973). It is shown that the influence of the perturbation in the tensors depends on exactly defined condition number. Illustrative numerical examples also confirm derived theoretical results.

Recently, Ji and Wei (2017, 2018) investigated the weighted Moore–Penrose inverses and the Drazin inverse of even-order tensors with Einstein product. It is natural to investigate possible extensions of derived results to the perturbation bounds for the weighted Moore–Penrose inverses and the Drazin inverse of tensors via Einstein product.