Perturbation theory for Moore–Penrose inverse of tensor via Einstein product

Ma, Haifeng; Li, Na; Stanimirović, Predrag S.; Katsikis, Vasilios N.

doi:10.1007/s40314-019-0893-6

Perturbation theory for Moore–Penrose inverse of tensor via Einstein product

Published: 22 May 2019

Volume 38, article number 111, (2019)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Computational and Applied Mathematics Aims and scope Submit manuscript

Perturbation theory for Moore–Penrose inverse of tensor via Einstein product

Download PDF

Haifeng Ma¹,
Na Li¹,
Predrag S. Stanimirović² &
…
Vasilios N. Katsikis³

708 Accesses
47 Citations
Explore all metrics

Abstract

The spectral norm of an even-order tensor is defined and investigated. An equivalence between the spectral norm of tensors and matrices is given. Using derived representations of some tensor expressions involving the Moore–Penrose inverse, we investigate the perturbation theory for the Moore–Penrose inverse of tensor via Einstein product. The classical results derived by Stewart (SIAM Rev 19:634–662, 1977) and Wedin (BIT 13:217–232, 1973) for the matrix case are extended to even-order tensors. An implementation in the Matlab programming language is developed and used in deriving appropriate numerical examples.

Numerical study on Moore-Penrose inverse of tensors via Einstein product

Article 24 March 2021

On reverse-order law of tensors and its application to additive results on Moore–Penrose inverse

Article 12 August 2020

Acute perturbation for Moore-Penrose inverses of tensors via the T-Product

Article 04 January 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

For a positive integer N, let ${\mathbf {I}}_1,\ldots ,{\mathbf {I}}_N$ be positive integers. An order N tensor ${\mathcal {A}}=( {\mathcal {A}}_{ {i_{1}i_{2}\ldots i_{{N}}} })_{1\le i_{j}\le {\mathbf {I}}_{j}}$, $(j = 1,\ldots , {N})$ is a multidimensional array with ${\mathfrak {I}}={\mathbf {I}}_{1}{\mathbf {I}}_{2}\cdots {\mathbf {I}}_{N}$ entries, where ${\mathbf {I}}_1,\ldots ,{\mathbf {I}}_N$ are positive integers. Let ${\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{N}}$ (resp. ${\mathbb {R}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{N}}$) be the set of the order N tensors of dimension ${\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{N}$ over complex numbers ${\mathbb {C}}$ (resp. real numbers ${\mathbb {R}}$).

The conjugate transpose of a tensor ${\mathcal {A}} = ({\mathcal {A}}_{ i_{1}\ldots i_{M}j_{1} \ldots j_{N}}) \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{M} \times {\mathbf {J}}_{1} \times \cdots \times {\mathbf {J}}_{N}}$ is denoted by ${\mathcal {A}}^{*}$ and elementwise defined as $({\mathcal {A}}^{*})_{ j_{1} \ldots j_{N} i_{1} \ldots i_M}=(\overline{{\mathcal {A}}})_{ i_{1} \ldots i_{M} j_{1} \ldots j_N} \in {\mathbb {C}}^{{\mathbf {J}}_{1} \times \cdots \times {\mathbf {J}}_{N} \times {\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{M}}$, where the overline means the conjugate operator. When the tensors are defined over ${\mathbb {R}}$, the tensor ${\mathcal {A}}^{\mathrm T}$ satisfying $({\mathcal {A}}^{\mathrm T})_{ j_{1} \ldots j_{N} i_{1} \ldots i_M}=({\mathcal {A}})_{ i_{1}\ldots i_{M} j_{1} \ldots j_N} \in {\mathbb {C}}^{{\mathbf {J}}_{1} \times \cdots \times {\mathbf {J}}_{N} \times {\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{M}}$ is called the transpose of ${\mathcal {A}}$.

The Einstein product of tensors is defined in Einstein (2007) by the operation $*_{N}$ via

$$\begin{aligned} ({\mathcal {A}}*_{N}{\mathcal {B}})_{i_{1}\ldots i_{{N}}j_{1}\ldots j_{{M}}} = \sum _{k_{1} \ldots k_{{N}}}{\mathcal {A}}_{i_{1} \ldots i_{{N}}k_{1} \ldots k_{{N}}}{\mathcal {B}}_{k_{1} \ldots k_{{N}}j_{1} \ldots ,j_{{M}}}, \end{aligned}$$

(1.1)

where ${\mathcal {A}} \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}}\times {\mathbf {K}}_{1} \times \cdots \times {\mathbf {K}}_{{N}}}$, ${\mathcal {B}} \in {\mathbb {C}}^{{\mathbf {K}}_{1}\times \cdots \times {\mathbf {K}}_{{N}}\times {\mathbf {J}}_{1}\times \cdots \times {\mathbf {J}}_{{M}}}$ and ${\mathcal {A}}*_{N}{\mathcal {B}} \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}} \times {\mathbf {J}}_{1} \times \cdots \times {\mathbf {J}}_{{M}}}$. The associative law of this tensor product holds. In the above formula, when ${\mathcal {B}} \in {\mathbb {C}}^{{\mathbf {K}}_{1} \times \cdots \times {\mathbf {K}}_{{N}}}$, then

$$\begin{aligned} ({\mathcal {A}}*_{N}{\mathcal {B}})_{i_{1}i_{2}\ldots i_{{N}}} = \sum _{k_{1},\ldots ,k_{{N}}}{\mathcal {A}}_{i_{1} \ldots i_{{N}}k_{1} \ldots k_{{N}}}{\mathcal {B}}_{k_{1} \ldots k_{{N}}}, \end{aligned}$$

where ${\mathcal {A}}*_{N}{\mathcal {B}} \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}}}$. When ${\mathcal {A}} \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}}}$ and ${\mathcal {B}}$ is a vector ${\mathbf {b}} = (b_{i})\in {\mathbb {C}}^{{I}_{{N}}}$, the product is defined by operation $\times _{{N}}$ via

$$\begin{aligned} ({\mathcal {A}}\times _{{N}}{\mathcal {B}})_{i_{1}i_{2}\ldots i_{{N}-1}} = \sum _{i_{{N}}}{\mathcal {A}}_{i_{1} \ldots i_{{N}}} b_{i_{{N}}}, \end{aligned}$$

where ${\mathcal {A}}\times _{{N}}{\mathcal {B}} \in {{\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {I}_{{N}-1}}}$.

Definition 1.1

(Sun et al. 2016) Let ${\mathcal {A}} \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}} \times {\mathbf {K}}_{1} \times \cdots \times {\mathbf {K}}_{{N}}}$. The tensor ${\mathcal {X}} \in {\mathbb {C}}^{{\mathbf {K}}_{1} \times \cdots \times {\mathbf {K}}_{{N}} \times {\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}}}$ which satisfies

$$\begin{aligned}&(1^T)\quad {\mathcal {A}}*_{{N}}{\mathcal {X}}*_{{N}}{\mathcal {A}} = {\mathcal {A}};\qquad (2^T)\quad {\mathcal {X}}*_{{N}}{\mathcal {A}}*_{{N}}{\mathcal {X}} = {\mathcal {X}};\\&(3^T)\quad ({\mathcal {A}}*_{{N}}{\mathcal {X}})^{*} = {\mathcal {A}}*_{{N}}{\mathcal {X}};\qquad (4^T)\quad ({\mathcal {X}}*_{{N}}{\mathcal {A}})^{*} = {\mathcal {X}}*_{{N}}{\mathcal {A}} \end{aligned}$$

is called the Moore–Penrose inverse of ${\mathcal {A}}$, abbreviated by M-P inverse, denoted by ${\mathcal {A}}^\dagger $. If the equation (i) of the above equations $(1^T)$–$(4^T)$ holds, then ${\mathcal {X}}$ is called an (i)-inverse of ${\mathcal {A}}$, denoted by ${\mathcal {A}}^{(i)}$.

For a tensor ${\mathcal {A}} \in {\mathbb {C}}^{{I}_{1} \times \cdots \times {I}_{{N}} \times {I}_{1} \times \cdots \times {I}_{{N}}}$, if there exists a tensor ${\mathcal {X}}$, such that ${\mathcal {A}}*_{{N}}{\mathcal {X}} = {\mathcal {X}}*_{{N}}{\mathcal {A}} = {\mathcal {I}}$, then ${\mathcal {X}}$ is called the inverse of ${\mathcal {A}}$, denoted by ${\mathcal {A}}^{-1}$. Clearly, if ${\mathcal {A}}$ is invertible, then ${\mathcal {A}}^\dagger = {\mathcal {A}}^{-1}$.

The Moore–Penrose inverse of matrices and linear operators plays an important role in theoretical study and numerical analysis in many areas, such as the singular matrix problems, ill-posed problems, optimization problems, total least-squares problem (Xie et al. 2019; Zheng et al. 2017), and statistical problems (Ben-Israel and Greville 2003; Cvetkovic-Illic and Wei 2017; Wang et al. 2018; Wei 2014). As a continuation of these results, operations with tensors have become increasingly prevalent in recent years (Che and Wei 2019; Che et al. 2019; Ding and Wei 2016; Harrison and Joseph 2016; Medellin et al. 2016; Qi and Luo 2017; Wei and Ding 2016). Brazell et al. (2013) introduced the notion of ordinary tensor inverse. Sun et al. (2014, 2016) prove the existence and uniqueness of the Moore–Penrose inverse and $\{i,j,k\}$-inverses of even-order tensors with the Einstein product. Panigrahy and Mishra (2018) and Sun et al. (2014, 2016) defined the Moore–Penrose inverse and $\{i\}$-inverses $(i = 1, 2, 3, 4)$ of even-order tensors with the Einstein product. In addition, the general solutions of some multilinear systems were given in terms of defined generalized inverses. A few further characterizations of different generalized inverses of tensors in conjunction with the new method to compute the Moore–Penrose inverse of tensors were considered in Behera and Mishra (2017). The weighted Moore–Penrose inverse in tensor spaces was introduced in Ji and Wei (2017). In addition, a characterization of the least-squares solutions to a multilinear system as well as the relationship between the weighted minimum-norm least-squares solution of a multilinear system and the weighted Moore–Penrose inverse of its coefficient tensor were considered in Ji and Wei (2017). Sun et al. (2018) defined $\{i\}$-inverses for $i = 1, 2, 5$ and the group inverse of tensors, assuming a general tensor product. Panigrahy et al. (2018) proved some additional properties of the Moore–Penrose inverse of tensors via the Einstein product and also derived a few necessary and sufficient conditions for the reverse-order law for the Moore–Penrose inverse of tensors. Several new sufficient conditions which ensure the reverse-order law of the weighted Moore–Penrose inverse for even-order square tensors were presented in Panigrahy and Mishra (2019). Recently, Ji and Wei (2018) investigated the Drazin inverse of even-order tensors with Einstein product. Liang and Zheng (2018) defined an iterative algorithm for solving Sylvester tensor equation based on the Einstein product.

Using another definition of the tensor product, some basic properties for order 2 left (right) inverse and product of tensors were given in Bu et al. (2014). The generalized inverse of tensors was established in Jin et al. (2017) using tensor equations and the t-product of tensors. The definition of generalized tensor function via the tensor singular value decomposition based on the t-product was introduced in Miao et al. (2019). In addition, the least-squares solutions of tensor equations as well as an algorithm for generating the Moore–Penrose inverse of a tensor were proposed in Jin et al. (2017), Shi et al. (2013).

On the other hand, the additive and multiplicative perturbation models have been investigated frequently during the past decades. For more details, the reader is referred to the references (Cai et al. 2011; Liu et al. 2008; Meng and Zheng 2010; Stewart 1977; Wedin 1973; Wei and Ling 2010; Wei 1999). The classical results derived by Stewart (1977) and Wedin (1973) have been improved in Li et al. (2013), Xu et al. (2010b). The acute perturbation of the group inverse was investigated in Wei (2017). Some results related to the perturbation of the oblique projectors which include the weighted pseudoinverse were presented in Xu et al. (2008, 2010a). Some optimal perturbation bounds of the weighted Moore–Penrose inverse under the weighted unitary invariant norms, the weighted Q-norms and the weighted F-norms were obtained in Xu et al. (2011). A sharp estimation for the perturbation bounds of weighted Moore–Penrose inverse was considered in Ma (2018). Meyer (1980) presented a perturbation formula with application to Markov chains. The authors extended the formula to the Drazin inverse $A^{\mathrm D}$. Two finite-time convergent Zhang neural network models for time-varying complex matrix Drazin inverse have been presented in Qiao et al. (2018). An explicit formula for perturbations of an outer inverse under certain conditions was given in Zhang and Wei (2008). The perturbation analysis for the nearest $\{1\}$, $\{1, i\}$, and $\{1, 2, i\}$-inverses with respect to the multiplicative perturbation model was considered in Meng et al. (2017).

In addition, the perturbation theory for the tensor eigenvalue and singular value problems of tensors has been investigated recently. The perturbation bounds of the tensor eigenvalue and singular value problems of even-order tensors were considered in Che et al. (2016). The explicit estimation of the backward error for the largest eigenvalue of an irreducible nonnegative tensor was given in Li and Ng (2014).

Our intention in the present paper is to extend the results concerning the perturbation of the Moore–Penrose inverse from the complex matrix space to more general results in the tensor space. According to this goal, our intention is to extend the classical results derived in Stewart (1977) and Wedin (1973) for the matrix case to even-order tensors.

The null spaces and the ranges of tensors were introduced in Ji and Wei (2017).

Definition 1.2

(Ji and Wei 2017) For ${\mathcal {A}} \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}} \times {\mathbf {K}}_{1} \times \cdots \times {\mathbf {K}}_{{N}}}$, the range ${\mathcal {R}}({\mathcal {A}})$ and the null space ${\mathcal {N}}({\mathcal {A}})$ of ${\mathcal {A}}$ are defined by

$$\begin{aligned}&{\mathcal {R}}({\mathcal {A}}) = \{{\mathcal {Y}} \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}}} : \ {\mathcal {Y}} = {\mathcal {A}}*_N{\mathcal {X}}, \ {\mathcal {X}} \in {\mathbb {C}}^{{\mathbf {K}}_{1} \times \cdots \times {\mathbf {K}}_{{N}}}\}\\&{\mathcal {N}}({\mathcal {A}}) = \{{\mathcal {X}} \in {\mathbb {C}}^{{\mathbf {K}}_{1} \times \cdots \times {\mathbf {K}}_{{N}}} : {\mathcal {A}}*_N {\mathcal {X}} = {\mathcal {O}}\}, \end{aligned}$$

where ${\mathcal {O}}$ is an appropriate zero tensor.

Definition 1.3

(Orthogonal Projection) The orthogonal projection onto a subspace ${\mathcal {R}}({\mathcal {A}})$ is denoted by ${P}_{{\mathcal {A}}}$ and defined as

$$\begin{aligned} {P}_{{\mathcal {A}}} = {\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger \end{aligned}$$

Clearly, ${P}_{{\mathcal {A}}}$ is the Hermitian and idempotent, and ${\mathcal {R}}({P}_{{\mathcal {A}}}) = {\mathcal {R}}({\mathcal {A}})$. Similarly

$$\begin{aligned} {R}_{{\mathcal {A}}} = {\mathcal {A}}^\dagger *_{{N}}{\mathcal {A}} \end{aligned}$$

is the projection onto ${\mathcal {R}}({\mathcal {A}}^*)$.

Definition 1.4

(Complement of projection) The projection onto ${\mathcal {R}}({\mathcal {A}})^{\bot }$ will be denoted by

$$\begin{aligned} {P}^{\bot }_{{\mathcal {A}}} \equiv {\mathcal {I}} - {P}_{{\mathcal {A}}}. \end{aligned}$$

Likewise

$$\begin{aligned} {R}^{\bot }_{{\mathcal {A}}} \equiv {\mathcal {I}} - {R}_{{\mathcal {A}}}. \end{aligned}$$

will denote the projection onto ${\mathcal {R}}({\mathcal {A}}^*)^{\bot }$.

Main contributions of the manuscript can be summarized as follows.

(1)
The spectral norm of a tensor is defined and investigated.
(2)
Useful representations of ${{\mathcal {A}}}*_N{\mathcal A}^\dagger $ and ${{\mathcal {I}}}-{{\mathcal {A}}}*_N{\mathcal A}^\dagger $ are derived.
(3)
The perturbation theory for the Moore–Penrose inverse of even-order tensors via Einstein product is considered using derived representations of some tensor expressions involving the Moore–Penrose inverse. Therefore, derived results represent the first contribution to the perturbation of the Moore–Penrose inverse of tensors.

The rest of this paper is organized as follows. The spectral tensor norm is defined and investigated in Sect. 2. Useful representations of ${{\mathcal {A}}}*_N{{\mathcal {A}}}^\dagger $ and ${{\mathcal {I}}}-{{\mathcal {A}}}*_N{{\mathcal {A}}}^\dagger $ are derived in Sect. 3. Section 4 generalizes some results from the matrix theory to the perturbation theory for the Moore–Penrose inverse of even-order tensor via Einstein product. Numerical examples are presented in Sect. 5.

2 Spectral norm of tensors

To simplify presentation, we use the additional notation

$$\begin{aligned} {\mathbf {I}}(N)={\mathbf {I}}_1\times \cdots \times {\mathbf {I}}_N,\ \ {\mathbb {I}}=\{I_1,\ldots ,I_N\}, \end{aligned}$$

where ${\mathbf {I}}_1,\ldots ,{\mathbf {I}}_N$ are positive integers. Then, the tensor ${\mathcal {A}} \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{M}} \times {\mathbf {K}}_{1} \times \cdots \times {\mathbf {K}}_{{N}}}$ is denoted shortly by ${\mathcal {A}} \in {\mathbb {C}}^{{\mathbf {I}}(M) \times {\mathbf {K}}(N)}$. The identity tensor ${\mathcal {I}}$ of the order ${\mathbf {I}}(N)\times {\mathbf {I}}(N)$ tensor is defined as in Brazell et al. (2013) by

$$\begin{aligned} {\mathcal {I}}_{i_1\ldots i_N\, j_1\ldots j_N}=\prod \limits _{k=1}^N \delta _{i_kj_k}, \end{aligned}$$

where

$$\begin{aligned} \delta _{ij}=\left\{ \begin{array}{ll} 1, &{}\quad i=j,\\ 0, &{}\quad i\ne j \end{array}\right. \end{aligned}$$

denotes the Kronecker delta operator.

The Frobenius inner product of two tensors ${\mathcal {A}}, {\mathcal {B}} \in {\mathbb {C}}^{{\mathbf {I}}({N})}$ is defined as

$$\begin{aligned} ({\mathcal {A}}, {\mathcal {B}}) =\sum \limits _{i_1=1}^{{\mathbf {I}}_{1}}\sum \limits _{i_2=1}^{{\mathbf {I}}_{2}}\cdots \sum \limits _{i_N=1}^{{\mathbf {I}}_{N}} {\mathcal {A}}_{ i_1 i_2 \ldots i_N} {\mathcal {B}}_{ i_1 i_2\ldots i_N}. \end{aligned}$$

If $({\mathcal {A}}, {\mathcal {B}}) = 0$, then ${\mathcal {A}}$ is orthogonal to ${\mathcal {B}}$. The Frobenius norm of ${\mathcal {A}}$ is defined by $\Vert {\mathcal {A}}\Vert _F=\sqrt{({\mathcal {A}}, {\mathcal {A}})}.$

A complex (real) tensor of order m dimension n is defined by ${\mathcal {A}}=({\mathcal {A}}_{ i_1 \ldots i_m})$, ${\mathcal {A}}_{ i_1\ldots i_m}\in {\mathbb {C}} ({\mathbb {R}})$, where $i_j=1,\ldots ,n$ for each $j= 1, \ldots ,m$. If ${\mathbf {x}}=(x_1,\ldots ,x_n)^{\mathrm T}$ is an n-dimensional vector, then ${\mathbf {x}}^m=x \otimes x \otimes \cdots \otimes x$ is considered as an mth order n-dimensional rank-one tensor with entries ${ (x^m)_{i_{1}, \ldots , i_{m}}= x_{i_1}\ldots x_{i_m}}$, where “$\otimes $” is the Kronecker product of vectors. Then

$$\begin{aligned} {\mathcal {A}}{\mathbf {x}}^m=\sum \limits _{i_1,i_2,\ldots ,i_m=1}^n {\mathcal {A}}_{ i_1 i_2\ldots i_m} x_{ i_1 i_2\ldots i_m} \end{aligned}$$

is the tensor product of ${\mathcal {A}}$ and ${\mathbf {x}}^m$. A tensor–vector multiplication of a tensor ${\mathcal {A}}=(a_{i_1,\ldots ,i_m})\in {\mathbb {C}}^{n\times \cdots \times n}$ of order m dimension n and an n-dimensional vector $x=(x_1, x_2,\ldots ,x_n)^{\mathrm T}$ is an n-dimensional vector ${\mathcal {A}}x^{m-1}$, whose ith component is equal to

$$\begin{aligned} ({\mathcal {A}}x^{m-1})_i=\sum \limits _{i_2,\ldots ,i_m=1}^n a_{ i_2\ldots i_m} x_{ i_2 \ldots i_m}. \end{aligned}$$

The eigenvalue of a tensor was introduced in Qi (2005). A complex number $\lambda $ is called an eigenvalue of A and ${\mathbf {x}}$ is an eigenvector of A associated with $\lambda $ if the equation

$$\begin{aligned} ({\mathcal {A}}{\mathbf {x}}^{m-1})_i=\lambda x^{[m-1]},\ \ {\mathbf {x}}^{[m-1]}=\left( x_1^{m-1}, x_2^{m-1},\ldots ,x_n^{m-1}\right) {,} \end{aligned}$$

is satisfied.

Recently, Liang et al. (2019) proposed a new definition of the eigenvalue of an even-order square tensor. Let ${\mathcal {A}} \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}} \times {\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}}}$ be given. If a nonzero tensor ${\mathcal {X}} \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}}}$ and a complex number $\lambda $ satisfy

$$\begin{aligned} {\mathcal {A}}*_{N}{\mathcal {X}}=\lambda {\mathcal {X}}, \end{aligned}$$

(2.1)

then $\lambda $ is an eigenvalue of the tensor $ {\mathcal {A}}$ and $ {\mathcal {X}}$ the eigentensor with respect to $\lambda $.

In addition, we found the following definition of the tensor spectral norm from Li (2016).

Definition 2.1

(Li 2016) For a given tensor ${\mathcal {T}}\in {\mathbb {R}}^{{\mathbf {I}}_{1}\times {\mathbf {I}}_2\times \cdots \times {\mathbf {I}}_N}$, the spectral norm of ${\mathcal {T}}$, denoted by $\Vert {\mathcal {T}}\Vert _{\sigma }$, is defined as

$$\begin{aligned} \Vert {\mathcal {T}}\Vert _{\sigma }:= \max \left\{ \langle {\mathcal {T}} , {\mathbf {x}}_1\otimes {\mathbf {x}}_2\otimes \cdots \otimes {\mathbf {x}}_N\rangle : {\mathbf {x}}_k\in {\mathbb {R}}^{{\mathbf {I}}_{k}},\ \Vert {\mathbf {x}}_k\Vert _F = 1,\ k = 1,\ldots ,N\right\} , \end{aligned}$$

where $\Vert {\mathbf {x}}\Vert _F$ denotes the Frobenius norm of the vector ${\mathbf {x}}$ and ${\mathbf {x}}_1\otimes \cdots \otimes {\mathbf {x}}_N$ means the outer product of vectors: $({\mathbf {x}}_1\otimes \cdots \otimes {{\mathbf {x}}_N)}_{i_1,\ldots ,i_N}=({\mathbf {x}}_1)_{i_1}\cdots ({\mathbf {x}}_N)_{i_N}$.

Essentially, $\Vert {\mathcal {T}}\Vert _{\sigma }$ is the maximal value of the Frobenius inner product between ${\mathcal {T}}$ and the rank-one tensor ${\mathbf {x}}_1\otimes \cdots \otimes {\mathbf {x}}_N$ whose Frobenius norm is one.

Let the eigenvalues of a complex even-order square tensor are defined as in (2.1). By $\lambda _{\min }({\mathcal {K}})$ and $\lambda _{\max }({\mathcal {K}})$, we denote the smallest and largest eigenvalue of a tensor ${\mathcal {K}}$, respectively. Similarly, ${\mu _{1}({\mathcal {K}})}$ stands for the largest singular value of a tensor ${\mathcal {K}}$.

Lemma 2.1

Let ${\mathcal {A}} \in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {K}}({N})}$. Then, the spectral norm of ${\mathcal {A}}$ can be defined as

$$\begin{aligned} \Vert {\mathcal {A}}\Vert _{2} = \sqrt{\lambda _{\max }\left( {\mathcal {A}}^{*}*_{{N}}{\mathcal {A}}\right) } = \mu _{1}({\mathcal {A}}), \end{aligned}$$

(2.2)

where $\lambda _{\max }\left( {\mathcal {A}}^{*}*_{{N}}{\mathcal {A}}\right) $ denotes the largest eigenvalue of ${\mathcal {A}}^{*}*_{{N}}{\mathcal {A}}$ and $\mu _{1}({\mathcal {A}})$ is the largest singular value of ${\mathcal {A}}$.

Proof

It is necessary to verify that the definition (2.2) satisfies properties of a norm function.

(1) Clearly, $\Vert {\mathcal {A}}\Vert _{2} \ge 0 $, and $\Vert {\mathcal {A}}\Vert _{2} = 0$ if and only if ${\mathcal {A}}=0$,

(2) The second property of $\Vert {\mathcal {A}}\Vert _{2}$ can be verified using

$$\begin{aligned} \begin{aligned} \Vert k{\mathcal {A}}\Vert _{2}&= \sqrt{\lambda _{\max }\left( (k{\mathcal {A}})^{*}*_{{N}}(k{\mathcal {A}})\right) }\\&= \sqrt{k^{2}\, \lambda _{\max }\left( {\mathcal {A}}^{*}*_{{N}}{\mathcal {A}}\right) }\\&= |k|\, \mu _{1}({\mathcal {A}})\\&= |k|\, \Vert {\mathcal {A}}\Vert _{2}. \end{aligned} \end{aligned}$$

(3) Since

$$\begin{aligned} \mu _{1}({\mathcal {A}}+{\mathcal {B}}) \le \mu _{1}({\mathcal {A}}) + \mu _{1}({\mathcal {B}}), \end{aligned}$$

immediately from the definition of the spectral norm it follows that

$$\begin{aligned} \Vert {\mathcal {A}}+{\mathcal {B}}\Vert _{2} \le \Vert {\mathcal {A}}\Vert _{2} +\Vert {\mathcal {B}}\Vert _{2}. \end{aligned}$$

Therefore, (2.2) is a valid definition of the matrix norm. $\square $

Our intention is to determine the spectral norm of a tensor explicitly using the approach based on the matricization or unfolding. Matricization is the transformation that transforms a tensor into a matrix. Let ${\mathbf {I}}_1,\ldots ,{\mathbf {I}}_M,{\mathbf {K}}_1,\ldots ,{\mathbf {K}}_N$ be positive integers. Assume that ${\mathfrak {I}}, {\mathfrak {K}}$ are positive integers defined by

$$\begin{aligned} {\mathfrak {I}}={\mathbf {I}}_{1}{\mathbf {I}}_{2}\cdots {\mathbf {I}}_{M}, \ \ {\mathfrak {K}}={\mathbf {K}}_{1}{\mathbf {K}}_{2}\cdots {\mathbf {K}}_{N}. \end{aligned}$$

(2.3)

Denote by $\mathrm {Mat}({\mathcal {A}})$ the matrix obtained after the matricization

$$\begin{aligned} \mathrm {Mat}: {\mathbb {C}}^{{\mathbf {I}}({M}) \times {\mathbf {K}}({N})}\mapsto {\mathbb {C}}^{{\mathfrak {I}}\times {\mathfrak {K}}}, \end{aligned}$$

which transforms a tensor ${\mathcal {A}}\in {\mathbb {C}}^{{\mathbf {I}}({M}) \times {\mathbf {K}}({N})}$ into the matrix $A\in {\mathbb {C}}^{{\mathfrak {I}}\times {\mathfrak {K}}}$. An arbitrary tensor ${\mathcal {A}}$ can be unfolded into an appropriate matrix A in different ways.

It is known that the spectral norm of the tensor is bounded by the spectral norm of the matricized tensor, i.e., $\Vert \mathrm {Mat}({\mathcal {A}})\Vert _{\sigma }\ge \Vert {\mathcal {A}}\Vert _{\sigma }$ (see, for example, Li 2016).

One approach in the matricization, denoted by $\psi :{\mathbb {C}}^{{\mathbf {I}}({M}) \times {\mathbf {K}}({N})}\mapsto {\mathbb {C}}^{{\mathfrak {I}}\times {\mathfrak {K}}}$ was proposed in Liang and Zheng (2018, Definition 2.4) (see also Brazell et al. 2013). The matricization $\psi $ is defined by

$$\begin{aligned} \psi ({\mathcal {A}}_{i_1,\ldots ,i_M,k_1,\ldots ,k_N})=A_{\mathrm {ivec}(i,{\mathbb {I}}),\mathrm {ivec}(k,{\mathbb {K}})}, \end{aligned}$$

where ${i}=(i_1,\ldots ,i_M)^{\mathrm T}$ and

$$\begin{aligned} \mathrm {ivec}({i},{\mathbb {I}})=i_1+\sum \limits _{j=2}^M (i_j-1)\prod \limits _{s=1}^{j-1}I_s. \end{aligned}$$

To define an effective procedure for the tensor matricization, we use the reshaping operation denoted as rsh, which was introduced in Panigrahy et al. (2018). Later, we define the spectral norm of a tensor by means of the spectral norm of the reshaped tensor. This operation can be implemented by means of the standard Matlab function reshape.

Definition 2.2

(Panigrahy et al. 2018) Let ${\mathbf {I}}_1,\ldots {\mathbf {I}}_M,{\mathbf {K}}_1,\ldots {\mathbf {K}}_N$ be given integers. Assume that ${\mathfrak {I}}, {\mathfrak {K}}$ are the integers defined (2.3). The reshaping operation

$$\begin{aligned} \mathrm {rsh}: {\mathbb {C}}^{{\mathbf {I}}({M}) \times {\mathbf {K}}({N})}\mapsto {\mathbb {C}}^{{\mathfrak {I}}\times {\mathfrak {K}}}, \end{aligned}$$

transforms a tensor ${\mathcal {A}}\in {\mathbb {C}}^{{\mathbf {I}}({M}) \times {\mathbf {K}}({N})}$ into the matrix $A\in \mathbb C^{{\mathfrak {I}}\times {\mathfrak {K}}}$ using the Matlab function reshape as follows:

$$\begin{aligned} \mathrm {rsh}\left( {\mathcal {A}}\right) =A=\mathrm {reshape}({\mathcal {A}},{\mathfrak {I}},{\mathfrak {K}}),\ \ {\mathcal {A}}\in {\mathbb {C}}^{{\mathbf {I}}({M}) \times {\mathbf {K}}({N})},\ A\in {\mathbb {C}}^{{\mathfrak {I}}\times {\mathfrak {K}}}. \end{aligned}$$

The inverse reshaping is the mapping defined by

$$\begin{aligned} \begin{aligned}&\mathrm {rsh}^{-1}\,:{\mathbb {C}}^{{\mathfrak {I}}\times {\mathfrak {K}}} \mapsto {\mathbb {C}}^{{\mathbf {I}}({M}) \times {\mathbf {K}}({N})},\\&\mathrm {rsh}^{-1}(A)={\mathcal {A}}=\mathrm {reshape}(A,{\mathbf {I}}_1,\ldots ,{\mathbf {I}}_M,{\mathbf {K}}_1,\ldots {,}{\mathbf {K}}_N),\ \ A\in {\mathbb {C}}^{{\mathfrak {I}}\times {\mathfrak {K}}},\ {\mathcal {A}}\in {\mathbb {C}}^{{\mathbf {I}}({M}) \times {\mathbf {K}}({N})}. \end{aligned} \end{aligned}$$

The following result from Panigrahy et al. (2018) will be useful.

Lemma 2.2

(Panigrahy et al. 2018) Let ${\mathcal {A}} \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}}\times {\mathbf {K}}_{1} \times \cdots \times {\mathbf {K}}_{{N}}}$ and ${\mathcal {B}} \in {\mathbb {C}}^{{\mathbf {K}}_{1}\times \cdots \times {\mathbf {K}}_{{N}}\times {\mathbf {L}}_{1}\times \cdots \times {\mathbf {L}}_{{N}}}$ be given tensors, integers ${\mathfrak {I}},{\mathfrak {K}}$ are computed as in (2.3) and ${\mathfrak {L}}={\mathbf {L}}_{1}{\mathbf {L}}_{2}\cdots {\mathbf {L}}_{N}$. Then

$$\begin{aligned} \mathrm {rsh}\left( {\mathcal {A}}*_{N}{\mathcal {B}}\right) =\mathrm {rsh}\left( {\mathcal {A}}\right) \mathrm {rsh}\left( {\mathcal {B}}\right) =AB, \end{aligned}$$

(2.4)

where $A=\mathrm {rsh}\left( {\mathcal {A}}\right) \in \mathbb C^{{\mathfrak {I}}\times {\mathfrak {K}}},B=\mathrm {rsh}\left( {\mathcal {B}}\right) \in {\mathbb {C}}^{{\mathfrak {K}}\times {\mathfrak {L}}}$.

Applying the inverse reshaping operator $\mathrm {rsh}^{-1}()$ on both sides in (2.4), it can be concluded that $\mathrm {rsh}^{-1}(AB) = \mathrm {rsh}^{-1}(A)*_{N} \mathrm {rsh}^{-1}(B) = {\mathcal {A}}*_{N}{\mathcal {B}}$.

Now, our intention is to approximate the tensor norm $\Vert {\mathcal {A}}\Vert _2$ by an effective computational procedure. For this purpose, we propose Algorithm 1 for computing $\mathrm {rsh}^{-1}(A)$ in terms of the Singular Value Decomposition (SVD) of A. Since eigenvalues in Liang et al. (2019) are defined for even-order square tensors, our further investigation will be restricted to even-order tensors.

In Lemma 2.3, we show that the tensor ${\mathcal {A}}$ in Algorithm 1 is defined well.

Lemma 2.3

The tensor ${\mathcal {A}}$ in Algorithm 1 is defined well.

Proof

Under the assumptions of Algorithm 1, an application of Lemma 2.2 gives

$$\begin{aligned} \begin{aligned} \mathrm {rsh}\left( {{\mathcal {A}}}\right)&=\mathrm {rsh}\left( {\mathcal U}*_N {{\mathcal {D}}} *_N{{\mathcal {V}}}^*\right) \\&=\mathrm {rsh}\left( {{\mathcal {U}}}\right) \mathrm {rsh}\left( {\mathcal D}\right) \mathrm {rsh}\left( {{\mathcal {V}}}^*\right) \\&=UDV^*\\&=A, \end{aligned} \end{aligned}$$

which confirms ${\mathcal {A}}=\mathrm {rsh}^{-1}(A)$. $\square $

As a consequence of Algorithm 1, Lemma 2.4 shows that the spectral norm is invariant with respect to the function $\mathrm {rsh}$.

Lemma 2.4

Let ${\mathcal {A}} \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}}\times {\mathbf {K}}_{1} \times \cdots \times {\mathbf {K}}_{{N}}}$ be a given tensor and integers ${\mathfrak {I}},{\mathfrak {K}}$ are computed as in (2.3). Then

$$\begin{aligned} \Vert {\mathcal {A}}\Vert _2=\Vert \mathrm {rsh}{^{-1}}\left( {\mathcal {A}}\right) \Vert _2=\Vert A\Vert _2, \end{aligned}$$

(2.5)

where $A\in {\mathbb {C}}^{{\mathfrak {I}}\times {\mathfrak {K}}}$.

Proof

According to Algorithm 1, the tensor ${\mathcal {A}} \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}}\times {\mathbf {K}}_{1} \times \cdots \times {\mathbf {K}}_{{N}}}$ possesses the same singular values as the matrix $A\in {\mathbb {C}}^{{\mathfrak {I}}\times {\mathfrak {K}}}$. $\square $

Example 2.1

Let ${\mathcal {A}} = rand(2,2,2,2)$ is defined by

$$\begin{aligned} \begin{aligned} {\mathcal {A}}(:,:,1,1)&=\begin{bmatrix} 0.8147&\quad 0.1270\\ 0.9058&\quad 0.9134 \end{bmatrix},\ {\mathcal {A}}(:,:,2,1) =\begin{bmatrix} 0.6324&\quad 0.2785\\ 0.0975&\quad 0.5469 \end{bmatrix},\\ {\mathcal {A}}(:,:,1,2)&=\begin{bmatrix} 0.9575&\quad 0.1576\\ 0.9649&\quad 0.9706\end{bmatrix},\ {\mathcal {A}}(:,:,2,2) =\begin{bmatrix} 0.9572&\quad 0.8003\\ 0.4854&\quad 0.1419\end{bmatrix}, \end{aligned} \end{aligned}$$

then

$$\begin{aligned} A = \mathrm {rsh}{^{-1}}({\mathcal {A}})=\mathrm {reshape}({\mathcal {A}},4,4)=\begin{bmatrix} 0.8147&\quad 0.6324&\quad 0.9575&\quad 0.9572\\ 0.9058&\quad 0.0975&\quad 0.9649&\quad 0.4854\\ 0.1270&\quad 0.2785&\quad 0.1576&\quad 0.8003\\ 0.9134&\quad 0.5469&\quad 0.9706&\quad 0.1419\end{bmatrix}. \end{aligned}$$

Simple verification shows that $\Vert {\mathcal {A}}\Vert _2=\Vert A\Vert _2=2.6201.$

Various definitions of the tensor rank can be found in the relevant literature. For more details see Brazell et al. (2013), and Comon et al. (2009). An alternative definition of the tensor rank was introduced in Panigrahy et al. (2018).

Definition 2.3

(Panigrahy et al. 2018) Let ${\mathcal {A}}\in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {K}}({N})}$ and $A=reshape \left( {\mathcal {A}},{\mathfrak {I}}, {\mathfrak {K}}\right) =\mathrm {rsh}({\mathcal {A}})\in \mathbb C^{{\mathfrak {I}}\times {\mathfrak {K}}}$ are defined as in Algorithm 1. Then, the tensor rank of ${\mathcal {A}}$ is denoted by $\mathrm {rshrank}({\mathcal {A}})$ and defined by $\mathrm {rshrank}({{\mathcal {A}}})=\mathrm {rank}({A})$.

3 Preliminary results

For $a \in {\mathbb {C}}$, let $a^\dagger = a^{-1}$, if $a \ne 0$ and $a^\dagger = 0$, if $a = 0$. Following this notation, the tensor ${\mathcal {D}} \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}} \times {\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}}}$ is called diagonal if all its entries are zero except ${\mathcal {D}}_{i_{1} \ldots i_{{N}}i_{1} \ldots i_{{N}}}$, that is

$$\begin{aligned} {\mathcal {D}}_{ i_{1}\cdots i_{{N}}j_{1}\ldots j_{{N}}} = {\left\{ \begin{array}{ll} 0, &{}\quad (i_{1},\ldots ,i_{{N}}) \ne (j_{1},\ldots ,j_{{N}}),\\ {\mathcal {D}}_{ i_{1}\cdots i_{{N}}i_{1}\ldots i_{{N}}} , &{}\quad (i_{1}, \ldots ,i_{{N}}) = (j_{1}, \ldots ,j_{{N}}), \end{array}\right. } \end{aligned}$$

(3.1)

where ${\mathcal {D}}_{i_{1} \ldots i_{{N}}i_{1}\ldots i_{{N}}}$ is a complex number. Particularly, a diagonal tensor becomes a unit tensor in this case ${\mathcal {D}}_{ i_{1}\cdots i_{{N}}j_{1}\ldots j_{{N}}}= \delta _{i_1j_1}\cdots \delta _{i_Nj_N}$, where

$$\begin{aligned}\delta _{lk}= {\left\{ \begin{array}{ll} 1, &{}\quad l=k, \\ 0, &{}\quad l=k \end{array}\right. } \end{aligned}$$

is the Kronecker delta, then ${\mathcal {D}}$ is a unit tensor, denoted by ${\mathcal {I}}$. It follows from Definition 1.1 that the Moore–Penrose inverse ${\mathcal {D}}^\dagger \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}} \times {\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}}}$ of the diagonal tensor defined in (3.1) is equal to

$$\begin{aligned} \begin{aligned}&({\mathcal {D}}^\dagger )_{j_{1} \ldots j_{{N}}i_{1} \ldots i_{{N}}}= {\left\{ \begin{array}{ll} \frac{1}{{\mathcal {D}}_{i_{1} \ldots i_{{N}}j_{1} \ldots j_{{N}}}},&{}\quad {\mathcal {D}}_{i_{1} \ldots i_{{N}}j_{1} \ldots j_{{N}}}\ne 0,\\ 0, &{}\quad {\mathcal {D}}_{i_{1} \ldots i_{{N}}j_{1} \ldots j_{{N}}}=0. \end{array}\right. }. \end{aligned} \end{aligned}$$

It is easy to see that if ${\mathcal {D}}$ is a diagonal tensor, then ${\mathcal {D}}*_{{N}}{\mathcal {D}}^\dagger $ and ${\mathcal {D}}^\dagger *_{{N}}{\mathcal {D}}$ are diagonal tensors, whose diagonal entries are 1 or 0.

The tensor ${\mathcal {A}} \in {\mathbb {C}}^{{\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}} \times {\mathbf {I}}_{1} \times \cdots \times {\mathbf {I}}_{{N}}}$ is orthogonal if ${\mathcal {A}}*_{{N}}{\mathcal {A}}^{*} = {\mathcal {A}}^{*}*_{{N}}{\mathcal {A}} = {\mathcal {I}}$.

The computation of the Moore–Penrose inverse of a tensor was proposed in Brazell et al. (2013), Sun et al. (2016). This method is restated in Lemma 3.1.

Lemma 3.1

(Brazell et al. 2013; Sun et al. 2016) For a tensor ${\mathcal {A}} \in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {K}}({N})}$, the singular value decomposition (SVD) of ${\mathcal {A}}$ has the form:

$$\begin{aligned} {\mathcal {A}} = {\mathcal {U}}*_{{N}}{\mathcal {D}}*_{{N}}{\mathcal {V}}^{*}, \end{aligned}$$

(3.2)

where ${\mathcal {U}} \in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {I}}({N})}$ and ${\mathcal {V}} \in {\mathbb {C}}^{{\mathbf {K}}({N}) \times {\mathbf {K}}({N})}$ are orthogonal tensors, ${\mathcal {D}} \in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {K}}({N})}$ is a diagonal tensor satisfying

$$\begin{aligned} {\mathcal {D}}_{ i_{1}\cdots i_{{N}}k_{1}\ldots k_{{N}}} = {\left\{ \begin{array}{ll} 0, &{}\quad (i_{1},\ldots ,i_{{N}}) \ne (k_{1},\ldots ,k_{{N}}),\\ \mu _{i_{1} \ldots i_{{N}}}, &{}\quad (i_{1}, \ldots ,i_{{N}}) = (k_{1}, \ldots ,k_{{N}}), \end{array}\right. } \end{aligned}$$

wherein $\mu _{i_{1} \ldots i_{{N}}}$ are the singular values of ${\mathcal {A}}$. Then

$$\begin{aligned} {\mathcal {A}}^\dagger = {\mathcal {V}}*_{{N}}{\mathcal {D}}^\dagger *_{{N}}{\mathcal {U}}^{*}, \end{aligned}$$

(3.3)

where

$$\begin{aligned} ({\mathcal {D}}^\dagger )_{ k_{1}\ldots k_{{N}} i_{1} \ldots i_{{N}}} = {\left\{ \begin{array}{ll} 0, &{}\quad (i_{1} \ldots i_{{N}}) \ne (k_{1} \ldots k_{{N}}),\\ \left( \mu _{i_{1} \ldots i_{{N}}}\right) ^\dagger , &{}\quad (i_{1} \ldots i_{{N}}) = (k_{1} \ldots k_{{N}}), \end{array}\right. } \end{aligned}$$

and

$$\begin{aligned} \left( \mu _{i_{1} \ldots i_{{N}}}\right) ^\dagger = {\left\{ \begin{array}{ll} 0, &{}\quad \mu _{i_{1} \ldots i_{{N}}}=0,\\ \left( \mu _{i_{1} \ldots i_{{N}}}\right) ^{-1}, &{}\quad \mu _{i_{1} \ldots i_{{N}}}\ne 0. \end{array}\right. } \end{aligned}$$

An effective algorithm for computing the Moore–Penrose inverse of a tensor in the form (3.3) was presented in Algorithm 1 from Huang et al. (2018). To compute the Moore–Penrose inverse by means of (3.3), it is necessary to compute the transpose of a tensor. For this purpose, we developed the following Algorithm 2.

Example 3.1

This example is aimed to verification of Algorithm 2. Consider ${\mathcal {A}}=\mathrm {rand}(2,2,2,2)$ equal to

$$\begin{aligned} \begin{aligned} {\mathcal {A}}(:,:,1,1)&= \begin{bmatrix} 0.8147&\quad 0.1270\\ 0.9058&\quad 0.9134\end{bmatrix} ;\ \ {\mathcal {A}}(:,:,2,1) = \begin{bmatrix} 0.6324&\quad 0.2785\\ 0.0975&\quad 0.5469\end{bmatrix} ;\\ {\mathcal {A}}(:,:,1,2)&= \begin{bmatrix} 0.9575&\quad 0.1576\\ 0.9649&\quad 0.9706\end{bmatrix} ;\ \ {\mathcal {A}}(:,:,2,2) = \begin{bmatrix} 0.9572&\quad 0.8003\\ 0.4854&\quad 0.1419\end{bmatrix} . \end{aligned} \end{aligned}$$

Then, $A = \mathrm {rsh}\left( {\mathcal {A}},4,4\right) $ is equal to

$$\begin{aligned} A=\begin{bmatrix} 0.8147&\quad 0.6324&\quad 0.9575&\quad 0.9572\\ 0.9058&\quad 0.0975&\quad 0.9649&\quad 0.4854\\ 0.1270&\quad 0.2785&\quad 0.1576&\quad 0.8003\\ 0.9134&\quad 0.5469&\quad 0.9706&\quad 0.1419 \end{bmatrix}. \end{aligned}$$

Furthermore, the results of Algorithm 2 is equal to ${\mathcal {A}}=\mathrm {reshape}(A^{\mathrm T},2,2,2,2)$, which gives

$$\begin{aligned} \begin{aligned} {\mathcal {A}}^{\mathrm T}(:,:,1,1)&= \begin{bmatrix} 0.8147&\quad 0.9575\\ 0.6324&\quad 0.9572\end{bmatrix} ;\ \ {\mathcal {A}}^{\mathrm T}(:,:,2,1) = \begin{bmatrix} 0.9058&\quad 0.9649\\ 0.0975&\quad 0.4854\end{bmatrix} ;\\ {\mathcal {A}}^{\mathrm T}(:,:,1,2)&= \begin{bmatrix} 0.1270&\quad 0.1576\\ 0.2785&\quad 0.8003\end{bmatrix} ;\ \ {\mathcal {A}}^{\mathrm T}(:,:,2,2) = \begin{bmatrix} 0.9134&\quad 0.9706\\ 0.5469&\quad 0.1419\end{bmatrix} . \end{aligned} \end{aligned}$$

(3.4)

On the other hand, a direct calculation gives

$$\begin{aligned} \begin{aligned} a_{1111}&=0.8147={\mathcal {A}}^{\mathrm T}_{1111};\ a_{2111}=0.9058={\mathcal {A}}^{\mathrm T}_{1121};\\ a_{1211}&=0.1270={\mathcal {A}}^{\mathrm T}_{1112};\ a_{2211}=0.9134={\mathcal {A}}^{\mathrm T}_{1122};\\ a_{1121}&=0.6324={\mathcal {A}}^{\mathrm T}_{2111};\ a_{2121}=0.0975={\mathcal {A}}^{\mathrm T}_{2121};\\ a_{1221}&=0.2785={\mathcal {A}}^{\mathrm T}_{2112};\ a_{2221}=0.5469={\mathcal {A}}^{\mathrm T}_{2122};\\ a_{1112}&=0.9575={\mathcal {A}}^{\mathrm T}_{1211};\ a_{2112}=0.9649={\mathcal {A}}^{\mathrm T}_{1221};\\ a_{1212}&=0.1576={\mathcal {A}}^{\mathrm T}_{1212};\ a_{2212}=0.9706={\mathcal {A}}^{\mathrm T}_{1222};\\ a_{1122}&=0.9572={\mathcal {A}}^{\mathrm T}_{2211};\ a_{2122}=0.4854={\mathcal {A}}^{\mathrm T}_{2221};\\ a_{1222}&=0.8003={\mathcal {A}}^{\mathrm T}_{2212};\ a_{2222}=0.1419={\mathcal {A}}^{\mathrm T}_{2222}. \end{aligned} \end{aligned}$$

Therefore, the result of direct calculation coincides with (3.4).

Lemma 3.2

Let ${\mathcal {A}} \in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {K}}({N})}$ be a given tensor, let $\mu _{i_{1},\cdots ,i_{{N}}}$ be the singular values of ${\mathcal {A}}$ and $\nu _{i_{1},\cdots ,i_{{L}}}$ be the nonzero singular values of ${\mathcal {A}}$. Then

$$\begin{aligned} \Vert {\mathcal {A}}\Vert _{2} = \mu _{1}({\mathcal {A}}); \qquad \Vert {\mathcal {A}}^\dagger \Vert _{2} = \frac{1}{\nu _{\min }({\mathcal {A}})}, \end{aligned}$$

where $\nu _{\min }({\mathcal {A}})$ denotes the smallest nonzero singular value of ${\mathcal {A}}$.

Proof

The identity $\Vert {\mathcal {A}}\Vert _{2} = \mu _{1}({\mathcal {A}})$ follows from Lemma 2.1. Since $\nu _{i_{1},\ldots ,i_{{L}}}$ are nonzero singular values of ${\mathcal {A}}$ defined in (3.2), it follows from (3.3) that $\left( \nu _{i_{1},\cdots ,i_{{L}}}\right) ^{-1} > 0$ are the nonzero singular values of ${\mathcal {A}}^\dagger $. Accordingly, $\Vert {\mathcal {A}}^\dagger \Vert _{2} =\mu _{1}(A^\dagger )= \frac{1}{\nu _{\min }({\mathcal {A}})}$. $\square $

A useful representation for ${{\mathcal {A}}}*_N{{\mathcal {A}}}^\dagger $ is derived in Lemma 3.3.

Lemma 3.3

Let ${\mathcal {A}}\in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {K}}({N})}$ be an arbitrary tensor and the positive integers ${\mathfrak {I}}, {\mathfrak {K}}$ are defined in (2.3). Then

$$\begin{aligned} {{\mathcal {A}}}*_N{{\mathcal {A}}}^\dagger ={{\mathcal {U}}}_A*_N \mathrm {rsh}{^{-1}}\left( \begin{bmatrix} I_{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)} \end{bmatrix}\right) *_N{{\mathcal {U}}}_A^*. \end{aligned}$$

(3.5)

Proof

We follow Algorithm 1 from Huang et al. (2018) to define ${\mathcal {A}}^\dagger $ and ${\mathcal {B}}^\dagger $. According to Step 1, it is necessary to reshape ${\mathcal {A}}\in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {K}}({N})}$ into a matrix $A\in {\mathbb {C}}^{{\mathfrak {I}}\times {\mathfrak {K}}}$, where ${\mathfrak {I}}, {\mathfrak {K}}$ are defined in (2.3). This transformation is denoted by $\mathrm {rsh}({\mathcal {A}})=A$. Step 2 assumes the Singular Value Decomposition (SVD) of A of the form $[U_A,D_A,V_A]=SVD(A)$, which implies $A=U_A D_A V_A^*$, where $U_A\in {\mathbb {C}}^{{\mathfrak {I}}\times {\mathfrak {I}}}$ and $V_A\in {\mathbb {C}}^{{\mathfrak {K}}\times {\mathfrak {K}}}$ are unitary and the matrix $D_A\in {\mathbb {C}}^{{\mathfrak {I}}\times {\mathfrak {K}}}$ is of the diagonal form:

$$\begin{aligned} D_A=\begin{bmatrix} \Sigma _A&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {K}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {K}}-{\mathfrak {I}}_R)} \end{bmatrix}, \end{aligned}$$

where

$$\begin{aligned} \Sigma _A\in {\mathbb {C}}^{{\mathfrak {I}}_R\times {\mathfrak {I}}_R},\ {\mathfrak {I}}_R=\mathrm {rshrank}({A}) \end{aligned}$$

is diagonal with singular values of A on the main diagonal and

$$\begin{aligned} O_{{\mathfrak {I}}_R\times ({\mathfrak {K}}-{\mathfrak {I}}_R)}\in \mathbb C^{{\mathfrak {I}}_R\times ({\mathfrak {K}}-{\mathfrak {I}}_R)}, O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}\in \mathbb C^{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}, O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {K}}-{\mathfrak {I}}_R)}\in \mathbb C^{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {K}}-{\mathfrak {I}}_R)} \end{aligned}$$

are appropriate zero blocks. According to Step 3, we perform the reshaping operations:

$$\begin{aligned} \mathrm {rsh}(U_A)={{\mathcal {U}}}_A\in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {I}}({N})},\ \ \mathrm {rsh}(V_A^*)={{\mathcal {V}}}_A^*\in {\mathbb {C}}^{{\mathbf {K}}({N}) \times {\mathbf {K}}({N})},\ \ \mathrm {rsh}(D_A)={{\mathcal {D}}}_A\in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {K}}({N})}. \end{aligned}$$

Then, compute

$$\begin{aligned} D_A^\dagger =\begin{bmatrix} \Sigma _A^{-1}&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {K}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad O_{({\mathfrak {K}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)} \end{bmatrix} \in {\mathbb {C}}^{{\mathfrak {K}}\times {\mathfrak {I}}} \end{aligned}$$

and

$$\begin{aligned} {{\mathcal {D}}}_A^\dagger =\mathrm {rsh}{^{-1}}(D_A^\dagger )\in {\mathbb {C}}^{{\mathbf {K}}({N}) \times {\mathbf {I}}({N})}. \end{aligned}$$

According to Step 4 of Algorithm 1 from Huang et al. (2018)

$$\begin{aligned} {{\mathcal {A}}}^\dagger ={{\mathcal {V}}}_A*_N {{\mathcal {D}}}_A^\dagger *_N{{\mathcal {U}}}_A^*. \end{aligned}$$

Now, the tensor ${{\mathcal {A}}}$ possesses the representation:

$$\begin{aligned} {{\mathcal {A}}}={{\mathcal {U}}}_A*_N {{\mathcal {D}}}_A *_N{\mathcal V}_A^*. \end{aligned}$$

Later, one can verify

$$\begin{aligned} {{\mathcal {D}}}_A*_N{{\mathcal {D}}}_A^\dagger = \mathrm {rsh}{^{-1}}\left( \begin{bmatrix} I_{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)} \end{bmatrix}\right) \in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {I}}({N})}, \end{aligned}$$

where $I_{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}$ is the identity ${\mathfrak {I}}_R\times {\mathfrak {I}}_R$ matrix. Consequently, ${{\mathcal {A}}}*_N{{\mathcal {A}}}^\dagger $ possesses the representation (3.5). $\square $

The result of Proposition 3.1 will be useful.

Proposition 3.1

(Meng and Zheng 2010) Let $W\in {\mathbb {C}}^{n\times n}$ be a unitary matrix with the block form:

$$\begin{aligned} W =\begin{bmatrix} W_{11}&\quad W_{12} \\ W_{21}&\quad W_{22}\end{bmatrix}, \ W_{11} \in {\mathbb {C}}^{r\times r}, W_{22}\in {\mathbb {C}}^{(n-r)\times (n-r)}, 1\le r < n. \end{aligned}$$

Then, $\Vert W_{12}\Vert = \Vert W_{21}\Vert $ for any unitarily invariant norm.

4 Main results

For the sake of convenience, we assume that the following condition holds

$$\begin{aligned} \begin{aligned} {\mathcal {A}}, {\mathcal {E}}&\in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {K}}({N})}, \ {\mathcal {B}} = {\mathcal {A}} + {\mathcal {E}}, \ \mathrm {rshrank}({{{\mathcal {A}}}})=\mathrm {rshrank}({{{\mathcal {B}}}}) = r\\ \bigtriangleup&= \Vert {\mathcal {A}}^\dagger \Vert _{2}\Vert {\mathcal {E}}\Vert _{2} < 1. \end{aligned} \end{aligned}$$

(4.1)

Lemma 4.1

If the Condition (4.1) is satisfied, then

$$\begin{aligned} \Vert {\mathcal {B}}^\dagger \Vert _{2} \le \frac{\Vert {\mathcal {A}}^\dagger \Vert _{2}}{1 - \bigtriangleup } \end{aligned}$$

(4.2)

Proof

According to the Lemma 3.2, we get

$$\begin{aligned} \Vert {\mathcal {B}}^\dagger \Vert _{2} = \frac{1}{\nu _{\min }({\mathcal {A}} + {\mathcal {E}})}, \end{aligned}$$

so

$$\begin{aligned} \begin{aligned} \frac{1}{\Vert {\mathcal {B}}^\dagger \Vert _{2}}&= \nu _{\min }({\mathcal {A}} + {\mathcal {E}}) \ge \nu _{\min }({\mathcal {A}}) - \nu _{\min }({\mathcal {E}}) \ge \nu _{\min }({\mathcal {A}}) - \mu _{1}({\mathcal {E}})\\&\ge \nu _{\min }({\mathcal {A}}) - \Vert {\mathcal {E}}\Vert _{2} \\&= \frac{1}{\Vert {\mathcal {A}}^\dagger \Vert _{2}} - \Vert {\mathcal {E}}\Vert _{2}. \end{aligned} \end{aligned}$$

Then

$$\begin{aligned} \Vert {\mathcal {B}}^\dagger \Vert _{2} \le \frac{\Vert {\mathcal {A}}^\dagger \Vert _{2}}{1 - \Vert {\mathcal {A}}^\dagger \Vert _{2}\Vert {\mathcal {E}}\Vert _{2}} = \frac{\Vert {\mathcal {A}}^\dagger \Vert _{2}}{1 - \bigtriangleup }, \end{aligned}$$

which completes the proof. $\square $

Next, we give the decomposition of ${\mathcal {B}}^\dagger - {\mathcal {A}}^\dagger $.

Theorem 4.1

Let ${\mathcal {A}}, {\mathcal {E}} \in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {K}}({N})}$ and ${\mathcal {B}} = {\mathcal {A}} + {\mathcal {E}}$. Then

$$\begin{aligned} \begin{aligned} {\mathcal {B}}^\dagger -{\mathcal {A}}^\dagger&= -{\mathcal {B}}^\dagger *_{{N}}{\mathcal {E}}*_{{N}}{\mathcal {A}}^\dagger + {\mathcal {B}}^\dagger *_{{N}}({\mathcal {B}}^\dagger )^{*}*_{{N}}{\mathcal {E}}^{*}*_{{N}}{P}^{\bot }_{{\mathcal {A}}} -{R}^{\bot }_{{\mathcal {B}}} *_{{N}}{\mathcal {E}}^{*}*_{{N}}({\mathcal {A}}^\dagger )^{*}*_{{N}}{\mathcal {A}}^\dagger . \end{aligned} \end{aligned}$$

Proof

After some verifications, one can obtain

$$\begin{aligned} \begin{aligned} {\mathcal {B}}^\dagger -{\mathcal {A}}^\dagger&= -{\mathcal {B}}^\dagger *_{{N}}{\mathcal {E}}*_{{N}}{\mathcal {A}}^\dagger +({\mathcal {B}}^\dagger -{\mathcal {A}}^\dagger ) +{\mathcal {B}}^\dagger *_{{N}}({\mathcal {B}}-{\mathcal {A}})*_{{N}}{\mathcal {A}}^\dagger \\&= -{\mathcal {B}}^\dagger *_{{N}}{\mathcal {E}}*_{{N}}{\mathcal {A}}^\dagger +{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger ) -({\mathcal {I}}-{\mathcal {B}}^\dagger *_{{N}}{\mathcal {B}})*_{{N}}{\mathcal {A}}^\dagger . \end{aligned} \end{aligned}$$

According to the properties $(3^T)$ and $(1^T)$ from Definition 1.1, it follows that

$$\begin{aligned} {\mathcal {A}}^{*}*_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )= {\mathcal {A}}^{*}-{\mathcal {A}}^{*}*_{{N}}({\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )^{*} = {\mathcal {A}}^{*}-\left( {\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger *_{{N}}{\mathcal {A}}\right) ^{*}={\mathcal {O}}, \end{aligned}$$

where ${\mathcal {O}}\in {\mathbb {C}}^{{\mathbf {K}}({N}) \times {\mathbf {I}}({N})}$ is an appropriate zero tensor. Consequently

$$\begin{aligned}&{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )= {\mathcal {B}}^\dagger *_{{N}}{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\\&\quad ={\mathcal {B}}^\dagger *_{{N}}({\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger )^{*}*_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\\&\quad ={\mathcal {B}}^\dagger *_{{N}}({\mathcal {B}}^\dagger )^{*} {{*_{{N}}}}({\mathcal {A}} +{\mathcal {E}})^{*}*_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\\&\quad ={\mathcal {B}}^\dagger *_{{N}}({\mathcal {B}}^\dagger )^{*}*_{{N}}{\mathcal {E}}^{*}*_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger ). \end{aligned}$$

Analogously, we arrive to

$$\begin{aligned} ({\mathcal {I}}-{\mathcal {B}}^\dagger *_{{N}}{\mathcal {B}})*_{{N}}{\mathcal {B}}^{*}={\mathcal {O}}, \end{aligned}$$

which further implies

$$\begin{aligned} ({\mathcal {I}}-{\mathcal {B}}^\dagger *_{{N}}{\mathcal {B}})*_{{N}}{\mathcal {A}}^\dagger =-({\mathcal {I}}-{\mathcal {B}}^\dagger *_{{N}}{\mathcal {B}})*_{{N}}{\mathcal {E}}^{*}*_{{N}}({\mathcal {A}}^\dagger )^{*}*_{{N}}{\mathcal {A}}^{*}. \end{aligned}$$

The conclusion can be obtained. $\square $

Lemma 4.2

If ${\mathcal {O}} \ne {\mathcal {P}} \in {\mathbb {C}}^{{\mathbf {K}}({N}) \times {\mathbf {K}}({N})}$, and ${\mathcal {P}}^{2}={\mathcal {P}}={\mathcal {P}}^{*}$, then

$$\begin{aligned} \Vert {\mathcal {P}}\Vert _{2}=1. \end{aligned}$$

Proof

Since

$$\begin{aligned} \Vert {\mathcal {P}}\Vert _{2}^{2}=\Vert {\mathcal {P}}^{*}*_{{N}}{\mathcal {P}}\Vert _{2}=\Vert {\mathcal {P}}^{2}\Vert _{2}=\Vert {\mathcal {P}}\Vert _{2}, \end{aligned}$$

it follows that

$$\begin{aligned} \Vert {\mathcal {P}}\Vert _{2}(\Vert {\mathcal {P}}\Vert _{2}-1)=0. \end{aligned}$$

Therefore, $\Vert {\mathcal {P}}\Vert _{2}=1$ in the case ${\mathcal {P}} \ne {\mathcal {O}}$. $\square $

Theorem 4.2

Let ${\mathcal {A}}, {\mathcal {E}} \in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {K}}({N})}, {\mathcal {B}} = {\mathcal {A}} + {\mathcal {E}}{.}$ If the Condition (4.1) is satisfied, then

$$\begin{aligned} \frac{\Vert {\mathcal {B}}^\dagger - {\mathcal {A}}^\dagger \Vert _{2}}{\Vert {\mathcal {A}}^\dagger \Vert _{2}} \le \left( 1+\frac{1}{1-\bigtriangleup }+\frac{1}{(1-\bigtriangleup )^{2}}\right) \bigtriangleup . \end{aligned}$$

(4.3)

Proof

Since $({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )^{2}=({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )=({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )^{*}$ and $({\mathcal {I}}-{\mathcal {B}}^\dagger *_{{N}}{\mathcal {B}})^{2}=({\mathcal {I}}-{\mathcal {B}}^\dagger *_{{N}}{\mathcal {B}})=({\mathcal {I}}-{\mathcal {B}}^\dagger *_{{N}}{\mathcal {B}})^{*}$, by Lemma 4.2

$$\begin{aligned} \Vert {\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger \Vert _{2}=1,\quad \Vert {\mathcal {I}}-{\mathcal {B}}^\dagger *_{{N}}{\mathcal {B}}\Vert _{2}=1, \end{aligned}$$

and from Theorem 4.1

$$\begin{aligned} \Vert {\mathcal {B}}^\dagger - {\mathcal {A}}^\dagger \Vert _{2} \le (\Vert {\mathcal {A}}^\dagger \Vert _{2}\Vert {\mathcal {B}}^\dagger \Vert _{2}+\Vert {\mathcal {B}}^\dagger \Vert _{2}^{2}+\Vert {\mathcal {A}}^\dagger \Vert _{2}^{2})\Vert {\mathcal {E}}\Vert _{2}. \end{aligned}$$

An application of Lemma 4.1 initiates

$$\begin{aligned} \Vert {\mathcal {B}}^\dagger - {\mathcal {A}}^\dagger \Vert _{2} \le \left( \frac{\Vert {\mathcal {A}}^\dagger \Vert _{2}^{2}}{1 - \bigtriangleup }+ \frac{\Vert {\mathcal {A}}^\dagger \Vert _{2}^{2}}{(1-\bigtriangleup )^{2}}+\Vert {\mathcal {A}}^\dagger \Vert _{2}^{2}\right) \Vert {\mathcal {E}}\Vert _{2}. \end{aligned}$$

Furthermore, the inequality (4.3) can be verified taking into account $\bigtriangleup = \Vert {\mathcal {A}}^\dagger \Vert _{2}\Vert {\mathcal {E}}\Vert _{2}$. $\square $

Theorem 4.3

If ${\mathcal {A}}, {\mathcal {E}} \in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {K}}({N})}, {\mathcal {B}} = {\mathcal {A}} + {\mathcal {E}}$, and $\mathrm {rshrank}({{\mathcal {A}}})=\mathrm {rshrank}({{\mathcal {B}}})$, then

$$\begin{aligned} \Vert {\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\Vert _{2} =\Vert {\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger )\Vert _{2}, \end{aligned}$$

(4.4)

where ${\mathcal {I}}$ is the identity ${\mathbf {I}}(N)\times {\mathbf {I}}(N)$ tensor.

Proof

According to Lemma 3.3, it follows that

$$\begin{aligned} {{\mathcal {A}}}*_N{{\mathcal {A}}}^\dagger ={{\mathcal {U}}}_A*_N \mathrm {rsh}{^{-1}}\left( \begin{bmatrix} I_{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)} \end{bmatrix}\right) *_N{{\mathcal {U}}}_A^*. \end{aligned}$$

Furthermore, using Lemma 2.2 and

$$\begin{aligned} {\mathcal {I}}=\mathrm {rsh}{^{-1}}\left( I_{{\mathfrak {I}}\times {\mathfrak {I}}}\right) \in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {I}}({N})}, \end{aligned}$$

it follows that

$$\begin{aligned} \begin{aligned} {\mathcal {I}}-{{\mathcal {A}}}{{*_{{N}}}}{\mathcal A}^\dagger&={{\mathcal {U}}}_A*_N \left( {\mathcal {I}}-\mathrm {rsh}{^{-1}}\left( \begin{bmatrix} I_{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)} \end{bmatrix}\right) \right) *_N{{\mathcal {U}}}_A^*\\&= {{\mathcal {U}}}_A*_N \mathrm {rsh}{^{-1}}\left( \begin{bmatrix} O_{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad I_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)} \end{bmatrix}\right) *_N{{\mathcal {U}}}_A^*. \end{aligned} \end{aligned}$$

Similarly, in view of $\mathrm {rshrank}({{\mathcal {B}}})=\mathrm {rshrank}({{\mathcal {A}}})$, it follows $\mathrm {rank}({B})=\mathrm {rank}({A})$, where $\mathrm {rsh}({\mathcal {A}})=A$ and $\mathrm {rsh}({\mathcal {B}})=B$. The SVD of ${\mathcal {B}}$ is given by $[U_B,D_B,V_B]=SVD(B)$. Now, consider the reshaping operations

$$\begin{aligned} \begin{aligned} \mathrm {rsh}(U_B)&={{\mathcal {U}}}_B\in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {I}}({N})},\ \ \mathrm {rsh}(V_B^*)={{\mathcal {V}}}_B^*\in {\mathbb {C}}^{{\mathbf {K}}({N}) \times {\mathbf {K}}({N})},\\ \mathrm {rsh}(D_B)&= \mathrm {rsh}{^{-1}}\left( \begin{bmatrix} \Sigma _B&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {K}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {K}}-{\mathfrak {I}}_R)} \end{bmatrix}\right) ={{\mathcal {D}}}_B\in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {I}}({N})}, \end{aligned} \end{aligned}$$

where

$$\begin{aligned} \Sigma _B\in {\mathbb {C}}^{{\mathfrak {I}}_R\times {\mathfrak {I}}_R},\ {\mathfrak {I}}_R=\mathrm {rank}({A}) \end{aligned}$$

is diagonal with singular values of B on the main diagonal. This causes

$$\begin{aligned}&{{\mathcal {B}}}={{\mathcal {U}}}_B*_N {{\mathcal {D}}}_B *_N{\mathcal V}_B^*,\ \ {{\mathcal {B}}}{{*_{{N}}}}{{\mathcal {B}}}^\dagger \\&\quad ={{\mathcal {U}}}_B*_N \mathrm {rsh}{^{-1}}\left( \begin{bmatrix} I_{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)} \end{bmatrix}\right) *_N{\mathcal U}_B^*, \end{aligned}$$

and further

$$\begin{aligned} \begin{aligned} {\mathcal {I}}-{{\mathcal {B}}}{{*_{{N}}}}{\mathcal B}^\dagger&={{\mathcal {U}}}_B*_N \left( {\mathcal {I}}- \mathrm {rsh}{^{-1}}\left( \begin{bmatrix} I_{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)} \end{bmatrix}\right) \right) *_N{{\mathcal {U}}}_B^*\\&={{\mathcal {U}}}_B*_N \mathrm {rsh}{^{-1}}\left( \begin{bmatrix} O_{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad I_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)} \end{bmatrix}\right) *_N{{\mathcal {U}}}_B^*. \end{aligned} \end{aligned}$$

Now, observe the tensor products ${{\mathcal {U}}}_A^**_N {\mathcal U}_B$ and ${{\mathcal {U}}}_B^**_N {{\mathcal {U}}}_A$. They are also unitary and equal to

$$\begin{aligned} \begin{aligned} {{\mathcal {U}}}_A^**_N {{\mathcal {U}}}_B&=\mathrm {rsh}\left( U_A^* U_B\right) =\mathrm {rsh}\left( \begin{bmatrix} W_{11}&W_{12}\\ W_{21}&W_{22}\end{bmatrix} \right) \\ {{\mathcal {U}}}_B^**_N {{\mathcal {U}}}_A&=\mathrm {rsh}\left( U_B^* U_A\right) =\mathrm {rsh}\left( \begin{bmatrix} W_{11}^*&\quad W_{21}^*\\ W_{12}^*&\quad W_{22}^*\end{bmatrix} \right) , \end{aligned} \end{aligned}$$

where

$$\begin{aligned} W_{11}\in {\mathbb {C}}^{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}, W_{12}\in {\mathbb {C}}^{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}, W_{21}\in {\mathbb {C}}^{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}, W_{22}\in \mathbb C^{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}, \end{aligned}$$

In addition, it can be verified that $\Vert {\cdot }\Vert _2$ is a unitary invariant tensor norm (Govaerts and Pryce 1989), which implies in conjunction with Lemma 2.2:

$$\begin{aligned} \begin{aligned}&\Vert {\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\Vert _{2}\\&\quad = \left\| \mathrm {rsh}{^{-1}}\left( \begin{bmatrix} I_{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)} \end{bmatrix} \begin{bmatrix} W_{11}&\quad W_{12}\\ W_{21}&\quad W_{22}\end{bmatrix} \begin{bmatrix} O_{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad I_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)} \end{bmatrix}\right) \right\| _2. \end{aligned} \end{aligned}$$

An application of Lemma 2.4 further implies

$$\begin{aligned} \begin{aligned}&\Vert {\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\Vert _{2}\\&\quad = \left\| \begin{bmatrix} I_{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)} \end{bmatrix} \begin{bmatrix} W_{11}&\quad W_{12}\\ W_{21}&\quad W_{22}\end{bmatrix} \begin{bmatrix} O_{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad I_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)} \end{bmatrix}\right\| _2\\&\quad =\left\| \begin{bmatrix} O&\quad W_{12}\\ O&\quad O \end{bmatrix} \right\| _2. \end{aligned} \end{aligned}$$

Finally, using the result from Govaerts and Pryce (1989), it follows that

$$\begin{aligned} \begin{aligned} \Vert {\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\Vert _{2}= \left\| W_{12}\right\| _2. \end{aligned} \end{aligned}$$

On the other hand, in dual case, it follows that

$$\begin{aligned} \begin{aligned}&\Vert {\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger )\Vert _{2}\\&\quad = \left\| \begin{bmatrix} I_{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)} \end{bmatrix} \begin{bmatrix} W_{11}^*&\quad W_{21}^*\\ W_{12}^*&\quad W_{22}^*\end{bmatrix} \begin{bmatrix} O_{{\mathfrak {I}}_R\times {\mathfrak {I}}_R}&\quad O_{{\mathfrak {I}}_R\times ({\mathfrak {I}}-{\mathfrak {I}}_R)}\\ O_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times {\mathfrak {I}}_R}&\quad I_{({\mathfrak {I}}-{\mathfrak {I}}_R)\times ({\mathfrak {I}}-{\mathfrak {I}}_R)} \end{bmatrix}\right\| _2\\&\quad =\left\| \begin{bmatrix} O&\quad W_{21}^*\\ O&\quad O \end{bmatrix} \right\| _2\\&\quad =\left\| W_{21}^*\right\| _2. \end{aligned} \end{aligned}$$

The proof can be completed by verifying $ \mu _{1}\left( {\mathcal {W}}_{12}\right) =\mu _{1}\left( {\mathcal {W}}_{21}^*\right) . $ Indeed, according to Proposition 3.1, it follows that $\Vert W_{12}\Vert _2=\Vert W_{21}\Vert _2$. The proof can be completed using the result $\Vert W_{21}\Vert _2=\Vert W_{21}^*\Vert _2$ from Govaerts and Pryce (1989). $\square $

Corollary 4.1

If ${\mathcal {A}}, {\mathcal {E}} \in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {K}}({N})}$, ${\mathcal {B}} = {\mathcal {A}} + {\mathcal {E}}$, and $\mathrm {rshrank}({{\mathcal {A}}}) = \mathrm {rshrank}({{\mathcal {B}}})$, let

$$\begin{aligned} {\mathcal {G}}={\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger ), \end{aligned}$$

then

$$\begin{aligned} \Vert {\mathcal {G}}\Vert _{2} \le \Vert {\mathcal {E}}\Vert _{2}\Vert {\mathcal {A}}^\dagger \Vert _{2} \Vert {\mathcal {B}}^\dagger \Vert _{2}. \end{aligned}$$

(4.5)

Proof

Clearly, ${\mathcal {G}}={\mathcal {B}}^\dagger *_{{N}}{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )$. Then

$$\begin{aligned} \Vert {\mathcal {G}}\Vert _{2} \le \Vert {\mathcal {B}}^\dagger \Vert _{2}\Vert {\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\Vert _{2}, \end{aligned}$$

since

$$\begin{aligned}&{\mathcal {B}}^{*}*_{{N}}({\mathcal {I}}-{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger )={{\mathcal {O}}},\\&({\mathcal {I}}-{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger )^{2} =({\mathcal {I}}-{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger ) =({\mathcal {I}}-{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger )^{*}. \end{aligned}$$

Therefore, $\Vert {\mathcal {I}}-{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger \Vert _{2}=1$, applying Theorem 4.3, one can obtain

$$\begin{aligned} \begin{aligned} \Vert {\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\Vert _{2}&= \Vert {\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger )\Vert _{2} \\&= \Vert ({\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )^{*}*_{{N}}({\mathcal {I}}-{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger )\Vert _{2}\\&= \Vert ({\mathcal {A}}^\dagger )^{*}*_{{N}}{\mathcal {E}}^{*}*_{{N}}({\mathcal {I}}-{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger )\Vert _{2}\\&\le \Vert ({\mathcal {A}}^\dagger )^{*}*_{{N}}{\mathcal {E}}^{*}\Vert _{2}=\Vert {\mathcal {E}}*_{{N}}{\mathcal {A}}^\dagger \Vert _{2}\\&\le \Vert {\mathcal {E}}\Vert _{2} \Vert {\mathcal {A}}^\dagger \Vert _{2}. \end{aligned} \end{aligned}$$

Thus, the statement can be obtained. $\square $

Theorem 4.4

Let ${\mathcal {A}}, {\mathcal {E}} \in {\mathbb {C}}^{{\mathbf {I}}({N}) \times {\mathbf {K}}({N})}, {\mathcal {B}} = {\mathcal {A}} + {\mathcal {E}}$ and ${\mathfrak {I}},{\mathfrak {K}}$ are defined as in (2.3). If the Condition (4.1) is satisfied, then

$$\begin{aligned} \frac{\Vert {\mathcal {B}}^\dagger - {\mathcal {A}}^\dagger \Vert _{2}}{\Vert {\mathcal {A}}^\dagger \Vert _{2}} \le k\frac{\triangle }{1-\triangle }, \end{aligned}$$

(4.6)

where $\triangle = \Vert {\mathcal {A}}^\dagger \Vert _{2}\Vert {\mathcal {E}}\Vert _{2}$ and the parameter k is defined as follows:

(1)
if $\mathrm {rshrank}({{\mathcal {A}}}) < \min ({\mathfrak {I}}, {\mathfrak {K}})$, then $k = \frac{1+\sqrt{5}}{2}$;
(2)
if $\mathrm {rshrank}({{\mathcal {A}}}) = \min ({\mathfrak {I}}, {\mathfrak {K}})$, then $k = \sqrt{2}$;
(3)
if $\mathrm {rshrank}({{\mathcal {A}}}) = {\mathfrak {I}} ={\mathfrak {K}}$, then $k = 1$.

Proof

Let

$$\begin{aligned} {\mathcal {F}}=-{\mathcal {B}}^\dagger *_{{N}}{\mathcal {E}}*_{{N}}{\mathcal {A}}^\dagger ,\ {\mathcal {G}}={\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger ),\ {\mathcal {H}}=-({\mathcal {I}}-{\mathcal {B}}^\dagger *_{{N}}{\mathcal {B}})*_{{N}}{\mathcal {A}}^\dagger . \end{aligned}$$

By the Lemma 4.1, we can get

$$\begin{aligned} \begin{aligned} \Vert {\mathcal {F}}\Vert _{2}&\le \frac{\triangle }{1-\triangle }\Vert {\mathcal {A}}^\dagger \Vert _{2},\\ \Vert {\mathcal {G}}\Vert _{2}&\le \frac{\triangle }{1-\triangle }\Vert {\mathcal {A}}^\dagger \Vert _{2},\\ \Vert {\mathcal {H}}\Vert _{2}&\le \triangle \Vert {\mathcal {A}}^\dagger \Vert _{2}, \end{aligned} \end{aligned}$$

where $\triangle = \Vert {\mathcal {A}}^\dagger \Vert _{2}\Vert {\mathcal {E}}\Vert _{2} = \Vert ({\mathcal {A}}^\dagger )^{*}\Vert _{2}\Vert {\mathcal {E}}^{*}\Vert _{2}$. Let

$$\begin{aligned} \alpha = \frac{\triangle }{1-\triangle }, \end{aligned}$$

then

$$\begin{aligned} \Vert {\mathcal {F}}\Vert _{2}, \Vert {\mathcal {G}}\Vert _{2}, \Vert {\mathcal {H}}\Vert _{2} \le \alpha \Vert {\mathcal {A}}^\dagger \Vert _{2}. \end{aligned}$$

(1) Let ${\mathcal {X}} \in {\mathbb {C}}^{{I}_{1} \times \cdots \times {I}_{{N}} \times {K}_{1} \times \cdots \times {K}_{{N}}}, \Vert {\mathcal {X}}\Vert _{2} = 1$, and ${\mathcal {X}} = {\mathcal {X}}_{1}+{\mathcal {X}}_{2}$, where

$$\begin{aligned} {\mathcal {X}}_{1}={\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger *_{{N}}{\mathcal {X}},\quad {\mathcal {X}}_{2}=({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )*_{{N}}{\mathcal {X}}. \end{aligned}$$

Clearly, ${\mathcal {X}}_{1}$ and ${\mathcal {X}}_{2}$ are orthogonal; hence,

$$\begin{aligned} 1 = \Vert {\mathcal {X}}\Vert _{2}^{2}= \Vert {\mathcal {X}}_{1}\Vert _{2}^{2}+\Vert {\mathcal {X}}_{2}\Vert _{2}^{2}. \end{aligned}$$

Therefore, there must be an angle $\varphi $ which makes

$$\begin{aligned} \cos \varphi = \Vert {\mathcal {X}}_{1}\Vert _{2},\quad \sin \varphi = \Vert {\mathcal {X}}_{2}\Vert _{2}. \end{aligned}$$

Then

$$\begin{aligned} \begin{aligned}&({\mathcal {B}}^\dagger -{\mathcal {A}}^\dagger )*_{{N}}{\mathcal {X}}\\&\quad =-{\mathcal {B}}^\dagger *_{{N}}{\mathcal {E}}*_{{N}}{\mathcal {A}}^\dagger *_{{N}}{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger *_{{N}}{\mathcal {X}} +{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )*_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )*_{{N}}{\mathcal {X}}\\&\qquad -({\mathcal {I}}-{\mathcal {B}}^\dagger *_{{N}}{\mathcal {B}})*_{{N}}{\mathcal {A}}^\dagger *_{{N}}{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger *_{{N}}{\mathcal {X}}=\\&{\mathcal {F}}*_{{N}}{\mathcal {X}}_{1}+{\mathcal {G}}*_{{N}}{\mathcal {X}}_{2}+{\mathcal {H}}*_{{N}}{\mathcal {X}}_{1} \equiv {\mathcal {Y}}_{1}+{\mathcal {Y}}_{2}+{\mathcal {Y}}_{3}, \end{aligned} \end{aligned}$$

where ${\mathcal {Y}}_{1}={\mathcal {F}}*_{{N}}{\mathcal {X}}_{1},\ {\mathcal {Y}}_{2}={\mathcal {G}}*_{{N}}{\mathcal {X}}_{2},\ {\mathcal {Y}}_{3}={\mathcal {H}}*_{{N}}{\mathcal {X}}_{1} $.

Since

$$\begin{aligned} ({\mathcal {I}}-{\mathcal {B}}^\dagger *_{{N}}{\mathcal {B}})*_{{N}}{\mathcal {B}}^{*}={\mathcal {O}}. \end{aligned}$$

It is easy to verify that ${\mathcal {Y}}_{3}$ is orthogonal to ${\mathcal {Y}}_{1}$ and ${\mathcal {Y}}_{2}$; therefore

$$\begin{aligned} \Vert ({\mathcal {B}}^\dagger -{\mathcal {A}}^\dagger )*_{{N}}{\mathcal {X}}\Vert _{2}^{2}\le & {} \Vert {\mathcal {Y}}_{1}+{\mathcal {Y}}_{2}\Vert _{2}^{2}+\Vert {\mathcal {Y}}_{3}\Vert _{2}^{2} \\\le & {} \alpha ^{2}\Vert {\mathcal {A}}^\dagger \Vert _{2}^{2}[(\Vert {\mathcal {X}}_{1}\Vert _{2}+\Vert {\mathcal {X}}_{2}\Vert _{2})^{2}+\Vert {\mathcal {X}}_{1}\Vert _{2}^{2}] \\= & {} \alpha ^{2}\Vert {\mathcal {A}}^\dagger \Vert _{2}^{2}[(\cos \varphi +\sin \varphi )^{2}+\cos ^{2}\varphi ]\\= & {} \alpha ^{2}\Vert {\mathcal {A}}^\dagger \Vert _{2}^{2}(3+2\sin 2\varphi +\cos 2\varphi )/2\\\le & {} \left( \frac{3+\sqrt{5}}{2}\right) \alpha ^{2}\Vert {\mathcal {A}}^\dagger \Vert _{2}^{2}. \end{aligned}$$

Therefore

$$\begin{aligned} \Vert {\mathcal {B}}^\dagger - {\mathcal {A}}^\dagger \Vert _{2}= \max _{\Vert {\mathcal {X}}\Vert _{2}=1}\Vert ({\mathcal {B}}^\dagger -{\mathcal {A}}^\dagger )*_{{N}}{\mathcal {X}}\Vert _{2} \le \frac{1+\sqrt{5}}{2}\alpha \Vert {\mathcal {A}}^\dagger \Vert _{2}. \end{aligned}$$

(2) If $\mathrm {rshrank}({{\mathcal {A}}})= \mathrm {rshrank}({{\mathcal {B}}}) = {\mathfrak {K}} < {\mathfrak {I}}$, owing to

$$\begin{aligned} {\mathcal {B}}^\dagger = ({\mathcal {B}}^{*}*_{{N}}{\mathcal {B}})^{-1}{\mathcal {B}}^{*}, \end{aligned}$$

then $({\mathcal {I}}-{\mathcal {B}}^\dagger *_{{N}}{\mathcal {B}}) = {\mathcal {O}}$; therefore, ${\mathcal {H}}={\mathcal {O}}, {\mathcal {Y}}_{3}={\mathcal {O}}$. If rank${(A)} =$ rank${{(B)}} = {\mathfrak {I}} < {\mathfrak {K}}$, owing to

$$\begin{aligned} {\mathcal {A}}^\dagger = {\mathcal {A}}^{*}*_{{N}}({\mathcal {A}}*_{{N}}{\mathcal {A}}^{*})^{-1}, \end{aligned}$$

then $({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )={\mathcal {O}}$, so ${\mathcal {G}}={\mathcal {O}}, {\mathcal {Y}}_{2}={\mathcal {O}}$. When one of ${\mathcal {Y}}_{2}$ or ${\mathcal {Y}}_{3}$ is the zero tensor

$$\begin{aligned} \Vert ({\mathcal {B}}^\dagger -{\mathcal {A}}^\dagger )*_{{N}}{\mathcal {X}}\Vert _{2}^{2} \le 2\alpha ^{2}\Vert {\mathcal {A}}^\dagger \Vert _{2}^{2}. \end{aligned}$$

Hence

$$\begin{aligned} \Vert {\mathcal {B}}^\dagger - {\mathcal {A}}^\dagger \Vert _{2}= \max _{\Vert {\mathcal {X}}\Vert _{2}=1}\Vert ({\mathcal {B}}^\dagger -{\mathcal {A}}^\dagger )*_{{N}}{\mathcal {X}}\Vert _{2} \le \sqrt{2}\alpha \Vert {\mathcal {A}}^\dagger \Vert _{2}. \end{aligned}$$

(3) In the case $\mathrm {rshrank}({{\mathcal {A}}})= \mathrm {rshrank}({{\mathcal {B}}}) = {\mathfrak {I}} = {\mathfrak {K}}$, according to (2), we know ${\mathcal {G}}={\mathcal {H}}={\mathcal {O}}$, so the conclusion is established. $\square $

Next, we introduce the condition number of the Moore–Penrose inverse for tensor ${\mathcal {A}}$:

$$\begin{aligned} {\mathbb {K}}_{2}({\mathcal {A}}) = \Vert {\mathcal {A}}\Vert _{2}\Vert {\mathcal {A}}^\dagger \Vert _{2}. \end{aligned}$$

Theorem 4.5

If the Condition (4.1) is satisfied, then

$$\begin{aligned} \frac{\Vert {\mathcal {B}}^\dagger - {\mathcal {A}}^\dagger \Vert _{2}}{\Vert {\mathcal {A}}^\dagger \Vert _{2}} \le k{\mathbb {K}}_{2}({\mathcal {A}})\frac{\frac{\Vert {\mathcal {E}}\Vert _{2}}{\Vert {\mathcal {A}}\Vert _{2}}}{1-{\mathbb {K}}_{2}({\mathcal {A}})\frac{\Vert {\mathcal {E}}\Vert _{2}}{\Vert {\mathcal {A}}\Vert _{2}}}. \end{aligned}$$

Proof

The statement can be verified using Theorem 4.4 and the definition of ${\mathbb {K}}_{2}({\mathcal {A}})$. $\square $

Theorem 4.5 shows that the perturbation ${\mathcal {E}}$ of ${\mathcal {A}}$ has little influence on ${\mathcal {A}}^\dagger $ when the condition number ${\mathbb {K}}_{2}({\mathcal {A}})$ is small, and when the condition number ${\mathbb {K}}_{2}({\mathcal {A}})$ is large, the influence of ${\mathcal {E}}$ on the disturbance to ${\mathcal {A}}^\dagger $ may be larger.

5 Examples

Example 5.1

This example is aimed to the verification of the inequality (4.2). Let the tensor ${\mathcal {A}} = 10^3*\mathrm {rand}(2,2,2,2)$ be defined by

$$\begin{aligned} \begin{aligned} {\mathcal {A}}(:,:,1,1)&=\begin{bmatrix} 950.9152&\quad 400.0797\\ 722.3485&\quad 831.8713\end{bmatrix},\ {\mathcal {A}}(:,:,2,1) =\begin{bmatrix} 134.3383&\quad 84.2471\\ 60.4668&\quad 163.8983 \end{bmatrix},\\ {\mathcal {A}}(:,:,1,2)&=\begin{bmatrix} 324.2199&\quad 11.6810\\ 301.7268&\quad 539.9051\end{bmatrix},\ {\mathcal {A}}(:,:,2,2) =\begin{bmatrix} 95.3727&\quad 631.1412\\ 146.5149&\quad 859.3204\end{bmatrix}, \end{aligned} \end{aligned}$$

and let ${\mathcal {E}} = 10^{-1}*\mathrm {rand}(2,2,2,2)$ be defined by

$$\begin{aligned} \begin{aligned} {\mathcal {E}}(:,:,1,1)&=\begin{bmatrix} 0.0974&\quad 0.0997\\ 0.0571&\quad 0.0554\end{bmatrix},\ {\mathcal {E}}(:,:,2,1)&=\begin{bmatrix} 0.0515&\quad 0.0430\\ 0.0331&\quad 0.0492 \end{bmatrix},\\ {\mathcal {E}}(:,:,1,2)&=\begin{bmatrix} 0.0071&\quad 0.0065\\ 0.0888&\quad 0.0436\end{bmatrix},\ {\mathcal {E}}(:,:,2,2)&=\begin{bmatrix} 0.0827&\quad 0.0613\\ 0.0395&\quad 0.0819\end{bmatrix}. \end{aligned} \end{aligned}$$

Then, ${\mathcal {B}} = {\mathcal {A}}+{\mathcal {E}}$ is defined by

$$\begin{aligned} \begin{aligned} {\mathcal {B}}(:,:,1,1)&=\begin{bmatrix} 951.0126&\quad 400.1794\\ 722.4056&\quad 831.9267 \end{bmatrix},\ {\mathcal {B}}(:,:,2,1)&=\begin{bmatrix} 134.3899&\quad 84.2901\\ 60.4998&\quad 163.9475 \end{bmatrix},\\ {\mathcal {B}}(:,:,1,2)&=\begin{bmatrix} 324.2270&\quad 11.6875\\ 301.8156&\quad 539.9487 \end{bmatrix},\ {\mathcal {B}}(:,:,2,2)&=\begin{bmatrix} 95.4554&\quad 631.2026\\ 146.5543&\quad 859.4023\end{bmatrix}. \end{aligned} \end{aligned}$$

It holds, $\mathrm {rshrank}({{\mathcal {A}}})=\mathrm {rshrank}({{\mathcal {B}}})=4$. In addition, an application of Huang et al. (2018, Algorithm 1) gives the Moore–Penrose inverse of ${\mathcal {A}}$:

$$\begin{aligned} \begin{aligned} {\mathcal {A}}^{\dagger }(:,:,1,1)&=\begin{bmatrix} -0.000384784660649&\quad -0.001099286197235\\ 0.013955577455799&\quad -0.001598581983579\end{bmatrix},\\ {\mathcal {A}}^{\dagger }(:,:,2,1)&=\begin{bmatrix} 0.002737547517780&\quad 0.000508206301570\\ -0.021392949244820&\quad 0.001110875409922 \end{bmatrix},\\ {\mathcal {A}}^{\dagger }(:,:,1,2)&=\begin{bmatrix} 0.001227173049100&\quad -0.003076391592687\\ -0.002071095392062&\quad 0.001139922248389\end{bmatrix},\\ {\mathcal {A}}^{\dagger }(:,:,2,2)&=\begin{bmatrix} -0.001325364668035&\quad 0.002294859505387\\ 0.003619787791101&\quad 0.000314492021375\end{bmatrix} \end{aligned} \end{aligned}$$

and the following Moore–Penrose inverse of ${\mathcal {B}}$:

$$\begin{aligned}\begin{aligned} {\mathcal {B}}^{\dagger }(:,:,1,1)&=\begin{bmatrix} -0.000385255308227&\quad -0.001098543826394\\ 0.013953167583606&\quad -0.001598698848909\end{bmatrix},\\ {\mathcal {B}}^{\dagger }(:,:,2,1)&=\begin{bmatrix} 0.002738190383336&\quad 0.000507663916870\\ -0.021390844989888&\quad 0.001111108748350 \end{bmatrix},\\ {\mathcal {B}}^{\dagger }(:,:,1,2)&=\begin{bmatrix} 0.001227624787307&\quad -0.003075755111089\\ -0.002076686203945&\quad 0.001140238650466\end{bmatrix},\\ {\mathcal {B}}^{\dagger }(:,:,2,2)&=\begin{bmatrix} -0.001325803821792&\quad 0.002294485476614\\ 0.003623245688721&\quad 0.000314224257971\end{bmatrix}. \end{aligned} \end{aligned}$$

In view of (2.5), it is easy to check that the tensor norms are equal to $\Vert {\mathcal {A}}^\dagger \Vert _{2} =0.026095036211067$, $\Vert {\mathcal {E}}\Vert _{2}=0.235145716909881$ and $\bigtriangleup = \Vert {\mathcal {A}}^\dagger \Vert _{2}\Vert {\mathcal {E}}\Vert _{2}=0.006136135997641 < 1$. Therefore, Condition (4.1) is satisfied. Then, $\Vert {\mathcal {B}}^\dagger \Vert _{2} =0.026093083833995$ and $\frac{\Vert {\mathcal {A}}^\dagger \Vert _{2}}{1 - \bigtriangleup }= 0.026256147502919.$ Hence, the inequality (4.2) in Lemma 4.1 is verified.

Example 5.2

The tensors in Example 5.1 are invertible. Example 5.2 is aimed to the verification of the inequality (4.2) in singular tensor case. To this end, let ${\mathcal {A}}, {\mathcal {E}}\in {\mathbb {R}}^{(2\times 2)\times (2\times 2)}$ with

$$\begin{aligned} \begin{aligned} {\mathcal {A}}(:,:,1,1)&=10^2\cdot \begin{bmatrix} 0.985940927109977&\quad 1.682512984915278\\ 1.420272484319284&\quad 1.962489222569553\end{bmatrix},\ {\mathcal {A}}(:,:,2,1) =\begin{bmatrix} 0&\quad 0\\ 0&\quad 0 \end{bmatrix},\\ {\mathcal {A}}(:,:,1,2)&=10^2\cdot \begin{bmatrix} 8.929224052859770&\quad 5.557379427193866\\ 7.032232245562910&\quad 1.844336677576532\end{bmatrix},\ {\mathcal {A}}(:,:,2,2) =\begin{bmatrix} 0&\quad 0\\ 0&\quad 0\end{bmatrix}, \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \begin{aligned} {\mathcal {E}}(:,:,1,1)&=\begin{bmatrix} 0.055778896675488&\quad 0.016620356290215\\ 0.031342898993659&\quad 0.062249725927990\end{bmatrix},\ {\mathcal {E}}(:,:,2,1)&=\begin{bmatrix} 0&\quad 0\\ 0&\quad 0 \end{bmatrix},\\ {\mathcal {E}}(:,:,1,2)&=\begin{bmatrix} 0.007399476957694&\quad 0.040238833269616\\ 0.068409606696201&\quad 0.098283520139395\end{bmatrix},\ {\mathcal {E}}(:,:,2,2)&=\begin{bmatrix} 0&\quad 0\\ 0&\quad 0\end{bmatrix}. \end{aligned} \end{aligned}$$

Then, the tensor ${\mathcal {B}} = {\mathcal {A}}+{\mathcal {E}}$ is defined by

$$\begin{aligned} \begin{aligned} {\mathcal {B}}(:,:,1,1)&=10^2\cdot \begin{bmatrix} 0.986498716076732&\quad 1.682679188478180\\ 1.420585913309220&\quad 1.963111719828833 \end{bmatrix},\ {\mathcal {B}}(:,:,2,1)&=\begin{bmatrix} 0&\quad 0\\ 0&\quad 0 \end{bmatrix},\\ {\mathcal {B}}(:,:,1,2)&=10^2\cdot \begin{bmatrix} 8.929298047629347&\quad 5.557781815526563\\ 7.032916341629872&\quad 1.845319512777926 \end{bmatrix},\ {\mathcal {B}}(:,:,2,2)&=\begin{bmatrix} 0&\quad 0\\ 0&\quad 0\end{bmatrix}. \end{aligned} \end{aligned}$$

It holds, $\mathrm {rshrank}({{\mathcal {A}}})=\mathrm {rshrank}({{\mathcal {B}}})=2$. In addition, an application of Huan et al. (2018, Algorithm 1) gives the following Moore–Penrose inverse of ${\mathcal {A}}$:

$$\begin{aligned}\begin{aligned} {\mathcal {A}}^{\dagger }(:,:,1,1)&=\begin{bmatrix} -0.002139621590719&\quad 0.000961949275511\\ 0&\quad 0\end{bmatrix},\\ {\mathcal {A}}^{\dagger }(:,:,2,1)&=10^{-3}\cdot \begin{bmatrix} 0.154116174683625&\quad 0.400242571625946\\ 0&\quad 0 \end{bmatrix},\\ {\mathcal {A}}^{\dagger }(:,:,1,2)&=\begin{bmatrix} 0.001721913469812&\quad 0.000005405961529\\ 0&\quad 0\end{bmatrix},\\ {\mathcal {A}}^{\dagger }(:,:,2,2)&=\begin{bmatrix} 0.004582706318709&\quad -0.000777570778470\\ 0&\quad 0 \end{bmatrix}. \end{aligned} \end{aligned}$$

Similarly

$$\begin{aligned}\begin{aligned} {\mathcal {B}}^{\dagger }(:,:,1,1)&=\begin{bmatrix}-0.002139080520305&\quad 0.000961905529252\\ 0&\quad 0\end{bmatrix},\\ {\mathcal {B}}^{\dagger }(:,:,2,1)&=10^{-3}\cdot \begin{bmatrix} 0.153469746601847&\quad 0.400351263240931\\ 0&\quad 0 \end{bmatrix},\\ {\mathcal {B}}^{\dagger }(:,:,1,2)&=\begin{bmatrix} 0.001720928456951&\quad 0.000005485433595\\ 0&\quad 0\end{bmatrix},\\ {\mathcal {B}}^{\dagger }(:,:,2,2)&=\begin{bmatrix} 0.004582730894265&\quad -0.000777786686340\\ 0&\quad 0\end{bmatrix}. \end{aligned} \end{aligned}$$

Following (2.5), it is easy to check $\Vert {\mathcal {A}}^\dagger \Vert _{2} =0.005446932213520$, $\Vert {\mathcal {E}}\Vert _{2}=0.149158220173799$ and $\bigtriangleup = \Vert {\mathcal {A}}^\dagger \Vert _{2}\Vert {\mathcal {E}}\Vert _{2}=8.124547143759799\cdot 10^{-4} < 1$. Therefore, Condition (4.1) is satisfied. Then, $\Vert {\mathcal {B}}^\dagger \Vert _{2} = 0.005446449437497$ and $\frac{\Vert {\mathcal {A}}^\dagger \Vert _{2}}{1 - \bigtriangleup }= 0.005451361197625,$ which confirms the inequality (4.2) in Lemma 4.1.

Example 5.3

This example is a continuation of Example 5.1 to verify the inequality (4.3) proved in Theorem 4.2. Therefore, for the tensors ${\mathcal {A}}$ and ${\mathcal {E}}$ defined in Example 5.1, we have

$$\begin{aligned} \frac{\Vert {\mathcal {B}}^\dagger - {\mathcal {A}}^\dagger \Vert _{2}}{\Vert {\mathcal {A}}^\dagger \Vert _{2}}=2.834195478378557\cdot 10^{-4} \end{aligned}$$

and

$$\begin{aligned} \left( 1+\frac{1}{1-\bigtriangleup }+\frac{1}{(1-\bigtriangleup )^{2}}\right) \bigtriangleup = 0.018295682536782. \end{aligned}$$

Hence, inequality (4.3) of Theorem 4.2 is valid.

Example 5.4

This example is a continuation of Example 5.2 to verify the inequality (4.3) proved in Theorem 4.2. Therefore, for the tensors ${\mathcal {A}}$ and ${\mathcal {E}}$ defined in Example 5.2, we have

$$\begin{aligned} \frac{\Vert {\mathcal {B}}^\dagger - {\mathcal {A}}^\dagger \Vert _{2}}{\Vert {\mathcal {A}}^\dagger \Vert _{2}}=2.394253885112045\cdot 10^{-4} \end{aligned}$$

and

$$\begin{aligned} \left( 1+\frac{1}{1-\bigtriangleup }+\frac{1}{(1-\bigtriangleup )^{2}}\right) \bigtriangleup = 0.002435384431426. \end{aligned}$$

Hence, inequality (4.3) of Theorem 4.2 is confirmed.

Example 5.5

We shall use again the settings of Example 5.2 to verify the validity of equality (4.4).

After appropriate calculations, one can verify

$$\begin{aligned} \begin{aligned}&\left( {\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\right) (:,:,1,1) =10^{-4}\\&\quad \cdot \begin{bmatrix} -0.489468003874380&\quad 0.346424530042466\\ 0.006594102139740&\quad 0.970661633334091\end{bmatrix},\\&\left( {\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\right) (:,:,2,1) =10^{-4}\\&\quad \cdot \begin{bmatrix} 0.441733836613403&\quad -0.163043151723344\\ 0.084143275517548&\quad -0.629731697791430\end{bmatrix},\\&\left( {\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\right) (:,:,1,2) =10^{-3}\\&\quad \cdot \begin{bmatrix} 0.035216688972592&\quad -0.046360736953049\\ -0.013384105875716&\quad -0.105126013940471\end{bmatrix},\\&\left( {\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\right) (:,:,2,2) =10^{-4}\\&\quad \begin{bmatrix} -0.375707154131807&\quad 0.341422001841479\\ 0.050538644491456&\quad 0.869372625158238 \end{bmatrix}. \end{aligned} \end{aligned}$$

In addition

$$\begin{aligned} \begin{aligned}&\left( {\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger )\right) (:,:,1,1) =10^{-4}\\&\quad \cdot \begin{bmatrix} 0.489668212727348&\quad -0.346613307126986\\ -0.006685674852000&\quad -0.970692034327758\end{bmatrix},\\&\left( {\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger )\right) (:,:,2,1) =10^{-4}\\&\quad \cdot \begin{bmatrix} 0.441825409325317&\quad 0.163138141635252\\ -0.084077978807235&\quad 0.629669045010689\end{bmatrix},\\&\left( {\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger )\right) (:,:,1,2) =10^{-3}\\&\quad \cdot \begin{bmatrix} -0.035235566680988&\quad 0.046380141446686\\ 0.013393604866949&\quad 0.105126135280298\end{bmatrix},\\&\left( {\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger )\right) (:,:,2,2) =10^{-4}\\&\quad \begin{bmatrix} 0.375676753138765&\quad -0.341420788443209\\ -0.050601297272405&\quad -0.868951121121841 \end{bmatrix}. \end{aligned} \end{aligned}$$

Hence

$$\begin{aligned} \begin{aligned} \Vert {\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\Vert _2&= \Vert \left( {\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {B}}*_{{N}}{\mathcal {B}}^\dagger )\right) \Vert _2\\&= 2.0813844590544\cdot 10^{-4}. \end{aligned} \end{aligned}$$

Example 5.6

This example is a continuation of Example 5.2 with the aim to verify the validity of inequality (4.5). It is possible to compute

$$\begin{aligned} \Vert {\mathcal {G}}\Vert _2=\Vert {\mathcal {B}}^\dagger *_{{N}}({\mathcal {I}}-{\mathcal {A}}*_{{N}}{\mathcal {A}}^\dagger )\Vert _2=1.132716947645736\cdot 10^{-6}. \end{aligned}$$

In addition, $\Vert {\mathcal {E}}\Vert _{2}\Vert {\mathcal {A}}^\dagger \Vert _{2} \Vert {\mathcal {B}}^\dagger \Vert _{2}=4.424993522104842\cdot 10^{-6}.$ Therefore, the inequality (4.5) is verified.

Example 5.7

In this example, we verify cases (1)–(3) of Theorem 4.4.

Case (1) The tensors ${\mathcal {A}}, {\mathcal {E}}$ and ${\mathcal {B}}$ are reused from Example 5.2. It can be verified that $\mathrm {rshrank}({{\mathcal {A}}})=2 < 4=\min ({\mathfrak {I}}, {\mathfrak {K}})$. From example 5.4, we have

$$\begin{aligned} \frac{\Vert {\mathcal {B}}^\dagger - {\mathcal {A}}^\dagger \Vert _{2}}{\Vert {\mathcal {A}}^\dagger \Vert _{2}}=2.394253885112045\cdot 10^{-4} \end{aligned}$$

and since $\bigtriangleup = \Vert {\mathcal {A}}^\dagger \Vert _{2}\Vert {\mathcal {E}}\Vert _{2}=8.124547143759799\cdot 10^{-4}$ (see Example 5.2), it follows

$$\begin{aligned} \frac{1+\sqrt{5}}{2}\cdot \frac{\bigtriangleup }{1-\bigtriangleup } = 0.001315648246801. \end{aligned}$$

Hence, inequality (4.6) of Theorem 4.4 is valid.

Case (2) We consider the tensors ${\mathcal {A}}, {\mathcal {E}}$ and ${\mathcal {B}}$ from Example 5.1. It is clear that $\mathrm {rshrank}({{\mathcal {A}}})= 4=\min ({\mathfrak {I}}, {\mathfrak {K}})$. From Example 5.3, we have

$$\begin{aligned} \frac{\Vert {\mathcal {B}}^\dagger - {\mathcal {A}}^\dagger \Vert _{2}}{\Vert {\mathcal {A}}^\dagger \Vert _{2}}=2.834195478378557\cdot 10^{-4}, \end{aligned}$$

and since $\bigtriangleup = \Vert {\mathcal {A}}^\dagger \Vert _{2}\Vert {\mathcal {E}}\Vert _{2}= 0.006136135997641$ (see Example 5.1), we have

$$\begin{aligned} \sqrt{2}\cdot \frac{\bigtriangleup }{1-\bigtriangleup } = 0.008731383706299. \end{aligned}$$

Hence, inequality (4.6) of Theorem 4.4 is valid.

Case (3) We consider the tensors ${\mathcal {A}}, {\mathcal {E}}$ and ${\mathcal {B}}$ from Example 5.1. Then, $\mathrm {rshrank}({{\mathcal {A}}})= 4={\mathfrak {I}}={\mathfrak {K}}$. As in the previous case [Case (2)], we have

$$\begin{aligned} \frac{\Vert {\mathcal {B}}^\dagger - {\mathcal {A}}^\dagger \Vert _{2}}{\Vert {\mathcal {A}}^\dagger \Vert _{2}}=2.834195478378557\cdot 10^{-4}, \end{aligned}$$

and

$$\begin{aligned} 1\cdot \frac{\bigtriangleup }{1-\bigtriangleup } = 0.006174020627866. \end{aligned}$$

Hence, inequality (4.6) of Theorem 4.4 is valid.

6 Concluding remarks

The aim of this paper is to generalize some results about the perturbation theory of the matrix pseudoinverse to tensors. For this purpose, we derive several useful representations and introduce some notions. The spectral norm of even-order tensors is defined by a computationally effective definition and investigated. In addition, useful representations of ${{\mathcal {A}}}*_N{{\mathcal {A}}}^\dagger $ and ${{\mathcal {I}}}-{{\mathcal {A}}}*_N{{\mathcal {A}}}^\dagger $ are derived. As a result, we explore the perturbation bounds for Moore–Penrose inverse of tensor via Einstein product. Unlike to so far exploited approaches which were developed only in the tensor or in the matrix case, our approach assumes an exact transition from one to another space. In this way, it is possible to extend many of known results from the matrix case into the multiarray case. The results derived in current research extend the classical results in the matrix case, derived by Stewart (1977) and Wedin (1973). It is shown that the influence of the perturbation in the tensors depends on exactly defined condition number. Illustrative numerical examples also confirm derived theoretical results.

Recently, Ji and Wei (2017, 2018) investigated the weighted Moore–Penrose inverses and the Drazin inverse of even-order tensors with Einstein product. It is natural to investigate possible extensions of derived results to the perturbation bounds for the weighted Moore–Penrose inverses and the Drazin inverse of tensors via Einstein product.

References

Behera R, Mishra D (2017) Further results on generalized inverses of tensors via the Einstein product. Linear Multilinear Algebra 65:1662–1682
MathSciNet MATH Google Scholar
Ben-Israel A, Greville TNE (2003) Generalized inverses: theory and applications, 2nd edn. Springer, New York
MATH Google Scholar
Brazell M, Li N, Navasca C, Tamon C (2013) Solving multilinear systems via tensor inversion. SIAM J Matrix Anal Appl 34:542–570
MathSciNet MATH Google Scholar
Bu C, Zhang X, Zhou J, Wang W, Wei Y (2014) The inverse, rank and product of tensors. Linear Algebra Appl 446:269–280
MathSciNet MATH Google Scholar
Cai LX, Xu WW, Li W (2011) Additive and multiplicative perturbation bounds for the Moore–Penrose inverse. Linear Algebra Appl 434:480–489
MathSciNet MATH Google Scholar
Che M, Wei Y (2019) Randomized algorithms for the approximations of Tucker and the tensor train decompositions. Adv Comput Math 45:395–428
MathSciNet MATH Google Scholar
Che M, Qi L, Wei Y (2016) Perturbation bounds of tensor eigenvalue and singular value problems with even order. Linear Multilinear Algebra 64:622–652
MathSciNet MATH Google Scholar
Che M, Qi L, Wei Y (2019) Stochastic $R_0$ tensors to stochastic tensor complementarity problems. Optim Lett 13:261–279
MathSciNet MATH Google Scholar
Comon P, ten Berge JMF, De Lathauwer L, Castaing J (2009) Generic and typical ranks of multi-way arrays. Linear Algebra Appl 430:2997–3007
MathSciNet MATH Google Scholar
Cvetkovic-Illic DS, Wei Y (2017) Algebraic properties of generalized inverses. Springer, Singapore
Google Scholar
Ding W, Wei Y (2016) Solving multi-linear systems with ${{\cal{M}}}$-tensors. J Sci Comput 68:689–715
MathSciNet MATH Google Scholar
Djordjević DS, Rakočević V (2008) Lectures on generalized inverses. Faculty of Sciences and Mathematics, University of Niš, Niš
Einstein A (2007) The foundation of the general theory of relativity. In: Kox A, Klein M, Schulmann R (eds) The collected papers of Albert Einstein, vol 6. Princeton University Press, Princeton, pp 146–200
Google Scholar
Govaerts W, Pryce JD (1989) A Singular value inequality for block matrices. Linear Algebra Appl 125:141–148
MathSciNet MATH Google Scholar
Harrison AP, Joseph D (2016) Numeric tensor framework: exploiting and extending Einstein notation. J Comput Sci 16:128–139
Google Scholar
Huang S, Zhao G, Chen M (2018) Tensor extreme learning design via generalized Moore–Penrose inverse and triangular type-2 fuzzy sets. Neural Comput Appl. https://doi.org/10.1007/s00521-018-3385-5
Ji J, Wei Y (2017) Weighted Moore–Penrose inverses and the fundamental theorem of even-order tensors with Einstein product. Front Math China 12:1317–1337
MathSciNet MATH Google Scholar
Ji J, Wei Y (2018) The Drazin inverse of an even-order tensor and its application to singular tensor equations. Comput Math Appl 75:3402–3413
MathSciNet MATH Google Scholar
Jin H, Bai M, Benítez J, Liu X (2017) The generalized inverses of tensors and an application to linear models. Comput Math Appl 74:385–397
MathSciNet MATH Google Scholar
Li Z (2016) Bounds on the spectral norm and the nuclear norm of a tensor based on tensor partitions. SIAM J Matrix Anal Appl 37:1440–1452
MathSciNet MATH Google Scholar
Li W, Ng MK (2014) The perturbation bound for the spectral radius of a nonnegative tensor. Adv Numer Anal 2014, Article ID 109525. https://doi.org/10.1155/2014/109525
Li Z, Xu Q, Wei Y (2013) A note on stable perturbations of Moore–Penrose inverses. Numer Linear Algebra Appl 20:18–26
MathSciNet MATH Google Scholar
Liang M-L, Zheng B (2018) Gradient-based iterative algorithms for solving Sylvester tensor equations and the associated tensor nearness problems. arXiv:1811.10378v1
Liang M-L, Zheng B, Zhao R-J (2019) Tensor inversion and its application to the tensor equations with Einstein product. Linear Multilinear Algebra 67:843–870
MathSciNet MATH Google Scholar
Liu X, Wang W, Wei Y (2008) Continuity properties of the $\{\rm 1\}$-inverse and perturbation bounds for the Drazin inverse. Linear Algebra Appl 429:1026–1037
MathSciNet MATH Google Scholar
Ma H (2018) Acute perturbation bounds of weighted Moore–Penrose inverse. Int J Comput Math 95:710–720
MathSciNet MATH Google Scholar
Medellin D, Ravi VR, Torres-Verdin C (2016) Multidimensional NMR inversion without Kronecker products: multilinear inversion. J Magn Reson 269:24–35
Google Scholar
Meng L, Zheng B (2010) The optimal perturbation bounds of the Moore–Penrose inverse under the Frobenius norm. Linear Algebra Appl 432:956–963
MathSciNet MATH Google Scholar
Meng L, Zheng B, Ma P (2017) Perturbation bounds of generalized inverses. Appl Math Comput 298:88–100
MathSciNet MATH Google Scholar
Meyer C (1980) The condition number of a finite Markov chain and perturbation bounds for the limiting probabilities. SIAM J Algebr Discrete Methods 1:273–283
MATH Google Scholar
Miao Y, Qi L, Wei Y (2019) Generalized tensor function via the tensor singular value decomposition based on the T-product. arXiv:1901.0425v2
Panigrahy K, Mishra D (2018) An extension of the Moore–Penrose inverse of a tensor via the Einstein product. arXiv:1806.03655v1
Panigrahy K, Mishra D (2019) Reverse order law for weighted Moore–Penrose inverses of tensors. arXiv:1901.01527v1
Panigrahy K, Behera R, Mishra D (2018) Reverse order law for the Moore–Penrose inverses of tensors. Linear Multilinear Algebra. https://doi.org/10.1080/03081087.2018.1502252
Qi L (2005) Eigenvalues of a real supersymmetric tensor. J Symb Comput 40:1302–1324
MathSciNet MATH Google Scholar
Qi L, Luo Z (2017) Tensor analysis: spectral theory and special tensors. SIAM, Philadelphia
MATH Google Scholar
Qiao S, Wang X, Wei Y (2018) Two finite-time convergent Zhang neural network models for time-varying complex matrix Drazin inverse. Linear Algebra Appl 542:101–117
MathSciNet MATH Google Scholar
Shi X, Wei Y, Ling S (2013) Backward error and perturbation bounds for high order Sylvester tensor equation. Linear Multilinear Algebra 61:1436–1446
MathSciNet MATH Google Scholar
Stanimirović PS, Ćirić M, Katsikis VN, Li C, Ma H (2018) Outer and $(b,c)$ inverses of tensors. Linear Multilinear Algebra. https://doi.org/10.1080/03081087.2018.1521783
Stewart G (1977) On the perturbation of pseudoinverses, projections and linear least squares problems. SIAM Rev 19:634–662
MathSciNet MATH Google Scholar
Sun L, Zheng B, Bu C, Wei Y (2014) Some results on the generalized inverse of tensors and idempotent tensors. arXiv:1412.7430v1
Sun L, Zheng B, Bu C, Wei Y (2016) Moore–Penrose inverse of tensors via Einstein product. Linear Multilinear Algebra 64:686–698
MathSciNet MATH Google Scholar
Sun L, Zheng B, Bu C, Wei Y (2018) Generalized inverses of tensors via a general product of tensors. Front Math China 13:893–911
MathSciNet MATH Google Scholar
Wang G, Wei Y, Qiao S (2018) Generalized inverses: theory and computations, 2nd edn, Developments in Mathematics, vol 53. Springer, Singapore; Science Press, Beijing
Wedin P (1973) Perturbation theory for pseudo-inverses. BIT 13:217–232
MathSciNet MATH Google Scholar
Wei Y (1999) On the perturbation of the group inverse and oblique projection. Appl Math Comput 98:29–42
MathSciNet MATH Google Scholar
Wei Y (2014) Generalized inverses of matrices, chapter 27. In: Hogben L (ed) Handbook of linear algebra, 2nd edn. Chapman and Hall/CRC, Boca Raton
Google Scholar
Wei Y (2017) Acute perturbation of the group inverse. Linear Algebra Appl 534:135–157
MathSciNet MATH Google Scholar
Wei Y, Ding W (2016) Theory and computation of tensors. Academic Press, London
MATH Google Scholar
Wei M, Ling S (2010) On the perturbation bounds of g-inverse and oblique projections. Linear Algebra Appl 433:1778–1792
MathSciNet MATH Google Scholar
Xie P, Xiang H, Wei Y (2019) Randomized algorithms for total least squares problems. Numer Linear Algebra Appl 26(1):e2219 16 pp
MathSciNet MATH Google Scholar
Xu Z, Sun J, Gu C (2008) Perturbation for a pair of oblique projectors $AA_{M, N}^\dagger $ and $AA_{M, N}^\dagger $. Appl Math Comput 203:432–446
MathSciNet Google Scholar
Xu Z, Gu C, Feng B (2010a) Weighted acute perturbation for two matrices. Arab J Sci Eng 35:129–143
MathSciNet MATH Google Scholar
Xu Q, Wei Y, Gu Y (2010b) Sharp norm-estimations for Moore–Penrose inverses of stable perturbations of Hilbert $C^*$-module operators. SIAM J Numer Anal 47:4735–4758
MathSciNet MATH Google Scholar
Xu W, Cai L, Li W (2011) The optimal perturbation bounds for the weighted Moore–Penrose inverse. Electron J Linear Algebra 22:521–538
MathSciNet MATH Google Scholar
Zhang N, Wei Y (2008) A note on the perturbation of an outer inverse. Calcolo 45:263–273
MathSciNet MATH Google Scholar
Zheng B, Meng L, Wei Y (2017) Condition numbers of the multidimensional total least squares problem. SIAM J Matrix Anal Appl 38:924–948
MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors would like to thank the editor and two referees for their detailed comments. This research is supported by the bilateral project between China and Serbia “The theory of tensors, operator matrices and applications (No. 4-5)”. H. Ma would like to thank Prof. Dragana S. Cvetković Ilić for her kind invitation and great hospitality; thank Prof. D.S. Djordjević and Prof. V. Rakoćević for their nice monograph (Djordjević and Rakočević 2008). Partial work is completed during her visiting at University of Nis̆ in 2017.

Author information

Authors and Affiliations

School of Mathematical Science, Harbin Normal University, Harbin, 150025, People’s Republic of China
Haifeng Ma & Na Li
Faculty of Sciences and Mathematics, University of Niš, Višegradska 33, Niš, 18000, Serbia
Predrag S. Stanimirović
National and Kapodistrian University of Athens, Sofokleous 1 Street, 10559, Athens, Greece
Vasilios N. Katsikis

Authors

Haifeng Ma
View author publications
You can also search for this author in PubMed Google Scholar
Na Li
View author publications
You can also search for this author in PubMed Google Scholar
Predrag S. Stanimirović
View author publications
You can also search for this author in PubMed Google Scholar
Vasilios N. Katsikis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Predrag S. Stanimirović.

Additional information

Communicated by Jinyun Yuan.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Haifeng Ma is supported by the bilateral project between China and Poland (No. 37-18). Predrag S. Stanimirović gratefully acknowledges support from the Research Project 174013 of the Serbian Ministry of Science.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ma, H., Li, N., Stanimirović, P.S. et al. Perturbation theory for Moore–Penrose inverse of tensor via Einstein product. Comp. Appl. Math. 38, 111 (2019). https://doi.org/10.1007/s40314-019-0893-6

Download citation

Received: 12 January 2019
Revised: 28 March 2019
Accepted: 16 May 2019
Published: 22 May 2019
DOI: https://doi.org/10.1007/s40314-019-0893-6

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Perturbation theory for Moore–Penrose inverse of tensor via Einstein product

Abstract

Similar content being viewed by others

Numerical study on Moore-Penrose inverse of tensors via Einstein product

On reverse-order law of tensors and its application to additive results on Moore–Penrose inverse

Acute perturbation for Moore-Penrose inverses of tensors via the T-Product

1 Introduction

Definition 1.1

Definition 1.2

Definition 1.3

Definition 1.4

2 Spectral norm of tensors

Definition 2.1

Lemma 2.1

Proof

Definition 2.2

Lemma 2.2

Lemma 2.3

Proof

Lemma 2.4

Proof

Example 2.1

Definition 2.3

3 Preliminary results

Lemma 3.1

Example 3.1

Lemma 3.2

Proof

Lemma 3.3

Proof

Proposition 3.1

4 Main results

Lemma 4.1

Proof

Theorem 4.1

Proof

Lemma 4.2

Proof

Theorem 4.2

Proof

Theorem 4.3

Proof

Corollary 4.1

Proof

Theorem 4.4

Proof

Theorem 4.5

Proof

5 Examples

Example 5.1

Example 5.2

Example 5.3

Example 5.4

Example 5.5

Example 5.6

Example 5.7

6 Concluding remarks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation