1 Introduction

Tensors, treated as high-dimensional generalizations of matrices and vectors, are powerful to express more complex interactions related to higher-order data. Recent years, the tensor analysis plays an important role in a wide range of real-world applications [3, 10,11,12, 14, 34]. Among them, one important problem is the tensor completion problem, which aims to estimate the missing values from the observed tensor data, e.g., color image or video inpainting [2, 4, 8, 19], hyperspectral images recovery [5, 15, 21], magnetic resonance image (MRI) recovery [6, 32], and higher-order web link analysis [18].

The low-rank constraint has become a powerful tool to recover the higher-order tensor with missing entries. Mathematically, the low-rank tensor completion (LRTC) problem can be formulated as

$$\begin{aligned} \begin{aligned}&\min _{\mathcal {M}} \quad \text {rank}(\mathcal {M})\\&s.t. \quad \mathcal {P}_{\Omega }(\mathcal {M})=\mathcal {T}, \end{aligned} \end{aligned}$$
(1)

where \(\mathcal {M}\in \mathbb {R}^{n_{1}\times \cdots \times n_{j}}\) is the underlying tensor, \(\mathcal {T}\in \mathbb {R}^{n_{1}\times \cdots \times n_{j}}\) is the observed tensor, \(\Omega \) is the index of observed entries, and \(\mathcal {P}_{\Omega }(\cdot )\) is the projection operator that keeps entries in \(\Omega \) and zeros out others. However, the definition for the rank of tensors is not unique, such as CANDECOMP/PARAFAC rank and Tucker rank [17]. Both of the corresponding minimization problems are generally NP-hard [9, 26]. To tackle this problem, Liu et al. [22] firstly defined the nuclear norm of a tensor based on the mode-k unfolding [17]

$$\begin{aligned} \begin{aligned}&\min _{\mathcal {M}} \quad \Vert \mathcal {M}\Vert _{*}=\sum _{k=1}^{j}\alpha _{k}\Vert \mathbf M _{(k)}\Vert _{*}\\&s.t. \quad \mathcal {P}_{\Omega }(\mathcal {M})=\mathcal {T}, \end{aligned} \end{aligned}$$
(2)

where \(\mathbf M _{(k)}\in \mathbb {R}^{n_{k}\times \prod _{d\ne k} n_{d}}\) is the mode-k unfolding of \(\mathcal {M}\) and \(\alpha _{k}s\) are positive constants satisfying \(\sum _{k=1}^{j}\alpha _{k}=1\). The nuclear norm minimization methods for solving problem (2) have to calculate the singular value decomposition (SVD) for all matrices in every iteration, which suffer from high computational cost. To improve the capacity of solving large-scale problems, Xu et al. [35] performed the low-rank matrix factorization to all-mode unfolded matrices, i.e., factorize each mode matricization \(\mathbf M _{(k)}\) into the product of two smaller matrices \(\mathbf X _{k}\) and \(\mathbf Y _{k}\), and proposed the following model:

$$\begin{aligned} \begin{aligned} \min _{\mathbf{X _{k}}, \mathbf{Y _{k}}, \mathcal {M}} \quad&\sum _{k=1}^{j}\alpha _{k}\Vert \mathbf X _{k}{} \mathbf Y _{k}-\mathbf M _{(k)}\Vert _{F}^{2}\\ s.t. \quad&\mathcal {P}_{\Omega }(\mathcal {M})=\mathcal {T}, \end{aligned} \end{aligned}$$
(3)

where \(\mathbf X _{k}\in \mathbb {R}^{n_{k}\times r_{k}}\), \(\mathbf Y _{k}\in \mathbb {R}^{r_{k}\times \prod _{d\ne k} n_{d}}\), and \(r_{k}\) is the estimated rank of the matrix \(\mathbf M _{(k)}\).

As the matrix \(\mathbf M _{(k)}\) is constructed based on an unbalanced matricization scheme (one mode versus the rest) [7, 22], the \(rank(\mathbf M _{(k)})\) only captures the correlation between a simple mode (not a few modes) and rest modes of the tensor. Therefore, the existing methods based on Tucker rank maybe not suitable for completing higher-order tensors (\(j>3\)) [16, 28].

Recently, the tensor train (TT) rank, which considers the global correlation of tensors thanks to a well-balanced matricization scheme, is proposed [27, 29] as

$$\begin{aligned} \text {rank}_{tt}(\mathcal {M})=\big (rank(\mathbf M _{[1]}),\ldots ,rank(\mathbf M _{[j-1]})\big ), \end{aligned}$$
(4)

where \(\mathbf M _{[k]}\in \mathbb {R}^{(\prod _{d=1}^{k}n_{d}) \times (\prod _{d=k+1}^{j}n_{d})}\) is the mode-k canonical matricization of the tensor \(\mathcal {M}\). Bengua et al. [1] applied TT rank to the color image and video completion. They proposed two methods based on TT rank for LRTC. The first one, simple low-rank tensor completion via tensor train (SiLRTC-TT), minimizes the TT nuclear norm, i.e.,

$$\begin{aligned} \begin{aligned} \min _{\mathcal {M}} \quad&\sum _{k=1}^{j-1}\alpha _{k}\Vert \mathbf M _{[k]}\Vert _{*}\\ s.t. \quad&\mathcal {P}_{\Omega }(\mathcal {M})=\mathcal {T}, \end{aligned} \end{aligned}$$
(5)

where \(\alpha _{k}\) is the constant satisfying \(\alpha _{k} \ge 0\) and \(\sum _{k=1}^{j-1}\alpha _{k}=1\). Another one, tensor completion by parallel matrix factorization via tensor train (TMac-TT), uses matrix factorization to approximate the TT rank of a tensor, i.e.,

$$\begin{aligned} \begin{aligned} \min _{\mathbf{X _{k}}, \mathbf{Y _{k}}, \mathcal {M}} \quad&\sum _{k=1}^{j-1}\frac{\alpha _{k}}{2}\Vert \mathbf X _{k}{} \mathbf Y _{k}-\mathbf M _{[k]}\Vert _{F}^{2}\\ s.t. \quad&\mathcal {P}_{\Omega }(\mathcal {M})=\mathcal {T}, \end{aligned} \end{aligned}$$
(6)

where \(\mathbf X _{k}\in \mathbb {R}^{(\prod _{d=1}^{k}n_{d})\times r_{k}}\), \(\mathbf Y _{k}\in \mathbb {R}^{r_{k}\times (\prod _{d=k+1}^{j}n_{d})}\), and \(r_{k}\) is the rank of the matrix \(\mathbf M _{[k]}\). TT rank is more suitable for higher-order tensors due to the ability of capturing the global correlation of a tensor. In order to handle the third-order tensor data, SiLRTC-TT and TMac-TT use a tensor augmented scheme known as \(ket \ augmentation\) (KA), but they cause block-artifacts [24] on restored images.

KA uses an appropriate block structured addressing scheme to cast a lower-order tensor into a higher-order tensor. However, this scheme does not consider the local smoothness between blocks and blocks, so the block-artifacts are caused on restored images. The block-artifacts can be seen in the results for completion of the Lena image with \(90\%\) missing entries in Fig. 1b, c. To reduce the block-artifacts, we need to smooth the edges between each block. Total variation (TV), one of the most famous functions to characterize the piecewise smoothness prior, has been shown to preserve edges well in image processing [13, 25, 31, 33, 36, 37]. Motivated by former works, we introduce TV into LRTC (6),

$$\begin{aligned} \begin{aligned} \min _{\mathbf{X _{k}}, \mathbf{Y _{k}}, \mathcal {M}} \quad&\sum _{k=1}^{j-1}\frac{\alpha _{k}}{2}\Vert \mathbf X _{k}{} \mathbf Y _{k}-\mathbf M _{[k]}\Vert _{F}^{2}+\lambda \text {TV}(\mathcal {M})\\ s.t. \quad&\mathcal {P}_{\Omega }(\mathcal {M})=\mathcal {T}, \end{aligned} \end{aligned}$$
(7)

where \(\alpha _{k}\) satisfies \(\alpha _{k} \ge 0\) and \(\sum _{k=1}^{j-1}\alpha _{k}=1\), \(\lambda \) is a regularization parameter, and \(\text {TV}(\mathcal {M})\) is the total variation of \(\mathcal {M}\) in spatial dimensions (see details in Sect. 3.1). The proposed model is named tensor completion via matrix factorization based on tensor train rank and total variation (MF-TTTV). From Fig. 1, it is clear that our method can effectively alleviate block-artifacts compared with SiLRTC-TT and TMac-TT.

Fig. 1
figure 1

The results of Lena with \(90\%\) missing entries by different methods. From left to right: the original image, the result by SiLRTC-TT, TMac-TT, and MF-TTTV, respectively

The contributions of this paper are mainly two folds: (1) we propose a new tensor completion model combining low-rank matrix factorization based on TT rank with the TV regularization, which simultaneously exploit the low-rankness and the piecewise smoothness prior of the underlying tensor; (2) we develop a block successive upper-bound minimization (BSUM) algorithm to solve the proposed model. Experiments demonstrate that our method performs better than the compared methods and effectively reduces the block-artifacts caused by using KA.

The outline of this paper is as follows. Section 2 reviews some preliminary knowledge about the tensor, the tensor train rank, the proximal operator, and KA. Section 3 describes the model formulation and an efficient BSUM-based algorithm with convergence analysis. Section 4 demonstrates the effectiveness of the proposed method based on abundant numerical experiments. Section 5 summarizes this paper.

2 Preliminary

2.1 Tensor Basics

A tensor is a high-dimensional array and its order (or mode) is the number of its dimensions. We denote scalars as lowercase letters, e.g., z, vectors as boldface lowercase letters, e.g., z, matrices as capital letters, e.g., \(\mathbf Z \), and tensors as calligraphic letters, e.g., \(\mathcal {Z}\). A jth-order tensor is defined as \(\mathcal {Z}\in \mathbb {R}^{n_{1}\times \cdots \times n_{j}}\) whose \((i_{1},\ldots ,i_{j})\)-th component is denoted as \(z_{i_{1},\ldots ,i_{j}}\).

The inner product of two tensors \(\mathcal {X}\) and \(\mathcal {Y}\) with same size is defined as

$$\begin{aligned} \langle \mathcal {X}, \mathcal {Y}\rangle = \sum _{i_{1},\ldots ,i_{j}} x_{i_{1},\ldots ,i_{j}}\cdot y_{i_{1},\ldots ,i_{j}}. \end{aligned}$$

The Frobenius norm of a jth-order tensor \(\mathcal {Z}\) is \(\Vert \mathcal {Z}\Vert _{F} = \sqrt{\langle \mathcal {Z}, \mathcal {Z}\rangle }\).

The mode-k unfolding of a tensor \(\mathcal {Z}\) is denoted as \(\mathbf Z _{(k)}\in \mathbb {R}^{n_{k}\times \prod _{d\ne k} n_{d}}\), where the tensor element \((i_{1},\ldots ,i_{j})\) maps to the element \((i_{k}, b)\) of matrix \(\mathbf Z _{(k)}\) satisfying

$$\begin{aligned} b = 1+\sum _{d=1,d\ne k}^{j}\ (i_{d} - 1) j_{d} \quad \text {with} \quad j_{d} = \prod _{t = 1, t\ne k}^{d-1} n_{t}. \end{aligned}$$
(8)

We denote the mode-k unfolding of a tensor \(\mathcal {Z}\) as \(\mathbf Z _{(k)}=\text {unfold}_{(k)}(\mathcal {Z})\). The inverse operator of unfolding is denoted as “fold”, i.e., \(\mathcal {Z} = \text {fold}_{(k)} (\mathbf Z _{(k)})\).

The mode-k canonical matricization of a tensor \(\mathcal {Z}\) is defined as \(\mathbf Z _{[k]}\in \mathbb {R}^{(\prod _{d=1}^{k}n_{d}) \times (\prod _{d=k+1}^{j}n_{d})}\), where the tensor element \((i_{1},\ldots ,i_{j})\) maps to the element (ab) of matrix \(\mathbf Z _{[k]}\) satisfying

$$\begin{aligned} a = 1+\sum _{d=1}^{k}\left( (i_{d}-1)\prod _{t=1}^{d-1}n_{t}\right) \quad \text {and} \quad b = 1+\sum _{d=k+1}^{j}\left( (i_{d}-1)\prod _{t=k+1}^{d-1}n_{t}\right) . \end{aligned}$$
(9)

In MATLAB, it can be implemented by the reshape function

$$\begin{aligned} \mathbf Z _{[k]}=\text {reshape}_{[k]}\left( \mathcal {Z}, \Pi _{d=1}^{k}n_{d}, \Pi _{d=k+1}^{j}n_{d}\right) . \end{aligned}$$

We denote the mode-k canonical matricization of a tensor \(\mathcal {Z}\) as \(\mathbf Z _{[k]}=\text {reshape}_{[k]}(\mathcal {Z})\). The inverse operator of reshape is denoted as “unreshape”, i.e., \(\mathcal {Z}=\text {unreshape}_{[k]}(\mathbf Z _{[k]})\).

The TT rank is defined as the vector

$$\begin{aligned} \text {rank}_{tt}(\mathcal {Z})=\big (rank(\mathbf Z _{[1]}),\ldots ,rank(\mathbf Z _{[j-1]})\big ). \end{aligned}$$

The tensor \(\mathcal {Z}\) is low-rank, if \(\mathbf Z _{[k]}\) is low-rank for all k .

The notations are listed in Table 1.

Table 1 Tensor notations

2.2 Operators

Let \(\Omega \) be an index set, then the projection operator \(\mathcal {P}_{\Omega }(\mathcal {Z})\) denotes the tensor copying the entries from \(\mathcal {Z}\) in the set \(\Omega \) and letting the remaining entries be zeros, i.e.,

$$\begin{aligned} (\mathcal {P}_{\Omega }(\mathcal {Z}))_{i_{1},\ldots ,i_{j}} = \left\{ \begin{array}{ll} z_{i_{1},\ldots ,i_{j}}, &{}\quad (i_{1},\ldots ,i_{j})\in \Omega , \\ 0, &{}\quad \text {otherwise}.\\ \end{array} \right. \end{aligned}$$

The proximal operator of a given convex function f(x) is defined as

$$\begin{aligned} \text {prox}_{f}(y) = \mathop {\mathrm{arg}\,\mathrm{min}}_{x} f(x)+\frac{\rho }{2}\Vert x-y\Vert ^{2}, \end{aligned}$$
(10)

where \(\rho \) is a positive proximal constant. The problem \(\min _{x}\{{f(x)}\}\) is equivalent to \(\min _{x,y}\{{f(x)}+\frac{\rho }{2}\Vert x-y\Vert ^{2}\}\). Thus one can obtain the minimization of f(x) by iteratively solving \(\text {prox}_{f}(x^{l})\), where \(x^{l}\) is the latest update of x. The proximal operator is very attractive in that the objective function (10) is strongly convex with respect to x so long as f(x) is convex.

Fig. 2
figure 2

Graphical illustration of a structured block addressing procedure. a Example addressing for a \(2\times 2\times 3\) color image. b Example addressing for a \(2^{2}\times 2^{2}\times 3\) color image (Color figure online)

2.3 Ket Augmentation

In this subsection, we introduce ket augmentation (KA) to represent a lower-order tensor by a higher-order one without changing the number of entries in the tensor. The TT decomposition can effectively exploit the local structure of the data by considering the low-rankness of the augmented tensor. Even though the tensor is slightly correlated, its augmented tensor has low TT rank [20].

KA is firstly introduced in [20] to cast a grayscale image into \(real \ ket \ state\) of a Hilbert space by using an appropriate block structured addressing. In [1], the authors successfully extended KA to third-order tensors. Next, we introduce the details about how KA works on the color image. We define the third-order tensor \(\mathcal {M}\in \mathbb {R}^{n_{1}\times n_{2}\times n_{3}}\) that represents the color image, where \(n_{1}\times n_{2} = 2^{n}\times 2^{n}\) is the number of pixels in the image and \(n_{3}=3\) is the number of colors. Let us start with an initial block, labeled as \(i_{1}\) and containing \(2 \times 2\) pixels corresponding to a single color j (\(j = 1, 2, 3\) denotes red, green and blue colors, respectively). This block can be represented as

$$\begin{aligned} {\mathcal {{\tilde{M}}}}=\sum ^{4}_{i_{1}=1}c_{i_{1}j}{} \mathbf e _{i_{1}}, \end{aligned}$$

where \(c_{i_{1}j}\) is the pixel value corresponding to color j and \(\mathbf e _{i_{1}}\) is the orthonormal base \(\mathbf e _{1} = (1, 0, 0, 0)\), \(\mathbf e _{2} = (0, 1, 0, 0)\), \(\mathbf e _{3} = (0, 0, 1, 0)\), and \(\mathbf e _{4} = (0, 0, 0, 1)\). For all three color channels, the three blocks is represented as

$$\begin{aligned} {\mathcal {\tilde{M}}}=\sum ^{4}_{i_{1}=1}\sum ^{3}_{j=1}c_{i_{1}j}{} \mathbf e _{i_{1}}\otimes \mathbf u _{j}, \end{aligned}$$

where \(\mathbf u _{j}\) is the orthonormal base \(\mathbf u _{1} = (1, 0, 0)\), \(\mathbf u _{2} = (0, 1, 0)\), \(\mathbf u _{3} = (0, 0, 1)\), and \(\otimes \) denotes a tensor product [17]. Let us consider a larger block (labeled as \(i_{2}\)) which consists of four inner sub-blocks for each color j as shown in Fig. 2b. So we have the new block

$$\begin{aligned} {\mathcal {\tilde{Z}}}=\sum ^{4}_{i_{2}=1}\sum ^{4}_{i_{1}=1}\sum ^{3}_{j=1}c_{i_{2}i_{1}j}{} \mathbf e _{i_{2}}\otimes \mathbf e _{i_{1}}\otimes \mathbf u _{j}. \end{aligned}$$

The color image can be cast into an \((n+1)\)th-order tensor as follows:

$$\begin{aligned} {\mathcal {\tilde{M}}}=\sum ^{4}_{i_{1},\ldots ,i_{n}=1}\sum ^{3}_{j=1}c_{i_{n},\ldots ,i_{1}j}{} \mathbf e _{i_{n}}\otimes \cdots \otimes \mathbf e _{i_{1}}\otimes \mathbf u _{j}. \end{aligned}$$
(11)

Based on the KA method, TT-based methods can effectively exploit the local structure information of visual data and have a better low-rank representation. After the augmented data is processed by the completion method, the reverse operation of KA is performed to obtain the original image form.

3 The Proposed Model and Algorithm

3.1 The Proposed Model

The objective function of our model is:

$$\begin{aligned} f(\mathbf X ,\mathbf Y ,\mathcal {M}) = \min _{\mathcal {M}, \mathbf X , \mathbf Y } \sum _{k=1}^{j-1}\frac{\alpha _{k}}{2}\Vert \mathbf X _{k}{} \mathbf Y _{k}-\mathbf M _{[k]}\Vert _{F}^{2}+\lambda \text {TV}(\mathcal {M}), \end{aligned}$$
(12)

where \(\mathbf X = (\mathbf X _{1},\ldots ,\mathbf X _{j-1})\), \(\mathbf Y = (\mathbf Y _{1},\ldots ,\mathbf Y _{j-1})\), and \(\lambda \) is the regularization parameter. In the proposed model, the first term is the fidelity term, the second term is the regularization term.

In the fidelity term, matrices \(\mathbf M _{[k]}\), obtained by a well-balanced matricization scheme, capture the global information of the underlying tensor \(\mathcal {M}\). This fidelity term enhances the low-rankness of the underlying tensor \(\mathcal {M}\).

In the regularization term, \(\text {TV}(\mathcal {M})\) denotes the isotropic TV in spatial dimensions. This term aims at enhancing the piecewise smoothness in spatial dimensions. According to the rules of unfolding a tensor (8), we find that the \(\text {mode}\)-1 unfolding matrix \(\mathbf M _{(1)}\) has a good structure:

$$\begin{aligned} \mathbf M _{(1)}=[\mathbf M ^{(1)}, \ldots , \mathbf M ^{(\hat{s})}]\in \mathbb {R}^{n_{1}\times s}, \end{aligned}$$
(13)

where \(\hat{s}=\prod _{k=3}^{j}n_{k}\), \(s=n_{2}\times \hat{s}\), and \(\mathbf M ^{(i)}\in \mathbb {R}^{n_{1}\times n_{2}}\) is a matrix in spatial dimensions of the tensor \(\mathcal {M}\).

In our works, the isotropic TV is defined as follows:

$$\begin{aligned} \text {TV}(\mathcal {M})=\sum _{i=1}^{\hat{s}}\sum _{m=1}^{n_{1}}\sum _{n=1}^{n_{2}}\sqrt{[D_{m,n}^{1}{} \mathbf M ^{(i)}]^{2}+[D_{m,n}^{2}{} \mathbf M ^{(i)}]^{2}}, \end{aligned}$$
(14)

where \(D_{m,n}^{1}{} \mathbf M ^{(i)}\) and \(D_{m,n}^{2}{} \mathbf M ^{(i)}\) are the horizontal and vertical gradient values at the \((\text {m},~ \text {n})\)-th pixel of matrix \(\mathbf M ^{(i)}\), respectively, and \(D_{m,n}^{1}\) and \(D_{m,n}^{2}\) are the discrete horizontal and vertical gradient operators, respectively.

3.2 The Proposed Algorithm

It is clear that the objective function (12) is not jointly convex for \((\mathbf X ,\mathbf Y ,\mathcal {M})\), but is convex with respect to \(\mathbf X \), \(\mathbf Y \), and \(\mathcal {M}\) independently. In order to solve the nonconvex problem effectively, we develop the BSUM-based algorithm.

Let \(\mathcal {Z}=(\mathbf X ,\mathbf Y ,\mathcal {M})\) and \(\mathcal {Z}^{l}=(\mathbf X ^{l},\mathbf Y ^{l},\mathcal {M}^{l})\), by utilizing the proximal operator, (7) can be updated through the following:

$$\begin{aligned} \mathcal {Z}^{l+1}=\mathop {\mathrm{arg}\,\mathrm{min}}_{\mathcal {Z}}h(\mathcal {Z},\mathcal {Z}^{l})=\mathop {\mathrm{arg}\,\mathrm{min}}_{\mathcal {Z}}f(\mathcal {Z})+\frac{\rho }{2}\Vert \mathcal {Z}-\mathcal {Z}^{l}\Vert _{F}^{2}, \end{aligned}$$
(15)

where \(\rho \) is the positive proximal parameter. Note that (15) can be written as

$$\begin{aligned} \left\{ \begin{array}{ll} \begin{aligned} \mathbf X ^{l+1} &{}= \mathop {\mathrm{arg}\,\mathrm{min}}_\mathbf{X } h_{1}(\mathbf X ,\mathcal {Z}^{l}_{1})= \mathop {\mathrm{arg}\,\mathrm{min}}_\mathbf{X } f(\mathbf X ,\mathbf Y ^{l},\mathcal {M}^{l})+\frac{\rho }{2}\Vert \mathbf X -\mathbf X ^{l}\Vert _{F}^{2}, \\ \mathbf Y ^{l+1} &{}= \mathop {\mathrm{arg}\,\mathrm{min}}_\mathbf{Y }h_{2}(\mathbf Y ,\mathcal {Z}^{l}_{2})= \mathop {\mathrm{arg}\,\mathrm{min}}_\mathbf{Y } f(\mathbf X ^{l+1},\mathbf Y ,\mathcal {M}^{l})+\frac{\rho }{2}\Vert \mathbf Y -\mathbf Y ^{l}\Vert _{F}^{2}, \\ \mathcal {M}^{l+1} &{}= \mathop {\mathrm{arg}\,\mathrm{min}}_{ \mathcal {P}_{\Omega }(\mathcal {M}) = \mathcal {T} }h_{3}(\mathcal {M},\mathcal {Z}^{l}_{3})= \mathop {\mathrm{arg}\,\mathrm{min}}_{ \mathcal {P}_{\Omega }(\mathcal {M}) = \mathcal {T} } f(\mathbf X ^{l+1},\mathbf Y ^{l+1},\mathcal {M})+\frac{\rho }{2}\Vert \mathcal {M}-\mathcal {M}^{l}\Vert _{F}^{2},\\ \end{aligned} \end{array} \right. \end{aligned}$$
(16)

where \(\mathcal {Z}^{l}_{1}=(\mathbf X ^{l},\mathbf Y ^{l},\mathcal {M}^{l})\), \(\mathcal {Z}^{l}_{2}=(\mathbf X ^{l+1},\mathbf Y ^{l},\mathcal {M}^{l})\), \(\mathcal {Z}^{l}_{3}=(\mathbf X ^{l+1},\mathbf Y ^{l+1},\mathcal {M}^{l})\). It is easy to see that the \(\mathbf X \)- and \(\mathbf Y \)-subproblem can be decomposed into \(j-1\) independent problems. Then, the (16) can be written as

$$\begin{aligned} \left\{ \begin{array}{l} \begin{aligned} \mathbf X ^{l+1}_{k} &{}= \mathop {\mathrm{arg}\,\mathrm{min}}_\mathbf{X _{k}}\frac{\alpha _{k}}{2}\Vert \mathbf X _{k}{} \mathbf Y _{k}^{l}-\mathbf M _{[k]}^{l}\Vert _{F}^{2}+\frac{\rho }{2}\Vert \mathbf X _{k}-\mathbf X _{k}^{l}\Vert _{F}^{2}\\ &{}= (\alpha _{k}{} \mathbf M _{[k]}^{l}(\mathbf Y _{k}^{l})^{T}+\rho \mathbf X _{k}^{l})(\alpha _{k}{} \mathbf Y _{k}^{l}(\mathbf Y _{k}^{l})^{T}+\rho \mathbf I )^{-1},\\ \mathbf Y ^{l+1}_{k} &{}= \mathop {\mathrm{arg}\,\mathrm{min}}_\mathbf{Y _{k}}\frac{\alpha _{k}}{2}\Vert \mathbf X _{k}^{l+1}{} \mathbf Y _{k}-\mathbf M _{[k]}^{l}\Vert _{F}^{2}+\frac{\rho }{2}\Vert \mathbf Y _{k}-\mathbf Y _{k}^{l}\Vert _{F}^{2}\\ &{}= (\alpha _{k}(\mathbf X _{k}^{l+1})^{T}{} \mathbf X _{k}^{l+1}+\rho \mathbf I )^{-1}(\alpha _{k}(\mathbf X _{k}^{l+1})^{T}{} \mathbf M _{[k]}^{l}+\rho \mathbf Y _{k}^{l}),\\ \mathcal {M}^{l+1} &{}= \mathop {\mathrm{arg}\,\mathrm{min}}_{\mathcal {P}_{\Omega }(\mathcal {M}) = \mathcal {T} } \sum _{k=1}^{j-1}\frac{\alpha _{k}}{2}\Vert \mathbf X _{k}^{l+1}{} \mathbf Y _{k}^{l+1}-\mathbf M _{[k]}\Vert _{F}^{2}+\lambda \text {TV}(\mathcal {M})+ \frac{\rho }{2}\Vert \mathcal {M}-\mathcal {M}^{l}\Vert _{F}^{2},\\ \end{aligned} \end{array} \right. \end{aligned}$$
(17)

where \(\mathbf I \in \mathbb {R}^{r_{k}\times r_{k}}\) is an identify matrix.

The cost of computing \(\mathbf X _{k}\) is \(O(r_{k}\prod _{d=1}^{j}n_{d})\) and the cost of computing \(\mathbf Y _{k}\) is \(O(r_{k}\prod _{d=1}^{j}n_{d})\).

Since the \(\mathcal {M}\)-subproblem does not admit a close-form solution, we solve the \(\mathcal {M}\)-subproblem using the alternating direction method of multipliers (ADMM). By introducing auxiliary variables, we rewrite it as the following equivalent constrained problem:

$$\begin{aligned} \begin{aligned} \min \quad&\lambda \sum _{m=1}^{n_{1}}\sum _{n=1}^{s} \Vert \mathbf E _{m,n}\Vert _{2}+ \sum _{k=1}^{j-1}\frac{\alpha _{k}}{2} \Vert \mathbf X _{k}^{l+1}{} \mathbf Y _{k}^{l+1}-\mathcal {A}_{k[k]}\Vert _{F}^{2}+\frac{\rho }{2}\Vert \mathcal {M}-\mathcal {M}^{l}\Vert _{F}^{2},\\ s.t. \quad&\mathcal {P}_{\Omega }(\mathcal {M})=\mathcal {T},\ \mathcal {M}=\mathcal {A}_{k},\ \mathcal {M}=\mathcal {Z},\ \mathbf D _{1}{} \mathbf Z _{(1)}=\mathbf E _{1},\ \mathbf D _{2}{} \mathbf Z _{(1)}=\mathbf E _{2}, \end{aligned} \end{aligned}$$
(18)

where \(\mathbf E _{m,n}=[(\mathbf E _{1})_{m,n}, (\mathbf E _{2})_{m,n}]\in \mathbb {R}^{1 \times 2}\), \((\mathbf E _{1})_{m,n}\) and \((\mathbf E _{2})_{m,n}\) are the \(\text {(m,n)-th}\) entries of \(\mathbf E _{1}\) and \(\mathbf E _{2}\), respectively, and \(\mathbf D _{1}\) and \(\mathbf D _{2}\) are the first-order difference matrices in the horizontal and vertical directions based on \(D_{m,n}^{1}\) and \(D_{m,n}^{2}\), respectively.

The augmented Lagrangian function of (18) is defined as

$$\begin{aligned} \begin{aligned} L(\mathcal {M}, \mathcal {A}_{k}, \mathcal {Z}, \mathbf E _{i}, \mathcal {B}_{k}, \mathcal {Q}_{i}, \mathbf F _{i})&=\lambda \sum _{m=1}^{n_{1}}\sum _{n=1}^{s} \Vert \mathbf E _{m,n}\Vert _{2}+\sum _{k=1}^{j-1}\frac{\alpha _{k}}{2}\Vert \mathbf X _{k}^{l+1}{} \mathbf Y _{k}^{l+1}-\mathcal {A}_{k[k]}\Vert _{F}^{2}\\&\quad +\,\frac{\rho }{2}\Vert \mathcal {M}-\mathcal {M}^{l}\Vert _{F}^{2}\\&\quad +\,\sum _{k=1}^{j-1}\left( \frac{\beta _{1}}{2}\Vert \mathcal {M}-\mathcal {A}_{k}\Vert _{F}^{2}+\langle \mathcal {M}-\mathcal {A}_{k}, \mathcal {B}_{k}\rangle \right) \\&\quad +\,\frac{\beta _{2}}{2}\Vert \mathcal {M}-\mathcal {Z}\Vert _{F}^{2}+\langle \mathcal {M}-\mathcal {Z}, \mathcal {Q}\rangle \\&\quad +\,\frac{\beta _{3}}{2}\Vert \mathbf D _{1}{} \mathbf Z _{(1)}-\mathbf E _{1}\Vert _{F}^{2}+\langle \mathbf D _{1}{} \mathbf Z _{(1)}-\mathbf E _{1}, \mathbf F _{1} \rangle \\&\quad +\,\frac{\beta _{3}}{2}\Vert \mathbf D _{2}{} \mathbf Z _{(1)}-\mathbf E _{2}\Vert _{F}^{2} + \langle \mathbf D _{2}{} \mathbf Z _{(1)}-\mathbf E _{2}, \mathbf F _{2} \rangle , \end{aligned} \end{aligned}$$
(19)

where \(\mathcal {Q}\), \(\{\mathcal {B}_{k}\}_{k=1}^{j-1}\), and \(\{\mathbf{F }_{i}\}_{i=1}^{2}\) are Lagrangian multipliers of the linear constraint and \(\beta _{1}, \beta _{2}, \beta _{3}>0\) are penalty parameters. The iterative scheme for solving (18) is as follows:

$$\begin{aligned} \left\{ \begin{array}{l} \begin{aligned} &{}\mathcal {M}^{l+1,p+1}=\mathop {\mathrm{arg}\,\mathrm{min}}_{\mathcal {P}_{\Omega }(\mathcal {M}) = \mathcal {T} }L(\mathcal {M}, \mathcal {A}_{k}^{p}, \mathcal {Z}^{p}, \mathcal {B}_{k}^{p}, \mathcal {Q}_{i}^{p}),\\ &{}\mathcal {A}_{k}^{p+1} =\mathop {\mathrm{arg}\,\mathrm{min}}_{\mathcal {A}_{k}}L(\mathcal {M}^{l+1,p+1}, \mathcal {A}_{k}, \mathcal {B}_{k}^{p+1}),\\ &{}\mathcal {Z}^{p+1} =\mathop {\mathrm{arg}\,\mathrm{min}}_{\mathcal {Z}}L(\mathcal {M}^{l+1,p+1}, \mathcal {Z}, \mathbf E _{i}^{p}, \mathcal {Q}^{p}, \mathbf F _{i}^{p}),\\ &{}\mathbf E _{i}^{p+1}=\mathop {\mathrm{arg}\,\mathrm{min}}_{E_{i}}L(\mathcal {Z}^{p+1}, \mathbf E _{i}, \mathbf F _{i}^{p}),\\ &{}\mathcal {B}_{k}^{p+1}=\mathcal {B}_{k}^{p}+\beta _{1}(\mathcal {M}^{l+1,p+1}-\mathcal {A}_{k}^{p+1}),\\ &{}\mathcal {Q}^{p+1}=\mathcal {Q}^{p}+\beta _{2}(\mathcal {M}^{l+1,p+1}-\mathcal {Z}^{p+1}),\\ &{}\mathbf F _{i}^{p+1}=\mathbf F _{i}^{p}+\beta _{3}(\mathbf D _{i}{} \mathbf Z _{(1)}^{p+1}-\mathbf E _{i}^{p+1}).\\ \end{aligned} \end{array} \right. \end{aligned}$$
(20)

We give the details for solving the first four subproblems in (20).

  1. 1.

    \(\mathcal {M}\)-subproblem    We fix all variables but \(\mathcal {M}\), the optimal \(\mathcal {M}\) is obtained as

    $$\begin{aligned} \begin{aligned} \mathcal {M}^{l+1,p+1}&= \mathop {\mathrm{arg}\,\mathrm{min}}_{ \mathcal {P}_{\Omega }(\mathcal {M}) = \mathcal {T} } \sum _{k=1}^{j-1}\left( \frac{\beta _{1}}{2}\Vert \mathcal {M}-\mathcal {A}_{k}^{p}\Vert _{F}^{2}+\langle \mathcal {M}-\mathcal {A}_{k}^{p}, \mathcal {B}_{k}^{p}\rangle \right) \\&\quad +\,\frac{\beta _{2}}{2}\Vert \mathcal {M}-\mathcal {Z}^{p}\Vert _{F}^{2} + \langle \mathcal {M}-\mathcal {Z}^{p}, \mathcal {Q}^{p}\rangle +\frac{\rho }{2}\Vert \mathcal {M}-\mathcal {M}^{l}\Vert _{F}^{2}\\&=\mathop {\mathrm{arg}\,\mathrm{min}}_{ \mathcal {P}_{\Omega }(\mathcal {M}) = \mathcal {T} } \sum _{k=1}^{j-1}\frac{\beta _{1}}{2}\left\| \mathcal {M}-\mathcal {A}_{k}^{p}+\frac{\mathcal {B}_{k}^{p}}{\beta _{1}}\right\| _{F}^{2}+ \frac{\beta _{2}}{2}\left\| \mathcal {M}-\mathcal {Z}^{p}+\frac{\mathcal {Q}^{p}}{\beta _{2}}\right\| _{F}^{2}\\&\quad +\,\frac{\rho }{2}\Vert \mathcal {M}-\mathcal {M}^{l}\Vert _{F}^{2}. \end{aligned} \end{aligned}$$
    (21)

    The function is quadratic in terms of \(\mathcal {M}\). The optimal \(\mathcal {M}\) is

    $$\begin{aligned} \mathcal {M}^{l+1,p+1} = \mathcal {P}_{\Omega ^{c}} \left( \frac{\sum _{k=1}^{j-1}(\beta _{1}\mathcal {A}_{k}^{p}-\mathcal {B}_{k}^{p})+\beta _{2}\mathcal {Z}^{p}-\mathcal {Q}^{p} +\rho \mathcal {M}^{l}}{(j-1)\beta _{1}+\beta _{2}+\rho } \right) +\mathcal {T}. \end{aligned}$$
    (22)
  2. 2.

    \(\{\mathcal {A}_{k}\}\)-subproblem    We give the optimal \(\mathcal {A}_{k}\) by

    $$\begin{aligned} \begin{aligned} \mathcal {A}_{k}^{p+1}&=\mathop {\mathrm{arg}\,\mathrm{min}}_{\mathcal {A}_{k}} \frac{\alpha _{k}}{2}\left\| \mathbf X _{k}^{l+1}{} \mathbf Y _{k}^{l+1}-\mathcal {A}_{k[k]}\right\| _{F}^{2}\\&\quad +\,\frac{\beta _{1}}{2}\Vert \mathcal {M}^{l+1,p+1}-\mathcal {A}_{k}\Vert _{F}^{2}+\langle \mathcal {M}^{l+1,p+1}-\mathcal {A}_{k}, \mathcal {B}_{k}^{p}\rangle \\&= \mathop {\mathrm{arg}\,\mathrm{min}}_{\mathcal {A}_{k}} \frac{\alpha _{k}}{2}\left\| \mathbf X _{k}^{l+1}{} \mathbf Y _{k}^{l+1}-\mathcal {A}_{k[k]}\right\| _{F}^{2}+\frac{\beta _{1}}{2}\left\| \mathcal {M}^{l+1,p+1}-\mathcal {A}_{k}+\frac{\mathcal {B}_{k}^{p}}{\beta _{1}}\right\| _{F}^{2}. \end{aligned} \end{aligned}$$
    (23)

    Utilizing the equation \(\Vert \mathbf M _{[k]}\Vert _{F} = \Vert \mathcal {M}\Vert _{F}\) and the definition of \(\text {unreshape}_{[k]}(\cdot )\) in Sect. 2.1, we rewrite the optimization problem (23) as the following problem:

    $$\begin{aligned} \mathcal {A}_{k}^{p+1}= \mathop {\mathrm{arg}\,\mathrm{min}}_{\mathcal {A}_{k}} \frac{\alpha _{k}}{2}\left\| \text {unreshape}_{[k]}(\mathbf X _{k}^{l+1}{} \mathbf Y _{k}^{l+1}){-}\mathcal {A}_{k}\right\| _{F}^{2}+\frac{\beta _{1}}{2}\left\| \mathcal {M}^{l+1,p+1}{-}\mathcal {A}_{k}+\frac{\mathcal {B}_{k}^{p}}{\beta _{1}}\right\| _{F}^{2}.\nonumber \\ \end{aligned}$$
    (24)

    The function is quadratic in terms of \(\mathcal {A}_{k}\). The optimal \(\mathcal {A}_{k}\) is

    $$\begin{aligned} \mathcal {A}_{k}^{p+1} = \frac{\alpha _{k}\text {unreshape}_{[k]}(\mathbf X _{k}^{l+1}{} \mathbf Y _{k}^{l+1})+\beta _{1}\mathcal {M}^{l+1,p+1}+\mathcal {B}_{k}^{p}}{\alpha _{k}+\beta _{1}}. \end{aligned}$$
    (25)
  3. 3.

    \(\mathcal {Z}\)-subproblem    The \(\mathcal {Z}\)-subproblem is

    $$\begin{aligned} \begin{aligned} \mathcal {Z}^{p+1}&=\mathop {\mathrm{arg}\,\mathrm{min}}_{\mathcal {Z}} \frac{\beta _{2}}{2}\left\| \mathcal {M}^{l+1,p+1}{-}\mathcal {Z}\right\| _{F}^{2}{+}\langle \mathcal {M}^{l+1,p+1}-\mathcal {Z}, \mathcal {Q}^{p}\rangle {+}\frac{\beta _{3}}{2}\left\| \mathbf{D }_{1}\mathbf{Z }_{(1)}-\mathbf{E }_{1}^{p}\right\| _{F}^{2}\\&\quad +\,\langle \mathbf{D }_{1}\mathbf{Z }_{(1)}-\mathbf{E }_{1}^{p}, \mathbf{F }_{1}^{p} \rangle +\frac{\beta _{3}}{2}\left\| \mathbf{D }_{2}\mathbf{Z }_{(1)}-\mathbf{E }_{2}^{p}\right\| _{F}^{2}+\langle \mathbf{D }_{2}\mathbf{Z }_{(1)}-\mathbf{E }_{2}^{p}, \mathbf{F }_{2}^{p} \rangle \\&=\mathop {\mathrm{arg}\,\mathrm{min}}_{\mathcal {Z}} \frac{\beta _{2}}{2}\left\| \mathcal {M}^{l+1,p+1}-\mathcal {Z}+\frac{\mathcal {Q}^{p}}{\beta _{2}}\right\| _{F}^{2} +\frac{\beta _{3}}{2}\left\| \mathbf{D }_{1}\mathbf{Z }_{(1)}-\mathbf{E }_{1}^{p}+\frac{\mathbf{F }_{1}^{p}}{\beta _{3}}\right\| _{F}^{2}\\&\quad +\,\frac{\beta _{3}}{2}\Vert \mathbf{D }_{2}\mathbf{Z }_{(1)}-\mathbf{E }_{2}^{p}+\frac{\mathbf{F }_{2}^{p}}{\beta _{3}}\Vert _{F}^{2}. \end{aligned} \end{aligned}$$
    (26)

    Using the equation \(\Vert \mathbf{Z }_{(1)}\Vert _{F} = \Vert \mathcal {Z}\Vert _{F}\), we solve its equivalent problem as

    $$\begin{aligned} \begin{aligned} \mathbf{Z }^{p+1}_{(1)}&=\mathop {\mathrm{arg}\,\mathrm{min}}_{\mathbf{Z }_{(1)}} \frac{\beta _{2}}{2}\left\| \mathbf{M }^{l+1,p+1}_{(1)}-\mathbf{Z }_{(1)}+\frac{\mathbf{Q }^{p}_{(1)}}{\beta _{2}}\right\| _{F}^{2} +\frac{\beta _{3}}{2}\left\| \mathbf{D }_{1}\mathbf{Z }_{(1)}-\mathbf{E }_{1}^{p}+\frac{\mathbf{F }_{1}^{p}}{\beta _{3}}\right\| _{F}^{2}\\&\quad +\,\frac{\beta _{3}}{2}\left\| \mathbf{D }_{2}\mathbf{Z }_{(1)}-\mathbf{E }_{2}^{p}+\frac{\mathbf{F }_{2}^{p}}{\beta _{3}}\right\| _{F}^{2}. \end{aligned} \end{aligned}$$
    (27)

    Optimizing this problem can be treated as solving the following linear system:

    $$\begin{aligned} \mathbf A \mathbf{Z }_{(1)}=\mathbf{B }, \end{aligned}$$
    (28)

    where \(\mathbf A = \beta _{2}+\beta _{3}{} \mathbf D _{1}^{T}{} \mathbf D _{1} + \beta _{3}{} \mathbf D _{2}^{T}{} \mathbf D _{2}\) and \(B = \beta _{2}{} \mathbf M _{(1)}^{l+1,p+1}+\mathbf Q _{(1)}^{p}+\beta _{3}{} \mathbf D _{1}^{T}{} \mathbf E _{1}^{p}-\mathbf D _{1}^{T}{} \mathbf F _{1}^{p}+\beta _{3}{} \mathbf D _{2}^{T}{} \mathbf E _{2}^{p} -\mathbf D _{2}^{T}{} \mathbf F _{2}^{p}\).

    Assuming the periodic boundary condition, \(\mathbf D _{1}\) and \(\mathbf D _{2}\) have block circulant with circulating block (BCCB) structure. After applying 2D FFTs, the optimal \(\mathcal {Z}\) is formed directly

    $$\begin{aligned} \mathcal {Z} = \text {fold}_{(1)}\Bigg (\mathcal {F}^{-1}\Bigg (\frac{\mathcal {F}(\mathbf B )}{\mathcal {F}(\mathbf A )}\Bigg ) \Bigg ). \end{aligned}$$
    (29)
  4. 4.

    \(\mathbf E \)-subproblem    For the \(\mathbf E \)-subproblem, we solve the following optimization problem:

    $$\begin{aligned} \begin{aligned} \mathbf{E }^{p+1}&=\mathop {\mathrm{arg}\,\mathrm{min}}_{\mathbf{E }} \lambda \sum _{m=1}^{n_{1}}\sum _{n=1}^{s}\left\| \mathbf{E }_{m,n}\right\| _{2} +\frac{\beta _{3}}{2}\left\| \mathbf{D }_{1}{} \mathbf Z _{(1)}^{p+1}-\mathbf{E }_{1}\right\| _{F}^{2}+\langle \mathbf{D }_{1}{} \mathbf Z _{(1)}^{p+1}-\mathbf{E }_{1}, \mathbf{F }_{1}^{p} \rangle \\&\quad +\,\frac{\beta _{3}}{2}\left\| \mathbf{D }_{2}{} \mathbf Z _{(1)}^{p+1}-\mathbf{E }_{2}\right\| _{F}^{2}+\langle \mathbf{D }_{2}{} \mathbf Z _{(1)}^{p+1}-\mathbf{E }_{2}, \mathbf{F }_{2}^{p} \rangle \\&=\mathop {\mathrm{arg}\,\mathrm{min}}_{E} \lambda \sum _{m=1}^{n_{1}}\sum _{n=1}^{s}\left\| \mathbf{E }_{m,n}\right\| _{2} +\frac{\beta _{3}}{2}\left\| \mathbf{D }_{1}{} \mathbf Z _{(1)}^{p+1}-\mathbf{E }_{1}+\frac{\mathbf{F }_{1}^{p}}{\beta _{3}}\right\| _{F}^{2}\\&\quad +\,\frac{\beta _{3}}{2}\left\| \mathbf{D }_{2}{} \mathbf Z _{(1)}^{p+1}-\mathbf{E }_{2}+\frac{\mathbf{F }_{2}^{p}}{\beta _{3}}\right\| _{F}^{2}. \end{aligned} \end{aligned}$$
    (30)

    Given fixed \(\mathbf Z _{(1)}\) and \(\mathbf F _{i}\), the optimal \(\mathbf E \) consists of solving \(n_{1}s\) independent two-variable minimization problems

    $$\begin{aligned} \begin{aligned} \mathop {\mathrm{arg}\,\mathrm{min}}_{(\mathbf{E }_{1},\mathbf{E }_{2})_{m,n}}&\quad \lambda \sqrt{\left[ (\mathbf{E }_{1})_{m,n}\right] ^{2}{+}\left[ (\mathbf{E }_{2})_{m,n}\right] ^{2}} {+}\frac{\beta _{3}}{2}\left[ (\mathbf{E }_{1})_{m,n} - (\mathbf{D }_{1}{} \mathbf Z _{(1)}^{p+1})_{m,n}-\frac{1}{\beta _{3}}(\mathbf{F }_{1}^{p})_{m,n}\right] ^{2}\\&+\frac{\beta _{3}}{2}\left[ (\mathbf{E }_{2})_{m,n} - (\mathbf{D }_{2}{} \mathbf Z _{(1)}^{p+1})_{m,n}-\frac{1}{\beta _{3}}(\mathbf{F }_{2}^{p})_{m,n}\right] ^{2}. \end{aligned} \end{aligned}$$
    (31)

    And (31) can be solved by using the 2-D shrinkage formula as follows:

    $$\begin{aligned} {[}(\mathbf{E }_{1})_{m,n},(\mathbf{E }_{2})_{m,n}]=\text {max}\left\{ \left\| \mathbf{W }_{m,n}\right\| _{2}-\frac{\lambda }{\beta _{3}},0\right\} \frac{\mathbf{W }_{m,n}}{\left\| \mathbf{W }_{m,n}\right\| _{2}}, 1\le m\le n_{1}, 1\le n\le s, \end{aligned}$$
    (32)

    where \(\mathbf W _{m,n}=[(\mathbf D _{1}{} \mathbf Z _{(1)}^{p+1})_{m,n}+\frac{1}{\beta _{3}}(\mathbf F _{1}^{p})_{m,n}, (D_{2}{} \mathbf Z _{(1)}^{p+1})_{m,n}+\frac{1}{\beta _{3}}(\mathbf F _{2}^{p})_{m,n}]\) for \(1\le m\le n_{1}, 1\le n\le s\), and set \(0\cdot (0/0)=0\).

Now we discuss the computational complexity of the \(\mathcal {M}\) problem. The \(\mathcal {A}_{k}\)-subproblem (25) involves the product of \(\mathbf X _{k}\) and \(\mathbf Y _{k}\) with sizes \( (\prod _{d=1}^{k}n_{d})\times r_{k}\) and \( r_{k}\times (\prod _{d=k+1}^{j}n_{d})\), whose complexity is \(O(r_{k}\prod _{d=1}^{j}n_{d})\). The main computation of the \(\mathcal {Z}_{k}\)-subproblem (29) is the fast Fourier transforms on the matrix with size \(\prod _{d=1}^{j}n_{d}\), whose complexity is \(O(\prod _{d=1}^{j}n_{d} \log \prod _{d=1}^{j}n_{d})\). The complexity of \([\mathbf E _{1},\mathbf E _{2}]\) (32) is \(O(\prod _{d=1}^{j}n_{d})\). The computational task of solving the \(\mathcal {M}\) problem involves a few inner iterations and its complexity is \(O(\sum _{k=1}^{j-1}r_{k}\prod _{d=1}^{j}n_{d} + \prod _{d=1}^{j}n_{d} \log \prod _{d=1}^{j}n_{d})\).

We summarize the proposed algorithm in Algorithm 1. In each iteration, the total cost of computing \(\mathbf X _{k}\), \(\mathbf Y _{k}\), and \(\mathcal {M}\) in Algorithm 1 is

$$\begin{aligned} O\left( \left( \sum _{k=1}^{j-1}r_{k} + \log \prod _{d=1}^{j}n_{d}\right) \prod _{d=1}^{j}n_{d}\right) . \end{aligned}$$
figure a

3.3 Convergence Analysis

In the subsection, we study the convergence of the proposed algorithm. Recently, Razaviyayn et al. [30] proposed the BSUM algorithm for the non-smooth optimization problem. It is an alternative inexact block coordinate descent method. Following, we reviewed the convergence result in [30] for convenience.

Lemma 1

Given the problem \(\min f(x)\), and subject to \(x\in \mathcal {X}\), where \(\mathcal {X}\) is the feasible set. Assume \(u(x,x^{l-1})\) is an approximation of f(x) at the \((l-1)\)-th iteration, which satisfied the following conditions:

$$\begin{aligned} \left\{ \begin{array}{l} \begin{aligned} &{} u_{i}(y_{i},y)=f(y), \forall y\in \mathcal {X}, \forall i;\\ &{} u_{i}(x_{i},y)\ge f(y_{1}, \ldots , y_{i-1}, x_{i}, y_{i+1}, \ldots , y_{n}), \forall x_{i}\in \mathcal {X}_{i}, \forall y\in \mathcal {X}, \forall i;\\ &{} u_{i}'(x_{i}, y; d_{i})|_{x_{i}=y_{i}}=f'(y; d), \forall d = (0, \ldots , d_{i}, \ldots , 0), s.t. y_{i}+d_{i} \in \mathcal {X}_{i}, \forall i;\\ &{} u_{i}(x_{i},y) \ \text {is continuous in} \ (x_{i}, y), \forall i; \end{aligned} \end{array} \right. \end{aligned}$$
(33)

where \(u_{i}(x_{i},y)\) is the sub-problem with respect to the i-th block and \(f'(y;d)\) is the direction derivative of f at the point y in direction d. Suppose \(u_{i}(x_{i},y)\) is quasi-convex in \(x_{i}\) for \(i=1,\ldots ,n\). Furthermore, assume that each sub-problem \(\mathop {\mathrm{arg}\,\mathrm{min}}u_{i}(x_{i}, x^{l-1})\), \(s.t. x\in \mathcal {X}_{i}\) has a unique solution for any point \(x^{l-1}\in \mathcal {X}\). Then, the iterates generated by the BSUM algorithm converge to the set of coordinatewise minimum of f. In addition, if \(f(\cdot )\) is regular at z, then z is a stationary point.

Next, we will show that the convergence of the proposed algorithm for the model (7) is guaranteed, as it fits the framework of the BSUM method.

Theorem 1

The iterates generated by (16) converge to the set of coordinatewise minimizers.

Proof

It is easy to verify that \(h(\mathcal {Z}, \mathcal {Z}^{k})\) is an approximation and a global upper bound of \(f(\mathcal {Z})\) at the k-th iteration, which satisfies the following conditions:

$$\begin{aligned} \left\{ \begin{array}{l} \begin{aligned} &{} h_{i}(\mathcal {Z}_{i}, \mathcal {Z})=f(\mathcal {Z}),\ \forall \mathcal {Z},\ i=1, 2, 3;\\ &{} h_{i}(\mathcal {\bar{Z}}_{i}, \mathcal {Z})\ge f(\mathcal {Z}_{1}, \ldots , \mathcal {\bar{Z}}_{i}, \ldots , \mathcal {Z}_{3}), \forall \mathcal {\bar{Z}}_{i}, \forall \mathcal {Z}, i=1, 2, 3;\\ &{} h_{i}'(\mathcal {\bar{Z}}_{i}, \mathcal {Z}; d_{i})|_{\mathcal {\bar{Z}}_{i}=\mathcal {Z}_{i}}=f'(\mathcal {Z}; d), \forall d = (0, \ldots , d_{i}, \ldots , 0), i=1, 2, 3;\\ &{} h_{i}(\mathcal {\bar{Z}}_{i}, \mathcal {Z}) \ \text {is continuous in} \ (\mathcal {\bar{Z}}_{i}, \mathcal {Z}), i=1, 2, 3; \end{aligned} \end{array} \right. \end{aligned}$$
(34)

where \(\mathcal {Z}=(\mathbf X ,\ \mathbf Y ,\ \mathcal {M})\), and \(\mathcal {Z}_{i}\) is equal to \(\mathbf X \), \(\mathbf Y \), \(\mathcal {M}\), for \(i = 1 , 2 , 3\) , respectively. In addition, the sub-problem \(h_{i}(i = 1 , 2 , 3)\) is strictly convex with respect to \(\mathbf X \), \(\mathbf Y \) and \(\mathcal {M}\) respectively and thus each sub-problem has a unique solution. Therefore, all assumptions in Lemma 1 are satisfied. \(\square \)

4 Numerical Experiments

In this section, we evaluate the performance of the proposed method for the reconstruction of color images, multispectral images (MSIs), MRI, and color videos. We compare the results with five well-known tensor completion methods, including SiLRTC [22], TMac [35], the method based on the tensor nuclear norm in [23] (“TNN” for short), SiLRTC-TT, and TMac-TT [1]. All test tensors are rescaled between [0, 255] to allow a fair quantitative evaluation. In color videos tests, as TNN is only applicable to third-order tensors, we perform it on each frame separately.

The quality of recovered images is measured by the peak signal-to-noise ratio (PSNR) and the structural similarity index (SSIM), which are defined as

$$\begin{aligned} \text {PSNR} = 10\log _{10}\frac{\text {Max}_\mathbf{M , \mathbf M ^{*}}^{2}}{\Vert \mathbf M -\mathbf M ^{*}\Vert _{F}^{2}} \end{aligned}$$

and

$$\begin{aligned} \text {SSIM}=\frac{ (2\mu _\mathbf{M }\mu _\mathbf{M ^{*}})(2\sigma _\mathbf{M \mathbf M ^{*}}+c_{2})}{(\mu _\mathbf{M }^{2}+\mu _\mathbf{M ^{*}}^{2}+c_{1})(\sigma _\mathbf{M }^{2}+\sigma _\mathbf{M ^{*}}^{2}+c_{2})}, \end{aligned}$$

where \(\mathbf M ^{*}\) is the true image, \(\mathbf M \) is the recovered image, \(\text {Max}_\mathbf{M , \mathbf M ^{*}}\) is the maximum pixel value of the images \(\mathbf M \) and \(\mathbf M ^{*}\), \(\mu _\mathbf{M }\) and \(\mu _\mathbf{M ^{*}}\) are the mean values of images \(\mathbf M \) and \(\mathbf M ^{*}\), \(\sigma _\mathbf{M }\) and \(\sigma _\mathbf{M ^{*}}\) are the standard variances of \(\mathbf M \) and \(\mathbf M ^{*}\), respectively, \(\sigma _\mathbf{M \mathbf M ^{*}}\) is the covariance of \(\mathbf M \) and \(\mathbf M ^{*}\), and \(c_{1}\) and \(c_{2}>0\) are constants. By calculating average PSNR and SSIM values for all bands, we obtain the PSNR and SSIM values of a higher-order tensor. Higher PSNR and SSIM values imply better image quality.

The convergence criterion of our proposed algorithm is defined by computing the relative error of the tensor \(\mathcal {M}\) between two successive iterations as follows:

$$\begin{aligned} \frac{\Vert \mathcal {M}^{l+1}-\mathcal {M}^{l}\Vert _{F}}{\Vert \mathcal {M}^{l}\Vert _{F}}\le 10^{-4}. \end{aligned}$$
(35)

In the experiments, there are two parameters that should be initialized: the weighting parameters \(\alpha \) and the initial ranks for TMac, TMac-TT, and MF-TTTV. Firstly, the weights \(\alpha _{k}\) for TMac-TT and MF-TTTV are defined as

$$\begin{aligned} \alpha _{k}=\frac{\delta _{k}}{\sum _{k=1}^{j-1}\delta _{k}} \quad \text {with} \quad \delta _{k}=\min (\Pi _{d=1}^{k}n_{d}, \Pi _{d=k+1}^{j}n_{d}), \end{aligned}$$
(36)

where \(k=1, \ldots , j-1\). In this way, we assign the large weights to the more balanced matrices. Similarly, for TMac, the weights \(\alpha _{k}\) are defined as

$$\begin{aligned} \alpha _{k}=\frac{n_{k}}{\sum _{k=1}^{j}n_{k}}, \end{aligned}$$
(37)

where \(k=1, \ldots , j\). Secondly, to obtain the initial ranks for TMac, TMac-TT, and MF-TTTV, each rank \(r_{k}\) is approximated by the numbers of the singular values that satisfies the following inequality:

$$\begin{aligned} \frac{\sigma _{q}^{[k]}}{\sigma _{1}^{[k]}}> th, \end{aligned}$$
(38)

where \(q = 1,\ldots ,r_{k}\), th is the threshold, and \(\sigma _{q}^{[k]}\) is assumed to be in descending order. This condition is chosen such that the matricizations with lower-rank will have more singular values truncated. Parameters in competing methods SiLRTC, TNN, and SiLRTC-TT are automatically optimized according to the author’s suggestion.

In our method, we set the proximal parameter \(\rho = 10^{-3}\), penalty parameters \(\beta _{1} = 5*10^{-3}\), and \(\beta _{2} = 0.1\) for all experiments. And we empirically select the regularization parameter \(\lambda \) and the penalty parameter \(\beta _{3}\) from the candidate set: \(\{0.01, 0.03, 0.05, 0.1, 0.3\}\), to attain the highest PSNR value.

All numerical experiments are performed on Windows 10 64-bit and MATLAB R2012a running on a desktop equipped with an Intel(R) Core(TM) i7-6700M CPU with 3.40 GHz and 8 GB of RAM.

Fig. 3
figure 3

The results of testing color images with \(SR=0.1\) recovered by different methods. From left to right: the original data, the observed data, the reconstructed results by SiLRTC, TMac, TNN, SiLRTC-TT, TMac-TT, and MF-TTTV, respectively (Color figure online)

4.1 Color Images

In this subsection, we evaluate the performance of the proposed method on color images. The size of the testing image is \(256\times 256 \times 3\). By applying KA, we cast a third-order tensor \(\mathcal {M} \in \mathbb {R}^{256\times 256 \times 3}\) into a ninth-order tensor \(\mathcal {\tilde{M}} \in \mathbb {R}^{4\times 4 \times 4\times 4\times 4\times 4\times 4\times 4\times 3}\) and then apply the proposed method to restore color images. The sampling rate (SR) is tested from 0.05 to 0.5.

Table 2 The PSNR and SSIM values obtained by compared methods and the proposed method for color images with different sampling rates (SRs)
Fig. 4
figure 4

The PSNR and SSIM values of the reconstructed color image results with different sampling rates by SiLRTC, TMac, TNN, SiLRTC-TT, TMac-TT, and MF-TTTV, respectively (Color figure online)

Figure 3 shows the visual results of color images with \(SR=0.1\) recovered by SiLRTC, TMac, TNN, SiLRTC-TT, TMac-TT, and MF-TTTV, respectively. Obviously, there are several vertical thorns in the recovered images by SiLRTC and TMac, the results by TNN appear dark and have horizontal thorns, and block-artifacts can be easily observed on the restored images by SiLRTC-TT and TMac-TT. While the recovered results by the proposed method are visually better than compared methods and keep the smoothness best. From the zoom-in regions of recovered images, the proposed method can effectively reduce block-artifacts compared with SiLRTC-TT and TMac-TT. Table 2 lists the numerical results by different methods, where the best PSNR and SSIM values are emphasized in bold font. Figure 4 shows the recovery PSNR and SSIM curves by different methods when applied to images Lena, Peppers, and House. We can see that for different SRs, the method performs better than the compared methods in terms of the PSNR and SSIM.

Fig. 5
figure 5

The results of images House and Lena with structural missing entries described by the black letters by SiLRTC, TMac, TNN, SiLRTC-TT, TMac-TT, and MF-TTTV, respectively

Table 3 The PSNR (dB), SSIM, and Time (in seconds) values obtained by compared methods and the proposed method for images Lena and Barbara with different SRs

In Fig. 5, we show the results by applying different methods on images House and Lena with structural missing entries. The missing entries are selected as the black text. The outlines of the text can still be clearly seen on the recovered images by the compared methods, while the text is almost removed using MF-TTTV.

In Table 3, we show the PSNR, SSIM, and Time (in seconds) values of the restored images by different methods. We test the color images Lena and Barbara with \(SR = 0.1\) and 0.3. It can be seen that our method outperforms the compared ones in terms of both PSNR and SSIM, although the proposed method requires more time than compared methods. The reason is that for better recovery results, we introduce the TV regularization and several variables when solving the proposed model, which lead to a more complex numerical algorithm.

Fig. 6
figure 6

The results of one band of testing MSIs recovered by different methods. The first (third) and second (fourth) rows: the results of observed MSIs Washington DC (Flowers) with \(SR=0.05\) and 0.1, respectively. From left to right: the original data, the observed data, the reconstructed results by SiLRTC, TMac, TNN, SiLRTC-TT, TMac-TT, and MF-TTTV, respectively

4.2 MSIs Data

In this subsection, we compare the performance of different methods on MSIs Faces, Flowers, and Washington DC data. The testing images of size \(256\times 256 \times 11\) are selected from the original MSIs. The third-order tensor is converted to the ninth-order tensor of size \(4\times 4 \times 4\times 4\times 4\times 4\times 4\times 4\times 11\) by using KA. The SRs are set to be 0.05, 0.1, and 0.15, respectively. In Fig. 6, we display the recovered results by different compared methods. From the zoom-in regions of recovered images, it is clear that the proposed method performs best in preserving the edges and details of recovered images. The PSNR and SSIM values of each band of the reconstructed MSIs Washington DC and Faces with \(SR = 0.1\) are shown in Fig. 7. From this figure, we can see that the PSNR and SSIM values in all bands obtained by the proposed method are better than those obtained by the compared methods. In addition, Table 4 shows the average PSNR and SSIM values of the reconstructed tensors by different methods. We can note that our proposed method obtains the highest quality results for different MSIs with different sampling rates.

Fig. 7
figure 7

The PSNR and SSIM values of all bands of the reconstructed MSIs with \(SR = 0.1\) recovered by different methods for Washington DC and Faces

Table 4 The average PSNR and SSIM values obtained by compared methods and the proposed method for MSIs with different SRs
Table 5 The average PSNR and SSIM values obtained by compared methods and the proposed method for MRI data with different SRs

4.3 MRI Data

In this subsection, we use the MRI data of size \(256\times 256 \times 11\) to compare the performance of SiLRTC, TMac, TNN, SiLRTC-TT, TMac-TT, and MF-TTTV. The SRs are set to be 0.05, 0.1, and 0.2, respectively. Table 5 lists the average PSNR and SSIM of the recovered MRI data using different methods. Figure 8 shows the PSNR and SSIM values of every band with \(SR=0.1\) and 0.2. From Table 5, we see that the proposed method achieves higher quality results for different sampling rates. The better visual performance of the recovered image by our method are shown in Fig. 9. Compared the results recovered by TMac-TT, the block-artifacts produced by using KA are obviously reduced in the restored images by MF-TTTV.

Fig. 8
figure 8

The PSNR and SSIM values of all bands of the reconstructed MRI recovered by different methods

Fig. 9
figure 9

The results of one band of testing MRI recovered by different methods. From top to bottom: the results of observed MRI with \(SR=0.05\), 0.1, and 0.2, respectively. From left to right: the original data, the observed data, the reconstructed results by SiLRTC, TMac, TNN, SiLRTC-TT, TMac-TT, and MF-TTTV, respectively

4.4 Color Videos

In this subsection, we test the proposed method on two color videos Bus and Mobile. The size of testing videos is \(243\times 256 \times 3 \times 27\). We reshape the tensor to a ninth-order tensor of size \(6\times 6 \times 6\times 6\times 6\times 6\times 6\times 6\times 3\). The ninth-order data is used for the proposed tensor completion algorithm. The SR is set as 0.1 in this task.

Fig. 10
figure 10

The results of two frames of testing color videos recovered by different methods. The first (third) and second (fourth) rows: the results of color videos Bus (Mobile), respectively. From left to right: the original data, the observed data, the reconstructed results by SiLRTC, TMac, TNN, SiLRTC-TT, TMac-TT, and MF-TTTV, respectively (Color figure online)

Results of using different methods are shown in Fig. 10. It is obvious that our method visually outperforms SiLRTC, TMac, TNN, SiLRTC-TT, and TMac-TT in keeping smoothness and details of recovered images. The PSNR and SSIM values of each frame of two reconstructed color videos are plotted in Fig. 11. We note that the PSNR and SSIM values of each frame recovered by the proposed method are higher than all compared methods.

Fig. 11
figure 11

The PSNR and SSIM values of all frames of color videos recovered by different methods for Bus and Mobile (Color figure online)

5 Conclusion

In this paper, we propose a new model combining low-rank matrix factorization based on TT rank with the TV regularization for LRTC. An effective BSUM-based algorithm is developed to solve the proposed model with guaranteed convergence. Moreover, a tensor augmentation tool is introduced to improve the performance of our method. Numerical experiments demonstrate that the proposed method can effectively keep the piecewise smoothness of tensor data in spatial dimensions and distinctly reduce the block-artifacts compared with TMac-TT using KA. In the future work, we will try to speed up the proposed method or develop the numerical algorithm with a faster convergence rate.