1 Introduction

In today’s era of information explosion, data volumes across various industries are experiencing exponential growth, posing significant challenges in storage, processing, and analysis. High-dimensional data, such as images [1], videos [2], sensor readings [3] and genomic sequences [4], are becoming increasingly prevalent in scientific research [5], industrial applications [6] and daily life. However, high-dimensional data often come with substantial computational and storage costs, along with the potential for containing a considerable amount of redundant information. To address this challenge, researchers have turned to dimensionality reduction techniques aimed at representing complex high-dimensional data in lower-dimensional spaces while preserving their fundamental structure and information [7, 8].

A prominent dimensionality reduction technique is the sparse representation model [9,10,11], which aims to reduce data dimensionality by finding a sparse set of linear combinations to express the data [12]. By emphasizing the sparsity of the representation matrix, redundant information in the data can be eliminated, resulting in a more concise and interpretable data representation. The application of rank-constrained optimization problems is widespread. For example, in the field of computer vision, sparse representation has been used for tasks such as image denoising [13], image restoration [14], and image classification [15]. In signal processing, sparse representation models have been applied to audio and speech processing for tasks like source separation [16], speech enhancement [17], and audio compression [18]. Additionally, sparse representation models find wide applications in neuroscience [19], bioinformatics [20], genomics [21], and other fields.

In prior research, the nuclear norm of a matrix has been demonstrated as an effective alternative method to traditional rank function. Unlike the nuclear norm, which treats each singular value equally [22], larger singular values may contain richer and more useful information compared to smaller ones [22, 23]. To address these limitations, several non-convex alternatives have been proposed, such as the truncated nuclear norm, weighted nuclear norm, and weighted Schatten-p norm, as suggested in [22,23,24]. Additionally, the log-determinant alternating method has been proven to be a more accurate approximation of rank [25], particularly evident in neural networks compared to the nuclear norm. These strategies are all based on approximate optimization of rank, where singular value decomposition (SVD) plays an indispensable role. However, the high complexity involved in SVD computations poses limitations when dealing with high-dimensional or large datasets [26]. To overcome this challenge, low-rank matrix decomposition is applied to the sparse representation of matrices [26]. Specifically, decomposing the target matrix into two smaller factor matrices describes the low-rank property of the target matrix. This approach not only satisfies the requirement of low-rank matrices but also benefits from rapid numerical optimization.

Currently, sparse representation methods for color images primarily focus on processing two-dimensional data. Consequently, when dealing with color images, it is common to process the RGB channels separately. However, this approach may lead to overlooking the intrinsic relationships among the three channels, resulting in the loss of their potential connections.

Based on the analysis above, there are two main issues with low-rank structured sparse representation of color images.

The first challenge in sparse representation of color images is that the correlation between RGB channels cannot be adequately maintained

Consequently, researchers have turned their attention to color image processing methods based on quaternion frameworks. Due to the unique structure of quaternion, each pixel in a color image can be represented using pure quaternions, forming a quaternion matrix. This approach has been widely applied in areas such as face recognition [27, 28], image edge detection [29], and image denoising [30, 31]. Other applications of color image processing can be found in references [32,33,34].

The second challenge is how to accurately describe the underlying low-rank structure

In the sparse representation of color images in quaternion space, many studies have been based on non-convex functions used for approximating the rank of quaternion matrices, including weighted Schatten-p norm, and Laplacian approximation. These functions highlight the advantages of quaternion matrices, which have been validated experimentally and theoretically. However, these methods require full processing of QSVD for quaternion matrices, which incurs high computational costs. To address this challenge, researchers have extended low-rank matrix decomposition to the quaternion algebra. The authors decompose the target quaternion matrix into dual-factor quaternion matrices for low-rank quaternion matrix completion [35]. These factorization-based methods only require optimization of two smaller quaternion matrices, thus significantly reducing any associated computational costs.

To address the aforementioned challenges, two quaternion sparse representation models have been proposed: Quaternion Logarithmic Norm Factorization Sparse Representation (QLNFSR) and Truncated Quaternion Logarithmic Norm Sparse Representation (TQLNSR). Both models are designed to approximate the rank in quaternion algebra more accurately and efficiently, thereby better utilizing the structure of color images. This paper utilizes the quaternion log-norm as a non-convex substitute for rank, which provides a more reliable description of low-rank matrices through compared to traditional approximation methods such as quaternion nuclear norm. With the increasing of singular values, the penalty of the logarithmic function becomes more lenient, hence the smaller singular values may receive more punishment. In Fig. 1, an intuitive explanation of rank approximation using the logarithmic function for scalar cases can be observed. When \(\textbf{X}=x\in \mathbb {R}\), \(\mathrm{{rank}}(x)=0\) if \(x=0\) and \(\mathrm{{rank}}(x)=1\) otherwise. Additionally, for x bounded by a positive constant M, denoted as \(|x|\le M\), \(\frac{\Vert x \Vert _{*}}{M}=\frac{\mid x \mid }{M}\) represents the convex envelope of \(\mathrm{{rank}}(x)\) on the interval \(\{x\mid \,|x|\le M\}\) [36]. Consequently, the slope of \(\mathrm{{rank}}(x)\) at the origin is infinite, while the convex envelope (\(|x|\le M\)) exhibits a consistent slope. In contrast, the slope of the logarithmic function at the origin is \(1/\delta \), where \(\delta \) is a small positive constant that ensures the logarithmic norm closely approximates \(\mathrm{{rank}}(x)\) compared to the convex envelope [37].

Subsequently, the quaternion logarithmic norm is applied to two smaller quaternion matrices, which are the factor quaternion matrices of the target quaternion matrix based on the quaternion log-norm factorization algorithm. Therefore, the expensive QSVD only needs to act on the smaller factor quaternion matrices, thereby improving algorithm efficiency. In the truncated quaternion logarithmic norm approximation algorithm, the quaternion logarithmic norm is first truncated, and then the shrinkage operator of the quaternion logarithmic norm is directly applied to optimize the target quaternion matrix. Thus, the main contributions of this paper can be summarized as follows.

  • The rank of quaternion matrices is not affected by the maximum singular value. Therefore, truncated quaternion logarithmic norm operates is to truncate the largest singular value, and then utilize the quaternion logarithmic norm in the truncation problem.

  • In order to solve the QLNFSR and TQLNSR models, we adopt the idea of alternating direction method of multipliers (ADMM) and establish two main algorithms called QLNFSR algorithm (Algorithm 1) and TQLNSR algorithm (Algorithm 2). Experimental results confirm the effectiveness of two algorithms in color image sparse representation processes.

This paper is organized as follows. In Section 2, we review some notations, definitions and lemmas with regard to the quaternion matrix. In Section 3, we review sparse representation methods based on low-rank matrices and introduce two strategies for low-rank sparse representation based on quaternion. In Section 4, we provide convergence analysis of the algorithms. In Section 5, the proposed algorithms have been applied to color image reconstruction. The feasibility and effectiveness of the algorithms are verified. Finally, we give some conclusions in Section 6.

Fig. 1
figure 1

The rank, nuclear norm, and the logarithmic norm for scalar x

Notation

In this article, \(\mathbb {R}\) and \(\mathbb {Q}\) denote the real space and quaternion space, respectively. A scalar and a matrix are written as a and A, respectively. \(\varvec{a}\) and \(\varvec{A}\) represent a quaternion number and a quaternion matrix, respectively. The symbols \((\cdot )^{*}\), \((\cdot )^{-1}\), \((\cdot )^{T}\) and \((\cdot )^{H}\) denote the conjugation, inverse, transpose and conjugate transpose, respectively. The symbols \(|\cdot |\), \(\Vert \cdot \Vert _{F}\) and \(\Vert \cdot \Vert _{*}\) are the absolute value or modulus, the Frobenius norm and the nuclear norm, respectively. The symbols \(\langle \cdot ,\cdot \rangle \), \(\textrm{tr}\{\cdot \}\) and \(\textrm{rank}(\cdot )\) denote the inner product operation, the trace and rank operators, respectively. The real part of quaternion (scalar, vector, and matrix) denotes \(\mathfrak {R}(\cdot )\). The symbol I represents the identity matrix with appropriate size.

2 Preliminaries

In this section, we recall some preliminary results that will be used in the following discussion. Firstly, we introduce the definition of quaternion.

Definition 2.1

[38] A quaternion \(\varvec{q} \in \mathbb {Q}\) is expressed as

$$\varvec{q}=q_0+q_1\varvec{i}+q_2\varvec{j}+q_3\varvec{k}, $$

where \(q_0, q_1, q_2, q_3\in \mathbb {R}\), and three imaginary units \(\varvec{i}, \varvec{j}, \varvec{k}\) satisfy

$$\varvec{i}^2=\varvec{j}^2=\varvec{k}^2=\varvec{ijk}=-1,\quad \varvec{ij}=-\varvec{ji}=\varvec{k},\quad \varvec{jk}=-\varvec{kj}=\varvec{i},\quad \varvec{ki}=-\varvec{ik}=\varvec{j.} $$

One of the most important properties of quaternion is that the multiplication of quaternion is noncommutative as these rules, that is \(\varvec{pq} \ne \varvec{qp}\) in general for \(\varvec{p}, \varvec{q} \in \mathbb {Q}\). For example, it is obvious that \(\varvec{pq} \ne \varvec{qp}\) while \(\varvec{p}=\varvec{i}\) and \(\varvec{q}=\varvec{j}\).

Definition 2.2

[38] A quaternion matrix \(\varvec{A} \in \mathbb {Q}^{m\times n}\) is expressed as

$$\varvec{A}=A_0+A_1\varvec{i}+A_2\varvec{j}+A_3\varvec{k}, $$

where \(A_0, A_1, A_2, A_3\in \mathbb {R}^{m\times n}\). The conjugate transpose matrix of \(\varvec{A}\) is defined as

$$\varvec{A}^{H}=A_0^{T}-A_1^{T}\varvec{i}-A_2^{T}\varvec{j}-A_3^{T}\varvec{k}. $$

We get the following definitions about the norm of the quaternion and quaternion matrix.

Definition 2.3

[38] Let \(\varvec{a} \in \mathbb {Q}\) and \(\varvec{A} \in \mathbb {Q}^{m\times n}\). The norm of a quaternion \(\varvec{a}=a_0+a_1\varvec{i}+a_2\varvec{j}+a_3\varvec{k}\) and the Frobenius norm of the quaternion matrix \(\varvec{A}=A_0+A_1\varvec{i}+A_2\varvec{j}+A_3\varvec{k}=(\varvec{a}_{ij})\), are defined as

$$\left\| \varvec{a} \right\| =\sqrt{a_{0}^{2}+a_{1}^{2}+a_{2}^{2}+a_{3}^{2}}$$

and

$$ \left\| \varvec{A} \right\| _{F}=\sqrt{\left\| A_{0}\right\| ^{2}_{F} +\left\| A_{1}\right\| ^{2}_{F} +\left\| A_{2}\right\| ^{2}_{F} +\left\| A_{3}\right\| ^{2}_{F} }=\sqrt{\textrm{tr}(\varvec{A}^{H}\varvec{A})}. $$

Below we give the definition of the generalized inverse of a quaternion matrix.

Definition 2.4

[39] For given \(\varvec{A}\in \mathbb {Q}^{m \times n}\), the generalized inverse of the quaternion matrix \(\varvec{A}\) is defined as \(\varvec{X}\), which satisfies the following conditions

$$\begin{aligned} (1)~\varvec{A}\varvec{X}\varvec{A}=\varvec{A},~ (2)~\varvec{X}\varvec{A}\varvec{X}=\varvec{X},~ (3)~(\varvec{A}\varvec{X})^{H}=\varvec{A}\varvec{X},~(4)~(\varvec{X}\varvec{A})^{H}=\varvec{X}\varvec{A}. \end{aligned}$$
(2.1)

We denote \(\varvec{X}\) by \(\varvec{A}^{\dagger }\).

Specially, if \(\varvec{A}\) is invertible, it is clear that \(\varvec{X}=\varvec{A}^{-1}\) trivially satisfies (2.1).

Definition 2.5

(QNN, [40]) Given \(\varvec{X}\in \mathbb {Q}^{m \times n}\), the nuclear norm of the quaternion matrix is

$$\begin{aligned} \Vert {\varvec{X}}\Vert _{*}=\sum _{i=1}^{\mathrm{{min}}(m,n)} \sigma _{i}(\varvec{X}), \end{aligned}$$
(2.2)

where \(\sigma _{i}(\varvec{X})\) denotes the i-th singular value of \(\varvec{X}\).

Lemma 2.1

(Binary factorization framework, [35]) Let the quaternion matrix \(\varvec{A}\in \mathbb {Q}^{m \times n}\) with \(\textrm{rank}(\varvec{A})=r\le d\). Then the binary factorization framework is devised as : 

$$\begin{aligned} \varvec{A}=\varvec{U}\varvec{V}^{H}, \end{aligned}$$
(2.3)

where \(\varvec{U}\in \mathbb {Q}^{m \times d}\) and \(\varvec{V}\in \mathbb {Q}^{n \times d}\) such that \(\textrm{rank}(\varvec{U})=\textrm{rank}(\varvec{V})=r\).

Lemma 2.2

(Quaternion Singular Value Decomposition (QSVD), [19]) Let \(\varvec{A} \in \mathbb {Q}^{m \times n}\), then there exist two unitary quaternion matrices \(\varvec{U} \in \mathbb {Q}^{m \times m}\) and \(\varvec{V} \in \mathbb {Q}^{n \times n}\) such that

$$\begin{aligned} \varvec{A}=\varvec{U}\varvec{\Sigma }\varvec{V}^{H}, \end{aligned}$$

where \(\varvec{\Sigma }=\textrm{diag}(\sigma _{1}, \sigma _{2},\cdots ,\sigma _{l})\), \(\sigma _{1}\ge \sigma _{2}\ge \cdots \ge \sigma _{l}\ge 0\), and the diagonal elements of \(\varvec{\Sigma }\) are all the nonnegative singular values of the matrix \(\varvec{A}\) and \(l=\textrm{min}(m,n)\).

Definition 2.6

(QLN, [41]) Let \(\varvec{X}\in \mathbb {Q}^{m \times n}\). The logarithmic norm of the quaternion matrix with \(0\le p\le 1\) and \(\epsilon > 0\) is defined as

$$\begin{aligned} \Vert {\varvec{X}}\Vert ^{p}_{L}=\sum _{i=1}^{\mathrm{{min}}(m,n)} \mathrm{{log}}(\sigma _{i}^{p}(\varvec{X})+\epsilon ), \end{aligned}$$
(2.4)

where \(\sigma _{i}(\varvec{X})\) denotes the i-th singular value of \(\varvec{X}\).

Lemma 2.3

[41] Let the quaternion matrix \(\varvec{X}\in \mathbb {Q}^{m \times n}\) with \(\textrm{rank}(\varvec{X})=r\le d \le \textrm{min}\{m,n\}\). There exist \(\varvec{U}\in \mathbb {Q}^{m \times d}\) and \(\varvec{V}\in \mathbb {Q}^{N \times d}\) such that \(\varvec{X}=\varvec{U}\varvec{V}^{H}\). Then we have : 

$$\begin{aligned} \Vert {\varvec{X}}\Vert ^{1/2}_{L}=\underset{\underset{\tiny {\varvec{X}=\varvec{U}\varvec{V}^{H}}}{\tiny {\varvec{U}},\tiny {\varvec{V}}}}{\text {min}}\quad \frac{1}{2}\Vert {\varvec{U}}\Vert _{L}^{1}+\frac{1}{2}\Vert {\varvec{V}}\Vert _{L}^{1}. \end{aligned}$$
(2.5)

Lemma 2.4

(Quaternion Logarithmic Singular Value Thresholding (QLSVT), [41]) Let the quaternion matrix \(\varvec{A}\in \mathbb {Q}^{m \times n}\) and \(\lambda > 0\). If QSVD of \(\varvec{A}\) is \(\varvec{A}=\varvec{U}_{\tiny {\varvec{A}}}\varvec{\Sigma }_{\tiny {\varvec{A}}}\varvec{V}^{H}_{\tiny {\varvec{A}}}\), then the closed solution of the problem

$$\begin{aligned} \underset{\tiny {\varvec{X}}}{\mathrm{{arg\, min}}}\quad \frac{1}{2}\Vert \varvec{A}-\varvec{X}\Vert _{F}^{2}+\lambda \Vert \varvec{X}\Vert _{L}^{1} \end{aligned}$$
(2.6)

is provided by \(\varvec{X}=\varvec{U}_{\tiny {\varvec{A}}} \varvec{\Delta }_{\lambda ,\epsilon ,{\tiny {\varvec{A}}}}\varvec{V}^{H}_{\tiny {\varvec{A}}}\). Here, the soft thresholding operator \(\varvec{\Delta }_{\lambda ,\epsilon ,\tiny {\varvec{A}}}\) is defined as : 

$$\varvec{\Delta }_{\lambda ,\epsilon ,\varvec{A}}=\textrm{diag}(l_{\lambda ,\epsilon }(\sigma _1),\,l_{\lambda ,\epsilon }(\sigma _2),\,\cdots , l_{\lambda ,\epsilon }(\sigma _r),\,0,\,\cdots ,\,0)\in \mathbb {R}^{m\times n}$$

with

$$\begin{aligned} l_{\lambda ,\epsilon }(x)=\left\{ \begin{array}{ll} 0,& \delta \le 0,\\ \underset{a\in \{0,\frac{1}{2}(x-\epsilon +\sqrt{\delta })\}}{\mathrm{{arg\, min}}}\ h(a), & \delta > 0,\\ \end{array} \right. \end{aligned}$$
(2.7)

where \(\delta =(x-\epsilon )^{2}-4(\lambda -x\epsilon )\) and the function \(h: \mathbb {R}^{+}\rightarrow \mathbb {R}^{+}\) is defined as \(h(a):=\frac{1}{2}(a-x)^{2}+\lambda \textrm{log}(a+\epsilon )\).

3 Quaternion matrix sparse representation model

Due to the pronounced non-local self-similarity evident in the structure of visual data, often observed as low-rank features, the goal of matrix sparse representation models is to tackle image rendering problems using the following approach:

$$\begin{aligned} \underset{\tiny {X}}{\mathrm{{min}}}\ \mathrm{{rank}}(X),\ \text {s.t.}\ B\approx AX, \end{aligned}$$
(3.1)

where \(\mathrm{{rank}}({X})\) is the rank function, B is the known data matrix, A is the constraint matrix, and X is the matrix to be found. Problem (3.1) constitutes a combinatorial optimization challenge, typically addressable by optimizing convex surrogates for the rank function [42].

Problem (3.1) briefly introduces the classic matrix sparse representation model, which fundamentally optimizes the sparse representation of grayscale images and other two-dimensional data. When processing color images, the model in (3.1) needs to decompose the RGB channels, while the quaternion matrix sparse representation model can assemble the RGB channels. Therefore, it can be expressed as:

$$\begin{aligned} \underset{\tiny {\varvec{X}}}{\mathrm{{min}}}\ \mathrm{{rank}}(\varvec{X}),\ \text {s.t.}\ \varvec{B}\approx \varvec{AX}, \end{aligned}$$
(3.2)

where \(\mathrm{{rank}}(\varvec{X})\) is the quaternion matrix rank function, \(\varvec{B}\) is the known quaternion matrix, \(\varvec{A}\) is the constraint quaternion matrix, and \(\varvec{X}\) is the quaternion matrix to be found.

The main sparse representation model in quaternion algebra primarily focuses on low-rank minimization. Similar to the classical matrix sparse representation model, the rank function in model (3.2) is challenging to optimize. Therefore, according to Definition 2.5, the low-rank minimization model can be expressed as:

$$\begin{aligned} \underset{\tiny {\varvec{X}}}{\mathrm{{min}}}\ \Vert {\varvec{X}}\Vert _{*},\ \text {s.t.}\ \varvec{B}\approx \varvec{AX}. \end{aligned}$$
(3.3)

A rank, akin to a value in the real domain, is represented by a real number. As depicted in Fig. 1, QLN provides a finer approximation compared to QNN. Furthermore, drawing on both the bi-factor surrogate theorem for matrix logarithmic norm mentioned in [37] and Lemma 2.1, it becomes feasible to formulate the bi-factor surrogate theorem for QLN. Therefore, combined with the Definition 2.6 and Lemma 2.3, this paper proposes two quaternion sparse representation models: Quaternion Logarithmic Norm Factorization Sparse Representation (QLNFSR) and Truncated Quaternion Logarithmic Norm Sparse Representation (TQLNSR) as follows.

3.1 Quaternion logarithmic norm factorization sparse representation

According to Definition 2.6 and Lemma 2.3, on the basis of model (3.3), the following sparse representation model based on quaternion framework is proposed:

$$\begin{aligned} \underset{\tiny {\varvec{X}}}{\mathrm{{min}}}\ \Vert {\varvec{X}}\Vert ^{1/2}_{L} ,~\text {s.t.}\ \varvec{B}\approx \varvec{A}\varvec{X}, \end{aligned}$$
(3.4)

where \(\varvec{B}\) is the known quaternion matrix, \(\varvec{A}\) is the constraint quaternion matrix, and \(\varvec{X}\) is the quaternion matrix to be found. Our aim is to minimize the disparity between \(\varvec{B}\) and \(\varvec{AX}\), while simultaneously reducing the rank of \(\varvec{X}\) to its lowest possible value. Combining the aforementioned objectives into a single equation allows us to represent (3.4) as:

$$\begin{aligned} \underset{\tiny {\varvec{X}},\tiny {\varvec{Y}},\tiny {\varvec{Z}}}{\mathrm{{min}}}\ \frac{\lambda }{2}(\Vert \varvec{Y}\Vert _{L}^{1}+\Vert \varvec{Z}\Vert _{L}^{1})+\frac{\rho }{2}\Vert \varvec{B}-\varvec{A}\varvec{X}\Vert _{F}^{2},~\text {s.t.}\ \varvec{Y}=\varvec{M}, \varvec{Z}=\varvec{N}, \varvec{X}=\varvec{M}\varvec{N}^{H}. \end{aligned}$$
(3.5)

It’s worth noting that (3.5) decomposes the interconnected terms, enabling them to be tackled separately. Subsequently, the challenges posed by (3.5) can be addressed using the ADMM framework.

Initially, we address the task described in (3.5) by minimizing the augmented Lagrangian function given by:

$$\begin{aligned} {\begin{matrix} & \mathcal {L}(\varvec{X},\varvec{Y},\varvec{Z},\varvec{M},\varvec{N},\varvec{F}_{1},\varvec{F}_{2},\varvec{F}_{3},\alpha ) \\ & =\frac{\lambda }{2}( \Vert \varvec{Y}\Vert _{L}^{1}+\Vert {\varvec{Z}}\Vert _{L}^{1})+\frac{\rho }{2}\Vert \varvec{B}-\varvec{A}\varvec{X}\Vert _{F}^{2}+\mathfrak {R}(\langle \varvec{F}_{1}, \varvec{M}-\varvec{Y}\rangle )+\mathfrak {R}(\langle \varvec{F}_{2}, \varvec{N}-\varvec{Z}\rangle ) \\ & ~~+\mathfrak {R}(\langle \varvec{F}_{3}, \varvec{X}-\varvec{M}\varvec{N}^{H}\rangle ) +\frac{\alpha }{2}(\Vert \varvec{Y}-\varvec{M}\Vert _{F}^{2}+\Vert \varvec{Z}-\varvec{N}\Vert _{F}^{2}+\Vert \varvec{X}-\varvec{M}\varvec{N}^{H}\Vert _{F}^{2}), \\ \end{matrix}} \end{aligned}$$
(3.6)

where \(\varvec{F}_{1}\), \(\varvec{F}_{2}\) and \(\varvec{F}_{3}\) are Lagrange multipliers, \(\alpha >0\) is the penalty parameter.

Updating \(\varvec{M}\) and \(\varvec{N}\)

In the \((k+1)\)-th iteration, while keeping the other variables at their most recent values, \(\varvec{M}\) and \(\varvec{N}\) are determined as the optimal solutions of the subsequent problems:

$$\begin{aligned} \begin{aligned} \varvec{M}^{k +1} =&\underset{\tiny {\varvec{M}}}{\mathrm{{arg\, min}}}\ \frac{1}{2}\Vert \varvec{X}^{k}+\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{3}-\varvec{M}(\varvec{N}^{k})^{H} \Vert _{F}^{2}+\frac{1}{2}\Vert \varvec{M}-\varvec{Y}^{k } +\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{1}\Vert _{F}^{2}. \\ \varvec{N}^{k +1} =&\underset{\tiny {\varvec{N}}}{\mathrm{{arg\, min}}}\ \frac{1}{2}\Vert \varvec{X}^{k}+\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{3}-\varvec{M}^{k+1}\varvec{N}^{H}\Vert _{F}^{2}+\frac{1}{2}\Vert \varvec{N}-\varvec{Z}^{k } +\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{2}\Vert _{F}^{2}. \end{aligned} \end{aligned}$$
(3.7)

Let

$$\mathcal {Q}(\varvec{M}):=\frac{1}{2}\Vert \varvec{X}^{k}+\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{3}-\varvec{M}(\varvec{N}^{k})^{H}\Vert _{F}^{2}+\frac{1}{2}\Vert \varvec{M}-\varvec{Y}^{k} +\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{1}\Vert _{F}^{2}$$

and

$$\mathcal {P}(\varvec{N}):=\frac{1}{2}\Vert \varvec{X}^{k}+\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{3}-\varvec{M}^{k+1}\varvec{N}^{H}\Vert _{F}^{2}+\frac{1}{2}\Vert \varvec{N}-\varvec{Z}^{k} +\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{2}\Vert _{F}^{2}.$$

Applying the relevant principles regarding quaternion matrix derivatives as outlined in [43], the gradient of \(\mathcal {Q}(\varvec{M})\) can be calculated as

$$\begin{aligned} \frac{\partial \mathcal {Q}(\varvec{M})}{\partial \varvec{M}}=\varvec{M}(\varvec{N}^{k})^{H}\varvec{N}^{k}-(\varvec{X}^{k}+\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{3})\varvec{N}^{k}+\varvec{M}-\varvec{Y}^{k} +\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{1}. \end{aligned}$$
(3.8)

By setting (3.8) equal to zero, the solution can be derived as

$$\begin{aligned} \varvec{M}^{k +1} =\big [(\varvec{X}^{k}+\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{3})\varvec{N}^{k}+\varvec{Y}^{k} -\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{1}\big ]\big [{I}+(\varvec{N}^{k})^{H}\varvec{N}^{k}\big ]^{-1}. \end{aligned}$$
(3.9)

Utilizing a comparable approach, we can derive the optimal solution for \(\varvec{N}^{k +1}\) as follows:

$$\begin{aligned} \varvec{N}^{k +1} =\big [(\varvec{X}^{k}+\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{3})^{H}\varvec{M}^{k+1}+\varvec{Z}^{k} -\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{2}\big ]\big [{I}+(\varvec{M}^{k+1})^{H}\varvec{M}^{k+1}\big ]^{-1}. \end{aligned}$$
(3.10)

Updating \(\varvec{Y}\) and \(\varvec{Z}\)

In the \((k+1)\)-th iteration, while maintaining the remaining variables at their most recent values, \(\varvec{Y}^{k+1}\) and \(\varvec{Z}^{k+1}\) represent the optimal solutions of the subsequent problems:

$$\begin{aligned} \left\{ \begin{array}{l} \varvec{Y}^{k +1} =\underset{\tiny {\varvec{Y}}}{\mathrm{{arg\, min}}}\ \frac{1}{2}\Vert \varvec{M}^{k+1}+\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{1}-\varvec{Y}\Vert _{F}^{2}+\frac{\lambda }{2\alpha ^{k}}\Vert \varvec{Y}\Vert _{L}^{1},\\ \varvec{Z}^{k +1} =\underset{\tiny {\varvec{Z}}}{\mathrm{{arg\, min}}}\ \frac{1}{2}\Vert \varvec{N}^{k+1}+\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{2}-\varvec{Z}\Vert _{F}^{2}+\frac{\lambda }{2\alpha ^{k}}\Vert \varvec{Z}\Vert _{L}^{1}. \end{array} \right. \end{aligned}$$
(3.11)

According to Lemma 2.4, we can utilize the QLSVT technique to update \(\varvec{Y}^{k +1}\) and \(\varvec{Z}^{k+1}\) in reference to (3.11), that are

$$\begin{aligned} \left\{ \begin{array}{l} \varvec{Y}^{k+1}=\varvec{U}_{\tiny {\varvec{S}_1}}\varvec{\Delta }_{\frac{\lambda }{2\alpha ^{k}},\epsilon ,\tiny {\varvec{\Sigma }_{\tiny {\varvec{S}_1}}}}\varvec{V}^{H}_{\tiny {\varvec{S}_1}}, \\ \varvec{Z}^{k+1}=\varvec{U}_{\tiny {\varvec{S}_2}}\varvec{\Delta }_{\frac{\lambda }{2\alpha ^{k}},\epsilon ,\tiny {\varvec{\Sigma }_{\tiny {\varvec{S}_2}}}}\varvec{V}^{H}_{\tiny {\varvec{S}_2}}, \end{array} \right. \end{aligned}$$
(3.12)

where \(\varvec{S}_{1}=\varvec{M}^{k+1}+\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{1}\) and \(\varvec{S}_{2}=\varvec{N}^{k+1}+\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{2}\). Let \(\varvec{S}_{1}=\varvec{U}_{\tiny {\varvec{S}_1}}\varvec{\Sigma }_{\tiny {\varvec{S}_1}}\varvec{V}^{H}_{\tiny {\varvec{S}_1}}\) and \(\varvec{S}_{2}=\varvec{U}_{\tiny {\varvec{S}_2}}\varvec{\Sigma }_{\tiny {\varvec{S}_2}}\varvec{V}^{H}_{\tiny {\varvec{S}_2}}\) be the QSVD of quaternion matrices \(\varvec{S}_{1}\) and \(\varvec{S}_{2}\), respectively.

Updating \(\varvec{X}\)

In the \((k+1)\)-th iteration, fixing the other variables at their latest values, \(\varvec{X}\) is the optimal solutions of the following problem:

$$\begin{aligned} \varvec{X}^{k +1} =\underset{\tiny {\varvec{X}}}{\mathrm{{arg\, min}}}\ \frac{\rho }{2}\Vert \varvec{B}-\varvec{A}\varvec{X} \Vert _{F}^{2}+\frac{1}{2}\Vert \varvec{X}-\varvec{M}^{k+1}(\varvec{N}^{k+1})^{H} +\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{3}\Vert _{F}^{2}. \end{aligned}$$
(3.13)

Let

$$\mathcal {H}(\varvec{X}):=\frac{\rho }{2}\Vert \varvec{B}-\varvec{A}\varvec{X} \Vert _{F}^{2}+\frac{1}{2}\Vert \varvec{X}-\varvec{M}^{k+1}(\varvec{N}^{k+1})^{H} +\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{3}\Vert _{F}^{2}.$$

The gradient of \(\mathcal {H}(\varvec{X})\) can be calculated as

$$\begin{aligned} \frac{\partial \mathcal {H}(\varvec{X})}{\partial \varvec{X}}=\rho \varvec{A}^{H}(\varvec{A}\varvec{X}-\varvec{B})+(\varvec{X}+\frac{\varvec{F}^{k}_{3}}{\alpha ^{k}}-\varvec{M}^{k+1}(\varvec{N}^{k+1})^{H}). \end{aligned}$$
(3.14)

Setting (3.14) to zero, we can obtain a unique solution

$$\begin{aligned} \varvec{X}^{k +1} =\big [{I}+\rho \varvec{A}^{H}\varvec{A}\big ]^{-1} \big [\rho \varvec{A}^{H}B+ \varvec{M}^{k+1}(\varvec{N}^{k+1})^{H}-\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{3}\big ]. \end{aligned}$$
(3.15)

Updating \(\varvec{F}_{1}\), \(\varvec{F}_{2}\), \(\varvec{F}_{3}\) and \(\alpha \)

The update formats are as follows:

$$\begin{aligned} \left\{ \begin{array}{l} \varvec{F}_{1}^{k+1}=\varvec{F}_{1}^{k}+\alpha ^{k}(\varvec{Y}^{k +1}-\varvec{M}^{k +1}), \\ \varvec{F}_{2}^{k+1}=\varvec{F}_{2}^{k}+\alpha ^{k}(\varvec{Z}^{k +1}-\varvec{N}^{k +1}), \\ \varvec{F}_{3}^{k+1}=\varvec{F}_{3}^{k}+\alpha ^{k}(\varvec{X}^{k +1}-\varvec{M}^{k +1}(\varvec{N}^{k+1})^{H}),\\ \alpha ^{k+1}=\textrm{min}(\beta \alpha ^{k},\alpha _{\max }). \end{array} \right. \end{aligned}$$
(3.16)

Algorithm 1 outlines the complete process, detailing each step sequentially.

Algorithm 1
figure a

QLNFSR algorithm.

3.2 Truncated quaternion logarithmic norm sparse representation

The truncated nuclear norm can achieve a better approximation of the rank function than the nuclear norm. According to this property, the method adopted in this paper combines the truncation skill and QLN, and the definition of truncated logarithmic norm based on quaternion is introduced below.

Definition 3.1

(TQLN, [10]) Given \(\varvec{X}\in \mathbb {Q}^{m \times n}\), the truncated logarithmic norm of the quaternion matrix with \(0\le p\le 1\) and \(\epsilon > 0\) is defined as the sum of logarithmic function of \(\textrm{min}(M,N)-r\) minimum singular values : 

$$\begin{aligned} \Vert {\varvec{X}}\Vert ^{p}_{L,r}=\sum _{i=r+1}^{\mathrm{{min}}(m,n)} \mathrm{{log}}(\sigma _{i}^{p}(\varvec{X})+\epsilon ), \end{aligned}$$
(3.17)

where \(\sigma _{i}(\varvec{X})\) denotes the i-th singular value of \(\varvec{X}\).

Given that the initial large singular values do not affect the rank, they are disregarded in TQLN. Subsequently, the focus shifts towards optimizing the smallest \(\textrm{min}(M,N)-r\) singular values to achieve a more precise low-rank estimation. According to TQLN principles, the completion process based on the low-rank minimization model (3.2) can be expressed as follows:

$$\begin{aligned} \underset{\tiny {\varvec{X}}}{\mathrm{{min}}}\ \Vert {\varvec{X}}\Vert ^{p}_{L,r} ,~\text {s.t.}\ \varvec{B}\approx \varvec{A}\varvec{X}. \end{aligned}$$
(3.18)

Lemma 3.1

[41] Let \(\varvec{X} \in \mathbb {Q}^{m \times n}\), and the matrices \(\varvec{U} \in \mathbb {Q}^{r \times m}\) and \(\varvec{V} \in \mathbb {Q}^{r \times n}\) with \(\varvec{U}\varvec{U}^{H}=I_{r}\), \(\varvec{V}\varvec{V}^{H}=I_{r}\). Here, r is any integer \((r\le \textrm{min}(m,n))\). Then it has

$$ \left\| \textrm{tr}(\varvec{U}\varvec{X}\varvec{V}^{H})\right\| \le \sum _{i=1}^{r}\sigma _{i}(\varvec{X}),~~\textrm{max}\left\| \textrm{tr}(\varvec{U}\varvec{X}\varvec{V}^{H})\right\| =\sum _{i=1}^{r}\sigma _{i}(\varvec{X}). $$

Based on Definition 3.1 and Lemma 3.1, we introduce a sparse representation model utilizing the quaternion-based framework:

$$\begin{aligned} \underset{\tiny {\varvec{X}}}{\mathrm{{min}}}\ \lambda \Vert {\varvec{X}}\Vert ^{p}_{L}-\underset{\tiny {\varvec{C}\varvec{C}^{H}=I_{r}},\tiny {\varvec{D}\varvec{D}^{H}=I_{r}}}{\mathrm{{max}}}\ \left\| \textrm{tr}(\varvec{C}\varvec{X}\varvec{D}^{H})\right\| ,~\text {s.t.}\ \varvec{B}\approx \varvec{A}\varvec{X}. \end{aligned}$$
(3.19)

The procedure is outlined in Algorithm 2.

Algorithm 2
figure b

TQLNSR algorithm.

In Algorithm 2, \(\varvec{C}^{k}\) and \(\varvec{D}^{k}\) are first obtained by QSVD of \(\varvec{X}^{k}\). Next, we will focus on \(\textbf{Step}~5\) of Algorithm 2, which can be expressed as the following formula:

$$\begin{aligned} \underset{\tiny {\varvec{X}},\tiny {\varvec{K}}}{\mathrm{{min}}}\ \lambda \Vert {\varvec{X}}\Vert ^{p}_{L}- \left\| \textrm{tr}(\varvec{C}^{k}\varvec{K}(\varvec{D}^{k})^{H})\right\| +\frac{\rho }{2}\Vert \varvec{B}-\varvec{A}\varvec{K}\Vert _{F}^{2} ,~\text {s.t.}\ \varvec{X}=\varvec{K}. \end{aligned}$$
(3.20)

It’s worth noting that the problem (3.20) can be addressed using the ADMM framework. Initially, we tackle the problem (3.5) by minimizing the augmented Lagrangian function provided below:

$$\begin{aligned} \mathcal {L}(\varvec{K},\varvec{X},\varvec{F},\alpha ) = \lambda \Vert {\varvec{X}}\Vert ^{p}_{L}- \left\| \textrm{tr}(\varvec{C}^{k}\varvec{K}(\varvec{D}^{k})^{H})\right\| +\frac{\rho }{2}\Vert \varvec{B}-\varvec{A}\varvec{K}\Vert _{F}^{2} +\mathfrak {R}(\langle \varvec{F}, \varvec{K}-\varvec{X}\rangle )+\frac{\alpha }{2}\Vert \varvec{K}-\varvec{X}\Vert _{F}^{2} , \end{aligned}$$
(3.21)

where \(\alpha \) and \(\varvec{F}\) are a positive penalty parameter and a Lagrange multiplier, respectively.

Updating \(\varvec{K}\)

In the \((k+1)\)-th iteration, fixing the other variables at their latest values, \(\varvec{K}\) is the optimal solution of the following problem:

$$\begin{aligned} \begin{aligned} \varvec{K}^{k +1}&= \underset{\tiny {\varvec{K}}}{\mathrm{{arg\, min}}}\ \frac{\rho }{2}\Vert \varvec{B}-\varvec{A}\varvec{K}\Vert _{F}^{2}+\frac{1}{2}\Vert \varvec{K}-\varvec{X}^{k } +\frac{1}{\alpha ^{k}}\varvec{F}^{k}\Vert _{F}^{2}- \left\| \textrm{tr}(\varvec{C}^{k}\varvec{K}(\varvec{D}^{k})^{H}) \right\| . \\&=\underset{\tiny {\varvec{K}}}{\mathrm{{arg\, min}}}\ \frac{\rho }{2}\Vert \varvec{B}-\varvec{A}\varvec{K}\Vert _{F}^{2}+\frac{1}{2}\Vert \varvec{K}-\varvec{X}^{k } +\frac{1}{\alpha ^{k}}(\varvec{F}^{k}-(\varvec{C}^{k})^{H}\varvec{D}^{k})\Vert _{F}^{2}. \end{aligned} \end{aligned}$$
(3.22)

Let

$$\mathcal {W}(\varvec{K}):=\frac{\rho }{2}\Vert \varvec{B}-\varvec{A}\varvec{K}\Vert _{F}^{2}+\frac{1}{2}\Vert \varvec{K}-\varvec{X}^{k } +\frac{1}{\alpha ^{k}}(\varvec{F}^{k}-(\varvec{C}^{k})^{H}\varvec{D}^{k})\Vert _{F}^{2}.$$

Applying the relevant principles regarding quaternion matrix derivatives as outlined in [43], the gradient of \(\mathcal {W}(\varvec{K})\) can be calculated as

$$\begin{aligned} \frac{\partial \mathcal {W}(\varvec{K})}{\partial \varvec{K}}=-\rho \varvec{A}^{H}(\varvec{B}-\varvec{A}\varvec{K})+\varvec{K}-\varvec{X}^{k}+\frac{1}{\alpha ^{k}}(\varvec{F}^{k}-(\varvec{C}^{k})^{H}\varvec{D}^{k}). \end{aligned}$$
(3.23)

Setting (3.23) to zero, we can obtain a unique solution

$$\begin{aligned} \varvec{K}^{k +1} =[\rho {I}+\varvec{A}^{H}\varvec{A}]^{-1}\big [\rho \varvec{A}^{H}\varvec{B}+\varvec{X}^{k}+\frac{1}{\alpha ^{k}}(\varvec{F}^{k}-(\varvec{C}^{k})^{H}\varvec{D}^{k})\big ]. \end{aligned}$$
(3.24)

Updating \(\varvec{X}\)

In the \((k+1)\)-th iteration, while keeping the other variables constant at their most recent values, \(\varvec{X}^{k+1}\) represents the optimal solution of the subsequent problem:

$$\begin{aligned} \varvec{X}^{k +1} =\underset{\tiny {\varvec{X}}}{\mathrm{{arg\, min}}}\ \frac{1}{2}\Vert \varvec{K}^{k+1}+\frac{1}{\alpha ^{k}}\varvec{F}^{k}-\varvec{X}\Vert _{F}^{2}+\frac{\lambda }{\alpha ^{k}}\Vert \varvec{X}\Vert _{L}^{p}. \end{aligned}$$
(3.25)

By Lemma 2.4, the QLSVT can be applied to (3.25) for updating \(\varvec{X}^{k +1}\) when \(p=1\), that is

$$\begin{aligned} \begin{aligned} \varvec{X}^{k+1}&=\varvec{U}_{\tiny {\varvec{S}_3}}\varvec{\Delta }_{\lambda ,\alpha ^{k},\tiny {\varvec{\Sigma }_{\tiny {\varvec{S}_3}}}}\varvec{V}^{H}_{\tiny {\varvec{S}_3}}, \end{aligned} \end{aligned}$$
(3.26)

where \(\varvec{S}_{3}=\varvec{K}^{k+1}+\frac{1}{\alpha ^{k}}\varvec{F}^{k}\). Let \(\varvec{S}_{3}=\varvec{U}_{\tiny {\varvec{S}_3}}\varvec{\Sigma }_{\tiny {\varvec{S}_3}}\varvec{V}^{H}_{\tiny {\varvec{S}_3}}\) be the QSVD of quaternion matrices \(\varvec{S}_{3}\).

Updating \(\varvec{F}\) and \(\alpha \)

The update formats are as follows:

$$\begin{aligned} \left\{ \begin{array}{l} \varvec{F}^{k+1}=\varvec{F}^{k}+\alpha ^{k}(\varvec{X}^{k +1}-\varvec{K}^{k +1}), \\ \alpha ^{k+1}=\textrm{min}(\beta \alpha ^{k},\alpha _{\max }). \end{array} \right. \end{aligned}$$
(3.27)

Algorithm 3 outlines the complete process, detailing each step sequentially.

Algorithm 3
figure c

The ADMM optimization process is used in Step 5 of the TQLNSR algorithm.

The termination condition for Algorithms 1, 2 and 3 is defined as the following relative error:

$$ \text {RE}:=\frac{\Vert \varvec{A}\varvec{X}^{k}-\varvec{B}\Vert _{F}^{2}}{\Vert \varvec{X}^{k}\Vert _{F}^{2}}\le \textrm{tol}, $$

where \(\textrm{tol}>0\) is the stopping tolerance. In the experiments, we set \(\textrm{tol}\)=1e-4.

4 Convergence analysis

No definite convergence property of the ADMM has been established for non-convex problems (or convex problems involving more than two blocks of variables), even within the real number field. Therefore, concerning the formidable problems (3.5) and (3.20), we empirically demonstrate their convergence behavior. Additionally, alongside the empirical observations, we provide a weak convergence property for problem (3.5) (similarly for problem (3.20)), subject to mild conditions, as outlined in the following theorems.

Theorem 4.1

Let \(\Theta _{1}:=(\varvec{X},\varvec{Y},\varvec{Z},\varvec{M},\varvec{N},\varvec{F}_{1},\varvec{F}_{2},\varvec{F}_{3})\) and \(\{\Theta _{1}^{k}\}^{\infty }_{k=1}\) be generated by Algorithm 1. Assume that \(\{\Theta _{1}^{k}\}^{\infty }_{k=1}\) is bounded, and \(\{\alpha ^{k}\}^{\infty }_{k=1}\) is non-decreasing and bounded. Then,

(1)  \(\{\varvec{X}^{k}\},\{\varvec{Y}^{k}\},\{\varvec{Z}^{k}\},\{\varvec{M}^{k}\}\) and \(\{\varvec{N}^{k}\}\) are Cauchy sequences;

(2)  any accumulation point of \(\{\Theta _{1}^{k}\}^{\infty }_{k=1}\) satisfies the Karush-Kuhn-Tucker (KKT) conditions for the problem (3.5).

Proof

(1)  According to (3.16), we have

$$\begin{aligned}&\varvec{M}^{k +1}-\varvec{Y}^{k +1}=\frac{1}{\alpha ^{k}}(\varvec{F}_{1}^{k+1}-\varvec{F}_{1}^{k}), \\&\varvec{N}^{k +1}-\varvec{Z}^{k +1}=\frac{1}{\alpha ^{k}}(\varvec{F}_{2}^{k+1}-\varvec{F}_{2}^{k}), \\&\varvec{X}^{k +1}-\varvec{M}^{k +1}(\varvec{N}^{k+1})^{H}=\frac{1}{\alpha ^{k}}(\varvec{F}_{3}^{k+1}-\varvec{F}_{3}^{k}). \end{aligned}$$

Due to the assumptions that the quaternion matrix sequences \(\{\varvec{F}_{1}^{k}\},\{\varvec{F}_{2}^{k}\}\) and \(\{\varvec{F}_{3}^{k}\}\) are bounded, we have

$$\begin{aligned}&\sum _{k =0}^{\infty } \Vert \varvec{M}^{k +1}-\varvec{Y}^{k +1}\Vert _{F}\le \sum _{k =0}^{\infty }\frac{\alpha ^{k+1}}{(\alpha ^{k})^2}\Vert \varvec{F}_{1}^{k+1}-\varvec{F}_{1}^{k}\Vert _{F}< \infty , \\&\sum _{k =0}^{\infty } \Vert \varvec{N}^{k +1}-\varvec{Z}^{k +1}\Vert _{F}\le \sum _{k =0}^{\infty }\frac{\alpha ^{k+1}}{(\alpha ^{k})^2}\Vert \varvec{F}_{2}^{k+1}-\varvec{F}_{2}^{k}\Vert _{F}< \infty , \\&\sum _{k =0}^{\infty } \Vert \varvec{X}^{k +1}-\varvec{M}^{k +1}(\varvec{N}^{k+1})^{H}\Vert _{F}\le \sum _{k =0}^{\infty }\frac{\alpha ^{k+1}}{(\alpha ^{k})^2}\Vert \varvec{F}_{3}^{k+1}-\varvec{F}_{3}^{k}\Vert _{F}< \infty , \end{aligned}$$

which imply that

$$\begin{aligned}&\lim \limits _{k \rightarrow \infty }\Vert \varvec{M}^{k +1}-\varvec{Y}^{k +1}\Vert _{F}=0, \\&\lim \limits _ {k \rightarrow \infty }\Vert \varvec{N}^{k +1}-\varvec{Z}^{k +1}\Vert _{F}=0, \\&\lim \limits _{k \rightarrow \infty }\Vert \varvec{X}^{k +1}-\varvec{M}^{k +1}(\varvec{N}^{k+1})^{H}\Vert _{F}=0. \end{aligned}$$

Hence, \(\{(\varvec{X}^{k},\varvec{Y}^{k},\varvec{Z}^{k},\varvec{M}^{k},\varvec{N}^{k})\}\) indeed approaches to a feasible solution.

Next, we show that \(\{\varvec{M}^{k}\}\) and \(\{\varvec{N}^{k}\}\) are Cauchy sequences. Note that

$$\begin{aligned} \left\{ \begin{array}{l} \varvec{F}_{1}^{k}=\varvec{F}_{1}^{k-1}+\alpha ^{k-1}(\varvec{Y}^{k }-\varvec{M}^{k }), \\ \varvec{F}_{2}^{k}=\varvec{F}_{2}^{k-1}+\alpha ^{k-1}(\varvec{Z}^{k}-\varvec{N}^{k}), \\ \varvec{F}_{3}^{k}=\varvec{F}_{3}^{k-1}+\alpha ^{k-1}(\varvec{X}^{k}-\varvec{M}^{k}(\varvec{N}^{k})^{H}). \end{array} \right. \end{aligned}$$

Then, from (3.12) and (3.13), we have

$$\begin{aligned}&\frac{1}{2}\big ( \varvec{M}^{k +1}-\varvec{Y}^{k} +\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{1}\big ) +\frac{1}{2}\big (\varvec{M}^{k +1}(\varvec{N}^{k})^{H}-\varvec{X}^{k}-\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{3}\big )\varvec{N}^{k} \nonumber \\ =&\frac{1}{2}\big ( \varvec{M}^{k +1}-\varvec{M}^{k}+\varvec{M}^{k}-\varvec{Y}^{k} +\frac{\varvec{F}^{k}_{1}}{\alpha ^{k}}\big ) +\frac{1}{2}\big (\varvec{M}^{k +1}(\varvec{N}^{k})^{H}-\varvec{M}^{k}(\varvec{N}^{k})^{H}-\frac{1}{\alpha ^{k-1}}\varvec{F}^{k}_{3}\nonumber \\&+\frac{1}{\alpha ^{k-1}}\varvec{F}^{k-1}_{3}-\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{3}\big )\varvec{N}^{k} \nonumber \\ =&\frac{1}{2}[(\varvec{M}^{k +1}-\varvec{M}^{k})(I+(\varvec{N}^{k})^{H}(\varvec{N}^{k}))+\frac{1}{\alpha ^{k-1}}(\varvec{F}_{1}^{k}-\varvec{F}_{1}^{k-1})+\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{1} -(\frac{1}{\alpha ^{k-1}}(\varvec{F}^{k}_{3}-\varvec{F}^{k-1}_{3})\nonumber \\&+\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{3})\varvec{N}^{k}]=O, \end{aligned}$$
(4.1)

and

$$\begin{aligned}&\frac{1}{2}\big ( \varvec{N}^{k +1}-\varvec{Z}^{k} +\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{2}\big ) +\frac{1}{2}\big (\varvec{N}^{k+1}(\varvec{M}^{k+1})^{H}-(\varvec{X}^{k})^{H}-\frac{1}{\alpha ^{k}}(\varvec{F}^{k}_{3})^{H}\big )\varvec{M}^{k+1} \nonumber \\ =&\frac{1}{2}\big ( \varvec{N}^{k +1}-\varvec{N}^{k}+\varvec{N}^{k}-\varvec{Z}^{k} +\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{2}\big ) +\frac{1}{2}\big (\varvec{N}^{k+1}(\varvec{M}^{k +1})^{H}-\varvec{N}^{k}(\varvec{M}^{k})^{H}-\frac{1}{\alpha ^{k-1}}\varvec{F}^{k}_{3}\nonumber \\&+\frac{1}{\alpha ^{k-1}}\varvec{F}^{k-1}_{3}-\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{3}\big )\varvec{M}^{k+1}\nonumber \\ =&\frac{1}{2}[(\varvec{N}^{k +1}-\varvec{N}^{k})(I+(\varvec{M}^{k+1})^{H}(\varvec{M}^{k+1}))+\frac{1}{\alpha ^{k-1}}(\varvec{F}_{2}^{k}-\varvec{F}_{2}^{k-1})+\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{2}\nonumber \\&+\varvec{N}^{k}( \varvec{M}^{k +1}-\varvec{M}^{k})^{H}\varvec{M}^{k +1}-(\frac{1}{\alpha ^{k-1}}(\varvec{F}^{k}_{3}-\varvec{F}^{k-1}_{3})+\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{3})\varvec{M}^{k+1}]=O. \end{aligned}$$
(4.2)

Based on (4.1) and (4.2), we can respectively obtain

$$\begin{aligned}&\varvec{M}^{k +1}-\varvec{M}^{k}\nonumber \\&=2\big [\frac{1}{\alpha ^{k-1}}(\varvec{F}_{1}^{k-1}-\varvec{F}_{1}^{k})-\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{1} +(\frac{1}{\alpha ^{k-1}}(\varvec{F}^{k}_{3}-\varvec{F}^{k-1}_{3})+\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{3})\varvec{N}^{k}\big ][I+(\varvec{N}^{k})^{H}(\varvec{N}^{k})]^{-1} \end{aligned}$$
(4.3)

and

$$\begin{aligned}&\varvec{N}^{k +1}-\varvec{N}^{k} \nonumber \\&= 2\big [\frac{1}{\alpha ^{k-1}}(\varvec{F}_{2}^{k-1}-\varvec{F}_{2}^{k})-\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{2}+\varvec{N}^{k}( \varvec{M}^{k +1}-\varvec{M}^{k})^{H} \varvec{M}^{k +1}+\varvec{N}^{k}( \varvec{M}^{k +1}-\varvec{M}^{k})^{H} \nonumber \\&~~~~+(\frac{1}{\alpha ^{k-1}}(\varvec{F}^{k}_{3}-\varvec{F}^{k-1}_{3}) +\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{3})\varvec{M}^{k+1}\big ] \big [I+(\varvec{M}^{k+1})^{H}(\varvec{M}^{k+1})\big ]^{-1}. \end{aligned}$$
(4.4)

Recall that \(\alpha ^{k}=\textrm{min}(\beta \alpha ^{k-1},\alpha _{\max })\), it follows that \(\{\alpha ^{k}\}^{\infty }_{k=1}\) is non-decreasing and bounded, we have

$$\begin{aligned} \sum _{k =0}^{\infty } \Vert \varvec{M}^{k +1}-\varvec{M}^{k}\Vert _{F}\le \sum _{k =0}^{\infty }\frac{\eta _{1}}{\alpha ^{k}}\le \sum _{k =0}^{\infty }\frac{\alpha ^{k+1}\eta _{1}}{(\alpha ^{k})^2} < \infty , \end{aligned}$$
(4.5)

where the constant \(\eta _{1}\) is defined as

$$\begin{aligned} \eta _{1}=&\textrm{max}\{2(\beta \Vert \varvec{F}_{1}^{k -1}-\varvec{F}_{1}^{k}\Vert _{F}+\Vert \varvec{F}_{1}^{k}\Vert _{F}+(\beta \Vert \varvec{F}_{3}^{k}-\varvec{F}_{3}^{k-1}\Vert _{F}+\Vert \varvec{F}_{3}^{k}\Vert _{F})\Vert \varvec{N}^{k}\Vert _{F})\\&\Vert \big (I+(\varvec{N}^{k})^{H}(\varvec{N}^{k})\big )^{-1}\Vert _{F}, k=1,2,\cdots \}. \end{aligned}$$

And it has

$$\begin{aligned}&\sum _{k =0}^{\infty } \Vert \varvec{N}^{k +1}-\varvec{N}^{k}\Vert _{F}\le \sum _{k =0}^{\infty }\frac{\eta _{2}}{\alpha ^{k}}+\sum _{k =0}^{\infty }\eta _{3}\Vert \varvec{M}^{k}-\varvec{M}^{k-1}\Vert _{F}\nonumber \\&\le \sum _{k =0}^{\infty }\frac{\alpha ^{k+1}\eta _{2}}{(\alpha ^{k})^2}+\sum _{k =0}^{\infty }\eta _{3}\Vert \varvec{M}^{k}-\varvec{M}^{k-1}\Vert _{F} < \infty , \end{aligned}$$
(4.6)

where the constants \(\eta _{2}\) and \(\eta _{3}\) are defined as

$$\begin{aligned} {\begin{matrix} \eta _{2}= & \textrm{max}\{2(\beta \Vert \varvec{F}_{2}^{k -1}-\varvec{F}_{2}^{k}\Vert _{F}+\Vert \varvec{F}_{2}^{k}\Vert _{F}+(\beta \Vert \varvec{F}_{3}^{k}-\varvec{F}_{3}^{k-1}\Vert _{F}+\Vert \varvec{F}_{3}^{k}\Vert _{F})\Vert \varvec{M}^{k+1}\Vert _{F})\\ & \Vert \big (I+(\varvec{M}^{k+1})^{H}(\varvec{M}^{k+1})\big )^{-1}\Vert _{F}, k=1,2,\cdots \} \end{matrix}} \end{aligned}$$

and

$$\eta _{3}=\textrm{max}\{2\Vert \varvec{N}^{k}\Vert _{F}\Vert \varvec{M}^{k+1}\Vert _{F}\Vert \big (I+(\varvec{M}^{k+1})^{H}(\varvec{M}^{k+1})\big )^{-1}\Vert _{F}, k=1,2,\cdots \}.$$

From (4.5) and (4.6), we know that \(\textrm{lim} _{k \rightarrow \infty }\Vert \varvec{M}^{k +1}-\varvec{M}^{k}\Vert _{F}=0\) and \(\textrm{lim} _{k \rightarrow \infty }\Vert \varvec{N}^{k +1}-\varvec{N}^{k}\Vert _{F}=0\). Hence, \(\{\varvec{M}^{k}\}\) and \(\{\varvec{N}^{k}\}\) are Cauchy sequences. Similarly, one can also verify \(\{\varvec{X}^{k}\}\) is Cauchy sequence.

Later, we establish that the sequences \(\{\varvec{Y}^{k}\}\), \(\{\varvec{Z}^{k}\}\) and \(\{\varvec{X}^{k}\}\) are Cauchy sequences. Let \(\varvec{U}_{y}^{k}\varvec{\Sigma }_{y}^{k}(\varvec{V}_{y}^{H})^{k}\) be QSVD of the matrix \(\varvec{M}^{k}+\frac{1}{\alpha ^{k}}\varvec{F}^{k}_{1}\) in the \((k+1)\)-th iteration. Then utilizing the QLSVT, we can get:

$$\begin{aligned} \varvec{Y}^{k+1}= \varvec{U}_{y}^{k}\varvec{\Delta }_{\frac{\lambda }{2\alpha ^{k}},\epsilon ,{\tiny {\varvec{\Sigma }_{y}^{k}}}}(\varvec{V}_{y}^{H})^{k}. \end{aligned}$$
(4.7)

The soft thresholding operator \(\varvec{\Delta }_{\lambda ,\epsilon ,\tiny {\varvec{A}}}\) is defined as : 

$$\varvec{\Delta }_{\lambda ,\epsilon ,\varvec{A}}=\textrm{diag}(l_{\lambda ,\epsilon }(\sigma _1),\,l_{\lambda ,\epsilon }(\sigma _2),\,\cdots , l_{\lambda ,\epsilon }(\sigma _r),\,0,\,\cdots ,\,0)\in \mathbb {R}^{m\times n}$$

with

$$\begin{aligned} l_{\lambda ,\epsilon }(x)=\left\{ \begin{array}{ll} 0,& \delta \le 0,\\ \underset{a\in \{0,\frac{1}{2}(x-\epsilon +\sqrt{\delta })\}}{\mathrm{{arg\, min}}}\ h(a), & \delta > 0,\\ \end{array} \right. \end{aligned}$$
(4.8)

where \(\delta =(x-\epsilon )^{2}-4(\lambda -x\epsilon )\) and the function \(h: \mathbb {R}^{+}\rightarrow \mathbb {R}^{+}\) is defined as \(h(a):=\frac{1}{2}(a-x)^{2}+\lambda \textrm{log}(a+\epsilon )\). Let \(x_{0}\) present an arbitrary singular value \((x_{0}\ge 0)\) and \(g(\lambda ,x_{0})=x_{0}-l_{\lambda ,\epsilon }(x_{0})\), then we can obtain \(\textrm{max}\big (g(\lambda ,x_{0})\big )=2\sqrt{\lambda }\) as follows.

Case 1: \(\delta \le 0\). From the definition of \(\delta \), we have \(0\le x_{0}\le 2\sqrt{\lambda }-\epsilon \). Moreover, \({l}_{\lambda ,\epsilon }(x_{0})\) in this case, so we have \(\textrm{max}\big (g(\lambda ,x_{0})\big )=2\sqrt{\lambda }\).

Case 2: \(\delta > 0\). In this case, \(x_{0}> 2\sqrt{\lambda }-\epsilon \) and we have

$$\textrm{max}\big (g(\lambda ,x_{0})\big )=x_{0}-\underset{a\in \{0,\frac{1}{2}(x-\epsilon +\sqrt{\delta })\}}{\mathrm{{arg\, min}}}\ h(a)=x_{0}-\frac{1}{2}(x_{0}-\epsilon +\sqrt{\delta }).$$

Then it holds

$$\dfrac{\partial \textrm{max}\big (g(\lambda ,x_{0})\big )}{\partial x_{0}}=\frac{1}{2}(1-\frac{x_{0}+\epsilon }{\sqrt{(x_{0}+\epsilon )^{2}-4\lambda }})< 0.$$

We could know that \(g(\lambda ,x_{0})<g(\lambda ,2\sqrt{\lambda }-\epsilon )=\sqrt{\lambda }\). For \(\alpha ^{k}>0\), from (3.16), we have

$$\begin{aligned}&\lim _{k \rightarrow \infty }\Vert \varvec{Y}^{k +1}-\varvec{Y}^{k}\Vert _{F} \\&=\lim _{k \rightarrow \infty } \Vert \varvec{M}^{k+1}+\frac{1}{\alpha ^{k}}(\varvec{F}_{1}^{k+1}-\varvec{F}_{1}^{k})-\varvec{Y}^{k}\Vert _{F}\\&=\lim _{k \rightarrow \infty } \Vert \varvec{M}^{k+1}-\varvec{M}^{k}+\frac{1}{\alpha ^{k}}\varvec{F}_{1}^{k+1}+\varvec{M}^{k}-\varvec{Y}^{k}+\frac{1}{\alpha ^{k}}\varvec{F}_{1}^{k}\Vert _{F} \\&\le \lim _{k \rightarrow \infty }(\Vert \varvec{M}^{k+1}-\varvec{M}^{k}\Vert _{F}+\frac{1}{\alpha ^{k}}\Vert \varvec{F}_{1}^{k+1}\Vert _{F}+\Vert \varvec{M}^{k}-\varvec{Y}^{k}+\frac{1}{\alpha ^{k}}\varvec{F}_{1}^{k}\Vert _{F}) \\&=\lim _{k \rightarrow \infty }(\Vert \varvec{M}^{k+1}-\varvec{M}^{k}\Vert _{F}+\frac{1}{\alpha ^{k}}\Vert \varvec{F}_{1}^{k+1}\Vert _{F}+\Vert \varvec{M}^{k}+\frac{1}{\alpha ^{k}}\varvec{F}_{1}^{k}-\varvec{Y}^{k}\Vert _{F}) \\&=\lim _{k \rightarrow \infty } (\Vert \varvec{M}^{k+1}-\varvec{M}^{k}\Vert _{F}+\frac{1}{\alpha ^{k}}\Vert \varvec{F}_{1}^{k+1}\Vert _{F}+\Vert \varvec{U}_{y}^{k}\varvec{\Sigma }_{y}^{k}(\varvec{V}_{y}^{H})^{k}-\varvec{U}_{y}^{k}\varvec{\Delta }_{\frac{\lambda }{2\alpha ^{k}},\epsilon ,{\tiny {\varvec{\Sigma }_{y}^{k}}}}(\varvec{V}_{y}^{H})^{k}\Vert _{F}) \\&= \lim _{k \rightarrow \infty }(\Vert \varvec{M}^{k+1}-\varvec{M}^{k}\Vert _{F}+\frac{1}{\alpha ^{k}}\Vert \varvec{F}_{1}^{k+1}\Vert _{F}+\Vert \varvec{U}_{y}^{k}\big (\varvec{\Sigma }_{y}^{k}-\varvec{\Delta }_{\frac{\lambda }{2\alpha ^{k}},\epsilon , {\tiny {\varvec{\Sigma }_{y}^{k}}}}\big )(\varvec{V}_{y}^{H})^{k}\Vert _{F}) \\&\le \lim _{k \rightarrow \infty }\big (\frac{1}{\alpha ^{k}}\Vert \varvec{F}_{1}^{k+1}\Vert _{F}+\textrm{max}(m,n)\sqrt{\frac{\lambda }{2\alpha ^{k}}}\big )\\&=0. \end{aligned}$$

Similarly, one can also verify that \(\{\varvec{Z}^{k}\}\) and \(\{\varvec{X}^{k}\}\) are Cauchy sequences.

(2)  Let \((\varvec{Y}_{*},\varvec{Z}_{*},\varvec{M}_{*},\varvec{N}_{*},\varvec{X}_{*})\) be a stationary point of (3.5). Then, it satisfies the following KKT conditions

$$\begin{aligned} \left\{ \begin{array}{lll} O\in \frac{\lambda }{2}\partial \Vert \varvec{Y}_{*}\Vert _{L}^{1}+{\varvec{F}_{1}}_{*}, & O\in \frac{\lambda }{2}\partial \Vert \varvec{Z}_{*}\Vert _{L}^{1}+{\varvec{F}_{2}}_{*}, & O={\rho }\varvec{A}^{H}(\varvec{B}-\varvec{A}\varvec{X}_{*} )+{\varvec{F}_{3}}_{*},\\ \varvec{X}_{*}=\varvec{M}_{*}(\varvec{N}_{*})^{H},& \varvec{Y}_{*}=\varvec{M}_{*}, & \varvec{Z}_{*}=\varvec{N}_{*} .\\ \end{array} \right. \end{aligned}$$
(4.9)

Subsequently, we will confirm that every limit point of \((\varvec{Y}^{k},\varvec{Z}^{k},\varvec{M}^{k},\varvec{N}^{k},\varvec{X}^{k})\) aforementioned KKT conditions.

From (3.11) and (3.13), we can respectively obtain

$$\begin{aligned} \begin{aligned}&O\in \frac{\lambda }{2\alpha ^{k}}\partial \Vert \varvec{Y}^{k+1}\Vert _{L}^{1}+(\varvec{M}^{k+1}-\varvec{Y}^{k+1}+\frac{\varvec{F}^{k}_{1}}{\alpha ^{k}}), \\&O\in \frac{\lambda }{2\alpha ^{k}}\partial \Vert \varvec{Z}^{k+1}\Vert _{L}^{1}+(\varvec{N}^{k+1}-\varvec{Z}^{k+1}+\frac{\varvec{F}^{k}_{2}}{\alpha ^{k}}), \\&O={\rho }\varvec{A}^{H}( \varvec{B}-\varvec{A}\varvec{X}^{k+1} )+( \varvec{X}^{k+1}-\varvec{M}^{k+1}(\varvec{N}^{k+1})^{H} +\frac{\varvec{F}^{k}_{3}}{\alpha ^{k}}). \end{aligned} \end{aligned}$$
(4.10)

Since \(\{\varvec{Y}^{k}\}, \{\varvec{Z}^{k}\}, \{\varvec{M}^{k}\}\), \(\{\varvec{N}^{k}\}\) and \(\{\varvec{X}^{k}\}\) are Cauchy sequences, we can let \(\varvec{Y}^{\infty },\varvec{Z}^{\infty },\varvec{M}^{\infty },\varvec{N}^{\infty }\) and \(\varvec{X}^{\infty }\) be their accumulation points, respectively. Then, together with the results in Algorithm 1, we have that \(\varvec{Y}^{\infty }=\varvec{M}^{\infty }\), \(\varvec{Z}^{\infty }=\varvec{N}^{\infty }\), \(\varvec{X}^{\infty }=\varvec{Y}^{\infty }(\varvec{Z}^{\infty })^{H}\). Thus, when \(k\rightarrow \infty \), (4.10) becomes

$$\begin{aligned} \begin{aligned}&O\in \frac{\lambda }{2}\partial \Vert \varvec{Y}^{\infty }\Vert _{L}^{1}+\varvec{F}^{\infty }_{1}, \\&O\in \frac{\lambda }{2}\partial \Vert \varvec{Z}^{\infty }\Vert _{L}^{1}+\varvec{F}^{\infty }_{2},\\&O= {\rho }\varvec{A}^{H}(\varvec{B}-\varvec{A}\varvec{X}^{\infty } )+\varvec{F}^{\infty }_{3}. \end{aligned} \end{aligned}$$
(4.11)

Consequently, any accumulation point \(\{\varvec{Y}^{\infty },\varvec{Z}^{\infty },\varvec{M}^{\infty },\varvec{N}^{\infty },\varvec{X}^{\infty }\}\) of the sequence \(\{(\varvec{Y}^{k},\varvec{Z}^{k},\) \(\varvec{M}^{k},\varvec{N}^{k},\varvec{X}^{k})\}\) generated by Algorithm 1 indeed satisfies the KKT conditions for the problem (3.5). This proof is complete. \(\square \)

Below we give the weak convergence property of the problem (3.20), but under mild conditions, as described in the following theorem.

Theorem 4.2

Let \(\Theta _{2}:=(\varvec{K},\varvec{X},\varvec{F})\) and \(\{\Theta _{2}^{k}\}^{\infty }_{k=1}\) be generated by Algorithm 3. Assume that \(\{\Theta _{2}^{k}\}^{\infty }_{k=1}\) is bounded, and \(\{\alpha ^{k}\}^{\infty }_{k=1}\) is non-decreasing and bounded. Then,

(1)  \(\{\varvec{K}^{k}\}\) and \(\{\varvec{X}^{k}\}\) are Cauchy sequences;

(2)  any accumulation point of \(\{\Theta _{2}^{k}\}^{\infty }_{k=1}\) satisfies the Karush-Kuhn-Tucker (KKT) conditions for the problem (3.20).

Proof

(1)  According to (3.27), we have

$$ \varvec{K}^{k +1}-\varvec{X}^{k +1}=\frac{1}{\alpha ^{k}}(\varvec{F}^{k+1}-\varvec{F}^{k}). $$

Due to the assumptions that \(\{\varvec{F}^{k}\}\) is bounded and \(\{\alpha ^{k}\}^{\infty }_{k=1}\) is non-decreasing and bounded, we have

$$ \sum _{k =0}^{\infty } \Vert \varvec{K}^{k +1}-\varvec{X}^{k +1}\Vert _{F}\le \sum _{k =0}^{\infty }\frac{\alpha ^{k+1}}{(\alpha ^{k})^2}\Vert \varvec{F}^{k+1}-\varvec{F}^{k}\Vert _{F}< \infty , $$

which implies that

$$\lim \limits _{k \rightarrow \infty }\Vert \varvec{K}^{k +1}-\varvec{X}^{k +1}\Vert _{F}=0.$$

Hence, \(\{(\varvec{K}^{k},\varvec{X}^{k})\}\) indeed approaches to a feasible solution.

Next, we show that \(\{\varvec{K}^{k}\}\) and \(\{\varvec{X}^{k}\}\) are Cauchy sequences. Let \(\varvec{U}_{x}^{k}\varvec{\Sigma }_{x}^{k}(\varvec{V}_{x}^{H})^{k}\) be the QSVD of \(\varvec{K}^{k}+\frac{1}{\alpha ^{k}}\varvec{F}^{k}\) in the \((k+1)\)-th iteration.

For \(\alpha ^{k}>0\), from (3.27) and the proof of Theorem 4.1, we know that

$$\begin{aligned}&\lim _{k \rightarrow \infty }\Vert \varvec{X}^{k +1}-\varvec{X}^{k}\Vert _{F} \\&=\lim _{k \rightarrow \infty } \Vert \varvec{K}^{k+1}+\frac{1}{\alpha ^{k}}(\varvec{F}^{k+1}-\varvec{F}^{k})-\varvec{X}^{k}\Vert _{F}\\&=\lim _{k \rightarrow \infty } \Vert \varvec{K}^{k+1}-\varvec{K}^{k}+\frac{1}{\alpha ^{k}}\varvec{F}^{k+1}+\varvec{K}^{k}-\varvec{X}^{k}+\frac{1}{\alpha ^{k}}\varvec{F}^{k}\Vert _{F} \\&\le \lim _{k \rightarrow \infty }(\Vert \varvec{K}^{k+1}-\varvec{K}^{k}\Vert _{F}+\frac{1}{\alpha ^{k}}\Vert \varvec{F}^{k+1}\Vert _{F}+\Vert \varvec{K}^{k}-\varvec{X}^{k}+\frac{1}{\alpha ^{k}}\varvec{F}^{k}\Vert _{F}) \\&=\lim _{k \rightarrow \infty }(\Vert \varvec{K}^{k+1}-\varvec{K}^{k}\Vert _{F}+\frac{1}{\alpha ^{k}}\Vert \varvec{F}^{k+1}\Vert _{F}+\Vert \varvec{K}^{k}+\frac{1}{\alpha ^{k}}\varvec{F}^{k}-\varvec{X}^{k}\Vert _{F}) \\&=\lim _{k \rightarrow \infty } (\Vert \varvec{K}^{k+1}-\varvec{K}^{k}\Vert _{F}+\frac{1}{\alpha ^{k}}\Vert \varvec{F}^{k+1}\Vert _{F}+\Vert \varvec{U}_{x}^{k}\varvec{\Sigma }_{x}^{k}(\varvec{V}_{x}^{H})^{k}-\varvec{U}_{x}^{k}\varvec{\Delta }_{\frac{\lambda }{\alpha ^{k}},\epsilon ,{\tiny {\varvec{\Sigma }_{x}^{k}}}}(\varvec{V}_{x}^{H})^{k}\Vert _{F}) \\&= \lim _{k \rightarrow \infty }(\Vert \varvec{K}^{k+1}-\varvec{K}^{k}\Vert _{F}+\frac{1}{\alpha ^{k}}\Vert \varvec{F}^{k+1}\Vert _{F}+\Vert \varvec{U}_{x}^{k}\big (\varvec{\Sigma }_{x}^{k}-\varvec{\Delta }_{\frac{\lambda }{\alpha ^{k}},\epsilon ,{\tiny {\varvec{\Sigma }_{x}^{k}}}}\big )(\varvec{V}_{x}^{H})^{k}\Vert _{F}) \\&\le \lim _{k \rightarrow \infty }\big (\frac{1}{\alpha ^{k}}\Vert \varvec{F}^{k+1}\Vert _{F}+\textrm{max}(m,n)\sqrt{\frac{\lambda }{\alpha ^{k}}}\big )\\&=0. \end{aligned}$$

Similarly, one can also verify that \(\{\varvec{H}^{k}\}\) is Cauchy sequence.

(2)  Let \((\varvec{K}_{*},\varvec{X}_{*})\) be a stationary point of (3.20). Then, it satisfies the following KKT conditions

$$\begin{aligned} \left\{ \begin{array}{l} O\in \lambda \partial \Vert \varvec{X}_{*}\Vert _{L}^{1}+\varvec{F}_{*},\\ O= {\rho }\varvec{A}^{H}( \varvec{B}-\varvec{A}\varvec{K}_{*} )+\varvec{F}_{*},\\ \varvec{X}_{*}=\varvec{K}_{*}.\\ \end{array} \right. \end{aligned}$$
(4.12)

Subsequently, we will confirm that every limit point of \((\varvec{K}^{k},\varvec{X}^{k})\) aforementioned KKT conditions.

Fig. 2
figure 2

Comparison of target values between quaternion matrix reconstructed by QLNFSR algorithm and random quaternion matrix under different parameters

From (3.22) and (3.25), we can respectively obtain

$$\begin{aligned} \begin{aligned}&O\in \frac{\lambda }{\alpha ^{k}}\partial \Vert \varvec{X}^{k+1}\Vert _{L}^{1}+(\varvec{K}^{k+1}-\varvec{X}^{k+1}+\frac{1}{\alpha ^{k}}\varvec{F}^{k}), \\&O= {\rho }\varvec{A}^{H}( \varvec{B}-\varvec{A}\varvec{K}^{k+1} )+\big ( \varvec{K}^{k+1}-\varvec{X}^{k+1} +\frac{1}{\alpha ^{k}}(\varvec{F}^{k}-\varvec{C}^{H}\varvec{D})\big ). \end{aligned} \end{aligned}$$
(4.13)

Since \(\{\varvec{K}^{k}\}\) and \(\{\varvec{X}^{k}\}\) are Cauchy sequences, let \(\varvec{K}^{\infty }\) and \(\varvec{X}^{\infty }\) be their accumulation points, respectively. Then, together with the results in Algorithm 3, we have that \(\varvec{K}^{\infty }=\varvec{X}^{\infty }\). Thus, when \(k\rightarrow \infty \), (4.13) becomes

$$\begin{aligned} \begin{aligned}&O\in \lambda \partial \Vert \varvec{X}^{\infty }\Vert _{L}^{1}+\varvec{F}^{\infty }, \\&O= {\rho }\varvec{A}^{H}( \varvec{B}-\varvec{A}\varvec{K}^{\infty } )+\varvec{F}^{\infty }. \end{aligned} \end{aligned}$$
(4.14)

Consequently, any accumulation point \(\{\varvec{K}^{\infty },\varvec{X}^{\infty }\}\) of the sequence \(\{(\varvec{K}^{k},\varvec{X}^{k})\}\) generated by Algorithm 3 indeed satisfies the KKT conditions for the problem (3.20). \(\square \)

5 Numerical experiments

In this section, based on the discussions in Sections 3 and 4, we give some numerical examples of color image sparse representation to prove the feasibility and effectiveness of Algorithms 1-3. We implemented all algorithms in MATLAB R2020a on a personal computer with Inter(R) Core(TM) i7-10700 CPU @ 2.90GHz and 8 GB memory.

Let \(\varvec{A}=\textrm{rand}\left( n,n\right) +\textrm{rand}\left( n,n\right) \varvec{i}+\textrm{rand}\left( n,n\right) \varvec{j}+\textrm{rand}\left( n,n\right) \varvec{k} \in \mathbb {Q}^{n\times n}\) with \(n=20\). We employ Algorithms 1-3 to individually compute the reconstructed quaternion matrix and random matrix under the models (3.5) and (3.19). The three-dimensional correlation between the target value and the parameter values \(\alpha \) and \(\rho \) is illustrated in Figs. 2 and 3.

From Figs. 2 and 3, it’s evident that the matrices reconstructed by our proposed Algorithms 1-3 consistently yield lower target values within the model compared to the random matrices. Therefore, the feasibility of our proposed algorithms is demonstrated. We represent a color image as \(\varvec{A}=R\varvec{i}+G\varvec{j}+B\varvec{k} \in \mathbb {Q}^{m\times n}\), where RG and B respectively represent the real matrix corresponding to the red, green and blue channels in the color image. The original color images selected in the experiments are \(64\times 64 \times 3\) pixels. Obviously, every color image matrix is a pure imaginary quaternion matrix. Using this representation method, we can make better use of the relationship between the three channels of the color image and process a color image as a whole. We compare the proposed algorithm with truncated QSVD algorithm [44] and quaternion nuclear norm algorithm [45].

Fig. 3
figure 3

Comparison of target values between quaternion matrix reconstructed by QTLNSR algorithm and random quaternion matrix under different parameters

Parameters and initialization setting

we set \(\varvec{B}=\varvec{A}\), \(\lambda =0.05\sqrt{\textrm{max}(m,n)}\), \(\rho =0.05\), \(d=10\), \(\alpha _{\max }=10^{7}\) and \(\beta =1.03\). Let \(\alpha ^{0}=0.01\), \(\varvec{X}_{0}=\textbf{randQ}(m,n)\), \([\varvec{U}^{x},\sim ,\varvec{V}^{x}]=\textbf{svdQ}(\varvec{X}_{0})\), \(\varvec{M}^{0}=\varvec{Y}^{0}=\varvec{U}^{x}(1:d,:)\), \(\varvec{N}^{0}=\varvec{Z}^{0}=\varvec{V}^{x}(1:d,:)\) and \(\varvec{F}_1^{0}=\varvec{F}_2^{0}=\varvec{F}_3^{0}=\varvec{0}\). In TQLNSR, as the exact number of truncated singular values is unknown beforehand, experimenting with \(r=1\) can yield useful insights. Adjusting r appropriately can lead to improved outcomes. The randomly generated quaternion matrix function \(\textbf{randQ}\) and the quaternion singular value decomposition function \(\textbf{svdQ}\) are derived from the Structure-preserving Quaternion Toolbox [46].

Quantitative assessment

To assess the effectiveness of the proposed methods, we utilize three commonly employed quantitative quality metrics, namely the peak signal-to-noise ratio (PSNR), the mean structural similarity index (MSSIM) and the sparsity of the proportion of zero elements, in addition to evaluating visual quality. The calculation formulas are as follows.

(1) PSNR

$$\text {PSNR}=10 \text {log}_{10}(\frac{L^{2}}{\text {MSE}}), \text {MSE}=\frac{1}{3mn}\sum _{h=1}^{3}\sum _{w=1}^{m}\sum _{u=1}^{n}[{X}_{h}(w,u)-{Y}_{h}(w,u)]^{2},$$

where L is the maximum value of the data type of the color image. MSE is the mean square error, where \(\varvec{X}=X_{1}\varvec{i}+X_{2}\varvec{j}+X_{3}\varvec{k}\in \mathbb {Q}^{m\times n}\), \(\varvec{Y}=Y_{1}\varvec{i}+Y_{2}\varvec{j}+Y_{3}\varvec{k} \in \mathbb {Q}^{m\times n}\) and (wu) represent the original image, the reconstructed image and the coordinates, respectively. m and n are the number of rows and columns of the image matrix, respectively.

Generally speaking, PSNR in the range of 30(dB) to 40(dB) indicates good image quality. That is, the distortion is perceptible but acceptable. When PSNR is higher than 40(dB), the image quality is excellent. The encrypted image is very close to the original image.

Table 1 Color image (Male, Blossom, Baboon, Peppers, Female, Tree, Splash and Plane) sparse representation results in sparse representation of various indicators in different algorithms, among which algorithms include QSVD, QSVT, QLNFSR and TQLNSR
Fig. 4
figure 4

Color image (Male, Blossom, Baboon, Peppers, Female, Tree, Splash and Plane) sparse representation results after reconstruction. The sequence is the results of the original image, QSVD, QSVT, QLNFSR and TQLNSR algorithms from left to right, respectively

(2) MSSIM

$$\begin{aligned}&\textrm{SSIM}(X_{h},Y_{h})=\dfrac{\big (2\mu ({X_{h}})\mu ({Y_{h}})+c_{1}\big )\big (2\sigma ({X_{h},Y_{h}})+c_{2}\big )}{\big (\mu ^{2}({X_{h}})+\mu ^{2}({Y_{h}})+c_{1})(\sigma ^{2}({X_{h}})+\sigma ^{2}({Y_{h}})+c_{2})},\\&\textrm{MSSIM}(\varvec{X},\varvec{Y})=\frac{1}{3}\sum _{h=1}^{3}\textrm{SSIM}(X_{h},Y_{h}), \end{aligned}$$

where \(\mu ({X_{h}})/ \mu ({Y_{h}})\) and \(\sigma ({X_{h}})/\sigma ({Y_{h}})\) represent the mean value and standard deviation of \(X_{h}/Y_{h}\) with \(h=1,2,3\); \(\sigma ({X_{h},Y_{h}})\) denotes the covariance matrix of \(X_{h}\) and \(Y_{h}\). \(c_1\) and \(c_2\) are constants in order to avoid denominators of 0 and maintain stability. The value of MSSIM ranges from 0 to 1. Notice that it has a well similarity of two images while the value of MSSIM is closed to 1.

(3) Sparsity

$$\textrm{Sparsity}=\frac{m\times n-T_z}{m\times n},$$

where m and n represent the dimensions of the quaternion matrix row and column, respectively. \(T_z\) represents the number of non-zero elements of the quaternion matrix. The sparsity measure ranges between 0 and 1. A sparsity value closer to 0 indicates a denser matrix, while a value closer to 1 indicates a sparser matrix.

The reconstruction results about PSNR, MSSIM, Sparsity and CPU time(s) for four test methods on eight test images are shown in Table 1. When the value of PSNR is inf, the image similarity is very high. Additionally, Fig. 4 illustrates the visual contrast between the four test methods on the eight test color images. As can be seen from Fig. 4 and Table 1, Algorithms 1-3 proposed in this paper can be well applied to the reconstruction after sparse representation of color images.

6 Conclusions

In conclusion, this paper introduces a novel approach for image restoration using quaternion matrix framework and logarithmic norm. By representing color images in a pure quaternion matrix, the proposed method preserves image structure while approximating rank efficiently. Furthermore, leveraging factorization and truncation techniques based on the logarithmic norm ensures effective image recovery. The alternate minimization framework facilitates optimization of these techniques, with rigorous mathematical analysis validating convergence. Numerical examples provided in the paper demonstrate the efficacy of the proposed algorithms in practice, highlighting their potential for various applications in image processing and computer vision.