1 Introduction

In the last decade a new paradigm has taken hold in signal and image processing: compressed sensing. The possibility of reconstructing a signal by only a few measurements under the condition that the representation in a given basis or frame is sparse has allowed to look at new methods and algorithms. Although sparsity constraints are directly connected only with non-convex optimization the uniqueness property shown by Candes et al. [10] allows the application of simple convex algorithms, such as linear programming. In parallel, during the last 15 years quaternion-valued functions have been used to represent color images, in particular RGB images [11]. Hereby, representations using the discrete and continuous quaternionic Fourier transforms play a particular important role [4]. In this paper we will show that it is possible to combine both approaches, i.e., to use sparse sampling methods in the quaternionic representation of color images. This is a priori not so evident due to the non-commutative structure of the quaternions. For instance, it is not clear that quaternionic sampling matrices will fulfil the RIP condition as the traditional condition for compressed sensing. Therefore, we intend to go back to the origins of compressed sensing and follow the original approach by Rauhut [28] to show that quaternionic color images allow sparse reconstruction by means of an \(l_1\)-minimization with high probability. As the principal example of a quaternionic Fourier sampling matrix we base ourselves on the discrete version of the classic quaternionic Fourier transform by Sommen [31]. The reason is twofold. First of all, it represents all the important difficulties coming from the quaternionic structure so that it can easily adapted to other cases. Second, it also represents the quaternionic Fourier transform which is most used in applications, this being the main reason why it has been re-invented so many times. We will show that it is possible to follow Rauhut’s ideas despite the fact that in the quaternionic case non-commutativity of the multiplication does not allow a direct application. Furthermore, we will give some examples of sparse sampling of images given as quaternionic signals. It shows that the practical application of these ideas in the field of hypercomplex analysis is possible. As a final remark we would like to point out that the ideas in the present paper can be also applied to the wider area of interpolation of quaternionic valued functions.

2 Preliminaries

In this section we recall some basic notions about quaternions and quaternionic exponentials which will be needed in the sequel.

The algebra of quaternions \({\mathbb {H}}\) is a four-dimensional real associative division algebra with unit 1 spanned by the elements \(\{ {\mathbf {I}}, {\mathbf {J}}, {\mathbf {K}}\}\) endowed with the relations

$$\begin{aligned} {\mathbf {I}}^2 = {\mathbf {J}}^2 = {\mathbf {K}}^2 = {\mathbf {I}}{\mathbf {J}}{\mathbf {K}}= -1, \quad {\mathbf {I}}{\mathbf {J}}= -{\mathbf {J}}{\mathbf {I}}= {\mathbf {K}}. \end{aligned}$$

This algebra is non-commutative. The real and imaginary parts of a given quaternion

$$\begin{aligned} q= x_0 1 + x_1 {\mathbf {I}}+ x_2 {\mathbf {J}}+ x_3 {\mathbf {K}}\end{aligned}$$

are defined as \({{\mathrm{Re}}}(q) = q_0 := x_0,\) and \({{\mathrm{Im}}}(q) := x_1 {\mathbf {I}}+ x_2 {\mathbf {J}}+ x_3 {\mathbf {K}}.\) Therefore, we have the natural embeddings of the real numbers and of \({\mathbb {R}}^3\) into quaternions given by

$$\begin{aligned} x_0 \in {\mathbb {R}} \rightarrow x_0 1 \in {\mathbb {H}} \quad \mathrm{and} \quad ( x_1, x_2, x_3) \in {\mathbb {R}}^3 \rightarrow x_1 {\mathbf {I}}+ x_2 {\mathbf {J}}+ x_3 {\mathbf {K}}\in {\mathbb {H}}. \end{aligned}$$

This leads to the identifications

$$\begin{aligned} {\mathbb {H}} \equiv {\mathbb {R}}^4, \quad {{\mathrm{Im}}}{\mathbb {H}} \equiv {\mathbb {R}}^3, \quad {{\mathrm{Re}}}{\mathbb {H}} \equiv {\mathbb {R}}, \end{aligned}$$

where \({{\mathrm{Im}}}{\mathbb {H}}\) is the three dimensional space of pure imaginary quaternions, and hence, \({\mathbb {H}} = {\mathbb {R}} \oplus {\mathbb {R}}^3.\)

The conjugation on \({\mathbb {H}}\) is an automorphism of \({\mathbb {H}}\) onto itself given by

$$\begin{aligned} q = x_0 + {{\mathrm{Im}}}q \quad \rightarrow \quad {{\overline{q}}} = x_0 - {{\mathrm{Im}}}q, \end{aligned}$$

together with the involution property

$$\begin{aligned} \overline{q p} = {{\overline{p}}} ~{{\overline{q}}}, \quad \forall p, q \in {\mathbb {H}}. \end{aligned}$$

A purely imaginary quaternion with absolute value 1 is called an imaginary unit. We denote the set of all imaginary units by \({\mathbb {S}}^2\), that is,

$$\begin{aligned} {\mathbb {S}}^2 = \left\{ x_1{\mathbf {I}}+ x_2{\mathbf {J}}+ x_3{\mathbf {K}}\in {\mathbb {H}}\,:\,\sum _{i=1}^{3}x_i^2=1\right\} . \end{aligned}$$

2.1 Inner and Outer Products

Given two quaternions \(q, p \in {\mathbb {H}}\) its quaternionic multiplication can be expressed in terms of the usual scalar and vector products on \({{\mathrm{Im}}}{\mathbb {H}} \sim {\mathbb {R}}^3\) by

$$\begin{aligned} q {\overline{p}}= & {} (q_0 + {{\mathrm{Im}}}q)(p_0 - {{\mathrm{Im}}}p) = \underbrace{q_0 p_0 + {{\mathrm{Im}}}q \cdot {{\mathrm{Im}}}p}_{\in {\mathbb {R}}}\\&+\, (\underbrace{ -q_0 {{\mathrm{Im}}}p + p_0 {{\mathrm{Im}}}q - {{\mathrm{Im}}}q \times {{\mathrm{Im}}}p}_{\in {{\mathrm{Im}}}{\mathbb {H}}}). \end{aligned}$$

Moreover, we have

$$\begin{aligned} \langle q, p \rangle = {{\mathrm{Re}}}(q {{\overline{p}}}), \end{aligned}$$

where \(\langle \cdot , \cdot \rangle \) corresponds to the Euclidean scalar product defined on \({\mathbb {H}} \sim {\mathbb {R}}^4,\) and the induced norm

$$\begin{aligned} \Vert q\Vert ^2 = {{\mathrm{Re}}}(q {{\overline{q}}}) =\langle q, q \rangle \end{aligned}$$

which satisfies the product rule \(\Vert q p\Vert = \Vert q\Vert ~\Vert p\Vert .\) Also, for every non-zero quaternion q it holds

$$\begin{aligned} q^{-1}= \frac{{{\overline{q}}}}{\Vert q \Vert ^2}. \end{aligned}$$

Furthermore, we can introduce the signum of a quaternion q defined via

$$\begin{aligned} \mathrm {sgn}\,q={\left\{ \begin{array}{ll}\frac{q}{\Vert q \Vert }, &{}\quad q\ne 0\\ 0, &{}\quad q=0. \end{array}\right. } \end{aligned}$$

Note that \(\mathrm {sgn}\,q\in {\mathbb {H}}.\) For \(\mathrm {sgn}\,q\) we have the following properties.

Lemma 2.1

Let \(q=q_0+q_1\mathbf{I}+q_2\mathbf{J}+q_3\mathbf{IJ}\) and \(p=p_0+p_1\mathbf{I}+p_2\mathbf{J}+p_3\mathbf{IJ}\) be two quaternions. It holds

  1. 1.

    \(\Vert \mathrm {sgn}\,p\Vert =1\), \(p\ne 0\);

  2. 2.

    \(\overline{\mathrm {sgn}\,p}\,\mathrm {sgn}\,p=1\), \(p\ne 0\);

  3. 3.

    \(p\,\overline{\mathrm {sgn}\,p}=\Vert p\Vert \);

  4. 4.

    \(\Vert q\Vert \,\Vert \mathrm {sgn}\,p\Vert =\Vert q\,\mathrm {sgn}\,p\Vert .\)

The proof is rather straightforward. For details we refer to [19].

For an invertible quaternion we have the following representation.

Theorem 2.2

Let q be an invertible quaternion. Then,

$$\begin{aligned} q = \Vert q\Vert \left[ \cos (\theta ) + \omega (q) \sin (\theta ) \right] , \end{aligned}$$

where \(\omega (q) = \frac{{{\mathrm{Im}}}q}{\Vert {{\mathrm{Im}}}q \Vert } \in {\mathbb {S}}^2 \subset {{\mathrm{Im}}}{\mathbb {H}}\) and \(\theta \in [0, \pi ]\) is such that \(\cos \theta = \frac{q_0}{\Vert q \Vert }\) and \(\sin \theta = \frac{\Vert {{\mathrm{Im}}}q\Vert }{\Vert q \Vert }.\)

This leads to the following definition of a quaternionic exponential (see, e.g. [20]):

Definition 2.3

Given a purely imaginary quaternion \(\mathbf {\omega } \in {\mathbb {S}}^2\subset {{\mathrm{Im}}}{\mathbb {H}},\) we define the quaternionic exponential \(e^{\mathbf {\omega } \theta }\) as

$$\begin{aligned} \theta \in {\mathbb {R}} \mapsto e^{\mathbf {\omega } \theta } := \cos (\theta ) + \mathbf {\omega } \sin (\theta ) \in {\mathbb {H}}. \end{aligned}$$
(2.1)

We list some of its properties.

  • \(\Vert e^{\omega \theta } \Vert =1;\)

  • \(\Vert e^{\omega \theta } \Vert \le e^{|\theta |};\)

  • given \(\omega _1, \omega _2 \in {\mathbb {S}}^2,\) we have

    $$\begin{aligned} e^{\omega _1 \theta _1} e^{\omega _2 \theta _2} = e^{\omega _1 \theta _1+\omega _2 \theta _2}, \quad \forall \theta _1, \theta _2\in [0,2\pi ],\end{aligned}$$

    only if \(\omega _1\omega _2 = \omega _2 \omega _1.\)

The last property shows the non-commutative character of quaternionic exponentials and presents one of the major problems in the application of the classic approach by Rauhut [28].

3 Sparse Sampling of Quaternionic Signals

3.1 The Setting

While there is quite a variety of possibilities to consider concrete quaternionic Fourier transforms [29] we will consider here decompositions with respect to the atoms \(\Psi _{k,l}(x,y)=e^{{\mathbf {I}}k x} e^{{\mathbf {J}} l y}\), \((k,l)\in {\mathbb {Z}}^2\). On the one side this corresponds to the original definition of a quaternionic Fourier transform by Sommen in 1982 [31] which has been reinvented several times since then. On the other side this quaternionic Fourier transform is also the principal Fourier transform to be used in applications in image processing (c.f. [6]). Furthermore, this example can also be easily adapted to the other cases. Let us denote by \(\prod _\rho \) the space of all quaternionic trigonometric polynomials of maximal order \(\rho \in {\mathbb {N}}_0\) in two real variables. The dimension of \(\prod _\rho \) is \(d:=(2\rho +1)^2\). Therefore, an element \(q\in \prod _\rho \) is of the form

$$\begin{aligned} q(x,y)=\displaystyle \sum _{(k,l)\in [-\rho ,\rho ]^2\cap {\mathbb {Z}}^2}e^{{\mathbf {I}}k x} e^{{\mathbf {J}}l y}c_{k,l}, \quad (x,y)\in [0,2\pi ]^2,~ c_{k,l}\in {\mathbb {H}}. \end{aligned}$$
(3.1)

We want to determine the coefficients \(c_{k,l}\) from a few given samples. To this end we consider the sequence of coefficients \(c:=(c_{k,l})_{k,l}\) as a vector (strictly speaking a tensor) such that c is supported on a set T which has a much smaller cardinality than the dimension of \(\prod _\rho \). That is to say, the finite combination in (3.1) is “sparse”. However, we do not require any information on T except its maximum size. Therefore, we introduce the set (not a linear space) \(\prod _\rho (M)\subset \prod _\rho \) of all polynomials of type (3.1) such that the sequence of their coefficients c has support on a set \(T\subset [-\rho ,\rho ]^2\cap {\mathbb {Z}}^2\) with \(|T|\le M\), i.e., \(q\in \prod _\rho (M)\) is of the form

$$\begin{aligned} q(x,y)= \displaystyle \sum _{(k,l)\in T\subset [-\rho ,\rho ]^2\cap {\mathbb {Z}}^2 }\displaystyle e^{{\mathbf {I}}k x} e^{{\mathbf {J}}l y}c_{k,l}. \end{aligned}$$

Furthermore, we consider a given sampling set \(X:=\{(x_{1},y_{1}), (x_{2},y_{2}), \ldots , (x_{N},x_{N})\}\) as a set of independent random variables having uniform distribution on \([0,2\pi ]^2\). Thus, the main objective is to reconstruct \(q\in \prod _\rho (M)\) from the samples \(f(x_i,y_i)\) at those N randomly chosen points.

The standard way of determining the coefficients \(c_{k,l}\) would be by applying a greedy algorithm, such as \(\ell _0\)-minimization or matching pursuit. It basically corresponds to searching for the atom with the largest coefficient in each step [36]. Unfortunately, such an algorithm is NP-hard with exponentially growing computational costs. Therefore, we would like to use a different algorithm, such as basis pursuit. This consists in the following non-linear method of reconstructing \(q\in \prod _\rho (M)\) from its sampled values \((q(x_{1},y_{1}), \ldots , q(x_{N},y_{N}))\). We minimize the \(\ell _1\)-norm of the Fourier coefficients \(c_{k,l}\),

$$\begin{aligned} ||(c_{k,l})||_1:=\sum _{(k,l)\in [-\rho ,\rho ]^2\cap {\mathbb {Z}}^2}|c_{k,l}|, \end{aligned}$$

under the constraint that the corresponding trigonometric polynomial matches q on the sampling points. In other words we solve the problem

$$\begin{aligned} ||(c_{k,l})||_1\rightarrow \min \qquad \mathrm {s.t.} \quad \sum _{(k,l)\in [-\rho ,\rho ]^2\cap {\mathbb {Z}}^2}e^{{\mathbf {I}}k x_i} e^{{\mathbf {J}}{\tilde{k}} y_i}c_{k,l}=q(x_i,y_i),\quad i=1,\ldots ,N.\nonumber \\ \end{aligned}$$
(3.2)

We can directly point out that basis pursuit [12] can be performed with efficient convex optimization techniques [5] via linear programming. This makes it a much faster algorithm than the simple \(\ell _0\)-minimization. But, of course, the principal question is if we indeed can replace the \(\ell _0\)-minimization by an \(\ell _1\)-minimization.

3.2 Equivalence Between \(\ell _0\)-minimization and Basis Pursuit

We would like to use basis pursuit [12] to reconstruct f completely by determining all coefficients \(c_{k,l}\), \((k,l)\in [-\rho ,\rho ]^2\cap {\mathbb {Z}}^2\). Unfortunately, a condition like the restrictive isometry property (RIP [1]) which is valid in all cases (including the worst) is too restrictive in our case. Therefore, we are going to give a probabilistic answer. The theorems below are analogues to the theorems given in Candes et al. [10] and in Rauhut [28].

Theorem 3.1

Assume \(f\in \prod _\rho (M)\) with some sparsity \(M\in {\mathbb {N}}\). Let \((x_{1},y_{1}), \ldots , (x_{N},y_{N}) \in [0,2\pi ]^2\) be independent random variables having uniform distribution on \([0,2\pi ]^2\). Choose \(n\in {\mathbb {N}},\) \(\beta >0,\) \(\kappa >0\) and \(K_1,\ldots ,K_n\in {\mathbb {N}}\) such that

$$\begin{aligned} a:= \sum _{m=1}^{n}\beta ^{n/K_m}<1 \quad \mathrm {and} \quad \frac{\kappa }{1-\kappa }\le \frac{1-a}{1+a}M^{-3/2}. \end{aligned}$$
(3.3)

Set \(\theta :=N/M\). Then with probability at least

$$\begin{aligned} 1-\left( d\beta ^{-2n}\sum _{m=1}^{n}G_{2mK_m}(\theta ) +\kappa ^{-2}\,M\,G_{2n}(\theta )\right) , \end{aligned}$$
(3.4)

where \(G_n(\theta )=\theta ^{-n}\sum _{k=1}^{\lfloor \frac{n}{2}\rfloor }S_2(n,k)\theta ^k\) and \(S_2(n,k)\) denote the Stirling numbers of the second kind, f can be reconstructed exactly from its sampled values \(f(x_{1},y_{1}),\ldots ,f(x_{N},y_{N})\) by solving the \(\ell _1\)-minimization problem

$$\begin{aligned} \min \left||(c_{k,l})\right||_1:= & {} \min \displaystyle \sum _{(k,l)\in [-\rho ,\rho ]^2\cap {\mathbb {Z}}^2}|c_{k,l}|,\nonumber \\ s.t. \quad f(x_{i},y_{i}):= & {} \displaystyle \sum _{(k,l)\in [-\rho ,\rho ]^2\cap {\mathbb {Z}}^2}e^{{\mathbf {I}}k x_i} e^{{\mathbf {J}}l y_i}c_{k,l}, \quad i=1,\dots ,N. \end{aligned}$$
(3.5)

While the above theorem provides exact constants we can give a version of the theorem which is somewhat easier to apply.

Theorem 3.2

There exists an absolute constant \(C>0\) such that the following is true. Assume \(f\in \prod _\rho (M)\) for some sparsity \(M\in {\mathbb {N}}\). Let \((x_{1},y_{1}),\ldots ,(x_{N},y_{N}) \in [0,2\pi ]^2\) be independent random variables having the uniform distribution on \([0,2\pi ]^2\). If for some \(\epsilon >0\) it holds

$$\begin{aligned} N\ge CM\log (d/\epsilon ) \end{aligned}$$
(3.6)

then with probability at least \(1-\epsilon \) the function f can be recovered from its sampled values \(f(x_{i},y_{i})\), \(\; i=1,\ldots , N\), by solving the minimization problem (3.5).

For our third theorem about reconstructing a sparse trigonometric polynomial from random samples we need some additional explanations. For convenience, we will use the notation \(\prod _T\) as the set of all trigonometric polynomials whose coefficients are supported on T and we model the set \(T\subset [-\rho ,\rho ]^2\cap {\mathbb {Z}}^2\) of non-vanishing Fourier coefficients as random. Here we will treat the “generic” case and not arbitrary sparse polynomials. We expect to get better estimates for the probability of exact reconstruction.

Consider \(0<\tau <1\). The probability that an index \((k,l)\in [-\rho ,\rho ]^2\cap {\mathbb {Z}}^2\) belongs to T is assumed to be

$$\begin{aligned} {\mathbb {P}}((k,l)\in T) = \tau \end{aligned}$$
(3.7)

independently of each (kl). The choice of the sampling set X is stochastically independent. Furthermore, we also assume the same for the choice of T. So the length of T, i.e., |T|, has a binomial distribution and the expected size of |T| is, obviously, \({\mathbb {E}}|T|=\tau d = \tau (2\rho +1)^2.\)

We will define now two auxiliary notations to announce the third theorem. For \(n\in {\mathbb {N}}\) we define

$$\begin{aligned} W(n,N,{\mathbb {E}}|T|,d) := N^{-2n}\sum _{t=1}^{\min \{n,N\}}\frac{N!}{(N-t)!}\sum _{s=2}^{2n}({\mathbb {E}}|T|)^s \sum _{R=0}^{\min \{s,t\}-1} Q(2n,t,s,R)d^{-R}\nonumber \\ \end{aligned}$$
(3.8)

and for \(K,m\in {\mathbb {N}}\) we define

$$\begin{aligned}&Z(K,m,N,{\mathbb {E}}|T|,d) \nonumber \\&\quad := N^{-2Km} \sum _{t=1}^{\min \{Km,N\}}\frac{N!}{(N-t)!}\sum _{s=1}^{2Km}({\mathbb {E}}|T|)^s \sum _{R=0}^{\min \{s,t\}} Q^*(2K,m,t,s,R)d^{-R}.\qquad \quad \end{aligned}$$
(3.9)

By using Basis Pursuit our theorem is given as follows.

Theorem 3.3

Let \((x_{1},y_{1}), \ldots , (x_{N},y_{N}) \in [0,2\pi ]^2\) be independent random variables having the uniform distribution on \([0,2\pi ]^2\). Further assume that \(T:=T_1\times T_2\) (which is an independent set of \((x_{i_1},y_{i_1}), \ldots , (x_{i_N},y_{i_N})\)) is a random subset of modeled by \({\mathbb {P}}((k,l)\in T)=\tau \) such that \({\mathbb {E}}|T|=\tau d\ge 1.\) Choose \(n\in {\mathbb {N}},\) \(\alpha ,\beta >0\) and \(K_1,\ldots ,K_n\in {\mathbb {N}}\) such that

$$\begin{aligned} a:= \displaystyle \sum _{m=1}^{n}\beta ^{n/K_m}<1 \qquad and \qquad \frac{k}{1-k}\le \frac{1-a}{1+a}((\alpha +1){\mathbb {E}}|T|)^{-3/2}. \end{aligned}$$
(3.10)

Then with probability at least

$$\begin{aligned} 1-\left( \kappa ^{-2}W(n,N,{\mathbb {E}}|T|,d)+\beta ^{-2n}d\sum _{m=1}^{n} Z(K_m,m,N,{\mathbb {E}}|T|,d)+\exp \left( -\frac{3\alpha ^{2}}{6+2\alpha }{\mathbb {E}}|T|\right) \right) \nonumber \\ \end{aligned}$$
(3.11)

any \(f\in \prod _T\subset \prod _\rho (|T|)\) can be reconstructed exactly from its sample values \(f(x_{1},y_{1}), \ldots , f(x_{N},y_{N})\) by solving the minimization problem (3.5).

In the next section we will give the proof of these theorems.

4 Proof of the Main Theorems

First of all, we need to introduce some auxiliary notations. We abbreviate \({\mathbb {Z}}_\rho ^2=[-\rho ,\rho ]^2\cap {\mathbb {Z}}^2\) and denote by \(\ell _2({\mathbb {Z}}_\rho ^2)\), \(\ell _2(T)\), \(\ell _2(X)\) the \(\ell _2-\)spaces of sequences indexed by our discrete sets \({\mathbb {Z}}_\rho ^2\), T, and X, respectively. We also need the sampling operator \({\mathcal {F}}_{X}:\ell _2({\mathbb {Z}}_\rho ^2)\rightarrow \ell _2(X)\) given by

$$\begin{aligned} \left( {\mathcal {F}}_{X}c_{k,l}\right) (x,y):= \displaystyle \sum _{(k,l)\in {\mathbb {Z}}_\rho ^2}\left( e^{{\mathbf {I}}k x} e^{{\mathbf {J}}l y}c_{k,l}\right) ,\quad (x,y)\in X. \end{aligned}$$

By \({\mathcal {F}}_{TX}\) we denote its restriction to sequences with support only on T. Hence, \({\mathcal {F}}_{TX}\) is an operator acting from \(\ell _2(T)\) in \(\ell _2(X).\) Also, we have to consider their adjoint operators, \({\mathcal {F}}^*_{X}:\ell _2(X)\rightarrow \ell _2({\mathbb {Z}}_\rho ^2)\) and \({\mathcal {F}}^*_{TX}:\ell _2(X)\rightarrow \ell _2(T).\)

The next lemma is the fundamental lemma on which our subsequent investigations will be based.

Lemma 4.1

Let \(c_{}\in \ell _2({\mathbb {Z}}_\rho ^2)\) and \(T:=\mathrm {supp}\,c_{}\). Furthermore, let us assume that \({\mathcal {F}}_{TX}:\ell _2(T)\rightarrow \ell _2(X)\) is injective and suppose that there exists a \(P\in \ell _2({\mathbb {Z}}_\rho ^2)\) with the following properties:

  1. (i)

    \(P_{k,l}=\mathrm {sgn}\,\left( c_{k,l}\right) _{k,l}\) for all \((k,l)\in T,\)

  2. (ii)

    \(\left| P_{k,l}\right| <1\) for all \((k,l)\notin T\),

  3. (iii)

    there exists a \(\lambda \in \ell _2(X)\) such that \(P={\mathcal {F}}^*_{X}\lambda \).

Then \(c_{}\) is the unique minimizer to the problem (3.5).

This lemma is the fundamental lemma for [10, 28]. Since unlike these cases we are working here with quaternion-valued vectors, i.e., a non-commutative structure, we are going to present a proof for our case.

Proof

Let us assume \(X\ne \emptyset \) and \(c\ne 0\) to exclude the trivial cases. Furthermore, let us suppose that the vector P exist. Let r be any vector different to c with \({\mathcal {F}}_X r={\mathcal {F}}_X c.\) Consider \(q:=r-c\), then \({\mathcal {F}}_X q\) vanishes on X. This means that for \(r_{k,l}\), \((k,l)\in T\), we have the following estimate

$$\begin{aligned} |r_{k,l}|= & {} |c_{k,l}+q_{k,l}|=|(c_{k,l}+q_{k,l})\overline{\mathrm {sgn}\,c_{k,l}}\,\mathrm {sgn}\,c_{k,l}|\\= & {} \left| \left( c_{k,l}\,\overline{\mathrm {sgn}\,c_{k,l}}+q_{k,l}\,\overline{\mathrm {sgn}\,c_{k,l}}\right) \mathrm {sgn}c_{k,l}\right| \\= & {} |\,|c_{k,l}|+q_{k,l}\,\overline{\mathrm {sgn}\,c_{k,l}}||\overline{\mathrm {sgn}c_{k,l}}|\\= & {} |\,|c_{k,l}|+q_{k,l}\,\overline{\mathrm {sgn}\,c_{k,l}}|\ge |c_{k,l}|+\mathrm {Re}\left( q_{k,l}\,\overline{\mathrm {sgn}\,c_{k,l}}\right) \\= & {} |c_{k,l}|+\mathrm {Re}\left( q_{k,l}\,\overline{P_{k,l}}\right) . \end{aligned}$$

Thus, for any \((k,l)\in T\) we have \(|c_{k,l}|+\mathrm {Re}\left( q_{k,l}\,\overline{P_{k,l}}\right) \le |r_{k,l}|.\) Otherwise, for \((k,l)\not \in T\) we have \(\mathrm {Re}\left( q_{k,l}\,\overline{P_{k,l}}\right) \le |q_{k,l}| = |r_{k,l}|\) since \(|P_{k,l}|<1.\) Thus

$$\begin{aligned} ||r||_{1}\ge ||p||_{1}+\sum _{(k,l)\in [-\rho ,\rho ]^2\cap {\mathbb {Z}}^2}\mathrm {Re}\left( q_{k,l}\,\overline{P_{k,l}}\right) . \end{aligned}$$

Now, from condition (iii) we get

$$\begin{aligned} \sum _{(k,l)\in {\mathbb {Z}}^2_\rho }\mathrm {Re}\left( q_{k,l}\,\overline{P_{k,l}}\right)= & {} \mathrm {Re}\left( \sum _{(k,l)\in {\mathbb {Z}}^2_\rho }q_{k,l}\,\overline{\left( {\mathcal {F}}^*_{X}\lambda \right) _{k,l}}\right) \nonumber \\= & {} \mathrm {Re}\left( \sum _{i=1}^{N}\,\left( {\mathcal {F}}_{X}\,q\right) (x_{i},y_{i})\overline{\lambda (x_{i},y_{i})}\right) =0 \end{aligned}$$
(4.1)

whereas \({\mathcal {F}}_{X}\,q\) vanishes. Thus, \({\mathcal {F}}_X P\) is supported on X and \({\mathcal {F}}_X{q}\) vanishes on X. Therefore, \(||r||_{\ell _1}\ge ||c||_{\ell _1}\). The equality holds when \(||q_{k,l}||=\mathrm {Re}\left( q_{k,l}\,\overline{P_{k,l}}\right) \) for all \((k,l)\not \in T.\) Since \(||P_{k,l}||<1\), this forces q to vanish outside of T. Taking in account the injectivity of \({\mathcal {F}}_{TX}\) we have that since \({\mathcal {F}}_X q\) vanishes on X, q vanishes identically and we get \(r=c.\) Thus, this shows that c is unique minimizer \(c^{\sharp }\) to the problem (3.5). \(\square \)

For the invertibility we can state the following obvious lemma.

Lemma 4.2

If \(N\ge |T|\) then \({\mathcal {F}}_{TX}\) is injective almost surely.

We need to show now that with high probability there exists a P with the required properties so that we can apply Lemma 4.1. To show this we proceed as in [28] but with the necessary modifications for the quaternionic case.

We introduce the restriction operator \(R_T:\ell _2({\mathbb {Z}}^2_\rho )\rightarrow \ell _2(T)\), \(R_Tc_{k,l}=c_{k,l}\) for \((k,l)\in T\). Its adjoint \(R^*_T=E_T:\ell _2(T)\rightarrow \ell _2({\mathbb {Z}}^2_\rho )\) is the operator that extends a vector outside T by zero, i.e., \((E_Td)_{k,l}=d_{k,l}\) for \((k,l)\in T\) and \((E_Td)_{k,l}=0\) otherwise.

Now, for the moment let us assume that \({\mathcal {F}}^*_{TX}{\mathcal {F}}_{TX}:\ell _2(T)\rightarrow \ell _2(T)\) is invertible. This is true almost surely under the condition of Lemma 4.2 since \({\mathcal {F}}_{TX}\) is then injective. In this case we can represent P explicitly in the form

$$\begin{aligned} P:={\mathcal {F}}^*_{X}{\mathcal {F}}_{TX}({\mathcal {F}}^*_{TX}{\mathcal {F}}_{TX})^{-1}R_T\mathrm {sgn}(c), \end{aligned}$$

It is easy to check that P fulfills property (i) and property (iii) in Lemma 4.1 with

$$\begin{aligned} \lambda :={\mathcal {F}}_{TX}({\mathcal {F}}^*_{TX}{\mathcal {F}}_{TX})^{-1}R_T\mathrm {sgn}\left( c_{k,l}\right) _{k,l} \in \ell _2(X). \end{aligned}$$

This means that we have to show that P fulfills (ii) of Lemma 4.1 with high probability.

To this end we represent P in terms of the following operators

$$\begin{aligned} H:=\ell _2(T)\rightarrow \ell _2(Z^2_\rho ),\quad H_0:=\ell _2(T)\rightarrow \ell _2(T), \end{aligned}$$

where

$$\begin{aligned} H:=N E_{T}-{\mathcal {F}}_{TX}^*{\mathcal {F}}_{TX}\quad H_0:=N I_{T}-{\mathcal {F}}_{TX}^*{\mathcal {F}}_{TX}, \end{aligned}$$

and \(I_T\) denotes the identity operator on \(\ell _2(T)\). Furthermore, the matrix \(H_0\) is self-adjoint, and H acts on c as

$$\begin{aligned} \left( H_{}c\right) _{k,l}=-\sum _{i=1}^{N}\sum _{\begin{array}{c} ({\tilde{k}},{\tilde{l}})\in T\\ ({\tilde{k}},{\tilde{l}})\ne (k,l) \end{array}}e^{-{\mathbf {J}}l y_{i}}e^{{\mathbf {I}}({\tilde{k}}-k) x_{i}} e^{{\mathbf {J}}{\tilde{l}} y_{i}}\,c_{{\tilde{k}},{\tilde{l}}}, \end{aligned}$$
(4.2)

By using H and \(H_0\) we can rewrite P as

$$\begin{aligned} P = \left[ \left( NE_T-H\right) \left( NI_T-H_0\right) ^{-1}\right] R_T\mathrm {sgn}(c). \end{aligned}$$
(4.3)

After joining the two terms into the last representation of P the result (4.3) looks very similar to the one of Rauhut [28], so we just adapt his approach below. Taking into account that we are interested in property (ii) from Lemma 4.1 we only need the values of P on \(T^c={\mathbb {Z}}^2_\rho {\setminus } T.\) Since \(R_{T^c}E_T=0\) we can write these components of P as

$$\begin{aligned} P_{k,\ell }=\left( -N^{-1}R_{T^c} H\bigg (I_T-N^{-1}H_0\bigg )^{-1}R_T\mathrm {sgn}(c)\right) _{k,\ell }\quad \mathrm {for\,all} \quad (k,\ell )\, \in T^{c}.\nonumber \\ \end{aligned}$$
(4.4)

Let us take a closer look at the operator \(\left( I_T-N^{-1}H_0\right) ^{-1}\). We would like to apply the von Neumann series, but that would require \(\Vert N^{-1}H_0\Vert <1\) which we cannot ensure. Therefore, let us assume that we can ensure \(\Vert (N^{-1}H_0)^n\Vert <1\) for some \(n\in {\mathbb {N}}\). Then we get

$$\begin{aligned} \left( I_T-\left( N^{-1}H_0\right) ^n\right) ^{-1}=I_T+A_n \end{aligned}$$

with

$$\begin{aligned} A_n:=\sum _{r=1}^{\infty }\left( N^{-1}H_0\right) ^{rn} \end{aligned}$$
(4.5)

and, consequently,

$$\begin{aligned} \left( I_T-N^{-1}H_0\right) ^{-1} =(I_T+A_n)\displaystyle \sum _{m=0}^{n-1}\left( N^{-1}H_0\right) ^m. \end{aligned}$$

This means that on the complement of T we have

$$\begin{aligned} R_{T^c}P=-N^{-1}H(I_T+A_n) \left( \displaystyle \sum _{m=0}^{n-1}\left( N^{-1}H_0\right) ^m\right) R_T\mathrm {sgn}(c)=-\left( P^{(1)}+P^{(2)}\right) , \end{aligned}$$

where

$$\begin{aligned} P^{(1)}=S_n\mathrm {sgn}(c) \quad \mathrm {and}\quad P^{(2)}=\frac{1}{N}HA_nR_T(I+S_{n-1})\mathrm {sgn}(c), \end{aligned}$$

with \(S_n:=\displaystyle \sum \nolimits _{m=0}^{n-1}\big (N^{-1}HR_T\big )^m\).

Since we would like to know when property (ii) fails we need to estimate the probability \({\mathbb {P}}(\displaystyle \mathrm {sup}_{(k,l)\in T^c}|P_{k,l}|\ge 1)\) by estimating \(P^{(1)}\) and \(P^{(2)}\) individually. To make this splitting we consider two arbitrary numbers \(a_1, a_2>0\) satisfying \(a_1+a_2=1\). Then we have

$$\begin{aligned} {\mathbb {P}}\big (\mathrm {sup}_{(k,l)\in T^c}|P_{k,l}|\ge 1\big )\le {\mathbb {P}}\left( \big \{\mathrm {sup}_{(k,l)\in T^c}|P_{k,l}^{(1)}|\ge a_1\big \} \cup \big \{\mathrm {sup}_{(k,l)\in T^c}|P_{k,l}^{(2)}|\ge a_2\big \}\right) .\nonumber \\ \end{aligned}$$
(4.6)

Thus, the first term can be estimated by

$$\begin{aligned} {\mathbb {P}}\big (|P_{k,l}^{(1)}|\ge a_1\big )= & {} {\mathbb {P}}\left( \left| \left( S_n\mathrm {sgn}(c)\right) _{k,l}\right| \ge a_1\right) \nonumber \\\le & {} {\mathbb {P}}\left( \sum _{m=1}^{n}|((N^{-1}HR_T)^m\mathrm {sgn}(c))_{k,l}|\ge a_1 \right) =:{\mathbb {P}}(E_{k,l}). \end{aligned}$$
(4.7)

while for the second term we have

$$\begin{aligned} \mathrm {sup}_{(k,\ell )\in T^c}|P_{k,l}^{(2)}|\le & {} \left||P^{(2)}\right||_{\infty }\nonumber \\\le & {} \left||N^{-1}HA_n\right||_{\ell ^{\infty }(T)\rightarrow \ell ^{\infty }(\Lambda )}( 1+\left||R_TS_{n-1}\mathrm {sgn}(c)\right||_{\ell ^{\infty }(T)}),\qquad \end{aligned}$$
(4.8)

where \(\Lambda :=\ell ^{\infty }({\mathbb {Z}}^2_\rho )\) denotes the space of sequences indexed by \({\mathbb {Z}}^2_\rho \) with the supremum norm.

Let us first analyze the term \(\left||R_TS_{n-1}\mathrm {sgn}(c)\right||_{\ell ^{\infty }(T)}\). This term is the same as in (4.7) and we obtain

$$\begin{aligned} {\mathbb {P}}\left( |(S_{n-1}\mathrm {sgn}(c))_{k,\ell }|\ge a_1\right) \le {\mathbb {P}}\bigg (\sum _{m=1}^{n}|((N^{-1}HR_T)^m\mathrm {sgn}(c))_{k,\ell }|\ge a_1 \bigg )={\mathbb {P}}(E_{k,\ell }) \end{aligned}$$

Now, let us analyze the norm of the operator in (4.8). By definition we know that \(||A_n||_{\infty }=\mathrm {sup}_r\sum _s|(A_n)_{r,s}|\) seen as a matrix operator. We also know that

$$\begin{aligned} \left||N^{-1}HA_n\right||_{\infty }\le \left||N^{-1}H\right||_{\infty }\left||A_n\right||_{{\infty }}. \end{aligned}$$
(4.9)

with \(A_n=\displaystyle \sum \nolimits _{r=1}^{\infty }\left( N^{-1}H_0\right) ^{rn}.\)

The term \(\left||A_n\right||_{\infty }\) can be estimated using the Frobenius norm. For the Frobenius norm we have \(\left||A\right||_F^2:=Tr(AA^{*})=\sum _{r,s}|A_{r,s}|^2\), where \(Tr(AA^{*})\) denotes the trace of \(AA^{*}\). Let us assume that

$$\begin{aligned} \left||\left( N^{-1}H_0\right) ^n\right||_F\le \kappa <1. \end{aligned}$$
(4.10)

From the definition (4.5) of \(A_n\), it follows that

$$\begin{aligned} \left||A_n\right||_F=\left||\sum _{r=1}^{\infty }\bigg (N^{-1}H_0\bigg )^{rn}\right||_F \le \sum _{r=1}^{\infty }\left||(N^{-1}H_0)^n\right||_F^r\le \sum _{r=1}^{\infty }\kappa ^r=\frac{\kappa }{1-\kappa }. \end{aligned}$$

Furthermore, since \(A_n\) has |T| columns by Cauchy–Schwarz inequality we get

$$\begin{aligned} \left||A_n\right||_{\infty }^2\le \sup _i \sum _j|(A_n)_{i,j}|^2\le |T|\left||A_n\right||_F^2. \end{aligned}$$
(4.11)

Thus, from (4.10) and \(\left||S_{n-1}\mathrm {sgn}(c)\right||_{\infty }<a_1\) it follows

$$\begin{aligned} \mathrm {sup}_{(k,\ell )\in T^c}|P_{k,l}^{(2)}|\le (1+a_1)\frac{\kappa }{1-\kappa }\left| T\right| ^{\frac{3}{2}}. \end{aligned}$$
(4.12)

This leads to the condition

$$\begin{aligned} \frac{\kappa }{1-\kappa }\le \frac{a_2}{1+a_1}\left| T\right| ^{-\frac{3}{2}} \end{aligned}$$
(4.13)

to ensure \(\mathrm {sup}_{(k,l)\in T^c}|P_{k,l}^{(2)}|\le a_2\) as intended.

Also it follows from (4.13) that \(\kappa <1\) and \(\left| T\right| \ge 1\) (we can exclude the case \(T=\emptyset \) since it corresponds to the trivial case of \(f=0\) where \(\ell _1\)-minimization will obviously recover f).

Now, since in Theorem 3.1 |T| is deterministic and in Theorem 3.3 \(\left| T\right| \) is a random variable we have to proceed differently in each case. In the case where |T| is deterministic we have

$$\begin{aligned} {\mathbb {P}}\big (\mathrm {sup}_{(k,\ell )\in T^c}|P_{k,\ell }|\ge 1\big ) \le \sum _{(k,\ell )\in [-\rho ,\rho ]^2\cap {\mathbb {Z}}^2}{\mathbb {P}}(E_{k,\ell })+{\mathbb {P}} \left( \left||(N^{-1}H_0)^n\right||_F \ge \kappa \right) .\nonumber \\ \end{aligned}$$
(4.14)

where we use the condition in (4.13).

Consider the second case, where \(\left| T\right| \) is random by Theorem 3.3. If we assume

$$\begin{aligned} |T|\le (\alpha +1){\mathbb {E}}|T| \end{aligned}$$

with \(\alpha >0\) and also assume

$$\begin{aligned} \frac{\kappa }{1-\kappa }\le \frac{a_2}{1+a_1}((\alpha +1){\mathbb {E}}|T|)^{-3/2} \end{aligned}$$
(4.15)

then clearly (4.13) is satisfied and consequently

$$\begin{aligned} \sup _{t\in T^c}\left| P_{k,l}^{(2)}\right| \le a_2. \end{aligned}$$

This means that we get from (4.6) the following estimate

$$\begin{aligned} {\mathbb {P}}\big (\mathrm {sup}_{(k,l)\in T^c}|P_{k,l}|\ge 1\big )\le & {} {\mathbb {P}}\left( \bigcup _{(k,l)\in T^c}\bigg \{\left| P_{k,l}^{(1)}\right| \ge a_1\bigg \} \cup \big \{\left||R_T\mathrm {sgn}(c)\right||_{{\infty }}\ge a_1\big \}\right. \nonumber \\&\left. \cup \left\{ \left||(N^{-1}H_0)^n\right||_F\ge \kappa \right\} \cup \big \{\left| T\right| \ge (\alpha +1) {\mathbb {E}}\left| T\right| \big \}\right) \nonumber \\\le & {} {\mathbb {P}}\left( \bigcup _{(k,l)\in [-\rho ,\rho ]^2\cap {\mathbb {Z}}^2}E_{k,l}\cup \big \{\left||(N^{-1}H_0)^n\right||_F\ge \kappa \big \}\right. \nonumber \\&\left. \cup \left\{ \left| T\right| \ge (\alpha +1){\mathbb {E}}\left| T\right| \right\} \right) \nonumber \\\le & {} \sum _{(k,l)\in [-\rho ,\rho ]^2\cap {\mathbb {Z}}^2} {\mathbb {P}}(E_{k,l})+{\mathbb {P}}\left( \left||(N^{-1}H_0)^n\right||_F\ge \kappa \right) \nonumber \\&+\,{\mathbb {P}}\left( \left| T\right| \ge (\alpha +1){\mathbb {E}}\left| T\right| \right) . \end{aligned}$$
(4.16)

Note that |T| is the sum of independent random variables. For the third term of (4.16), we obtain

$$\begin{aligned} {\mathbb {P}}\big (\left| T\right|\ge & {} \alpha {\mathbb {E}}\left| T\right| +{\mathbb {E}}\left| T\right| \big )\le \exp \left( -3(\alpha {\mathbb {E}}\left| T\right| )^2/(6{\mathbb {E}}\left| T\right| +2(\alpha {\mathbb {E}}\left| T\right| ))\right) \nonumber \\= & {} \exp \left( -\frac{3\alpha ^2}{6+2\alpha }{\mathbb {E}}\left| T\right| )\right) . \end{aligned}$$
(4.17)

Therefore, in the next two sections we have to estimate the quantities \({\mathbb {P}}(E_{k,\ell })\) and \({\mathbb {P}}(\Vert {(N^{-1}H_0)^n}\Vert _F\ge \kappa ),\) respectively.

4.1 Analysis of the Powers of \(H_0\)

The objective of this section is to estimate the second term in (4.16) and (4.14). To this end we need to estimate Frobenius norm of our random matrix \(H_0\). For this propose we will estimate \(\left||H_0^n\right||_F^2\) by means of Markov’s inequality. We will start with some auxiliary lemmas which will be needed for the proof of Lemma 4.5 in which we will determine the expectation value towards the random sampling set \(X=\{(x_{1},y_{1}),\ldots ,(x_{N},y_{N})\}.\)

Lemma 4.3

Consider \(\theta _r\in {\mathbb {R}}\) and \({\underline{a}}\in \{-1,1\}.\) For

$$\begin{aligned} \lambda (a_r,\theta _r)={\left\{ \begin{array}{ll}\cos \theta _r, &{}\quad a_r=+1\\ \sin \theta _r, &{}\quad a_r=-1, \end{array}\right. } \end{aligned}$$

it holds

$$\begin{aligned} 2^{n-1}\prod _{r=1}^{n}\lambda (a_r,\theta _r)= \sum _{\alpha _2,\ldots ,\alpha _n\in \{-1,1\}}\alpha _2^{\delta _2}\ldots \alpha _n^{\delta _n}\lambda \left( \prod _{j=1}^{n}a_j,\theta _1+\alpha _2\theta _2+\cdots +\alpha _n\theta _n\right) \end{aligned}$$

such that \(\delta _1:=0\) and

$$\begin{aligned} \delta _n:=\delta \left( \prod _{j=1}^{n-1}a_j,a_n\right) \end{aligned}$$
(4.18)

with

$$\begin{aligned} \delta (a_m,a_{m+1})={\left\{ \begin{array}{ll}\frac{|a_m|-a_m}{2}, &{}\quad a_m\cdot a_{m+1}=+1\\ \frac{|a_m|+a_m}{2}, &{}\quad a_m\cdot a_{m+1}=-1. \end{array}\right. } \end{aligned}$$
(4.19)

Proof

The proof is given by mathematical induction. In the case of \(n=1\) we obtain \(\lambda (a_1,\theta _1)\) \(=\sum _{\alpha _1\in \{-1,1\}}\lambda (a_1,\theta _1)\) \(=\lambda (+1,\theta _1)+\lambda (-1,\theta _1)=\cos \theta _1+{\mathbf {I}}\sin \theta _1.\) Now, let us assume that our equality holds for some natural number n. We have to show that it holds for \(n+1\). Here, we can state

$$\begin{aligned} 2^{n}\prod _{r=1}^{n+1}\lambda (a_r,\theta _r)= & {} 2^{n-1}\prod _{r=1}^{n}\lambda (a_r,\theta _r)\cdot \lambda (a_{n+1},\theta _{n+1})\\= & {} \sum _{\alpha _2,\ldots ,\alpha _n\in \{-1,1\}}\alpha _2^{\delta _2}\ldots \alpha _n^{\delta _n}\lambda \nonumber \\&\times \left( \prod _{j=1}^{n}a_j,\theta _1 +\alpha _2\theta _2+\cdots +\alpha _n\theta _n\right) \cdot \lambda (a_{n+1},\theta _{n+1})\\= & {} \sum _{\alpha _2,\ldots ,\alpha _n,\alpha _{n+1}\in \{-1,1\}}\alpha _2^{\delta _2}\ldots \alpha _n^{\delta _n}\cdot \alpha _{n+1}^{\delta _{n+1}}\lambda \nonumber \\&\times \left( \prod _{j=1}^{n+1}a_j,\theta _1 +\alpha _2\theta _2+\cdots +\alpha _n\theta _n+\alpha _{n+1}\theta _{n+1}\right) . \end{aligned}$$

\(\square \)

Lemma 4.4

Let \(n\in {\mathbb {N}}.\) It holds

$$\begin{aligned} \prod _{r=1}^{n}\bigg (e^{{\mathbf {J}}A_r}\cos \theta _r+e^{{\mathbf {J}}C_r}\sin \theta _r{\mathbf {I}}\bigg )=\sum _{{\underline{a}}\in \{-1,1\}^n}e^{{\mathbf {J}}\sum _{r=1}^{n}{\tilde{\sigma }}(a_r)}{\mathbf {I}}^{\frac{1-\prod _{j=1}^{n}a_j}{2}}\prod _{r=1}^{n}\lambda (a_r,\theta _r),\qquad \quad \quad \end{aligned}$$
(4.20)

where

$$\begin{aligned} {\tilde{\sigma }}({\underline{a}},r)={\left\{ \begin{array}{ll}(-1)^{\pi (r)}A_r, &{}\quad a_r=+1\\ (-1)^{\pi (r)}C_r, &{}\quad a_r=-1, \end{array}\right. }\quad \lambda (a_r,\theta _r)={\left\{ \begin{array}{ll}\cos \theta _r, &{}\quad a_r=+1\\ \sin \theta _r, &{} \quad a_r=-1, \end{array}\right. } \end{aligned}$$

such that \(\pi (1)=0\) and \(\pi (n)=\sum _{j=1}^{n-1}\frac{|a_j|-a_j}{2}\) \((n>0)\) which counts the number of \(\sin \)’s in the vector \({\underline{a}}=({\underline{a}}_1,\ldots ,{\underline{a}}_n)\in \{-1,1\}^n\).

Proof

Again, the proof will be given by mathematical induction.

For the case \(n=1\) we have

$$\begin{aligned} e^{{\mathbf {J}}A_1}\cos \theta _1+e^{{\mathbf {J}}C_1}\sin \theta _1{\mathbf {I}}=\sum _{{\underline{a}}\in \{-1,1\}}e^{{\mathbf {J}}{\tilde{\sigma }}(a_1)}\sigma (a_1)=e^{{\mathbf {J}}{\tilde{\sigma }}(+1)}\sigma (+1)+e^{{\mathbf {J}}{\tilde{\sigma }}(-1)}\sigma (-1), \end{aligned}$$

where \({\tilde{\sigma }}(+1)=(-1)^{\pi (1)}A_1=A_1\) and \({\tilde{\sigma }}(-1)=(-1)^{\pi (1)}C_1=C_1.\) Note that we have \(\pi (1)=0\) as initial condition. For the induction step, let \(n\in {\mathbb {N}}\) be given and suppose (4.20) is true for n. Then

$$\begin{aligned}&\prod _{r=1}^{n+1}\bigg (e^{{\mathbf {J}}A_r}\cos \theta _r+e^{{\mathbf {J}}C_r}\sin \theta _r{\mathbf {I}}\bigg )\\&\quad =\prod _{r=1}^{n}\bigg (e^{{\mathbf {J}}A_r}\cos \theta _r+e^{{\mathbf {J}}C_r}\sin \theta _r{\mathbf {I}}\bigg )\left( e^{{\mathbf {J}}A_{n+1}}\cos \theta _{n+1}+e^{{\mathbf {J}}C_{n+1}}\sin \theta _{n+1}{\mathbf {I}}\right) \\&\quad =\prod _{r=1}^{n}\sum _{{\underline{a}}\in \{-1,1\}^n}e^{{\mathbf {J}}\sum _{j=1}^{n}{\tilde{\sigma }}(a_r)}{\mathbf {I}}^{\frac{1-\prod _{j=1}^{n}a_j}{2}}\prod _{r=1}^{n}\lambda (a_r,\theta _r)\left( e^{{\mathbf {J}}A_{n+1}}\cos \theta _{n+1}+e^{{\mathbf {J}}C_{n+1}}\sin \theta _{n+1}{\mathbf {I}}\right) \\&\quad =\prod _{r=1}^{n}\sum _{{\underline{a}}\in \{-1,1\}^n}e^{{\mathbf {J}}\sum _{r=1}^{n}{\tilde{\sigma }}(a_r)+(-1)^{\pi (n+1)}A_{n+1}}{\mathbf {I}}^{\frac{1-\prod _{j=1}^{n}a_j}{2}}\prod _{r=1}^{n}\lambda (a_r,\theta _r)\cos \theta _{n+1}\\&\qquad +\prod _{r=1}^{n}\sum _{{\underline{a}}\in \{-1,1\}^n}e^{{\mathbf {J}}\sum _{r=1}^{n}{\tilde{\sigma }}(a_r)+(-1)^{\pi (n+1)}C_{n+1}}{\mathbf {I}}^{\frac{1-\prod _{j=1}^{n}a_j}{2}}\prod _{r=1}^{n}\lambda (a_r,\theta _r)\sin \theta _{n+1}{\mathbf {I}}\\&\quad =\prod _{r=1}^{n+1}\sum _{\underline{\underline{a}}\in \{-1,1\}^{n+1}}e^{{\mathbf {J}}\sum _{r=1}^{n+1}{\tilde{\sigma }}(a_r)}{\mathbf {I}}^{\frac{1-\prod _{j=1}^{n+1}a_j}{2}}\prod _{j=1}^{n+1}\lambda (a_r,\theta _r), \end{aligned}$$

such that \({\underline{\underline{a}}}=({\underline{a}},a_{n+1})\in \{-1,1\}^{n+1}\) and \(a_{n+1}=\pm 1.\) \(\square \)

Using the above lemmas we can now estimate the expectation value of the Frobenius norm of our random matrix \(H_0^n\). The need of these lemmas stems from the fact that we cannot use a direct approach via exponentials due to the non-commutativity of quaternions.

Lemma 4.5

It holds

$$\begin{aligned}&{\mathbb {E}}_X\left[ \left||H_0^n\right||_F^{2}\right] =\sum _{t=1}^{\min \{n,N\}}\displaystyle \frac{N!}{(N-t)!}\sum _{{\mathcal {A}}\in P(2n,t)}\sum _{\begin{array}{c} (k_1,l_1),\ldots ,(k_{2n},l_{2n})\in T\\ (k_j,l_j)\ne (k_{j+1},l_{j+1}) \end{array}}\\&\quad \times \prod _{A\subset {\mathcal {A}}}\sum _{\begin{array}{c} {\underline{a}}\in \{-1,1\}^r\\ a_1\ldots a_r=+1 \end{array}} \delta \left( \sum _{r\in A}{\tilde{\varrho }}(a_r)\right) 2^{1-|A|}\sum _{\alpha _2,\ldots ,\alpha _n\in \{-1,1\}}\alpha _2^{\delta _2}\ldots \alpha _{|A|}^{\delta _{|A|}}\delta \left( \theta _1+\sum _{r\in A}^{}\alpha _s\theta _s\right) , \end{aligned}$$

where \(\delta (n)\) denotes the Kronecker \(\delta _{0n}\) and \((k_{2n+1},l_{2n+1})=(k_1,l_1)\) and

$$\begin{aligned} {\tilde{\varrho }}({\underline{a}},l_r)={\left\{ \begin{array}{ll}(-1)^{\pi (r)}(l_{r+1}-l_{r}), &{}\quad a_r=+1\\ -(-1)^{\pi (r)}(l_{r+1}+l_{r}), &{}\quad a_r=-1,\end{array}\right. }\quad \theta _s=(k_{s+1}-k_{s}) \end{aligned}$$

such that \(\pi (|A|)=\sum _{j=1}^{|A|-1}\frac{|a_j|-a_j}{2}\).

Proof

Since \(H_0^n\) is self adjoint we have from the definition \({\mathbb {E}}_X[\left||H_0^n\right||_F^{2}]={\mathbb {E}}_X[\mathrm {tr}{H_0^{2n}}]\). Consider

$$\begin{aligned} H_0\left[ (k_1,l_1),(k_2,l_2)\right]= & {} (1-\delta _{k_1,k_2}\delta _{l_1,l_2})\sum _{i_1=1}^{N}\left[ e^{-{\mathbf {J}}l_1 y_{i_1}} e^{{\mathbf {I}}(k_2-k_1) x_{i_1}} e^{{\mathbf {J}}l_2 y_{i_1}} \right] ,\nonumber \\&(k_1,l_1),(k_2,l_2)\in T. \end{aligned}$$

Algebraic operations with quaternions constitute a big handicap because quaternions do not commute. Thus, we have to find a new strategy. What plays to our advantage are the anti-commutativity of the basis elements, that is, \({\mathbf {I}}{\mathbf {J}}=-{\mathbf {J}}{\mathbf {I}}\). Moreover, this means that \({\mathbf {I}}e^{{\mathbf {J}}y}=e^{-{\mathbf {J}}y}{\mathbf {I}}\). Thus, we can simplify our expression as

$$\begin{aligned} e^{-{\mathbf {J}}l_r y_{i_r}} e^{{\mathbf {I}}(k_{r+1}-k_r) x_{i_r}} e^{{\mathbf {J}}l_{r+1} y_{i_{r}}}= & {} e^{{\mathbf {J}}(l_{r+1}-l_r) y_{i_r}}\cos \left( [k_{r+1}-k_r]x_{i_r}\right) \nonumber \\&+\,\,e^{-{\mathbf {J}}(l_{r+1}+l_r) y_{i_r}}\,\sin \left( [k_{r+1}-k_r]x_{i_r}\right) {\mathbf {I}}.\quad \quad \end{aligned}$$
(4.21)

Furthermore, we can write the 2n-th power of the matrix \(H_0\) as follows

$$\begin{aligned}&H_0^{2n}\left[ (k_1,l_1),(k_{2n+1},l_{k_{2n+1}})\right] \\&\quad = \sum _{\begin{array}{l} (k_2,l_2),\ldots ,(k_{2n},l_{2n})\in T\\ (k_j,l_j)\ne (k_{j+1},l_{j+1}) \end{array}}H_0\left[ (k_1,l_1),(k_2,l_2)\right] \ldots H_0\left[ (k_{2n},l_{2n})\,(k_{2n+1},l_{k_{2n+1}})\right] \\&\quad = \sum _{\begin{array}{c} i_1=1\\ \vdots \\ i_{2n}=1 \end{array}}^{N}\sum _{\begin{array}{c} (k_2,l_2),\ldots ,(k_{2n},l_{2n})\in T\\ (k_j,l_j)\ne (k_{j+1},l_{j+1}), j=1,\ldots ,2n \end{array}} \left[ \prod _{r=1}^{2n}e^{-{\mathbf {J}}l_r y_{i_r}} e^{{\mathbf {I}}(k_{r+1}-k_r) x_{i_r}} e^{{\mathbf {J}}l_{r+1} y_{i_{r}}} \right] . \end{aligned}$$

Since we need the diagonal element of the 2n-th power of our matrix to calculate the norm \(\left||H_0^n\right||_F^{2}=\mathrm {Tr}H_0^{2n}\) we require \((k_{2n+1},l_{k_{2n+1}})=(k_1,l_1).\) Therefore, using the linearity of the expectation value it follows

$$\begin{aligned}&{\mathbb {E}}_X\left[ \mathrm {Tr}H_0^{2n}\right] \\&\quad =\sum _{\begin{array}{c} i_1=1\\ \vdots \\ i_{2n}=1 \end{array}}^{N}\sum _{\begin{array}{c} (k_1,\ell _1),\ldots ,(k_{2n},l_{2n})\in T\\ (k_j,l_j)\ne (k_{j+1},l_{j+1}) \end{array}} {\mathbb {E}}_X\left[ \prod _{r=1}^{2n}e^{-{\mathbf {J}}l_r y_{i_r}} e^{{\mathbf {I}}(k_{r+1}-k_r) x_{i_r}} e^{{\mathbf {J}}l_{r+1} y_{i_{r}}} \right] . \end{aligned}$$

Furthermore, some indices \({i_r}\) might be the same which means that we cannot use directly the product rule for the expectation value since it is only valid for independent random variables. This is where we consider a set partition. We associate a partition \({\mathcal {A}}=\left( A_1, A_2, \ldots , A_t\right) \) of \(\{1,\ldots ,2n\}\) to a certain vector \(i_1,\ldots ,i_{2n}\) such that \(i_{r}=i_{r'}\) if and only if r and \(r'\) are contained in the same set \(A_i\in {\mathcal {A}}\). This is allows us to unambiguously write \(i_A\) instead of \(i_r\) if \(r\in A\). Now, the independence of the variables \((x_{A}, y_A)\) yields

$$\begin{aligned}&{\mathbb {E}}_X\left[ \prod _{r=1}^{2n} \exp \left( -{\mathbf {J}}l_r y_{i_r}\right) \exp \left( {\mathbf {I}}(k_{r+1}-k_r) x_{i_r}\right) \exp \left( {\mathbf {J}}l_{r+1} y_{{i_r}}\right) \right] \nonumber \\&\quad ={\mathbb {E}}_X\left[ \prod _{A\in {\mathcal {A}}}\prod _{r\in A} \exp \left( -{\mathbf {J}}l_r y_{A}\right) \exp \left( {\mathbf {I}}(k_{r+1}-k_r) x_{A}\right) \exp \left( {\mathbf {J}}l_{r+1} y_{{A}}\right) \right] \nonumber \\&\quad =\prod _{A\in {\mathcal {A}}}{\mathbb {E}}_X\left[ \prod _{r\in A} \exp \left( -{\mathbf {J}}l_r y_{A}\right) \exp \left( {\mathbf {I}}(k_{r+1}-k_r) x_{A}\right) \exp \left( {\mathbf {J}}l_{r+1} y_{{A}}\right) \right] .\qquad \end{aligned}$$
(4.22)

Let us introduce now

$$\begin{aligned} {\tilde{\sigma }}({\underline{a}},l_r)={\left\{ \begin{array}{ll}(-1)^{\pi (r)}(l_{r+1}-l_{r})y_{A}, &{}\quad a_r=+1\\ (-1)^{\pi (r)}(-l_{r+1}-l_{r})y_{A}, &{}\quad a_r=-1, \end{array}\right. } \end{aligned}$$

and

$$\begin{aligned} \lambda (a_r,\theta _r)={\left\{ \begin{array}{ll}\cos (k_{r+1}-k_r)x_{A}, &{}\quad a_r=+1\\ \sin (k_{r+1}-k_r)x_{A}, &{}\quad a_r=-1 \end{array}\right. }, \end{aligned}$$

and by applying Lemma 4.4 we get for the expectation value in (4.22)

(4.23)

The last line in (4.23) can be analyzed as follows: when \(a_1\ldots a_r=+1\) then the term is independent of \({\mathbf {I}}\) and when \(a_1\ldots a_r=-1\) the term depends on \({\mathbf {I}}\). This can be easily seen as in the first case the number of \(a_j=-1\) which corresponds to the terms containing \({\mathbf {I}}\) being even. For this reason the basis element \({\mathbf {I}}\) vanishes because \({\mathbf {I}}^{2r}=\pm 1\), with \(r\in {\mathbb {N}}\). For the second term in (4.23) the number of terms with \(a_j=-1\) is odd, matching the number of \(\sin \)-functions in the product. That means the basis element \({\mathbf {I}}\) never vanishes for the second term since \({\mathbf {I}}^{2r-1}=\pm {\mathbf {I}}\), with \(r\in {\mathbb {N}}\).

We emphasized here that the expectation value of the second term vanishes because it is given in terms of \(\sin \)-functions which vanish when \(x_{s_A}\) is integrated over \([0,2\pi ]\). Therefore, only the first term is surviving. In turn, the first term has \(2^{n-1}\) sub-terms. Thus, we can state

$$\begin{aligned}&{\mathbb {E}}_X\left[ \sum _{{ \begin{array}{c} {\underline{a}}\in \{-1,1\}^r\\ a_1\ldots a_r=+1 \end{array}}}e^{{\mathbf {J}}\sum _{r\in A}{\tilde{\sigma }}(a_r)}\prod _{r\in A}\lambda (a_r,\theta _r)\right] \\&\quad =\sum _{{ \begin{array}{c} {\underline{a}}\in \{-1,1\}^r\\ a_1\ldots a_r=-1 \end{array}}}\int _{[0,2\pi ]}e^{{\mathbf {J}}\sum _{r\in A}{\tilde{\sigma }}(a_r)}dy\int _{[0,2\pi ]}\prod _{r\in A}\lambda (a_r,\theta _r)dx \end{aligned}$$

where

$$\begin{aligned} {\tilde{\sigma }}(a_r)={\left\{ \begin{array}{ll}(-1)^{\pi (r)}(l_{r+1}-l_{r})y_{r}, &{}\quad a_r=+1\\ -(-1)^{\pi (r)}(l_{r+1}+l_{r})y_{r}, &{}\quad a_r=-1, \end{array}\right. } \end{aligned}$$

and

$$\begin{aligned} \lambda (a_r,\theta _r)={\left\{ \begin{array}{ll}\cos [(k_{r+1}-k_{r})x_{r}], &{}\quad a_r=+1\\ \sin [(k_{r+1}-k_{r})x_{r}], &{}\quad a_r=-1, \end{array}\right. } \end{aligned}$$

such that \(\pi (n)=\sum _{j=1}^{n-1}\frac{1-a_j}{2}\). The last quantity \(\pi (n)\) does nothing else then counting the number of \(\sin \)-functions. By invoking Lemma 4.3 for the product altogether we obtain

$$\begin{aligned}&{\mathbb {E}}_X\left[ \exp \left( {\mathbf {J}}\sum _{r\in A}{\tilde{\sigma }}({\underline{a}},l_r)\right) \prod _{s\in A}\lambda (a_r,\theta _r)\right] \\&\quad =\sum _{{ \begin{array}{c} {\underline{a}}\in \{-1,1\}^r\\ a_1\ldots a_r=+1 \end{array}}}\delta \left( \sum _{r\in A}{\tilde{\varrho }}(a_r)\right) 2^{1-|A|}\sum _{\alpha _2,\ldots ,\alpha _n\in \{-1,1\}}\alpha _2^{\delta _2}\ldots \alpha _{|A|}^{\delta _{|A|}}\delta \left( \theta _1 +\sum _{r\in A}^{}\alpha _r\theta _r\right) , \end{aligned}$$

such that

$$\begin{aligned} {\tilde{\varrho }}({\underline{a}},l_r)={\left\{ \begin{array}{ll}(-1)^{\pi (r)}(l_{r+1}-l_{r}), &{}\quad a_r=+1\\ -(-1)^{\pi (r)}(l_{r+1}+l_{r}), &{}\quad a_r=-1. \end{array}\right. } \end{aligned}$$
(4.24)

Furthermore, the condition \(|A|\ge 2\) for all \(A\in {\mathcal {A}}\) should be satisfied, i.e., we consider partitions P(2nt) with \(t\ge 2\). Moreover, the number of vectors \(\left( A_1,A_2,\ldots ,A_t\right) \in \{1,\ldots ,N\}^t\) with different entries is exactly \(N\ldots (N-t+1)=N!/(N-t)!\) if \(N\ge t\) and 0 if \(N\le t.\) \(\square \)

For later reference let us introduce the following notation

$$\begin{aligned} C_{{\mathbb {H}}}({\mathcal {A}},T):= & {} \sum _{\begin{array}{c} (k_1,l_1),\ldots ,(k_{2n},l_{2n})\in T\\ (k_j,l_j)\ne (k_{j+1},l_{j+1}) \end{array}}\sum _{\begin{array}{c} {\underline{a}}\in \{-1,1\}^r\\ a_1\ldots a_r=+1 \end{array}}\prod _{A\in {\mathcal {A}}}\nonumber \\&\times \, \delta \left( \sum _{s\in A}{\tilde{\varrho }}(a_s)\right) 2^{1-|A|}\sum _{\alpha _2,\ldots ,\alpha _n\in \{-1,1\}}\alpha _2^{\delta _2}\ldots \alpha _{|A|}^{\delta _{|A|}}\delta \left( \theta _1 +\sum _{r\in A}^{}\alpha _r\theta _r\right) ,\nonumber \\ \end{aligned}$$
(4.25)

with

$$\begin{aligned} {\tilde{\varrho }}({\underline{a}},l_s)={\left\{ \begin{array}{ll}\beta _s(l_{s+1}-l_{s}), &{}\quad a_s=+1\\ -\beta _s(l_{s+1}+l_{s}), &{}\quad a_s=-1. \end{array}\right. } \end{aligned}$$

and \(\beta _s=(-1)^{\pi (s)}\) where \(\pi (s)=\sum _{j=1}^{n-1}\frac{|a_j|-a_j}{2}\).

4.2 Analysis of \({\mathbb {P}}(E_{k,l})\)

While we have the estimate for the Frobenius norm of \(H_0^n\) from the previous section we still need to study the probability \({\mathbb {P}}(E_{k,l})\), i.e., the first term in (4.16) and (4.14). Similar to [28] consider the numbers \(\beta _m>0\), \(m=1,\ldots ,n\), such that

$$\begin{aligned} \sum _{m=1}^{n}\beta _m=a_1 \end{aligned}$$

and \(K_m\in {\mathbb {N}},\) \(m=1,\ldots ,n,\) some natural numbers. Let \((k,l)\in Z^2_\rho .\) By applying Markov’s inequality it follows

$$\begin{aligned} {\mathbb {P}}(E_{k,l})= & {} {\mathbb {P}}\left( \sum _{m=1}^{n}|((N^{-1}HR_T)^m\mathrm {sgn}(c))_{k,l}|\ge a_1\right) \nonumber \\\le & {} \sum _{m=1}^{n}{\mathbb {P}}\left( N^{-m}|((HR_T)^m\mathrm {sgn}(c))_{k,l}|\ge \beta _m\right) \nonumber \\= & {} \sum _{m=1}^{n}{\mathbb {P}}\left( N^{-2mK_m}|((HR_T)^m\mathrm {sgn}(c))_{k,l}|^{2K_m}\ge \beta _m^{2K_m}\right) \nonumber \\\le & {} \sum _{m=1}^{n}{\mathbb {E}}_X\left( |((HR_T)^m\mathrm {sgn}(c))_{k,l}|^{2K_m}\right) N^{-2mK_m}\beta _m^{-2K_m}. \end{aligned}$$
(4.26)

Let us consider \(\beta _m=\beta ^{n/K_m}\), i.e., \(\beta _m^{-2K_m}=\beta _m^{-2n}\). Now, from (4.26) it follows

$$\begin{aligned} {\mathbb {P}}(E_{k,l})\le \beta _m^{-2n}\sum _{m=1}^{n}{\mathbb {E}}_X\left( |((HR_T)^m\mathrm {sgn}(c))_{k,l}|^{2K_m}\right) N^{-2mK_m} \end{aligned}$$
(4.27)

while the condition \(a_1=\sum _{m=1}^{n}\beta _m\) can be written as

$$\begin{aligned} a_1=a=\sum _{m=1}^{n}\beta ^{n/K_m}<1. \end{aligned}$$

The following lemma attends the expectation value appearing in (4.27). While Lemma 4.6 is similar to Lemma 4.5, it requires much more work and, unfortunately, lengthy calculations. That is why we need to refer to Appendix B in the Ph.D. thesis [19] for the lengthy calculations which form part of this proof.

Lemma 4.6

For a sequence \(c=(c_{k,l})\in \ell _2({\mathbb {Z}}^2_\rho )\) with \(\mathrm {supp}\,c=T\), it holds

$$\begin{aligned}&{\mathbb {E}}_X\left[ \left| \big ((HR_T)^m\mathrm {sgn}(c)\big )_{k,l}\right| ^{2K}\right] \\&\quad \le \sum _{t=1}^{\min \{Km,N\}}\frac{N!}{(N-t)!}\sum _{{\mathcal {A}}\in P(2Km,t)} \sum _{\begin{array}{c} (k_1^{(1)},l_1^{(1)}),\ldots ,(k_{m}^{(1)},l_{m}^{(1)})\in T\\ \vdots \\ (k_1^{(K)},l_1^{(K)}),\ldots ,(k_{m}^{(K)},l_{m}^{(K)})\in T\\ (k_j^{(p)},l_j^{(p)})\ne (k_{j+1}^{(p)},l_{j+1}^{(p)}) \end{array}} 2^{3-m}\\&\qquad \times \sum _{(r_1,\ldots ,r_K)\subset V(\{1,\ldots ,16\},K)} \sum _{\begin{array}{c} {\underline{a}}\in \{-1,1\}^m\\ a_1^{(1)}\ldots a_m^{(1)}=\pm 1\\ \vdots \\ a_1^{(K)}\ldots a_m^{(K)}=\pm 1 \end{array}} \sum _{\begin{array}{c} \alpha _2^{(1)},\ldots ,\alpha _m^{(1)}\in \{-1,1\}\\ \vdots \\ \alpha _2^{(K)},\ldots ,\alpha _m^{(K)}\in \{-1,1\} \end{array}}\\&\qquad \times \prod _{A\in {\mathcal {A}}}\delta \left( \sum _{(r,p)\in A}\alpha _r^{(p)}\theta _r^{(p)}\right) \delta \left( \sum _{(s,p)\in A}\beta _s^{(p)}\phi _s^{(p)}\right) , \end{aligned}$$

where \(\theta _r^{(p)} = (k_{r+1}^{(p)}-k_r^{(p)})\) and \(\phi _s^{(p)} = {\left\{ \begin{array}{ll}(l_{s+1}^{(p)}-l_{s}^{(p)}), &{} a_s=+1\\ -(l_{s+1}^{(p)}+l_{s}^{(p)}), &{} a_s=-1,\end{array}\right. }\) with \(\alpha _r^{(p)}\in \{-1,1\}\) and \(\beta _s^{(p)}=(-1)^{\pi (s)}\) such that \(\pi (s)=\sum _{j=1}^{s-1}\frac{|a_j|-a_j}{2}\) counts the number of \(\sin \)’s.

Proof

We want to compute the expectation value \({\mathbb {E}}_X\) of the Kth-power of the modulus, i.e., the expectation value of the quantity \(\left| ((H_{}R_T)^m\mathrm {sgn}(c_{k,l}))_{k,l}\right| ^{2K}\). But there exist a major problem due the non-commutativity in the quaternionic setting. While in the previous proof we could still use the anti-commutativity of the basis elements \({\mathbf {I}}\) and \({\mathbf {J}}\) we now have to take into account quaternion-valued coefficients. That means we have to compute first \((HR_T)^m\) and then multiply it by the coefficient vector from the signal. This we are going to do in the way that we write the previous quantity as \((|((HR_T)^{m}\sigma )_{k,l}|^2)^K= (Q_0^2 + Q_1^2 + Q_2^2 + Q_3^2)^K.\) Of course, breaking the power of a quaternion in several small pieces and joining afterwards usually involves lengthy calculations. We refer for the calculation of the large formulae to Appendix B in the Ph.D. thesis [19]. Thus, we get

$$\begin{aligned}&\left( (HR_T)\left[ (k_1,l_1),(k_{m+1},l_{m+1})\right] \right) ^{m} \\&\quad = \sum _{ \begin{array}{c} (k_2,l_2),\ldots ,(k_{m},l_{m})\in T\\ (k_j,l_j)\ne (k_{j+1},l_{j+1}),\,j=1,\ldots ,m \end{array}}HR_T\left[ (k_1,l_1),(k_2,l_2)\right] \ldots HR_T\left[ (k_{m},l_{m})\,(k_{m+1},l_{m+1})\right] \\&\quad = \sum _{\begin{array}{c} i_1=1\\ \vdots \\ i_{m}=1 \end{array}}^{N}\sum _{\begin{array}{c} (k_2,l_2),\ldots ,(k_{m},l_{m})\in T\\ (k_j,l_j)\ne (k_{j+1},l_{j+1}),\,j=1,\ldots ,m \end{array}} \left[ \prod _{r=1}^{m}e^{-{\mathbf {J}}l_r y_{i_r}} e^{{\mathbf {I}}(k_{r+1}-k_r) x_{i_r}} e^{{\mathbf {J}}l_{r+1} y_{i_{r}}} \right] \\&\quad =\sum _{\begin{array}{c} i_1=1\\ \vdots \\ i_{m}=1 \end{array}}^{N}\sum _{\begin{array}{c} (k_2,l_2),\ldots ,(k_{m},l_{m})\in T\\ (k_j,l_j)\ne (k_{j+1},l_{j+1}),\,j=1,\ldots ,m \end{array}}\sum _{\begin{array}{c} {\underline{a}}\in \{-1,1\}^m\\ \end{array}}e^{{\mathbf {J}}\sum _{r=1}^m\beta _r\phi _r} \prod _{s=1}^{m}\lambda (a_s,\theta _s)\\&\quad =\sum _{\begin{array}{c} i_1=1\\ \vdots \\ i_{m}=1 \end{array}}^{N}\sum _{\begin{array}{c} (k_2,l_2),\ldots ,(k_{m},l_{m})\in T\\ (k_j,l_j)\ne (k_{j+1},l_{j+1}),\,j=1,\ldots ,m \end{array}}\sum _{\begin{array}{c} {\underline{a}}\in \{-1,1\}^m\\ a_1\ldots a_m=+1 \end{array}}2^{1-m}\\&\qquad \times e^{{\mathbf {J}}\sum _{r=1}^m\beta _r\phi _r}\sum _{\alpha _2,\ldots ,\alpha _m\in \{-1,1\}}\alpha _2^{\delta _2}\ldots \alpha _m^{\delta _m} \lambda \left( \prod _{j=1}^{m}a_j,\theta _1+\sum _{r=2}^{m}\alpha _r\theta _r\right) \\&\quad +\sum _{\begin{array}{c} i_1=1\\ \vdots \\ i_{m}=1 \end{array}}^{N}\sum _{\begin{array}{c} (k_2,l_2),\ldots ,(k_{m},l_{m})\in T\\ (k_j,l_j)\ne (k_{j+1},l_{j+1}),\,j=1,\ldots ,m \end{array}}\sum _{\begin{array}{c} {\underline{a}}\in \{-1,1\}^m\\ a_1\ldots a_m=-1 \end{array}}2^{1-m}\\&\qquad \times \, e^{{\mathbf {J}}\sum _{r=1}^m\beta _r\phi _r}\sum _{\alpha _2,\ldots ,\alpha _m\in \{-1,1\}}\alpha _2^{\delta _2}\ldots \alpha _m^{\delta _m} \lambda \left( \prod _{j=1}^{m}a_j,\theta _1+\sum _{r=2}^{m}\alpha _r\theta _r\right) . \end{aligned}$$

In accordance with the proof of our previous Lemma 4.3, we have

$$\begin{aligned}&\left( (HR_T)\left[ (k_1,l_1),(k_{m+1},l_{m+1})\right] \right) ^{m}\\&\quad =\sum _{\begin{array}{c} i_1=1\\ \vdots \\ i_{m}=1 \end{array}}^{N}\sum _{\begin{array}{c} (k_2,l_2),\ldots ,(k_{m},l_{m})\in T\\ (k_j,l_j)\ne (k_{j+1},l_{j+1}),\,j=1,\ldots ,m \end{array}}\sum _{\begin{array}{c} {\underline{a}}\in \{-1,1\}^m\\ a_1\ldots a_m=+1 \end{array}}2^{1-m}\\&\qquad \times \, e^{{\mathbf {J}}\sum _{r=1}^m\beta _r\phi _r}\sum _{\alpha _2,\ldots ,\alpha _m\in \{-1,1\}}\alpha _2^{\delta _2}\ldots \alpha _m^{\delta _m} \cos \left( \theta _1+\sum _{r=2}^{m}\alpha _r\theta _r\right) \\&\qquad +\sum _{\begin{array}{c} i_1=1\\ \vdots \\ i_{m}=1 \end{array}}^{N}\sum _{\begin{array}{c} (k_2,l_2),\ldots ,(k_{m},l_{m})\in T\\ (k_j,l_j)\ne (k_{j+1},l_{j+1}),\,j=1,\ldots ,m \end{array}}\sum _{\begin{array}{c} {\underline{a}}\in \{-1,1\}^m\\ a_1\ldots a_m=-1 \end{array}}2^{1-m}\\&\qquad \times \, e^{{\mathbf {J}}\sum _{r=1}^m\beta _r\phi _r}\sum _{\alpha _2,\ldots ,\alpha _m\in \{-1,1\}}\alpha _2^{\delta _2}\ldots \alpha _m^{\delta _m} \sin \left( \theta _1+\sum _{r=2}^{m}\alpha _r\theta _r\right) {\mathbf {I}}\\&\quad =\sum _{\begin{array}{c} i_1=1\\ \vdots \\ i_{m}=1 \end{array}}^{N}\sum _{\begin{array}{c} (k_2,l_2),\ldots ,(k_{m},l_{m})\in T\\ (k_j,l_j)\ne (k_{j+1},l_{j+1}),\,j=1,\ldots ,m \end{array}}\sum _{\begin{array}{c} {\underline{a}}\in \{-1,1\}^m\\ a_1\ldots a_m=+1 \end{array}}2^{1-m}\\&\qquad \times \left( \cos \left( \sum _{r=1}^m\beta _r\phi _r\right) +{\mathbf {J}}\sin \left( \sum _{r=1}^m\beta _r\phi _r\right) \right) \sum _{\alpha _2,\ldots ,\alpha _m\in \{-1,1\}}\alpha _2^{\delta _2}\ldots \alpha _m^{\delta _m} \cos \left( \theta _1+\sum _{r=2}^{m}\alpha _r\theta _r\right) \\&\qquad +\sum _{\begin{array}{c} i_1=1\\ \vdots \\ i_{m}=1 \end{array}}^{N}\sum _{\begin{array}{c} (k_2,l_2),\ldots ,(k_{m},l_{m})\in T\\ (k_j,l_j)\ne (k_{j+1},l_{j+1}),\,j=1,\ldots ,m \end{array}}\sum _{\begin{array}{c} {\underline{a}}\in \{-1,1\}^m\\ a_1\ldots a_m=-1 \end{array}}2^{1-m} \end{aligned}$$
$$\begin{aligned}&\left( \cos \left( \sum _{r=1}^m\beta _r\phi _r\right) +{\mathbf {J}}\sin \left( \sum _{r=1}^m\beta _r\phi _r\right) \right) \sum _{\alpha _2,\ldots ,\alpha _m\in \{-1,1\}}\alpha _2^{\delta _2}\ldots \alpha _n^{\delta _m} \sin \left( \theta _1+\sum _{r=2}^{m}\alpha _r\theta _r\right) {\mathbf {I}}\\&\quad =\sum _{\begin{array}{c} i_1=1\\ \vdots \\ i_{m}=1 \end{array}}^{N}\sum _{\begin{array}{c} (k_2,l_2),\ldots ,(k_{m},l_{m})\in T\\ (k_j,l_j)\ne (k_{j+1},l_{j+1}),\,j=1,\ldots ,m \end{array}}\sum _{\begin{array}{c} {\underline{a}}\in \{-1,1\}^m\\ a_1\ldots a_m=+1 \end{array}}2^{1-m}\\&\qquad \times \cos \left( \sum _{r=1}^m\beta _r\phi _r\right) \sum _{\alpha _2,\ldots ,\alpha _m\in \{-1,1\}}\alpha _2^{\delta _2}\ldots \alpha _m^{\delta _m} \cos \left( \theta _1+\sum _{r=2}^{m}\alpha _r\theta _r\right) \\&\qquad +\sum _{\begin{array}{c} i_1=1\\ \vdots \\ i_{m}=1 \end{array}}^{N}\sum _{\begin{array}{c} (k_2,l_2),\ldots ,(k_{m},l_{m})\in T\\ (k_j,l_j)\ne (k_{j+1},l_{j+1}),\,j=1,\ldots ,m \end{array}}\sum _{\begin{array}{c} {\underline{a}}\in \{-1,1\}^m\\ a_1\ldots a_m=+1 \end{array}}2^{1-m}\\&\qquad \times \sin \left( \sum _{r=1}^m\beta _r\phi _r\right) \sum _{\alpha _2,\ldots ,\alpha _m\in \{-1,1\}}\alpha _2^{\delta _2}\ldots \alpha _m^{\delta _m} \cos \left( \theta _1+\sum _{r=2}^{m}\alpha _r\theta _r\right) {\mathbf {J}}\\&\qquad +\sum _{\begin{array}{c} i_1=1\\ \vdots \\ i_{m}=1 \end{array}}^{N}\sum _{\begin{array}{c} (k_2,l_2),\ldots ,(k_{m},l_{m})\in T\\ (k_j,l_j)\ne (k_{j+1},l_{j+1}),\,j=1,\ldots ,m \end{array}}\sum _{\begin{array}{c} {\underline{a}}\in \{-1,1\}^m\\ a_1\ldots a_m=-1 \end{array}}2^{1-m}\\&\qquad \times \cos \left( \sum _{r=1}^m\beta _r\phi _r\right) \sum _{\alpha _2,\ldots ,\alpha _m\in \{-1,1\}}\alpha _2^{\delta _2}\ldots \alpha _m^{\delta _m} \sin \left( \theta _1+\sum _{r=2}^{m}\alpha _r\theta _r\right) {\mathbf {I}}\\&\qquad +\sum _{\begin{array}{c} i_1=1\\ \vdots \\ i_{m}=1 \end{array}}^{N}\sum _{\begin{array}{c} (k_2,l_2),\ldots ,(k_{m},l_{m})\in T\\ (k_j,l_j)\ne (k_{j+1},l_{j+1}),\,j=1,\ldots ,m \end{array}}\sum _{\begin{array}{c} {\underline{a}}\in \{-1,1\}^m\\ a_1\ldots a_m=-1 \end{array}}2^{1-m}\\&\qquad \times \sin \left( \sum _{r=1}^m\beta _r\phi _r\right) \sum _{\alpha _2,\ldots ,\alpha _m\in \{-1,1\}}\alpha _2^{\delta _2}\ldots \alpha _m^{\delta _m} \sin \left( \theta _1+\sum _{r=2}^{m}\alpha _r\theta _r\right) {\mathbf {J}}{\mathbf {I}}\\&\quad =P_{0}+{\mathbf {I}}P_{1}+{\mathbf {J}}P_{2}-{\mathbf {K}}P_{3}. \end{aligned}$$

Let us consider \(\sigma _m=\sigma _m(k_1,l_1)=\sigma _{0m}(k_1,l_1)+{\mathbf {I}}\sigma _{1m}(k_1,l_1) +{\mathbf {J}}\sigma _{2m}(k_1,l_1)+{\mathbf {K}}\sigma _{3m}(k_1,l_1).\) We are interested to compute

$$\begin{aligned}&\sum _{(k_1,l_1)\in T}\left[ P_{0}+{\mathbf {I}}P_{1}+{\mathbf {J}}P_{2}-{\mathbf {K}}P_{3}\right] \\&\;\qquad \times \left[ \sigma _{0m}(k_1,l_1)+{\mathbf {I}}\sigma _{1m}(k_1,l_1)+{\mathbf {J}}\sigma _{2m}(k_1,l_1)+{\mathbf {K}}\sigma _{3m}(k_1,l_1)\right] \\&\;\quad =\sum _{(k_1,l_1)\in T}\bigg [\left( P_{0}\sigma _{0m}(k_1,l_1)-P_{1}\sigma _{1m}(k_1,l_1)-P_{2}\sigma _{2m}(k_1,l_1)+P_{3}\sigma _{3m}(k_1,l_1)\right) \\&\;\qquad +\left( P_{0}\sigma _{1m}(k_1,l_1)+P_{1}\sigma _{0m}(k_1,l_1)+P_{2}\sigma _{\rho _03}+P_{3}\sigma _{2m}(k_1,l_1)\right) {\mathbf {I}} \\&\;\qquad +\left( P_{0}\sigma _{2m}(k_1,l_1)-P_{1}\sigma _{3m}(k_1,l_1)+P_{2}\sigma _{0m}(k_1,l_1)-P_{3}\sigma _{1m}(k_1,l_1)\right) {\mathbf {J}}\\&\;\qquad +\left( P_{0}\sigma _{03m}(k_1,l_1)+P_{1}\sigma _{2m}(k_1,l_1)-P_{2}\sigma _{1m}(k_1,l_1)-P_{3}\sigma _{0m}(k_1,l_1)\right) {\mathbf {K}}\bigg ]\\&\;\quad =\sum _{(k_1,l_1)\in T}\left( P_{0}\sigma _{0m}(k_1,l_1)-P_{1}\sigma _{1m}(k_1,l_1)-P_{2}\sigma _{2m}(k_1,l_1)+P_{3}\sigma _{3m}(k_1,l_1)\right) \\&\;\qquad +\sum _{(k_1,l_1)\in T}\left( P_{0}\sigma _{1m}(k_1,l_1)+P_{1}\sigma _{0m}(k_1,l_1)+P_{2}\sigma _{\rho _03}+P_{3}\sigma _{2m}(k_1,l_1)\right) {\mathbf {I}} \\&\;\qquad +\sum _{(k_1,l_1)\in T}\left( P_{0}\sigma _{2m}(k_1,l_1)-P_{1}\sigma _{3m}(k_1,l_1)+P_{2}\sigma _{0m}(k_1,l_1)-P_{3}\sigma _{1m}(k_1,l_1)\right) {\mathbf {J}}\\&\;\qquad +\sum _{(k_1,l_1)\in T}\left( P_{0}\sigma _{03m}(k_1,l_1)+P_{1}\sigma _{2m}(k_1,l_1)-P_{2}\sigma _{1m}(k_1,l_1)-P_{3}\sigma _{0m}(k_1,l_1)\right) {\mathbf {K}}\\&\;\quad = Q_0 + Q_1{\mathbf {I}} + Q_2{\mathbf {J}} + Q_3{\mathbf {K}}. \end{aligned}$$

Let us consider the expectation value applied to the sum. As in the proof of Lemma 4.5 we have to take into account that some of the indices \(i_r^{(p)}\) might coincide. This means that we have to use again set partitions. Let \((i_r^{(p)})_{r=1,\ldots ,m}^{p=1,\ldots ,2K}\subset \{1,\ldots ,N\}^{2Km}\) be a vector of indices and let \({\mathcal {A}}=\left( A_1,A_2,\ldots ,A_t\right) ,\;A_i\subset \{1,\ldots ,m\}\times \{1,\ldots ,2K\}\) be a corresponding partition such that (rp) and \((r',p')\) are contained in the same block if and only if \(i_r^{(p)}=i_{r'}^{(p')}\). For some \(A\in {\mathcal {A}}\) we may unambiguously write \(i_A\) instead of \(i_r^{(p)}\) if \((r,p)\in A\). Once again, if \(A\subset {\mathcal {A}}\) contains only one element then the last expression vanishes due to the condition \( (k_r^{(p)},l_r^{(p)}) \ne (k_{r-1}^{(p)},l_{r-1}^{(p)}) \). Thus, we only need to consider partitions \({\mathcal {A}}\in P(2Km,t)\) with \(t>1\). Now we are able to rewrite the inequality as

$$\begin{aligned}&{\mathbb {E}}_X\left[ \left| \left( \left( H_{}R_T\right) ^m\mathrm {sgn}(c_{k,l})\right) _{k,l}\right| ^{2K}\right] \\&\quad \le \sum _{t=1}^{\min \{Km,N\}}\frac{N!}{(N-t)!}\sum _{{\mathcal {A}}\in P(2Km,t)} \sum _{\begin{array}{c} (k_1^{(1)},l_1^{(1)}),\ldots ,(k_{m}^{(1)},l_{m}^{(1)})\in T\\ \vdots \\ (k_1^{(K)},l_1^{(K)}),\ldots ,(k_{m}^{(K)},l_{m}^{(K)})\in T\\ (k_j^{(p)},l_j^{(p)})\ne (k_{j+1}^{(p)},l_{j+1}^{(p)}) \end{array}} 2^{3-m}\\&\qquad \times \sum _{(r_1,\ldots ,r_K)\subset V(\{1,\ldots ,16\},K)} \sum _{\begin{array}{c} {\underline{a}}\in \{-1,1\}^m\\ a_1^{(1)}\ldots a_m^{(1)}=\pm 1\\ \vdots \\ a_1^{(K)}\ldots a_m^{(K)}=\pm 1 \end{array}} \sum _{\begin{array}{c} \alpha _2^{(1)},\ldots ,\alpha _m^{(1)}\in \{-1,1\}\\ \vdots \\ \alpha _2^{(K)},\ldots ,\alpha _m^{(K)}\in \{-1,1\} \end{array}}\prod _{A\in {\mathcal {A}}}\\&\qquad \times \prod _{A\in {\mathcal {A}}}\delta \left( \sum _{(r,p)\in A}\alpha _r^{(p)}\theta _r^{(p)}\right) \delta \left( \sum _{(s,p)\in A}\beta _s^{(p)}\phi _s^{(p)}\right) , \end{aligned}$$

where \(\theta _r^{(p)} = (k_{r+1}^{(p)}-k_r^{(p)})\) and \(\phi _s^{(p)} = {\left\{ \begin{array}{ll}(l_{s+1}^{(p)}-l_{s}^{(p)}), &{}\quad a_s=+1\\ -(l_{s+1}^{(p)}+l_{s}^{(p)}), &{}\quad a_s=-1,\end{array}\right. }\) which \(\alpha _r^{(p)}\in \{-1,1\}\) and \(\beta _s^{(p)}=(-1)^{\pi (s)}\) with \(\pi (s)=\sum _{j=1}^{s-1}\frac{|a_j|-a_j}{2}\). \(\square \)

To simplify our work in the next section we abbreviate

$$\begin{aligned} B_{{\mathbb {H}}}({\mathcal {A}},T)= & {} \sum _{\begin{array}{c} (k_1^{(1)},l_1^{(1)}),\ldots ,(k_{m}^{(1)},l_{m}^{(1)})\in T\\ \vdots \\ (k_1^{(K)},l_1^{(K)}),\ldots ,(k_{m}^{(K)},l_{m}^{(K)})\in T\\ (k_j^{(p)},l_j^{(p)})\ne (k_{j+1}^{(p)},l_{j+1}^{(p)}) \end{array}} 2^{3-m}\nonumber \\&\times \, \sum _{(r_1,\ldots ,r_K)\subset V(\{1,\ldots ,16\},K)} \sum _{\begin{array}{c} {\underline{a}}\in \{-1,1\}^m\\ a_1^{(1)}\ldots a_m^{(1)}=\pm 1\\ \vdots \\ a_1^{(K)}\ldots a_m^{(K)}=\pm 1 \end{array}} \sum _{\begin{array}{c} \alpha _2^{(1)},\ldots ,\alpha _m^{(1)}\in \{-1,1\}\\ \vdots \\ \alpha _2^{(K)},\ldots ,\alpha _m^{(K)}\in \{-1,1\} \end{array}}\nonumber \\&\times \, \prod _{A\in {\mathcal {A}}}\delta \left( \sum _{(r,p)\in A}\alpha _r^{(p)}\theta _r^{(p)}\right) \delta \left( \sum _{(s,p)\in A}\beta _s^{(p)}\phi _s^{(p)}\right) . \end{aligned}$$
(4.28)

4.3 Proof of Theorem 3.1

Using our lemmas from the previous section, in particular Lemma 4.5, we will give now the proof of Theorem 3.1. The value of the quantity \(C_{{\mathbb {H}}}({\mathcal {A}},T)\) as defined in (4.25) depends on T and \({\mathcal {A}}\in P(2n,t)\). Here, the indices \(\left( (k_1,l_1),\ldots ,(k_{2n},l_{2n})\right) \in T^{2n}\) are conditioned by the \(|{\mathcal {A}}|=t\) linear constraints \(\sum _{r\in A}\left( k_{r+1}-k_r\right) =0\) and \(\sum _{s\in A} \left( l_{s+1}\pm l_s\right) =0\) for all \(A\in {\mathcal {A}}\). These constraints are independent except for \(\sum _{r=1}^{2n}\left( k_{r+1}-k_r\right) =0\) and \(\sum _{s=1}^{2n}\left( l_{s+1}\pm l_s\right) =0\). Thus, from (4.25) we can estimate

$$\begin{aligned} C_{{\mathbb {H}}}({\mathcal {A}},T)\le \left| T\right| ^{2n-t+1}\le M^{2n-t+1}. \end{aligned}$$
(4.29)

By Lemma 4.5 we obtain (since in Theorem 3.1 T is not random which means \({\mathbb {E}}={\mathbb {E}}_X\))

$$\begin{aligned} {\mathbb {E}}\left[ \left||H_{0}^n\right||_F^2\right] \le \sum _{t=1}^{\mathrm {min}\{n,N\}}\frac{N!}{({N}-t)!}\sum _{{\mathcal {A}}\in P(2n,t)}\left| T\right| ^{2n-t+1}\le M^{2n+1}\sum _{t=1}^n\left( \frac{{N}}{M}\right) ^t S_2(2n,t), \end{aligned}$$

where \(S_2(n,t)=|P(2n,t)|\) are the associated Stirling numbers of the second kind. Let us put \(\theta =\frac{{N}}{M}.\) By Markov’s inequality it follows

$$\begin{aligned} {\mathbb {P}}\left( \left||\left( {N}^{-1}H_{0}\right) ^n\right||_F\ge \kappa \right)= & {} {\mathbb {P}}\left( \left||H_{0}^n\right||_F^2\ge {N}^{2n}\kappa ^2\right) \le {N}^{-2n}\kappa ^{-2}\,{\mathbb {E}}\left[ \left||H_0^n\right||_F^2\right] \\\le & {} \kappa ^{-2}\,M\,\theta ^{-2n}F_{2n}(\theta )=\kappa ^{-2}\,\,M\,G_{2n}(\theta ). \end{aligned}$$

Let us point out that \(\kappa <1.\) This implies \(\left||({N}^{-1}H_{0})^n\right||_F\le \kappa \) and, therefore, \((I_T-(N^{-1}H_0)^n)\) and also \([{\mathcal {F}}_{TX}^*{\mathcal {F}}_{TX}]={N}(I_T-{N}^{-1}H_{0})\) are invertible. In particular, \({\mathcal {F}}_{TX}\) is injective.

Let us now take a look at \(P(E_{k,l})\). By Lemma 4.6 we need to estimate \(B_{{\mathbb {H}}}({\mathcal {A}},T)\), which represents the number of elements \((k_{j}^{(p)},l_{j}^{(p)})\in T^{2Km}\) that satisfy \(\sum _{(r,p)}(k_r^{(p)}-k_{r-1}^{(p)})=0\) and \(\sum _{(s,p)}(k_{s+1}^{(p)}\pm k_s^{(p)})=0\) for all \(A\in {\mathcal {A}}\) with \({\mathcal {A}}\in P(2Km,t)\). Since we have t independent linear constraints \(B_{{\mathbb {H}}}({\mathcal {A}},T)\) is bounded from above by \(|T|^{2Km-t}\le M^{2Km-t}\). Thus, by putting again \(\theta =\frac{{N}}{M}\), we get

$$\begin{aligned} {\mathbb {E}}_X\left[ \left| \left( \left( HR_T\right) ^m\mathrm {sgn}\,c\right) _{k{\tilde{k}}}\right| ^{2K}\right] \le \sum _{t=1}^{Km}{N}^{t}S_2(2Km,t)M^{2Km-t}=\,M^{2Km}F_{2Km}(\theta ), \end{aligned}$$

and, therefore, we get

$$\begin{aligned} P(E_{k,l})=\beta ^{-2n}\sum _{m=1}^n \theta _{2mK_m}F_{2mK_m}(\theta )=\beta ^{-2n}\sum _{m=1}^nG_{2mK_m}(\theta ). \end{aligned}$$

Let us denote by \({\mathbb {P}}\)(failure) the probability that exact reconstruction of f by means of \(\ell _1\)-minimization fails.

By Lemma 4.1 and the above estimates it follows

$$\begin{aligned} {\mathbb {P}}(\mathrm {failure})\le & {} {\mathbb {P}}\left( \{{\mathcal {F}}_{TX}\{\mathrm {is\;not\;injective}\}\cup \{\mathrm {sup}_{(k,l)\in T^c}|P_{k,l}|\ge 1\}\right) \\\le & {} \sum _{(k,l)\in {\mathbb {Z}}_\rho ^2}{\mathbb {P}}(E_{k,l})+{\mathbb {P}}\left( \left||\left( {N}^{-1}H_0\right) ^n\right||\ge \kappa \right) \\\le & {} \,(2\rho +1)^2\beta ^{-2n}\sum _{m=1}^{n}G_{2mK_m}(\theta )+\kappa ^{-2}\,\,M\,G_{2n}(\theta ) \end{aligned}$$

under the conditions

$$\begin{aligned} a_1= & {} a=\sum _{m=1}^n\beta ^{n/K_m}<1,\quad a_2+a_1=1,\quad \mathrm {i.e.,}\;a_2=1-a,\\ \frac{\kappa }{1-\kappa }\le & {} \frac{a_2}{1+a_1}M^{-3/2}=\frac{1-a}{1+a}M^{-3/2}. \end{aligned}$$

\(\square \)

Let us remark that, given n, a possible choice for \(K_m\) is \(K_m\approx m/n,~m=1,\ldots ,n\), by rounding m / n to the nearest integer. Then \(\beta \) is chosen quite close to the maximal value such that \(a=\sum _{m=1}^n\beta ^{n/K_m}<1.\) By our choice of \(K_m\) we approximately have

$$\begin{aligned} \sum _{m=1}^{n}\beta ^{n/K_m} \approx \sum _{m=1}^{n}\beta ^{m}\approx \frac{\beta }{1-\beta }. \end{aligned}$$

Thus, the optimal \(\beta \) will always be close to 1 / 2.

4.4 Proof of Theorem 3.2

The proof of the Theorem 3.2 is practically the same as the proof of Theorem 2.1 in [28], because it depends only on set partitions without the algebraic structure of quaternions entering the picture. For more details we refer to [19].

4.5 Proof of Theorem 3.3

To prove our theorem we will need some modifications of Lemmas 4.5 and 4.6, in particular for the quantities \(C_{{\mathbb {H}}}({\mathcal {A}},T)\) and \(B_{{\mathbb {H}}}({\mathcal {C}},T)\) defined in (4.25) and (4.28), respectively. Hereby, we have to consider T as a random set modeled by (3.7), similarly as [28]. We will start first with (4.25) as follows.

Lemma 4.7

For \({\mathcal {A}}\in P(2n,t)\) we have

$$\begin{aligned}&{\mathbb {E}}\left[ C_{{\mathbb {H}}}({\mathcal {A}},T)\right] \\&\quad \le \sum _{s=2}^{n}({\mathbb {E}}|T|)^s \sum _{R=0}^{\min \{s,t\}-1}(2\rho +1)^{-2R} \#\{{\mathcal {B}}\in U(2n,s),\;\mathrm {rank}\;M({\mathcal {A}},{\mathcal {B}})=R \}, \end{aligned}$$

where \(M = M({\mathcal {A}},{\mathcal {B}})\) denote the \(t\times s\) matrix whose entries are given by

$$\begin{aligned} M_{i,j}:=|A_i\cap B_j|-|(A_i+1)\cap B_j|, \quad 1\le i\le t,\quad 1\le j\le s. \end{aligned}$$

Proof

Using the linearity of the expectation value we get

$$\begin{aligned} {\mathbb {E}}\left[ C_{{\mathbb {H}}}({\mathcal {C}},T)\right]= & {} {\mathbb {E}}\left[ \sum _{ \begin{array}{c} (k_1,l_1),\ldots ,(k_{2n},l_{2n})\in {\mathbb {Z}}^2_\rho \\ (k_{j},l_{j})\ne (k_{j+1},l_{j+1}) \end{array} }\prod _{j=1}^{2n}\mathbbm {1}_{\{(k_{j},l_{j})\in T\}}\right. \\&\left. \times \prod _{A\in {\mathcal {A}}}\delta \left( \sum _{r\in A}(k_{r+1}-k_r)\right) \delta \left( \sum _{r\in A}(l_{r+1}\pm l_{r})\right) \right] \\= & {} \sum _{ \begin{array}{c} (k_1,l_1),\ldots ,(k_{2n},l_{2n})\in {\mathbb {Z}}_\rho ^2\\ (k_{j},l_{j})\ne (k_{j+1},l_{j+1}) \end{array} }{\mathbb {E}}\left[ \prod _{j=1}^{2n}\mathbbm {1}_{\{(k_{j},l_{j})\in T\}}\right] \\&\times \prod _{A\in {\mathcal {A}}}\delta \left( \sum _{r\in A}(k_{r+1}-k_r)\right) \delta \left( \sum _{r\in A}(l_{r+1}\pm l_{r})\right) . \end{aligned}$$

Hereby, \(\mathbbm {1}_{\{(k,l)\in T\}}\) denotes the characteristic function of the set T, i.e., it is 1 if and only if \((k,l)\in T\). The expression \({\mathbb {E}}\left[ \prod _{j=1}^{2n}\mathbbm {1}_{\{(k_j,l_j)\in T\}}\right] \) depends on on the number of \((k_j,l_j)\) belonging to T. In this way, we can play again with partitions. If \( ((k_1, l_1),\ldots ,(k_{2n},l_{2n}))\in ({\mathbb {Z}}_\rho ^2)^{2n}\) is a vector satisfying \((k_{j},l_{j})\ne (k_{j+1},l_{j+1})\) then we can associate to it a partition \({\mathcal {B}}=(B_1,\ldots ,B_s)\) of \(\{1,\ldots ,2n\}\) such that j and \(j'\) are in the same set \(B_i\) if and only if \((k_j,l_j)= (k_{j'}, l_{j'})\). Obviously, j and \(j+1\) have to be contained in different blocks for all j due to the condition \((k_{j},l_{j})\ne (k_{j+1},l_{j+1})\) (once again we use the convention that \(2n + 1\) is identified with 1). In other words \({\mathcal {B}}\) has no adjacencies, i.e., \({\mathcal {B}}\in U(2n,s)\). Now if \({\mathcal {B}}\) has \(|{\mathcal {B}}|=s\) blocks we use the probability model for T given by \({\mathbb {P}}((k,l)\in T)=\tau \), \(0<\tau <1\), (see (3.7)) and stochastic independence to get

$$\begin{aligned} {\mathbb {E}}\left[ \prod _{i=1}^{2n}\mathbbm {1}_{\{(k_i,l_i)\in T\}}\right] = {\mathbb {E}}\left[ \prod _{j=1}^{s}\mathbbm {1}_{\{(k,l)_{B_j}\in T\}}\right] = \prod _{j=1}^{s}{\mathbb {E}}\left[ \mathbbm {1}_{\{(k,l)_{B_j}\in T\}}\right] = \tau ^s,\qquad \end{aligned}$$
(4.30)

where (unambiguously) \((k,l)_{B_j}=(k,l)_{i}\) if \(i\in B_j\). By introducing the notation \(\sigma _{{\mathcal {B}}}(r)=j\) if and only if \(r\in B_j\in {\mathcal {B}}\) we can write

$$\begin{aligned}&{\mathbb {E}}\left[ C_{{\mathbb {H}}}({\mathcal {C}},T)\right] \\&\quad =\sum _{s=2}^{n}\tau ^s\sum _{{\mathcal {B}}\in U(2n,s)}\sum _{\begin{array}{c} (k_1,l_1),\ldots ,(k_{2n},l_{2n})\in {\mathbb {Z}}^2_\rho \\ (k_i, l_i) \,\mathrm {p.w.\,different} \end{array}} \prod _{A\in {\mathcal {A}}}\delta \left( \sum _{r\in A}(k_{\sigma _{{\mathcal {B}}}(r+1)}-k_{\sigma _{{\mathcal {B}}}(r)})\right) . \end{aligned}$$

Clearly, we have \(\prod _{A\in {\mathcal {A}}}\delta \left( \sum _{r\in A}(k_{\sigma _{{\mathcal {B}}}(r+1)}-k_{\sigma _{{\mathcal {B}}}(r)})\right) =1\) and \(\prod _{A\in {\mathcal {A}}}\delta \left( \sum _{r\in A}( l_{\sigma _{{\mathcal {B}}}(r+1)}-l_{\sigma _{{\mathcal {B}}}(r)})\right) =1\) if and only if, respectively,

$$\begin{aligned} \sum _{r\in A}(k_{\sigma _{{\mathcal {B}}}(r+1)}-k_{\sigma _{{\mathcal {B}}}(r)}) = 0, \quad \mathrm {for\;all}\; A\in {\mathcal {A}} \end{aligned}$$
(4.31)

and

$$\begin{aligned} \sum _{r\in A}(l_{\sigma _{{\mathcal {B}}}(r+1)}\pm l_{\sigma _{{\mathcal {B}}}(r)}) = 0, \quad \mathrm {for\;all}\; A\in {\mathcal {A}}. \end{aligned}$$
(4.32)

Otherwise, the terms are 0. For \(j\in \{1,\ldots ,s\}\) the term \((k_j, l_j)\) appears \(|A_i\cap B_j|\) times in the terms \((k_{\sigma _{{\mathcal {B}}}(r)},l_{\sigma _{{\mathcal {B}}}(r)})\) when r runs through \(A_i\in {\mathcal {A}}\). Let \(M = M({\mathcal {A}},{\mathcal {B}})\) denote the \(t\times s\) matrix whose entries are given by

$$\begin{aligned} M_{i,j}:=|A_i\cap B_j|-|(A_i+1)\cap B_j|, \quad 1\le i\le t,1\le j\le s. \end{aligned}$$

Then (4.31) and (4.32) are satisfied if and only if \(((k_1, l_1),\ldots ,(k_{s}, l_{s}))\in ({\mathbb {Z}}^2_\rho )^s\) belongs to the kernel of \(M({\mathcal {A}},{\mathcal {B}})\). Thus, if the rank of \(M({\mathcal {A}},{\mathcal {B}})\) equals R then the number of vectors \((k_1,l_1),\ldots ,(k_{s},l_{s})\in ({\mathbb {Z}}^2_\rho )^s\) for which (4.31) and (4.32) are satisfied can be bounded by \(((2\rho +1)^2)^{s-R}\). Here we even neglected the condition that \((k_1,l_1),\ldots ,(k_{s},l_{s})\) should be pairwise different. So finally we obtain

$$\begin{aligned} {\mathbb {E}}\left[ C_{{\mathbb {H}}}({\mathcal {C}},T)\right]\le & {} \sum _{s=2}^{n}\tau ^s \sum _{R=0}^{\min \{s,t\}-1} (2\rho +1)^{2(s-R)} \#\{{\mathcal {B}}\in U(n,s),\, \mathrm {rank}\;M({\mathcal {A}},{\mathcal {B}})=R \}\\= & {} \sum _{s=2}^{n}({\mathbb {E}}|T|)^s \sum _{R=0}^{\min \{s,t\}-1} (2\rho +1)^{-2R} \#\{{\mathcal {B}}\in U(n,s), \mathrm {rank}\;M({\mathcal {A}},{\mathcal {B}})=R \}, \end{aligned}$$

where we substituted \({\mathbb {E}}|T|=\tau D\). \(\square \)

Since \(E = E_XE_T\) by Fubini’s theorem and stochastic independence of T and X the previous result yields together with Lemma 4.5

$$\begin{aligned} {\mathbb {E}}\left[ \left||H_0^n\right||^2_F\right]\le & {} \sum _{t=1}^{\min \{n,N\}}\frac{N!}{(N-t)!} \sum _{{\mathcal {A}}\in P(2n,t)}\sum _{s=2}^{n}({\mathbb {E}}|T|)^s\\&\times \,\sum _{R=0}^{\min \{s,t\}-1} (2\rho +1)^{-2R} \#\{{\mathcal {B}}\in U(n,s),\mathrm {rank}\;M({\mathcal {A}},{\mathcal {B}})=R \} \\\le & {} \sum _{t=1}^{\min \{n,N\}}\frac{N!}{(N-t)!}\sum _{s=2}^{n}({\mathbb {E}}|T|)^s \sum _{R=0}^{\min \{s,t\}-1} (2\rho +1)^{-2R} Q(2n,t,s,R)\\= & {} N^{2n}W(n,N,{\mathbb {E}}|T|,D). \end{aligned}$$

Hereby, \(Q(2n, t, s,R):=\#\left\{ ({\mathcal {A}},{\mathcal {B}}):{\mathcal {A}}\in P(n,t),{\mathcal {B}}\in U(n,s), \mathrm {rank}M({\mathcal {A}},{\mathcal {B}})=R\right\} \) and

$$\begin{aligned}&W(n,N,{\mathbb {E}}|T|,\rho )\\&\quad :=N^{-2n}\sum _{t=1}^{\min \{n,N\}}\frac{N!}{(N-t)!}\sum _{s=2}^{n}({\mathbb {E}}|T|)^s \sum _{R=0}^{\min \{s,t\}-1} (2\rho +1)^{-2R} Q(2n,t,s,R). \end{aligned}$$

Applying Markov’s inequality we obtain

$$\begin{aligned} {\mathbb {P}}\left( \left||(N^{-1}H_0)^n\right||_F\ge \kappa \right) \le N^{-2n}\kappa ^{-2}{\mathbb {E}}\left[ \left||H_0^n\right||_F^2\right] \le \kappa ^{-2}W(n,N,{\mathbb {E}}|T|,\rho ). \end{aligned}$$

Obviously, like in the proof of Theorem 3.3 \({\mathcal {F}}_{TX}\) is injective if \(\left||(N^{-1}H_0)^n\right||_F< 1\).

As in the case where T is not random we have to estimate \({\mathbb {P}}(E_{k,l})\). This has to be done like in Lemma 4.6, i.e., we need to estimate the expectation value of \(B_{{\mathbb {H}}}({\mathcal {C}},T)\) defined in (4.28).

Lemma 4.8

Suppose \({\mathcal {A}}\in P(2Km,t)\) then we have

$$\begin{aligned}&{\mathbb {E}}\left[ B_{{\mathbb {H}}}({\mathcal {C}},T)\right] \\&\quad \le \sum _{s=1}^{2Km}({\mathbb {E}}|T|)^s \sum _{R=0}^{\min \{s,t\}-1} (2\rho +1)^{-2R} \#\{{\mathcal {B}}\in U^*(2K,m,s),\quad \mathrm {rank}\; L({\mathcal {A}},{\mathcal {B}})=R \}, \end{aligned}$$

where similar to [28] \(U^*(2K,m,s)\) denotes all partitions of \([K]\times [m]\) such that (rp) and \((r,p+1)\) are not contained in the same block. \(L = L({\mathcal {A}},{\mathcal {B}})\) denote the \(t\times s\) matrix whose entries are given by

$$\begin{aligned} L_{i,j}:=\sum _{(p,u)\in A_i\cap B_j}(-1)^p-\sum _{(p,u)\in (A_i-1)\cap B_j} (-1)^p. \end{aligned}$$

Proof

In the same way as in the proof of Lemma 4.7 we can consider

$$\begin{aligned} {\mathbb {E}}\left[ B_{{\mathbb {H}}}({\mathcal {C}},T)\right]= & {} \sum _{ \begin{array}{c} (k_1,l_1),\ldots ,(k_{2n},l_{2n})\in {\mathbb {Z}}^2_\rho \\ (k_{j},l_{j})\ne (k_{j+1},l_{j+1}) \end{array}}{\mathbb {E}}\left[ \prod _{(p,j)\in [2K]\times [m]}\mathbbm {1}_{\{(k_j,l_j)\in T\}}\right] \\&\times \prod _{A\in {\mathcal {A}}}\delta \left( \sum _{(r,p)\in A}\alpha (k_{r+1}^{(p)}-k_{r}^{(p)})\right) \delta \left( \sum _{(r,p)\in A}\alpha (l_{r+1}^{(p)}-l_{r}^{(p)})\right) . \end{aligned}$$

Hereby, \({\mathbb {E}}[\prod _{(p,j)\in [2K]\times [m]}\mathbbm {1}_{\{(k_j^{(p)},l_j^{(p)})\in T\}}]\) depends only on the number of different \((k_r^{(p)},l_{r}^{(p)})\)’s. Therefore, if \(((k_1^{(1)},l_1^{(1)}),\ldots ,(k_m^{(2K)},l_m^{(2K)}))\in ({\mathbb {Z}}^2_\rho )^{(2Km)}\) satisfies

$$\begin{aligned} \big (k_{j}^{(p)},l_{j}^{(p)}\big )\ne \big (k_{j+1}^{(p)},l_{j+1}^{(p)}\big )\quad \mathrm {for\;all}\;j\in [m],\quad p\in [2K] \end{aligned}$$
(4.33)

we can identify it with partition \({\mathcal {B}}=(B_1,\ldots ,B_s)\) of [2K] such that (pj) and \((p', j')\) are belonging to the same block if and only if \(k_{j}^{(p)} =k_{j'}^{(p')}\). This implies that \({\mathcal {B}}\) belongs to \(U^*(2K,m, s)\) since (pj) and \((p, j+1)\) cannot be in the same block due to the condition (4.33). Now \({\mathcal {B}}\) has s sets which means that we have s different values of \((k_j^{(p)},l_j^{(p)})\) and

$$\begin{aligned} {\mathbb {E}}\left[ \prod _{(p,j)\in [2K]\times [m]}\mathbbm {1}_{\{(k_j^{(p)},l_j^{(p)})\in T\}}\right] =\tau ^s \end{aligned}$$

as in (4.30). Using again the notation \(\sigma _{{\mathcal {B}}}(p,j)=i\) if \((p,j)\in B_i\in {\mathcal {B}}\) and \(\sigma (p,0) = 0\) we get

$$\begin{aligned} {\mathbb {E}}\left[ B_{{\mathbb {H}}}({\mathcal {C}},T)\right]= & {} \sum _{s=1}^{2n}\tau ^s\sum _{{\mathcal {B}}\in U^*(2K,m,s)}\sum _{\begin{array}{c} (k_1,l_1),\ldots ,(k_{2n},l_{2n})\in {\mathbb {Z}}_\rho ^2\\ (k_i, l_i) \,\mathrm {p.w.\,different} \end{array}}\\&\times \, \prod _{A\in {\mathcal {A}}}\delta \left( \sum _{(p,j)\in A}\alpha (k_{\rho _{{\mathcal {B}}}(p,j+1)}-k_{\rho _{{\mathcal {B}}}(p,j)})\right) \delta \left( \sum _{(p,j)\in A}\alpha (l_{\rho _{{\mathcal {B}}}(p,j+1)}\pm l_{\rho _{{\mathcal {B}}}(p,j)})\right) . \end{aligned}$$

The term \(\prod _{A\in {\mathcal {A}}}\delta \left( \sum _{(p,j)\in A}\alpha (l_{\rho _{{\mathcal {B}}}(p,j+1)}-l_{\rho _{{\mathcal {B}}}(p,j)})\right) \delta \left( \sum _{(p,j)\in A}\alpha (l_{\rho _{{\mathcal {B}}}(p,j+1)}\right. \left. \pm l_{\rho _{{\mathcal {B}}}(p,j)})\right) \) is non-zero if and only if

$$\begin{aligned} \sum _{(p,j)\in A}\alpha (k_{\rho _{{\mathcal {B}}}(p,j+1)}-k_{\rho _{{\mathcal {B}}}(p,j)}) = 0\quad \mathrm {for \;all} \; A\in {\mathcal {A}} \end{aligned}$$

and

$$\begin{aligned} \sum _{(p,j)\in A}\alpha (l_{\rho _{{\mathcal {B}}}(p,j+1)}\pm l_{\rho _{{\mathcal {B}}}(p,j)}) = 0\quad \mathrm {for \;all} \; A\in {\mathcal {A}}. \end{aligned}$$

Similar to [28] this leads to the estimate

$$\begin{aligned} {\mathbb {E}}\left[ B({\mathcal {A}},T)\right]\le & {} \sum _{s=1}^{2Km}\tau ^s \sum _{R=0}^{\min \{s,t\}} D^{s-R} \#\{{\mathcal {B}}\in U^*(2K,m,s),\; \mathrm {rank}\;L({\mathcal {A}},{\mathcal {B}})=R \}. \end{aligned}$$

Since \({\mathbb {E}}|T|=\tau D\) this proves the lemma. \(\square \)

Using our Lemma 4.6 the previous result gives us

$$\begin{aligned}&{\mathbb {E}}\left[ |((HR_T)^m\sigma )_{k,l}|^{2K}\right] \\&\quad \le \sum _{t=1}^{\min \{Km,N\}}\frac{N!}{(N-t)!}\sum _{s=1}^{2Km}({\mathbb {E}}|T|)^s \sum _{R=0}^{\min \{s,t\}} Q^*(2K,m,t,s,R)D^{-R}\\&\quad = N^{2Km}Z(K,m,N,{\mathbb {E}}|T|,D) \end{aligned}$$

where \(Q^*(2K,m,t,s,R)\) are numbers defined by

$$\begin{aligned}&Q^*(2K,m,t,s,R)\\&\quad :=\#\big \{({\mathcal {A}},{\mathcal {B}}), {\mathcal {A}}\in P(2Km,t),{\mathcal {B}}\in U^*(2K,m,s), \mathrm {rank}\;L({\mathcal {A}},{\mathcal {B}})=R \big \}. \end{aligned}$$

From (4.27) we get

$$\begin{aligned} {\mathbb {P}}(E_{k,l})\le \beta ^{-2n}\sum _{m=1}^{n}Z(K_m,m,N,{\mathbb {E}}|T|,D). \end{aligned}$$

Finally, again let \({\mathbb {P}}(\mathrm {failure})\) denote the probability that exact reconstruction of f fails. By Lemma 4.1, expressions (4.16) and (4.17), and using the fact \(\{{\mathcal {F}}_{TX}\;\mathrm {is\; not\; injective}\}\subset \{\left||(N^{-1}H_0)^n\right||_F\ge \kappa \}\) we finally obtain

$$\begin{aligned} {\mathbb {P}}(\mathrm {failure})\le & {} {\mathbb {P}} \left( \{{\mathcal {F}}_{TX}\{\mathrm {is\;not\;injective}\}\cup \{\mathrm {sup}_{(k,l)\in T^c}|P_{k,l}|\ge 1\}\right) \\\le & {} \,\sum _{(k,l)\in {\mathbb {Z}}_\rho ^2}{\mathbb {P}}(E_{k,l})+{\mathbb {P}}\left( \left||\left( N^{-1}H_0\right) ^n\right||\ge \kappa \right) + {\mathbb {P}}(|T|\ge (\alpha +1){\mathbb {E}}|T|)\\\le & {} D\beta ^{-2n}\sum _{m=1}^{n}Z(K_m,m,N,{\mathbb {E}}|T|,D) + \kappa ^{-2}W(n,N,{\mathbb {E}}|T|,D) \\&\quad +\,\exp \left( -\frac{3\alpha ^2}{6+2\alpha }{\mathbb {E}}|T|\right) \end{aligned}$$

under the conditions (see also (4.15))

$$\begin{aligned} a_1= & {} a=\sum _{m=1}^n\beta ^{n/K_m}<1,\;\;\;a_2+a_1=1,\;\mathrm {i.e.,}\;a_2=1-a,\\ \frac{\kappa }{1-\kappa }\le & {} \frac{a_2}{1+a_1}((\alpha +1){\mathbb {E}}|T|)^{-3/2}=\frac{1-a}{1+a}((\alpha +1){\mathbb {E}}|T|)^{-3/2}. \end{aligned}$$

This proves Theorem 3.3.

5 Applications

Although there exists a large variety of applications of quaternionic signals, in this section we will restrict ourselves to the case of color-encoded images represented by quaternions, i.e., the R-,G-, and B-components of the image are represented by the \({\mathbf {I}}\)-, \({\mathbf {J}}\)-, \({\mathbf {K}}\)-components of a given quaternion, respectively. In what follows we present some numerical experiments. Namely, we reconstruct a given signal by means of the \(\ell _1\)-minimization problem

$$\begin{aligned}&\min \left||x\right||_{1}=\sum _{s=1}^N|x_s|\nonumber \\&\mathrm {s.t.}\quad \Phi x = y, \end{aligned}$$
(5.1)

where \(\Phi \in {\mathbb {H}}^{M\times N}\) is our quaternionic sampling matrix, \(y\in {\mathbb {H}}^{M\times 1}\) is the vector consisting of chosen random pixels from the given image under the assumption that the signal vector \(x\in {\mathbb {H}}^{N\times 1}\) is k-sparse, i.e., at most k entries of x are non-zero. The support of x is the set of indexes corresponding to the non-zero entries, \(\mathrm {supp}\,x=\{s\in \{1,\ldots ,N\}:x_s\ne 0\}\). We aim to illustrate that an effective signal reconstruction is possible even if using only a small amount of the signal information.

For the implementation of the \(\ell _1\)-minimization algorithm we use the Matlab toolbox \(\ell \)1-Magic [9].

Since a quaternion is defined as follows

$$\begin{aligned} q=R(q) + {\mathbf {I}}I(q) + {\mathbf {J}}J(q) + {\mathbf {K}}K(q) \end{aligned}$$

where R(q) is the real part of the quaternion and I(q), J(q) and K(q) are its three imaginary components we have for vectors and matrices of quaternions the decomposition

$$\begin{aligned} \mathbf{v}= & {} R(\mathbf{v}) + {\mathbf {I}}I(\mathbf{v}) + {\mathbf {J}}J(\mathbf{v}) + {\mathbf {K}}K(\mathbf{v})\\ \mathbf{M}= & {} R(\mathbf{M}) + {\mathbf {I}}I(\mathbf{M}) + {\mathbf {J}}J(\mathbf{M}) + {\mathbf {K}}K(\mathbf{M}). \end{aligned}$$

Since \(\ell 1\)-Magic works only with real-valued vectors we need to modify our quaternion-valued system. We rewrite the quaternionic multiplication \((\alpha + {\mathbf {I}}\beta + {\mathbf {J}}\gamma + {\mathbf {K}}\delta )(x + {\mathbf {I}}y + {\mathbf {J}}v + {\mathbf {K}}w) = a + {\mathbf {I}}b + {\mathbf {J}}c + {\mathbf {K}}d\) as a matrix-vector multiplication, i.e.,

$$\begin{aligned} \begin{pmatrix} \alpha &{}\quad \beta &{}\quad \gamma &{}\quad \delta \\ - \beta &{}\quad \alpha &{}\quad -\delta &{}\quad \gamma \\ - \gamma &{}\quad \delta &{}\quad \alpha &{}\quad -\beta \\ - \delta &{}\quad -\gamma &{}\quad \beta &{}\quad \alpha \end{pmatrix} \begin{pmatrix} x\\ y \\ v\\ w \\ \end{pmatrix}= \begin{pmatrix} a \\ b \\ c \\ d \\ \end{pmatrix}. \end{aligned}$$
Fig. 1
figure 1

Lena—original image

Fig. 2
figure 2

Lena image—reconstructed from 40,000 pixels

Fig. 3
figure 3

Galaxy—original image

Fig. 4
figure 4

Galaxy image—reconstructed from 40,000 pixels

Fig. 5
figure 5

Saturn—original image

This allows us to rewrite the quaternionic linear system (5.1) in the form \({{\tilde{y}}}={\mathcal {M}}{{\tilde{x}}}\) with

$$\begin{aligned} {\mathcal {M}}= & {} \begin{pmatrix} R(\Phi ) &{}\quad -I(\Phi ) &{}\quad J(\Phi ) &{}\quad -K(\Phi ) \\ -I(\Phi ) &{}\quad R(\Phi ) &{}\quad -K(\Phi ) &{}\quad J(\Phi ) \\ -J(\Phi ) &{}\quad K(\Phi ) &{}\quad R(\Phi ) &{}\quad -I(\Phi ) \\ -K(\Phi ) &{}\quad -J(\Phi ) &{}\quad I(\Phi ) &{}\quad R(\Phi ) \\ \end{pmatrix},\\ {{\tilde{x}}}= & {} \begin{pmatrix} R(x)\\ I(x) \\ J(x)\\ K(x) \end{pmatrix}, \quad \mathrm{and } \quad {{\tilde{y}}}= \begin{pmatrix} R(y)\\ I(y) \\ J(y)\\ K(y) \end{pmatrix}. \end{aligned}$$
Fig. 6
figure 6

Saturn image—reconstructed from 40,000 pixels

Fig. 7
figure 7

Chips—original image

Fig. 8
figure 8

Reconstructed Chips image using information of 10,000 pixels

In our implementation we use the sampling matrix in its explicit form. This leads to large requirements in terms of memory since we deal with a matrix of dimension \(M\times N^2\). To be able to work with such large matrices, we used a strategy of dividing the image into \(8 \times 8\)-blocks followed by an individual reconstruction of each block. The sampling matrices for the blocks themselves are constructed by using DCT (Discrete Cosine Transform) and DST (Discrete Sine Transform). Since we study each block individually, we obviously get the additional problem of reassembling them. To this end we introduce reflexive boundary conditions on each block.

We performed the calculations on a computer with Intel(R) Core(TM) i7-4790U CPU 3.60 GHz, RAM 16 GB, Windows 8.1, OS 64-bit(win64) and running Matlab R2012b.

For our examples we choose the following images: Lena (Fig. 1), Galaxy (Fig. 3), and the Saturn rings (Fig. 5), each with \(N^2=262{,}144\) pixels (\(512\times 512\)). Lena image was chosen as one of the standard examples in image processing. Galaxy and Saturn have large black parts which of course can affect the reconstruction when by random choice mainly black pixels are chosen. Furthermore, Saturn is also easily visible as a gray-scale image. For the reconstruction we use 40,000 randomly chosen pixels (\(M=625\) samples in each block) which corresponds to \({\approx }15.26\%\) of the total information. The reconstructed images can be seen in Figs. 2, 4, and 6, respectively.

Another particularly interesting example is the chips image (Fig. 7). This image combines textures and sharp edges and represents a kind of worst case scenario. In this example, the image has \(N^2=65{,}536\) pixels (\(256\times 256\)). For the reconstruction we use 10,000 pixels (\(M=625\) samples in each block). Since the overall size of the image is smaller we just use a decomposition into \(4\times 4\) blocks. Under the same conditions as above (\({\approx }15.26\%\) of the original pixels where taken as random samples) we obtain our reconstruction (see Fig. 8).

Fig. 9
figure 9

Comparison between number of random samples and reconstruction error

While the image is still blurred (as expected) it still contains all relevant details.

To show the error reduction we provide Fig. 9 which shows the relation between the number of used pixels and the error given by the \(l_2\)-norm of the matrix containing the difference in the pixels between the original chips image (Fig. 7) and the reconstructed image. The figure shows an exponential decay in the error. It also shows why we opted to use the values of 10,000 pixels in the reconstruction since it is an interesting compromise between numbers of pixels and quality of the reconstructed image.