Keywords

3.1 Introduction

Weighted least squares is a generalization of the least-squares (LS) problem, where prior information about parameters and data is incorporated by multiplying both sides of the original LS problem by a typically diagonal weights matrix. Applications of weighted least-squares in Signal Processing include signal restoration [1, 2], source localization in wireless networks [3,4,5,6], adaptive filters [4, 7,8,9], and image smoothing [10]. In Statistics, weighted least-squares regression is often used to reduce bias from non-informative data samples [11, 12]. Also, a best linear unbiased estimator (BLUE) is obtained by using the inverse of the data covariance matrix as the weights matrix [13].

Recently, sparsity has become a commonly desired characteristic of a least-squares solution [14, 15]. Because of its relatively small number of non-zero values, a sparse solution could result in faster processing with lower computer storage requirements [14, 15]. A sparse solution is usually obtained by solving a least-squares problem while minimizing either the L0 norm of the solution (non-convex optimization problem) or minimizing the L1 norm of the solution (convex optimization problem), where the L0 norm of a vector is its number of non-zero elements and the L1 norm of a vector is the sum of the magnitude of its elements [14].

Several methods have been proposed to solve sparse least-squares problems, including the Method of Frames [16], Matching Pursuit (MP) [17], Orthogonal Matching Pursuit (OMP) [18], Best Orthogonal Basis [19], Least Absolute Shrinkage and Selection Operator (LASSO) that is also known as Basis Pursuit [20, 21], and Least Angle Regression (LARS) [21]. Both MP and OMP solve the L0 constrained least-squares problem [22] using sequential heuristic steps that add solution coefficients in a greedy, i.e., non-globally optimal, way. LASSO relaxes the non-convex L0 constrained least-squares problem to solve the convex L1 constrained least-squares problem instead [20]. Among the above solution methods, only Least Angle Regression could efficiently solve both the L0 and, with a slight modification, L1 constrained least-squares problem for all critical values of their regularization parameters. This parameter is required to balance the minimization of the LS residual with the minimization of the norm of the solution [21].

In addition to incorporating a priori information, weights also could be introduced to sparse least-squares problems to improve the \({L}_{1}\) minimization problem results [23, 24]. Candès et al. also used a reweighted \({L}_{1}\) minimization approach to enhance sparsity in compressed sensing [25]. Also, weighted L1 constrained least-squares regression has been used to extract information from large data sets for statistical applications [26, 27]. We note that sparse weighted least-squares problems could be solved using any of the above optimization methods.

Multilinear least-squares is a multidimensional generalization of least-squares [28,29,30], where the least-squares matrix has a Kronecker structure [31, 32]. Sparse multilinear least-squares could be either an L0 constrained or an L1 constrained multilinear least-squares problem. Caiafa and Cichocki introduced a generalization of OMP, Kronecker-OMP, to solve the L0 constrained sparse multilinear least-squares problem [32]. Elrewainy and Sherif [33] developed Kronecker Least Angle Regression (K-LARS) to efficiently solve both L0 and L1 constrained sparse least-squares having a specific Kronecker matrix form, \({\varvec{A}}\otimes {\varvec{I}}\), for all critical values of the regularization parameter. To overcome this limitation, the authors further developed Tensor Least angle Regression (T-LARS) [30], a generalization of K-LARS that does not require any special form of the LS matrix beyond being Kronecker. T-LARS solves either large L0 or large L1 constrained, sparse multilinear least-squares problems (underdetermined or overdetermined) for all critical values of the regularization parameter λ with significantly lower computational complexity and memory usage than Kronecker-OMP.

Weighted multilinear least-squares is a generalization of multilinear least-squares that introduces a typically diagonal weight matrix to both sides of the original LS problem. Since an arbitrary diagonal weight matrix would not be Kronecker, the weighted LS matrix would lose its original Kronecker structure, resulting in a potentially very large non-Kronecker LS matrix. Thus solving these weighted sparse multilinear least-squares problems could become highly impractical, as it would require significant memory and computational power.

Therefore, in this paper, we extend T-LARS to Weighted Tensor Least Angle Regression (WT-LARS) that could solve efficiently both \({L}_{0}\) and \({L}_{1}\) constrained sparse weighted multilinear least-squares problems for all critical values of the regularization parameter. It is organized as follows: Sect. 3.2 includes a brief introduction to the sparse weighted multilinear least-squares problem. In Sect. 3.3, we describe our new Weighted Tensor Least Angle Regression (WT-LARS) algorithm in detail. Section 3.4 provides results of applying WT-LARS to solve three different image inpainting problems by obtaining sparse representations of binary-weighted images. We present our conclusions in Sect. 3.5.

3.2 Problem Formulation

3.2.1 Sparse Weighted Multilinear Least-Squares Problem

A multilinear transformation of a tensor \(\mathcal{X}\) could be defined as, \(\mathcal{Y}=\mathcal{X}{\times }_{1}{\boldsymbol{\Phi }}^{\left(1\right)}{\times }_{2}\cdots {\times }_{N}{\boldsymbol{\Phi }}^{\left(N\right)}\), where \(\mathcal{Y}\in {\mathbb{R}}^{{J}_{1}\times \dots \times {J}_{n}\times \dots \times {J}_{N}}\) and \(\mathcal{X}\in {\mathbb{R}}^{{I}_{1}\times \dots {\times I}_{n}\times \dots \times {I}_{N}}\) are Nth order tensors, with the equivalent vectorized form

$$\begin{array}{*{20}c} {{\mathbf{\Phi }}vec\left( {{\text{ }}{\mathcal{X}}} \right) = vec\left( {\mathcal{Y}} \right)} \\ \end{array}$$
(3.1)

where \(\boldsymbol{\Phi }\in {\mathbb{R}}^{J\times I},\) and \(\boldsymbol{\Phi }={\boldsymbol{\Phi }}^{\left(N\right)}\otimes \cdots \otimes {\boldsymbol{\Phi }}^{\left(1\right)}\), and \(\otimes \) is the Kronecker product operator [34].

Let \({\varvec{W}}={{\varvec{S}}}^{H}{\varvec{S}}\), be a diagonal weight matrix. We could obtain a weighted linear transformation [35] of (1) as

$$\begin{array}{*{20}c} {{\mathbf{S\Phi }}vec\left( {\mathcal{X}} \right) = Svec\left({\mathcal{Y}} \right)} \\ \end{array}$$
(3.2)

A sparse solution of the weighted linear system in \(\left(2\right)\) could be obtained by solving an \({L}_{p}\) (p = 0 or p = 1) minimization problem,

$$\begin{array}{c}\stackrel{\sim }{\mathcal{X}} = \underset{\mathcal{X}}{\mathrm{arg min}}{\Vert {\varvec{S}}\boldsymbol{\Phi }{\text{vec}}\left(\mathcal{X}\right) - {\varvec{S}}{\text{vec}}\left(\mathcal{Y}\right) \Vert }_{2}^{2}+\lambda {\Vert {\text{vec}}\left(\mathcal{X}\right)\Vert }_{p}\end{array}$$
(3.3)

where \(\uplambda \) is a regularization parameter.

If \({\varvec{S}}\) is a Kronecker matrix, then \({\varvec{S}}\boldsymbol{\Phi }=\left({{\varvec{S}}}^{\left(N\right)}{\boldsymbol{\Phi }}^{\left(N\right)}\otimes \cdots \otimes {{\varvec{S}}}^{\left(1\right)}{\boldsymbol{\Phi }}^{\left(1\right)}\right)\) and we could use T-LARS [30] to obtain a sparse solution for either \({L}_{0}\) or \({L}_{1}\) optimization problem in \(\left(3\right)\) efficiently. However, \({\varvec{S}}\) is not typically Kronecker, so \({\varvec{S}}\boldsymbol{\Phi }\) would not have a Kronecker structure, and \(\left(3\right)\) should be solved as a potentially very large vectorized (one-dimensional) sparse least-squares problem which could be very challenging in terms of memory and computational power requirements. Therefore, in this paper, we develop Weighted Tensor Least Angle Regression (WT-LARS), a computationally efficient method, to solve either \({L}_{0}\) or \({L}_{1}\) constrained sparse weighted multilinear least-squares problems in \(\left(3\right)\) for an arbitrary diagonal weights matrix \({\varvec{W}}={{\varvec{S}}}^{H}{\varvec{S}}\in {\mathbb{R}}^{J\times J}\).

3.3 Weighted Tensor Least Angle Regression

In this section, we develop Weighted Tensor Least Angle Regression (WT-LARS) by extending T-LARS to solve the sparse weighted multilinear least-squares problem in \(\left(3\right)\), for weights \({\varvec{W}}={{\varvec{S}}}^{H}{\varvec{S}}\) and Kronecker dictionaries \(\boldsymbol{\Phi }\).

Inputs to WT-LARS are the data tensor \(\mathcal{Y}\in {\mathbb{R}}^{{J}_{1}\times \dots \times {J}_{n}\times \dots \times {J}_{N}}\), mode-n dictionary matrices \({\boldsymbol{\Phi }}^{\left(n\right)};n\in \left\{1,\cdots , N\right\}\) where \(\boldsymbol{\Phi }={\boldsymbol{\Phi }}^{\left(N\right)}\otimes \cdots \otimes {\boldsymbol{\Phi }}^{\left(1\right)}\), the diagonal weight matrix \({\varvec{W}}={{\varvec{S}}}^{H}{\varvec{S}}\), and the stopping criterion as a residual tolerance \(\varepsilon \) or the maximum number of non-zero coefficients \(K\) (K-sparse representation). The output is the solution tensor \(\mathcal{X}\in {\mathbb{R}}^{{I}_{1}\times \dots {\times I}_{n}\times \dots \times {I}_{N}}\).

WT-LARS requires weighted data \({\varvec{S}}{\text{vec}}\left(\mathcal{Y}\right)\), and columns of the weighted dictionary \({\varvec{S}}\boldsymbol{\Phi }\) to have a unit \({L}_{2}\) norm. Normalized weighted data could be easily calculated by \({\mathcal{Y}}_{W}={\varvec{S}}{\text{vec}}\left(\mathcal{Y}\right)/{\Vert {\varvec{S}}{\text{vec}}\left(\mathcal{Y}\right)\Vert }_{2}\). However, the dictionary matrix \({\varvec{S}}\boldsymbol{\Phi }\) does not have a Kronecker structure. Hence, normalizing mode-n dictionary matrices \({\boldsymbol{\Phi }}^{\left(n\right)}\) does not ensure normalization of the columns of \({\varvec{S}}\boldsymbol{\Phi }\). Therefore, in WT-LARS, we use the normalized weighted dictionary matrix \({\boldsymbol{\Phi }}_{W}={\varvec{S}}\boldsymbol{\Phi }{\varvec{Q}}\) instead of the normalized dictionary matrix \(\boldsymbol{\Phi }\) in T-LARS, where \({\varvec{Q}}\) is a diagonal matrix,

$$\begin{array}{c}{{\varvec{Q}}}_{i,i}= \frac{1}{{\Vert {\left({\varvec{S}}\boldsymbol{\Phi }\right)}_{i}\Vert }_{2}}\end{array}$$
(3.4)

where \({\left({\varvec{S}}\boldsymbol{\Phi }\right)}_{i}\) is the \({i}^{th}\) column of the weighted dictionary matrix \({\varvec{S}}\boldsymbol{\Phi }\). We can efficiently calculate the diagonal matrix \({\varvec{Q}}\) as,

$$\begin{array}{c}diag\left({\varvec{Q}}\right)=1./\sqrt{{\left({\boldsymbol{\Phi }}^{*2}\right)}^{{\varvec{T}}} diag\left({\varvec{W}}\right)}\end{array}$$
(3.5)

where, \({\boldsymbol{\Phi }}^{*2}\) [36] denotes the Hadamard square of \(\boldsymbol{\Phi }\), such that \({\boldsymbol{\Phi }}_{i,j}^{*2}={\left({\boldsymbol{\Phi }}_{i,j}\right)}^{2}\), \("./"\) denotes elementwise division, and \(diag\left({\varvec{Q}}\right)\) and \(diag\left({\varvec{W}}\right)\) are diagonal vectors of \({\varvec{Q}}\) and \({\varvec{W}}\) respectively. We could efficiently calculate \({\left({\boldsymbol{\Phi }}^{*2}\right)}^{{\varvec{T}}}diag\left({\varvec{W}}\right)\) using the full multilinear product.

WT-LARS solves the \({L}_{0}\) or \({L}_{1}\) constrained minimization problems in \(\left(3\right)\) for all critical values of the regularization parameter \(\uplambda \). WT-LARS starts with a large value of \(\uplambda \), that results in an empty active set \(I=\{\}\), and a solution \({\stackrel{\sim }{\mathcal{X}}}_{t=0}=0\). The set \(I\) denotes an active set of columns of the dictionary \({\boldsymbol{\Phi }}_{W}\), i.e., column indices where the optimal solution \({\stackrel{\sim }{\mathcal{X}}}_{t}\) at iteration \(t\), is nonzero, and \({I}^{c}\) denotes its corresponding inactive set. Therefore, \({{\boldsymbol{\Phi }}_{W}}_{I}\) contains only the active columns of the dictionary \({\boldsymbol{\Phi }}_{W}\) and \({{\boldsymbol{\Phi }}_{W}}_{{I}^{c}}\) contains only its inactive columns.

At each iteration \(t\), a new column is either added (\({L}_{0})\) to the active set \(I\) or a new column is either added or removed (\({L}_{1}\)) from the active set \(I\), and \(\uplambda \) is reduced by a calculated value \({\delta }_{t}^{*}\).

As a result of such iterations, new solutions with an increased number of coefficients that follow a piecewise linear path are obtained until a predetermined residual error \(\varepsilon \) or a predetermined number of active columns \(K\) is obtained.

The regularization parameter \(\uplambda \) is initialized to the maximum of the correlation \({{\varvec{c}}}_{1}\), between the columns of \({\boldsymbol{\Phi }}_{W}\) and the initial residual \({{\varvec{r}}}_{0}={\text{vec}}\left(\mathcal{Y}\right)\).

$$\begin{array}{c}{{\varvec{c}}}_{1}= {\boldsymbol{\Phi }}_{W}^{T}{{\varvec{r}}}_{0}\end{array}$$
(3.6)

Since \({\boldsymbol{\Phi }}_{W}^{T}={\varvec{Q}}{\boldsymbol{\Phi }}^{{\varvec{T}}}{\varvec{S}}\), we can easily calculate \({\boldsymbol{\Phi }}^{{\varvec{T}}}{\varvec{S}}{{\varvec{r}}}_{0}\) using the full multilinear product as

$$ \begin{array}{*{20}c} {{\mathcal{C}}_{1}^{\prime } = {\mathcal{R}}_{{{\varvec{S}}_{0} }} \times_{1} {{\varvec{\Phi}}}^{{\left( 1 \right)^{T} }} \times_{2} \ldots \times_{N} {{\varvec{\Phi}}}^{{\left( N \right)^{T} }} } \\ \end{array} $$
(3.7)

where \({\text{vec}}\left({\mathcal{R}}_{{{\varvec{S}}}_{0}}\right)={\varvec{S}}{{\varvec{r}}}_{0}\) and \({\varvec{c}}_{1} = {\varvec{Q}}{\text{vec}}\left( {{\mathcal{C}}_{1} ^{\prime}} \right)\). The column index corresponding to the maximum correlation \({{\varvec{c}}}_{1}\) is added to the active set. For a given active set \(I\), the optimal solution \({\stackrel{\sim }{\mathcal{X}}}_{t}\) at any iteration \(t\), could be written as

$${\text{vec}}\left( {\mathop {\mathcal{X}}\limits_{t}^{\sim } } \right) = \left\{ {\begin{array}{*{20}c} {\left( {{\mathbf{\Phi }}_{{W_{{I_{t} }}^{T} }} {\mathbf{\Phi }}_{{W_{{I_{t} }} }} } \right)^{{ - 1}} } & {\left( {{\mathbf{\Phi }}_{{W_{{I_{t} }}^{T} }} {\text{vec}}\left( {\mathcal{Y}} \right) - \lambda _{t} {\mathbf{z}}_{t} } \right),{\text{on}}\,\,I} \\ {0,} & {{\text{Otherwise}}} \\ \end{array} } \right.$$
(3.8)

where, \({{\varvec{z}}}_{t}\) is the sign sequence of \({{\varvec{c}}}_{t}\) on the active set \(I\), and \({{\varvec{c}}}_{t} = {\boldsymbol{\Phi }}_{W}^{T}{{\varvec{r}}}_{t-1}\) is the correlation vector of all columns of the dictionary \({\boldsymbol{\Phi }}_{W}\) with the residual \({{\varvec{r}}}_{t-1}\) at any iteration t.

The optimal solution at any iteration, \(t\) must satisfy the following two optimality conditions,

$$\begin{array}{c}{{\boldsymbol{\Phi }}_{W}}_{{I}_{t}}^{T}{{\varvec{r}}}_{t} = -{\lambda }_{t}{{\varvec{z}}}_{t}\end{array}$$
(3.9)
$$\begin{array}{c}{\Vert {{\boldsymbol{\Phi }}_{W}}_{{I}_{t}^{c}}^{T}{{\varvec{r}}}_{t} \Vert }_{\infty } \le {\lambda }_{t} \end{array}$$
(3.10)

where, \({{\varvec{r}}}_{t}=\mathrm{ vec}\left(\mathcal{Y}\right) -{\boldsymbol{\Phi }}_{W}{\text{vec}}\left({\stackrel{\sim }{\mathcal{X}}}_{t}\right)\) is the residual at iteration \(t\), and \({{\varvec{z}}}_{t}\) is the sign sequence of the correlation \({{\varvec{c}}}_{t}\) at iteration \(t\), on the active set \(I\). The condition in (\(9\)) ensures that the magnitude of the correlation between all active columns of \({\boldsymbol{\Phi }}_{W}\) and the residual is equal to \(\left|{\lambda }_{t}\right|\) at each iteration, and the condition in (\(10\)) ensures that the magnitude of the correlation between the inactive columns of \({\boldsymbol{\Phi }}_{W}\) and the residual is less than or equal to \(\left|{\lambda }_{t}\right|\).

At each iteration \(t\), \({\lambda }_{t}\) is reduced by a small step size, \({\delta }_{t}^{*}\), until a condition in either (\(9\)) or (\(10\)) violates. For \({L}_{0},\) and \({L}_{1}\) constrained minimization problems, if an inactive column violates the condition (\(10\)), it is added to the active set, and for \({L}_{1}\) constrained minimization problems, if an active column violates the condition (\(9\)), it is removed from the active set.

As \(\uplambda \) is reduced by \({\delta }_{t}^{*}\), the solution \({\stackrel{\sim }{\mathcal{X}}}_{t}\) changes by \({\delta }_{t}^{*}{{\varvec{d}}}_{t}\) along a direction \({{\varvec{d}}}_{t}\), where \({{\varvec{d}}}_{{I}_{t}^{c}}=0\) and \({{\varvec{d}}}_{{I}_{t}}={{\varvec{G}}}_{t}^{-1}{{\varvec{z}}}_{t}\), and \({{\varvec{G}}}_{t}^{-1}\) is the inverse of the Gram matrix of the active columns of the dictionary \({{\varvec{G}}}_{t}={{\boldsymbol{\Phi }}_{W}}_{{I}_{t}}^{T}{{\boldsymbol{\Phi }}_{W}}_{{I}_{t}}\).

The size of this Gram matrix would either increase (dictionary column addition) or decrease (dictionary column removal) with each iteration \(t\). Therefore, for computational efficiency, we use the Schur complement inversion formula to calculate \({{\varvec{G}}}_{t}^{-1}\) from \({{\varvec{G}}}_{t-1}^{-1}\) thereby avoiding its full calculation [30, 37].

The smallest step size for \({L}_{1}\) constrained sparse least-squares problem \({\delta }_{t}^{*} =\mathrm{ min }\left\{{\delta }_{t}^{+},{\delta }_{t}^{-}\right\}\) is the minimum of \({\delta }_{t}^{+}\), minimum step size for adding a column, and \({\delta }_{t}^{-}\), minimum step size for removing a column. The minimum step size for removing a column from the active set is given by,

$$\begin{array}{c}{\delta }_{t}^{-}=\underset{i\in I}{{\text{min}}}\left\{-\frac{{{\varvec{x}}}_{t-1}\left(i\right)}{{{\varvec{d}}}_{t}\left(i\right)}\right\}\end{array}$$
(3.11)

The minimum step size for adding a new column to the active set is given by,

$$\begin{array}{c}{\delta }_{t}^{+}=\underset{i\in {I}^{c}}{{\text{min}}}\left\{\frac{{\lambda }_{t}-{{\varvec{c}}}_{t}\left(i\right)}{1-{{\varvec{v}}}_{t}\left(i\right)},\frac{{\lambda }_{t}+{{\varvec{c}}}_{t}\left(i\right)}{1+{{\varvec{v}}}_{t}\left(i\right)}\right\}\end{array}$$
(3.12)

where

$$\begin{array}{c}{{\varvec{v}}}_{t}= {\boldsymbol{\Phi }}_{W}^{T}{\boldsymbol{\Phi }}_{W}{{\varvec{d}}}_{t}\end{array}$$
(3.13)

Since \({\boldsymbol{\Phi }}_{W}={\varvec{S}}\boldsymbol{\Phi }{\varvec{Q}}\), We can efficiently calculate \({{\varvec{v}}}_{t}\) using two full multilinear products.

Let \({\varvec{v}}_{t} = {\varvec{Q}}{\text{vec}}\left( {{\mathcal{V}}_{t}{\prime} } \right)\), where

$$ \begin{array}{*{20}c} {{\mathcal{V}}_{t}^{\prime } = {\mathcal{U}}_{{w_{t} }} \times_{1} {{\varvec{\Phi}}}^{{\left( 1 \right)^{T} }} \times_{2} \ldots \times_{N} {{\varvec{\Phi}}}^{{\left( N \right)^{T} }} } \\ \end{array} $$
(3.14)

and \({\text{vec}}\left( {{\mathcal{U}}_{{w_{t} }} } \right) = {\varvec{W}}{\text{vec}}\left( {{\mathcal{D}}_{t}{\prime} \times_{1} {{\varvec{\Phi}}}^{\left( 1 \right)} \times_{2} \ldots \times_{N} {{\varvec{\Phi}}}^{\left( N \right)} } \right)\) , and \({\text{vec}}\left( {{\mathcal{D}}_{t}{\prime} } \right) = {\varvec{Qd}}_{t}\).

The residual \({{\varvec{r}}}_{t+1}\) is calculated at the end of each iteration using,

$$\begin{array}{c}{{\varvec{r}}}_{t+1}= {{\varvec{r}}}_{t}- {{\delta }_{t}^{*}\boldsymbol{\Phi }}_{W}{{\varvec{d}}}_{t} \end{array}$$
(3.15)

where we can efficiently calculate \({\boldsymbol{\Phi }}_{W}{{\varvec{d}}}_{t}\) using

$$ \begin{array}{*{20}c} {{{\varvec{\Phi}}}_{W} {\varvec{d}}_{t} = {\varvec{S}}vec\left( {{\mathcal{D}}_{t}{\prime} \times_{1} {{\varvec{\Phi}}}^{\left( 1 \right)} \times_{2} \cdots \times_{N} {{\varvec{\Phi}}}^{\left( N \right)} } \right)} \\ \end{array} $$
(3.14)

WT-LARS stops at a predetermined residual error \({{\varvec{r}}}_{t+1}\le \varepsilon \) or when a predetermined number of active columns \(K\) is obtained.

3.3.1 Weighted Tensor Least Angle Regression Algorithm

An algorithm of weighted tensor least angle regression. It includes initialization S = square root W, residual r 0 = S vec of y x 0 = 0, I = left and right curly bracket, and return I, x.

3.4 Experimental Results

In this section, we present experimental results for WT-LARS as a tensor completion problem [38,39,40], using inpainting as an example. Image inpainting has progressed significantly during last few years, specifically using machine learning methods [41,42,43]. However, as far as we know, no other tensor-based method is available for solving the image inpainting problem as a weighted tensor least squares problems.

For experiments shown in Figs. 3.2 and 3.1, we obtained fenced images from the Image datasets for MSBP deformable lattice detection Algorithm [44], and for the experiment shown in Fig. 3.1, we obtained a landscape image from the DIV2K dataset [45].

Fig. 3.1
3 photographs of a Leopard surrounded by a fence. A. represents an original image, B. represents a weights image, and C. represents W T L A R S reconstructed image.

a Original image with a fence b weights image with zero weights for the fence c WT-LARS reconstructed image (Fence Removed)

Fig. 3.2
3 photographs of a deer surrounded by a fence. A. represents an original image, B. represents a weights image, and C. represents W T L A R S reconstructed image.

a Original image with a fence b weights image with zero weights for the fence c WT-LARS reconstructed image (Fence Removed)

Our experimental results were obtained using a MATLAB implementation of WT-LARS using the MATLAB version R2017b on an MS-Windows machine: 2 Intel Xeon CPUs E5-2637 v4, 3.5 GHz, 32 GB RAM, and NVIDIA Tesla P100 GPU with 12 GB memory.

3.4.1 Inpainting

In this experiment, we use WT-LARS for inpainting. We obtained a sparse representation of the inpainted image using WT-LARS after applying zero weights to the missing data.

In our experimental results shown in Figs. 3.1 and 3.2, we obtained a fenceless image by considering pixels behind the fences as missing data. Figures 3.1a and 3.2a show the original image with a fence, and Figs. 3.1b and 3.2b show the respective masks applied to each pixel of the original image, where black indicates zero and white indicates one. Figures 3.1c and 3.2c show the obtained sparse representation of images behind fences using WT-LARS.

We obtained RGB image patches, \(200\times 200\times 3\) pixels, from the original images in Figs. 3.1a and 3.2a. For each patch, we obtained a weighted K-sparse representation using WT-LARS, with 10% nonzero coefficients, for three fixed mode-n overcomplete dictionaries, \({\boldsymbol{\Phi }}^{\left(1\right)}\in {\mathbb{R}}^{200\times 400}\), \({\boldsymbol{\Phi }}^{\left(2\right)}\in {\mathbb{R}}^{200\times 400}\) and \({\boldsymbol{\Phi }}^{\left(3\right)}\in {\mathbb{R}}^{3\times 4}\), by solving a \({L}_{1}\) constrained sparse weighted least squares problem. Weights consists of zeros for the pixels that belong to the fence in the original images and ones for everywhere else. Used fixed mode-n overcomplete dictionaries were a union of a Discrete Cosine Transform (DCT) dictionary and a Symlet wavelet packet with four vanishing moments dictionary. In the experimental results shown in Figs. 3.1 and 3.2, the RGB patches with the minimum number of nonzero samples had \(\mathrm{79,834}\) and \(\mathrm{92,748}\) nonzero samples, respectively. We collected 60 image patches from the image in Fig. 3.1a and 35 image patches from the image in Fig. 3.2a, where on average WT-LARS took 476 s to collect 12,000 (10% of \(200\times 200\times 3\)) non-zero coefficients from each image patch.

In the experimental results shown in Fig. 3.3, we use WT-LARS to obtain a landscape image occluded by a person in Fig. 3.3a. Figure 3.3b shows the weights, and Fig. 3.3c shows the inpainting result after removing the person from the foreground of the landscape image.

Fig. 3.3
3 photographs of a person. A. represents an original image. B. represents the weights image of the background removed. C. represents W T L A R S reconstructed image for the person removed.

a Original image with a person b weights image with zero weights for the person c WT-LARS reconstructed image (Person Removed)

The RGB images in Fig. 3.3a is a scaled version of the original image with \(200\times 300\times 3\) pixels. We obtained a weighted K-sparse representation for the scaled image in Fig. 3.3a using WT-LARS, with 20% non-zero coefficients, for three fixed mode-n overcomplete dictionaries, \({\boldsymbol{\Phi }}^{\left(1\right)}\in {\mathbb{R}}^{200\times 400}\), \({\boldsymbol{\Phi }}^{\left(2\right)}\in {\mathbb{R}}^{300\times 604}\) and \({\boldsymbol{\Phi }}^{\left(3\right)}\in {\mathbb{R}}^{3\times 4}\), by solving a weighted \({L}_{1}\) constrained sparse least squares problem. Weights consist of zeros for the pixels belonging to the person in the original image and ones for everywhere else. Used fixed mode-n overcomplete dictionaries were a union of a Discrete Cosine Transform (DCT) dictionary and a Symlet wavelet packet with four vanishing moments dictionary. In the experimental results shown in Fig. 3.3, a total of 170,829 nonzero samples have been used to obtain the sparse signal representation of the landscape image. The WT-LARS took 20,625 s to collect 36,000 non-zero coefficients, which is 20% of the size of the image tensor in Fig. 3.3a. Therefore, inpainting results in, Figs. 3.1c, 3.2c and 3.3c clearly show that WT-LARS can be successfully used to approximate missing/incomplete data.

3.5 Conclusions

Sparse weighted multilinear least-squares is a generalization of the sparse multilinear least-squares problem, where both sides of the Kronecker LS system are multiplied by an arbitrary diagonal weights matrix. These arbitrary weights would result in a potentially very large non-Kronecker least-squares problem that could be impractical to solve as it would require significant memory and computational power.

This paper extended the T-LARS algorithm, earlier developed by the authors [28], to become the Weighted Tensor Least Angle Regression (WT-LARS) algorithm that could solve efficiently either L0 or L1 constrained multilinear least-squares problems with arbitrary diagonal weights for all critical values of their regularization parameter. To validate our new WT-LARS algorithm, we used it to solve three image inpainting problems. In our experimental results using WT-LARS shown in Figs. 3.1 and 3.2, we obtained the exact sparse signal representation of RGB images behind fences after applying zero weights to the pixels representing the fences. In the experimental result using WT-LARS shown in Fig. 3.3, we successfully obtained an exact sparse signal representation of an RGB landscape image occluded by a person by applying zero weights to the pixels representing this person. These results demonstrate the validity and usefulness of our new Weighted Least Angle Regression (WT-LARS) algorithm.

Possible future applications of WT-LARS include efficiently solving weighted least-squares applications for tensor signals. Such examples include tensor completion, image/video inpainting, image/video smoothing, and tensor signal restorations.

A MATLAB GPU-based implementation of our Weighted Tensor Least Angle Regression (WT-LARS) algorithm, Algorithm 1, is available at https://github.com/SSSherif/Weighted-Tensor-Least-Angle-Regression.