Weighted Tensor Least Angle Regression for Solving Sparse Weighted Multilinear Least Squares Problems

Wickramasingha, Ishan M.; Mezgebo, Biniyam K.; Sherif, Sherif S.

doi:10.1007/978-981-97-0109-4_3

Ishan M. Wickramasingha⁸,
Biniyam K. Mezgebo⁸ &
Sherif S. Sherif⁸

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 385))

Included in the following conference series:

International Workshop on New Approaches for Multidimensional Signal Processing

36 Accesses

Abstract

Sparse weighted multilinear least-squares is a generalization of the sparse multilinear least-squares problem, where prior information about, e.g., parameters and data is incorporated by multiplying both sides of the original problem by a typically diagonal weights matrix. However, the introduction of arbitrary diagonal weights would result in a non-Kronecker least-squares problem that could be very large to store or solve practically. In this paper, we generalize our recent Tensor Least Angle Regression (T-LARS) algorithm to efficiently solve either L₀ or L₁ constrained multilinear least-squares problems with arbitrary diagonal weights for all critical values of their regularization parameter. To demonstrate the validity of our new Weighted Least Angle Regression (WT-LARS) algorithm, we used it to successfully solve three different image inpainting problems by obtaining sparse representations of binary-weighted images.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Keywords

3.1 Introduction

Weighted least squares is a generalization of the least-squares (LS) problem, where prior information about parameters and data is incorporated by multiplying both sides of the original LS problem by a typically diagonal weights matrix. Applications of weighted least-squares in Signal Processing include signal restoration [1, 2], source localization in wireless networks [3,4,5,6], adaptive filters [4, 7,8,9], and image smoothing [10]. In Statistics, weighted least-squares regression is often used to reduce bias from non-informative data samples [11, 12]. Also, a best linear unbiased estimator (BLUE) is obtained by using the inverse of the data covariance matrix as the weights matrix [13].

Recently, sparsity has become a commonly desired characteristic of a least-squares solution [14, 15]. Because of its relatively small number of non-zero values, a sparse solution could result in faster processing with lower computer storage requirements [14, 15]. A sparse solution is usually obtained by solving a least-squares problem while minimizing either the L₀ norm of the solution (non-convex optimization problem) or minimizing the L₁ norm of the solution (convex optimization problem), where the L₀ norm of a vector is its number of non-zero elements and the L₁ norm of a vector is the sum of the magnitude of its elements [14].

Several methods have been proposed to solve sparse least-squares problems, including the Method of Frames [16], Matching Pursuit (MP) [17], Orthogonal Matching Pursuit (OMP) [18], Best Orthogonal Basis [19], Least Absolute Shrinkage and Selection Operator (LASSO) that is also known as Basis Pursuit [20, 21], and Least Angle Regression (LARS) [21]. Both MP and OMP solve the L₀ constrained least-squares problem [22] using sequential heuristic steps that add solution coefficients in a greedy, i.e., non-globally optimal, way. LASSO relaxes the non-convex L₀ constrained least-squares problem to solve the convex L₁ constrained least-squares problem instead [20]. Among the above solution methods, only Least Angle Regression could efficiently solve both the L₀ and, with a slight modification, L₁ constrained least-squares problem for all critical values of their regularization parameters. This parameter is required to balance the minimization of the LS residual with the minimization of the norm of the solution [21].

In addition to incorporating a priori information, weights also could be introduced to sparse least-squares problems to improve the ${L}_{1}$ minimization problem results [23, 24]. Candès et al. also used a reweighted ${L}_{1}$ minimization approach to enhance sparsity in compressed sensing [25]. Also, weighted L₁ constrained least-squares regression has been used to extract information from large data sets for statistical applications [26, 27]. We note that sparse weighted least-squares problems could be solved using any of the above optimization methods.

Multilinear least-squares is a multidimensional generalization of least-squares [28,29,30], where the least-squares matrix has a Kronecker structure [31, 32]. Sparse multilinear least-squares could be either an L₀ constrained or an L₁ constrained multilinear least-squares problem. Caiafa and Cichocki introduced a generalization of OMP, Kronecker-OMP, to solve the L₀ constrained sparse multilinear least-squares problem [32]. Elrewainy and Sherif [33] developed Kronecker Least Angle Regression (K-LARS) to efficiently solve both L₀ and L₁ constrained sparse least-squares having a specific Kronecker matrix form, ${\varvec{A}}\otimes {\varvec{I}}$, for all critical values of the regularization parameter. To overcome this limitation, the authors further developed Tensor Least angle Regression (T-LARS) [30], a generalization of K-LARS that does not require any special form of the LS matrix beyond being Kronecker. T-LARS solves either large L₀ or large L₁ constrained, sparse multilinear least-squares problems (underdetermined or overdetermined) for all critical values of the regularization parameter λ with significantly lower computational complexity and memory usage than Kronecker-OMP.

Weighted multilinear least-squares is a generalization of multilinear least-squares that introduces a typically diagonal weight matrix to both sides of the original LS problem. Since an arbitrary diagonal weight matrix would not be Kronecker, the weighted LS matrix would lose its original Kronecker structure, resulting in a potentially very large non-Kronecker LS matrix. Thus solving these weighted sparse multilinear least-squares problems could become highly impractical, as it would require significant memory and computational power.

Therefore, in this paper, we extend T-LARS to Weighted Tensor Least Angle Regression (WT-LARS) that could solve efficiently both ${L}_{0}$ and ${L}_{1}$ constrained sparse weighted multilinear least-squares problems for all critical values of the regularization parameter. It is organized as follows: Sect. 3.2 includes a brief introduction to the sparse weighted multilinear least-squares problem. In Sect. 3.3, we describe our new Weighted Tensor Least Angle Regression (WT-LARS) algorithm in detail. Section 3.4 provides results of applying WT-LARS to solve three different image inpainting problems by obtaining sparse representations of binary-weighted images. We present our conclusions in Sect. 3.5.

3.2 Problem Formulation

3.2.1 Sparse Weighted Multilinear Least-Squares Problem

A multilinear transformation of a tensor $\mathcal{X}$ could be defined as, $\mathcal{Y}=\mathcal{X}{\times }_{1}{\boldsymbol{\Phi }}^{\left(1\right)}{\times }_{2}\cdots {\times }_{N}{\boldsymbol{\Phi }}^{\left(N\right)}$, where $\mathcal{Y}\in {\mathbb{R}}^{{J}_{1}\times \dots \times {J}_{n}\times \dots \times {J}_{N}}$ and $\mathcal{X}\in {\mathbb{R}}^{{I}_{1}\times \dots {\times I}_{n}\times \dots \times {I}_{N}}$ are N^th order tensors, with the equivalent vectorized form

$$\begin{array}{*{20}c} {{\mathbf{\Phi }}vec\left( {{\text{ }}{\mathcal{X}}} \right) = vec\left( {\mathcal{Y}} \right)} \\ \end{array}$$

(3.1)

where $\boldsymbol{\Phi }\in {\mathbb{R}}^{J\times I},$ and $\boldsymbol{\Phi }={\boldsymbol{\Phi }}^{\left(N\right)}\otimes \cdots \otimes {\boldsymbol{\Phi }}^{\left(1\right)}$, and $\otimes $ is the Kronecker product operator [34].

Let ${\varvec{W}}={{\varvec{S}}}^{H}{\varvec{S}}$, be a diagonal weight matrix. We could obtain a weighted linear transformation [35] of (1) as

$$\begin{array}{*{20}c} {{\mathbf{S\Phi }}vec\left( {\mathcal{X}} \right) = Svec\left({\mathcal{Y}} \right)} \\ \end{array}$$

(3.2)

A sparse solution of the weighted linear system in $\left(2\right)$ could be obtained by solving an ${L}_{p}$ (p = 0 or p = 1) minimization problem,

$$\begin{array}{c}\stackrel{\sim }{\mathcal{X}} = \underset{\mathcal{X}}{\mathrm{arg min}}{\Vert {\varvec{S}}\boldsymbol{\Phi }{\text{vec}}\left(\mathcal{X}\right) - {\varvec{S}}{\text{vec}}\left(\mathcal{Y}\right) \Vert }_{2}^{2}+\lambda {\Vert {\text{vec}}\left(\mathcal{X}\right)\Vert }_{p}\end{array}$$

(3.3)

where $\uplambda $ is a regularization parameter.

If ${\varvec{S}}$ is a Kronecker matrix, then ${\varvec{S}}\boldsymbol{\Phi }=\left({{\varvec{S}}}^{\left(N\right)}{\boldsymbol{\Phi }}^{\left(N\right)}\otimes \cdots \otimes {{\varvec{S}}}^{\left(1\right)}{\boldsymbol{\Phi }}^{\left(1\right)}\right)$ and we could use T-LARS [30] to obtain a sparse solution for either ${L}_{0}$ or ${L}_{1}$ optimization problem in $\left(3\right)$ efficiently. However, ${\varvec{S}}$ is not typically Kronecker, so ${\varvec{S}}\boldsymbol{\Phi }$ would not have a Kronecker structure, and $\left(3\right)$ should be solved as a potentially very large vectorized (one-dimensional) sparse least-squares problem which could be very challenging in terms of memory and computational power requirements. Therefore, in this paper, we develop Weighted Tensor Least Angle Regression (WT-LARS), a computationally efficient method, to solve either ${L}_{0}$ or ${L}_{1}$ constrained sparse weighted multilinear least-squares problems in $\left(3\right)$ for an arbitrary diagonal weights matrix ${\varvec{W}}={{\varvec{S}}}^{H}{\varvec{S}}\in {\mathbb{R}}^{J\times J}$.

3.3 Weighted Tensor Least Angle Regression

In this section, we develop Weighted Tensor Least Angle Regression (WT-LARS) by extending T-LARS to solve the sparse weighted multilinear least-squares problem in $\left(3\right)$, for weights ${\varvec{W}}={{\varvec{S}}}^{H}{\varvec{S}}$ and Kronecker dictionaries $\boldsymbol{\Phi }$.

Inputs to WT-LARS are the data tensor $\mathcal{Y}\in {\mathbb{R}}^{{J}_{1}\times \dots \times {J}_{n}\times \dots \times {J}_{N}}$, mode-n dictionary matrices ${\boldsymbol{\Phi }}^{\left(n\right)};n\in \left\{1,\cdots , N\right\}$ where $\boldsymbol{\Phi }={\boldsymbol{\Phi }}^{\left(N\right)}\otimes \cdots \otimes {\boldsymbol{\Phi }}^{\left(1\right)}$, the diagonal weight matrix ${\varvec{W}}={{\varvec{S}}}^{H}{\varvec{S}}$, and the stopping criterion as a residual tolerance $\varepsilon $ or the maximum number of non-zero coefficients $K$ (K-sparse representation). The output is the solution tensor $\mathcal{X}\in {\mathbb{R}}^{{I}_{1}\times \dots {\times I}_{n}\times \dots \times {I}_{N}}$.

WT-LARS requires weighted data ${\varvec{S}}{\text{vec}}\left(\mathcal{Y}\right)$, and columns of the weighted dictionary ${\varvec{S}}\boldsymbol{\Phi }$ to have a unit ${L}_{2}$ norm. Normalized weighted data could be easily calculated by ${\mathcal{Y}}_{W}={\varvec{S}}{\text{vec}}\left(\mathcal{Y}\right)/{\Vert {\varvec{S}}{\text{vec}}\left(\mathcal{Y}\right)\Vert }_{2}$. However, the dictionary matrix ${\varvec{S}}\boldsymbol{\Phi }$ does not have a Kronecker structure. Hence, normalizing mode-n dictionary matrices ${\boldsymbol{\Phi }}^{\left(n\right)}$ does not ensure normalization of the columns of ${\varvec{S}}\boldsymbol{\Phi }$. Therefore, in WT-LARS, we use the normalized weighted dictionary matrix ${\boldsymbol{\Phi }}_{W}={\varvec{S}}\boldsymbol{\Phi }{\varvec{Q}}$ instead of the normalized dictionary matrix $\boldsymbol{\Phi }$ in T-LARS, where ${\varvec{Q}}$ is a diagonal matrix,

$$\begin{array}{c}{{\varvec{Q}}}_{i,i}= \frac{1}{{\Vert {\left({\varvec{S}}\boldsymbol{\Phi }\right)}_{i}\Vert }_{2}}\end{array}$$

(3.4)

where ${\left({\varvec{S}}\boldsymbol{\Phi }\right)}_{i}$ is the ${i}^{th}$ column of the weighted dictionary matrix ${\varvec{S}}\boldsymbol{\Phi }$. We can efficiently calculate the diagonal matrix ${\varvec{Q}}$ as,

$$\begin{array}{c}diag\left({\varvec{Q}}\right)=1./\sqrt{{\left({\boldsymbol{\Phi }}^{*2}\right)}^{{\varvec{T}}} diag\left({\varvec{W}}\right)}\end{array}$$

(3.5)

where, ${\boldsymbol{\Phi }}^{*2}$ [36] denotes the Hadamard square of $\boldsymbol{\Phi }$, such that ${\boldsymbol{\Phi }}_{i,j}^{*2}={\left({\boldsymbol{\Phi }}_{i,j}\right)}^{2}$, $"./"$ denotes elementwise division, and $diag\left({\varvec{Q}}\right)$ and $diag\left({\varvec{W}}\right)$ are diagonal vectors of ${\varvec{Q}}$ and ${\varvec{W}}$ respectively. We could efficiently calculate ${\left({\boldsymbol{\Phi }}^{*2}\right)}^{{\varvec{T}}}diag\left({\varvec{W}}\right)$ using the full multilinear product.

WT-LARS solves the ${L}_{0}$ or ${L}_{1}$ constrained minimization problems in $\left(3\right)$ for all critical values of the regularization parameter $\uplambda $. WT-LARS starts with a large value of $\uplambda $, that results in an empty active set $I=\{\}$, and a solution ${\stackrel{\sim }{\mathcal{X}}}_{t=0}=0$. The set $I$ denotes an active set of columns of the dictionary ${\boldsymbol{\Phi }}_{W}$, i.e., column indices where the optimal solution ${\stackrel{\sim }{\mathcal{X}}}_{t}$ at iteration $t$, is nonzero, and ${I}^{c}$ denotes its corresponding inactive set. Therefore, ${{\boldsymbol{\Phi }}_{W}}_{I}$ contains only the active columns of the dictionary ${\boldsymbol{\Phi }}_{W}$ and ${{\boldsymbol{\Phi }}_{W}}_{{I}^{c}}$ contains only its inactive columns.

At each iteration $t$, a new column is either added (${L}_{0})$ to the active set $I$ or a new column is either added or removed (${L}_{1}$) from the active set $I$, and $\uplambda $ is reduced by a calculated value ${\delta }_{t}^{*}$.

As a result of such iterations, new solutions with an increased number of coefficients that follow a piecewise linear path are obtained until a predetermined residual error $\varepsilon $ or a predetermined number of active columns $K$ is obtained.

The regularization parameter $\uplambda $ is initialized to the maximum of the correlation ${{\varvec{c}}}_{1}$, between the columns of ${\boldsymbol{\Phi }}_{W}$ and the initial residual ${{\varvec{r}}}_{0}={\text{vec}}\left(\mathcal{Y}\right)$.

$$\begin{array}{c}{{\varvec{c}}}_{1}= {\boldsymbol{\Phi }}_{W}^{T}{{\varvec{r}}}_{0}\end{array}$$

(3.6)

Since ${\boldsymbol{\Phi }}_{W}^{T}={\varvec{Q}}{\boldsymbol{\Phi }}^{{\varvec{T}}}{\varvec{S}}$, we can easily calculate ${\boldsymbol{\Phi }}^{{\varvec{T}}}{\varvec{S}}{{\varvec{r}}}_{0}$ using the full multilinear product as

$$ \begin{array}{*{20}c} {{\mathcal{C}}_{1}^{\prime } = {\mathcal{R}}_{{{\varvec{S}}_{0} }} \times_{1} {{\varvec{\Phi}}}^{{\left( 1 \right)^{T} }} \times_{2} \ldots \times_{N} {{\varvec{\Phi}}}^{{\left( N \right)^{T} }} } \\ \end{array} $$

(3.7)

where ${\text{vec}}\left({\mathcal{R}}_{{{\varvec{S}}}_{0}}\right)={\varvec{S}}{{\varvec{r}}}_{0}$ and ${\varvec{c}}_{1} = {\varvec{Q}}{\text{vec}}\left( {{\mathcal{C}}_{1} ^{\prime}} \right)$. The column index corresponding to the maximum correlation ${{\varvec{c}}}_{1}$ is added to the active set. For a given active set $I$, the optimal solution ${\stackrel{\sim }{\mathcal{X}}}_{t}$ at any iteration $t$, could be written as

$${\text{vec}}\left( {\mathop {\mathcal{X}}\limits_{t}^{\sim } } \right) = \left\{ {\begin{array}{*{20}c} {\left( {{\mathbf{\Phi }}_{{W_{{I_{t} }}^{T} }} {\mathbf{\Phi }}_{{W_{{I_{t} }} }} } \right)^{{ - 1}} } & {\left( {{\mathbf{\Phi }}_{{W_{{I_{t} }}^{T} }} {\text{vec}}\left( {\mathcal{Y}} \right) - \lambda _{t} {\mathbf{z}}_{t} } \right),{\text{on}}\,\,I} \\ {0,} & {{\text{Otherwise}}} \\ \end{array} } \right.$$

(3.8)

where, ${{\varvec{z}}}_{t}$ is the sign sequence of ${{\varvec{c}}}_{t}$ on the active set $I$, and ${{\varvec{c}}}_{t} = {\boldsymbol{\Phi }}_{W}^{T}{{\varvec{r}}}_{t-1}$ is the correlation vector of all columns of the dictionary ${\boldsymbol{\Phi }}_{W}$ with the residual ${{\varvec{r}}}_{t-1}$ at any iteration t.

The optimal solution at any iteration, $t$ must satisfy the following two optimality conditions,

$$\begin{array}{c}{{\boldsymbol{\Phi }}_{W}}_{{I}_{t}}^{T}{{\varvec{r}}}_{t} = -{\lambda }_{t}{{\varvec{z}}}_{t}\end{array}$$

(3.9)

$$\begin{array}{c}{\Vert {{\boldsymbol{\Phi }}_{W}}_{{I}_{t}^{c}}^{T}{{\varvec{r}}}_{t} \Vert }_{\infty } \le {\lambda }_{t} \end{array}$$

(3.10)

where, ${{\varvec{r}}}_{t}=\mathrm{ vec}\left(\mathcal{Y}\right) -{\boldsymbol{\Phi }}_{W}{\text{vec}}\left({\stackrel{\sim }{\mathcal{X}}}_{t}\right)$ is the residual at iteration $t$, and ${{\varvec{z}}}_{t}$ is the sign sequence of the correlation ${{\varvec{c}}}_{t}$ at iteration $t$, on the active set $I$. The condition in ($9$) ensures that the magnitude of the correlation between all active columns of ${\boldsymbol{\Phi }}_{W}$ and the residual is equal to $\left|{\lambda }_{t}\right|$ at each iteration, and the condition in ($10$) ensures that the magnitude of the correlation between the inactive columns of ${\boldsymbol{\Phi }}_{W}$ and the residual is less than or equal to $\left|{\lambda }_{t}\right|$.

At each iteration $t$, ${\lambda }_{t}$ is reduced by a small step size, ${\delta }_{t}^{*}$, until a condition in either ($9$) or ($10$) violates. For ${L}_{0},$ and ${L}_{1}$ constrained minimization problems, if an inactive column violates the condition ($10$), it is added to the active set, and for ${L}_{1}$ constrained minimization problems, if an active column violates the condition ($9$), it is removed from the active set.

As $\uplambda $ is reduced by ${\delta }_{t}^{*}$, the solution ${\stackrel{\sim }{\mathcal{X}}}_{t}$ changes by ${\delta }_{t}^{*}{{\varvec{d}}}_{t}$ along a direction ${{\varvec{d}}}_{t}$, where ${{\varvec{d}}}_{{I}_{t}^{c}}=0$ and ${{\varvec{d}}}_{{I}_{t}}={{\varvec{G}}}_{t}^{-1}{{\varvec{z}}}_{t}$, and ${{\varvec{G}}}_{t}^{-1}$ is the inverse of the Gram matrix of the active columns of the dictionary ${{\varvec{G}}}_{t}={{\boldsymbol{\Phi }}_{W}}_{{I}_{t}}^{T}{{\boldsymbol{\Phi }}_{W}}_{{I}_{t}}$.

The size of this Gram matrix would either increase (dictionary column addition) or decrease (dictionary column removal) with each iteration $t$. Therefore, for computational efficiency, we use the Schur complement inversion formula to calculate ${{\varvec{G}}}_{t}^{-1}$ from ${{\varvec{G}}}_{t-1}^{-1}$ thereby avoiding its full calculation [30, 37].

The smallest step size for ${L}_{1}$ constrained sparse least-squares problem ${\delta }_{t}^{*} =\mathrm{ min }\left\{{\delta }_{t}^{+},{\delta }_{t}^{-}\right\}$ is the minimum of ${\delta }_{t}^{+}$, minimum step size for adding a column, and ${\delta }_{t}^{-}$, minimum step size for removing a column. The minimum step size for removing a column from the active set is given by,

$$\begin{array}{c}{\delta }_{t}^{-}=\underset{i\in I}{{\text{min}}}\left\{-\frac{{{\varvec{x}}}_{t-1}\left(i\right)}{{{\varvec{d}}}_{t}\left(i\right)}\right\}\end{array}$$

(3.11)

The minimum step size for adding a new column to the active set is given by,

$$\begin{array}{c}{\delta }_{t}^{+}=\underset{i\in {I}^{c}}{{\text{min}}}\left\{\frac{{\lambda }_{t}-{{\varvec{c}}}_{t}\left(i\right)}{1-{{\varvec{v}}}_{t}\left(i\right)},\frac{{\lambda }_{t}+{{\varvec{c}}}_{t}\left(i\right)}{1+{{\varvec{v}}}_{t}\left(i\right)}\right\}\end{array}$$

(3.12)

where

$$\begin{array}{c}{{\varvec{v}}}_{t}= {\boldsymbol{\Phi }}_{W}^{T}{\boldsymbol{\Phi }}_{W}{{\varvec{d}}}_{t}\end{array}$$

(3.13)

Since ${\boldsymbol{\Phi }}_{W}={\varvec{S}}\boldsymbol{\Phi }{\varvec{Q}}$, We can efficiently calculate ${{\varvec{v}}}_{t}$ using two full multilinear products.

Let ${\varvec{v}}_{t} = {\varvec{Q}}{\text{vec}}\left( {{\mathcal{V}}_{t}{\prime} } \right)$, where

$$ \begin{array}{*{20}c} {{\mathcal{V}}_{t}^{\prime } = {\mathcal{U}}_{{w_{t} }} \times_{1} {{\varvec{\Phi}}}^{{\left( 1 \right)^{T} }} \times_{2} \ldots \times_{N} {{\varvec{\Phi}}}^{{\left( N \right)^{T} }} } \\ \end{array} $$

(3.14)

and ${\text{vec}}\left( {{\mathcal{U}}_{{w_{t} }} } \right) = {\varvec{W}}{\text{vec}}\left( {{\mathcal{D}}_{t}{\prime} \times_{1} {{\varvec{\Phi}}}^{\left( 1 \right)} \times_{2} \ldots \times_{N} {{\varvec{\Phi}}}^{\left( N \right)} } \right)$ , and ${\text{vec}}\left( {{\mathcal{D}}_{t}{\prime} } \right) = {\varvec{Qd}}_{t}$.

The residual ${{\varvec{r}}}_{t+1}$ is calculated at the end of each iteration using,

$$\begin{array}{c}{{\varvec{r}}}_{t+1}= {{\varvec{r}}}_{t}- {{\delta }_{t}^{*}\boldsymbol{\Phi }}_{W}{{\varvec{d}}}_{t} \end{array}$$

(3.15)

where we can efficiently calculate ${\boldsymbol{\Phi }}_{W}{{\varvec{d}}}_{t}$ using

$$ \begin{array}{*{20}c} {{{\varvec{\Phi}}}_{W} {\varvec{d}}_{t} = {\varvec{S}}vec\left( {{\mathcal{D}}_{t}{\prime} \times_{1} {{\varvec{\Phi}}}^{\left( 1 \right)} \times_{2} \cdots \times_{N} {{\varvec{\Phi}}}^{\left( N \right)} } \right)} \\ \end{array} $$

(3.14)

WT-LARS stops at a predetermined residual error ${{\varvec{r}}}_{t+1}\le \varepsilon $ or when a predetermined number of active columns $K$ is obtained.

3.3.1 Weighted Tensor Least Angle Regression Algorithm

An algorithm of weighted tensor least angle regression. It includes initialization S = square root W, residual r 0 = S vec of y x 0 = 0, I = left and right curly bracket, and return I, x.

3.4 Experimental Results

In this section, we present experimental results for WT-LARS as a tensor completion problem [38,39,40], using inpainting as an example. Image inpainting has progressed significantly during last few years, specifically using machine learning methods [41,42,43]. However, as far as we know, no other tensor-based method is available for solving the image inpainting problem as a weighted tensor least squares problems.

For experiments shown in Figs. 3.2 and 3.1, we obtained fenced images from the Image datasets for MSBP deformable lattice detection Algorithm [44], and for the experiment shown in Fig. 3.1, we obtained a landscape image from the DIV2K dataset [45].

3 photographs of a Leopard surrounded by a fence. A. represents an original image, B. represents a weights image, and C. represents W T L A R S reconstructed image. — **Fig. 3.1**

3 photographs of a deer surrounded by a fence. A. represents an original image, B. represents a weights image, and C. represents W T L A R S reconstructed image. — **Fig. 3.2**

Our experimental results were obtained using a MATLAB implementation of WT-LARS using the MATLAB version R2017b on an MS-Windows machine: 2 Intel Xeon CPUs E5-2637 v4, 3.5 GHz, 32 GB RAM, and NVIDIA Tesla P100 GPU with 12 GB memory.

3.4.1 Inpainting

In this experiment, we use WT-LARS for inpainting. We obtained a sparse representation of the inpainted image using WT-LARS after applying zero weights to the missing data.

In our experimental results shown in Figs. 3.1 and 3.2, we obtained a fenceless image by considering pixels behind the fences as missing data. Figures 3.1a and 3.2a show the original image with a fence, and Figs. 3.1b and 3.2b show the respective masks applied to each pixel of the original image, where black indicates zero and white indicates one. Figures 3.1c and 3.2c show the obtained sparse representation of images behind fences using WT-LARS.

We obtained RGB image patches, $200\times 200\times 3$ pixels, from the original images in Figs. 3.1a and 3.2a. For each patch, we obtained a weighted K-sparse representation using WT-LARS, with 10% nonzero coefficients, for three fixed mode-n overcomplete dictionaries, ${\boldsymbol{\Phi }}^{\left(1\right)}\in {\mathbb{R}}^{200\times 400}$, ${\boldsymbol{\Phi }}^{\left(2\right)}\in {\mathbb{R}}^{200\times 400}$ and ${\boldsymbol{\Phi }}^{\left(3\right)}\in {\mathbb{R}}^{3\times 4}$, by solving a ${L}_{1}$ constrained sparse weighted least squares problem. Weights consists of zeros for the pixels that belong to the fence in the original images and ones for everywhere else. Used fixed mode-n overcomplete dictionaries were a union of a Discrete Cosine Transform (DCT) dictionary and a Symlet wavelet packet with four vanishing moments dictionary. In the experimental results shown in Figs. 3.1 and 3.2, the RGB patches with the minimum number of nonzero samples had $\mathrm{79,834}$ and $\mathrm{92,748}$ nonzero samples, respectively. We collected 60 image patches from the image in Fig. 3.1a and 35 image patches from the image in Fig. 3.2a, where on average WT-LARS took 476 s to collect 12,000 (10% of $200\times 200\times 3$) non-zero coefficients from each image patch.

In the experimental results shown in Fig. 3.3, we use WT-LARS to obtain a landscape image occluded by a person in Fig. 3.3a. Figure 3.3b shows the weights, and Fig. 3.3c shows the inpainting result after removing the person from the foreground of the landscape image.

3 photographs of a person. A. represents an original image. B. represents the weights image of the background removed. C. represents W T L A R S reconstructed image for the person removed. — **Fig. 3.3**

The RGB images in Fig. 3.3a is a scaled version of the original image with $200\times 300\times 3$ pixels. We obtained a weighted K-sparse representation for the scaled image in Fig. 3.3a using WT-LARS, with 20% non-zero coefficients, for three fixed mode-n overcomplete dictionaries, ${\boldsymbol{\Phi }}^{\left(1\right)}\in {\mathbb{R}}^{200\times 400}$, ${\boldsymbol{\Phi }}^{\left(2\right)}\in {\mathbb{R}}^{300\times 604}$ and ${\boldsymbol{\Phi }}^{\left(3\right)}\in {\mathbb{R}}^{3\times 4}$, by solving a weighted ${L}_{1}$ constrained sparse least squares problem. Weights consist of zeros for the pixels belonging to the person in the original image and ones for everywhere else. Used fixed mode-n overcomplete dictionaries were a union of a Discrete Cosine Transform (DCT) dictionary and a Symlet wavelet packet with four vanishing moments dictionary. In the experimental results shown in Fig. 3.3, a total of 170,829 nonzero samples have been used to obtain the sparse signal representation of the landscape image. The WT-LARS took 20,625 s to collect 36,000 non-zero coefficients, which is 20% of the size of the image tensor in Fig. 3.3a. Therefore, inpainting results in, Figs. 3.1c, 3.2c and 3.3c clearly show that WT-LARS can be successfully used to approximate missing/incomplete data.

3.5 Conclusions

Sparse weighted multilinear least-squares is a generalization of the sparse multilinear least-squares problem, where both sides of the Kronecker LS system are multiplied by an arbitrary diagonal weights matrix. These arbitrary weights would result in a potentially very large non-Kronecker least-squares problem that could be impractical to solve as it would require significant memory and computational power.

This paper extended the T-LARS algorithm, earlier developed by the authors [28], to become the Weighted Tensor Least Angle Regression (WT-LARS) algorithm that could solve efficiently either L₀ or L₁ constrained multilinear least-squares problems with arbitrary diagonal weights for all critical values of their regularization parameter. To validate our new WT-LARS algorithm, we used it to solve three image inpainting problems. In our experimental results using WT-LARS shown in Figs. 3.1 and 3.2, we obtained the exact sparse signal representation of RGB images behind fences after applying zero weights to the pixels representing the fences. In the experimental result using WT-LARS shown in Fig. 3.3, we successfully obtained an exact sparse signal representation of an RGB landscape image occluded by a person by applying zero weights to the pixels representing this person. These results demonstrate the validity and usefulness of our new Weighted Least Angle Regression (WT-LARS) algorithm.

Possible future applications of WT-LARS include efficiently solving weighted least-squares applications for tensor signals. Such examples include tensor completion, image/video inpainting, image/video smoothing, and tensor signal restorations.

A MATLAB GPU-based implementation of our Weighted Tensor Least Angle Regression (WT-LARS) algorithm, Algorithm 1, is available at https://github.com/SSSherif/Weighted-Tensor-Least-Angle-Regression.

References

Repetti, A., Chouzenoux, E., Pesquet, J.C.: A penalized weighted least squares approach for restoring data corrupted with signal-dependent noise. Eur. Signal Process. Conf., 1553–1557 (2012)
Google Scholar
Wang, J., Lu, H., Wen, J., Liang, Z.: Multiscale penalized weighted least-squares sinogram restoration for low-dose X-ray computed tomography. IEEE Trans. Biomed. Eng. 55, 1022–1031 (2008). https://doi.org/10.1109/TBME.2007.909531
Article Google Scholar
Cheung, K.W., So, H.C., Ma, W.K., Chan, Y.T.: Least squares algorithms for time-of-arrival-based mobile location. IEEE Trans. Signal Process. 52, 1121–1128 (2004). https://doi.org/10.1109/TSP.2004.823465
Article MathSciNet Google Scholar
Zou, Y., Liu, H., Wan, Q.: An iterative method for moving target localization using TDOA and FDOA measurements. IEEE Access. 6, 2746–2754 (2017). https://doi.org/10.1109/ACCESS.2017.2785182
Article Google Scholar
Tarrío, P., Bernardos, A.M., Besada, J.A., Casar, J.R.: A new positioning technique for RSS-based localization based on a weighted least squares estimator. ISWCS’08 - Proc. 2008 IEEE Int. Symp. Wirel. Commun. Syst., 633–637 (2008). https://doi.org/10.1109/ISWCS.2008.4726133
Wang, W., Tan, W., Shi, W., Zhang, Q., Li, H.: Direction finding based on iterative adaptive approach utilizing weighted $$\ell _2$$ ℓ 2 -norm penalty for acoustic vector sensor array. Multidimens. Syst. Signal Process. 331(33), 247–261 (2021). https://doi.org/10.1007/S11045-021-00797-6
Chan, S.C.K.: Adaptive weighted least squares algorithm for Volterra signal modeling. IEEE Trans. Circuits Syst. I Fundam. Theory Appl. 47, 545–554 (2000). https://doi.org/10.1109/81.841856
De Courville, M., Duhamel, P.: Adaptive filtering in subbands using a weighted criterion. IEEE Trans. Signal Process. 45, 1675 (1997). https://doi.org/10.1109/icassp.1995.480341
Article Google Scholar
Zhao, R., Lai, X.: A fast matrix iterative technique for the WLS design of 2-D quadrantally symmetic FIR filters. Multidimens. Syst. Signal Process. 2010 224. 22, 303–317 (2010). https://doi.org/10.1007/S11045-010-0128-X
Min, D., Choi, S., Lu, J., Ham, B., Sohn, K., Do, M.N.: Fast global image smoothing based on weighted least squares. IEEE Trans. Image Process. 23, 5638–5653 (2014). https://doi.org/10.1109/TIP.2014.2366600
Article MathSciNet Google Scholar
Ruppert, D., Wand, M.P.: Multivariate locally weighted least squares regression. Ann. Stat. 22, 1346–1370 (2007). https://doi.org/10.1214/aos/1176325632
Article MathSciNet Google Scholar
Magee, L.: Improving survey-weighted least squares regression. J. R. Stat. Soc. Ser. B Stat. Methodol. 60, 115–126 (1998). https://doi.org/10.1111/1467-9868.00112
Romano, J.P., Wolf, M.: Resurrecting weighted least squares. J. Econom. 197, 1–19 (2017). https://doi.org/10.1016/j.jeconom.2016.10.003
Article MathSciNet Google Scholar
Wickramasingha, I., Sobhy, M., Sherif, S.S.: Sparsity in Bayesian Signal Estimation. In: Tejedor, J.P. (ed.) Bayesian Inference. InTech (2017)
Google Scholar
Mallat, S.: A Wavelet Tour of Signal Processing. Elsevier (2009)
Google Scholar
Daubechies, I.: The wavelet transform, time-frequency localization and signal analysis. IEEE Trans. Inf. Theory 36, 961–1005 (1990). https://doi.org/10.1109/18.57199
Article MathSciNet Google Scholar
Mallat, S.G., Zhang, Z.: Matching pursuits with time-frequency dictionaries. IEEE Trans. Signal Process. 41, 3397–3415 (1993). https://doi.org/10.1109/78.258082
Article Google Scholar
Pati, Y.C., Rezaiifar, R., Krishnaprasad, P.S.: Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition. In: Conference Record of the Asilomar Conference on Signals, Systems & Computers, pp. 40–44. IEEE Comput. Soc. Press (1993)
Google Scholar
Coifman, R.R., Wickerhauser, M.V.: Entropy-based algorithms for best basis selection. IEEE Trans. Inf. Theory 38, 713–718 (1992). https://doi.org/10.1109/18.119732
Article Google Scholar
Chen, S.S., Donoho, D.L., Saunders, M.A.: Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 20, 33–61 (1998). https://doi.org/10.1137/S1064827596304010
Article MathSciNet Google Scholar
Efron, B., Hastie, T., Johnstone, I., Tibshirani, R., Ishwaran, H., Knight, K., Loubes, J.M., Massart, P., Madigan, D., Ridgeway, G., Rosset, S., Zhu, J.I., Stine, R.A., Turlach, B.A., Weisberg, S., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Stat. 32, 407–499 (2004). https://doi.org/10.1214/009053604000000067
Article MathSciNet Google Scholar
Tropp, J.A.: Greed is good: algorithmic results for sparse approximation. IEEE Trans. Inf. Theory 50, 2231–2242 (2004). https://doi.org/10.1109/TIT.2004.834793
Article MathSciNet Google Scholar
Friedlander, M.P., Mansour, H., Saab, R., Yilmaz, Ö.: Recovering compressively sampled signals using partial support information. IEEE Trans. Inf. Theory 58, 1122–1134 (2012). https://doi.org/10.1109/TIT.2011.2167214
Article MathSciNet Google Scholar
Khajehnejad, M.A., Xu, W., Avestimehr, A.S., Hassibi, B.: Weighted l1 minimization for sparse recovery with prior information. IEEE Int. Symp. Inf. Theory - Proc. 483–487 (2009). https://doi.org/10.1109/ISIT.2009.5205716
Candès, E.J., Wakin, M.B., Boyd, S.P.: Enhancing sparsity by reweightedℓ1 minimization. J. Fourier Anal. Appl. 14, 877–905 (2008). https://doi.org/10.1007/s00041-008-9045-x
Article MathSciNet Google Scholar
Bergersen, L.C., Glad, I.K., Lyng, H.: Weighted lasso with data integration. Stat. Appl. Genet. Mol. Biol. 10 (2011). https://doi.org/10.2202/1544-6115.1703
Shimamura, T., Imoto, S., Yamaguchi, R., Miyano, S.: Weighted lasso in graphical Gaussian modeling for large gene network estimation based on microarray data. Genome Inform. 19, 142–153 (2007). https://doi.org/10.1142/9781860949852_0013
Article Google Scholar
Kolda, T.G., Bader, B.W.: Tensor decompositions and applications. SIAM Rev. 51, 455–500 (2009). https://doi.org/10.1137/07070111X
Article MathSciNet Google Scholar
Cichocki, A., Mandic, D., De Lathauwer, L., Zhou, G., Zhao, Q., Caiafa, C., Phan, H.A.: Tensor decompositions for signal processing applications: from two-way to multiway component analysis (2015)
Google Scholar
Wickramasingha, I., Sobhy, M., Elrewainy, A., Sherif, S.S.: Tensor least angle regression for sparse representations of multidimensional signals. Neural Comput. 32, 1697–1732 (2020). https://doi.org/10.1162/neco_a_01304
Article MathSciNet Google Scholar
Sulam, J., Ophir, B., Zibulevsky, M., Elad, M.: Trainlets: dictionary learning in high dimensions. IEEE Trans. Signal Process. 64, 3180–3193 (2016). https://doi.org/10.1109/TSP.2016.2540599
Article MathSciNet Google Scholar
Caiafa, C.F., Cichocki, A.: Computing sparse representations of multidimensional signals using Kronecker bases. Neural Comput. 25, 1–35 (2012). https://doi.org/10.1162/NECO_a_00385
Article MathSciNet Google Scholar
Elrewainy, A., Sherif, S.S.: Kronecker least angle regression for unsupervised unmixing of hyperspectral imaging data. Signal, Image Video Process. 14, 359–367 (2020). https://doi.org/10.1007/s11760-019-01562-w
Article Google Scholar
Schäcke, K.: On the Kronecker product, 1–35 (2013)
Google Scholar
Moon, T.K., Stirling, W.C.: Mathematical methods and algorithms for signal processing (1999)
Google Scholar
Bocci, C., Carlini, E., Kileel, J.: Hadamard products of linear spaces. J. Algebr. 448, 595–617 (2016). https://doi.org/10.1016/j.jalgebra.2015.10.008
Article MathSciNet Google Scholar
Zhao, Q., Caiafa, C.F., Mandic, D.P., Chao, Z.C., Nagasaka, Y., Fujii, N., Zhang, L., Cichocki, A.: Higher order partial least squares (HOPLS): A generalized multilinear regression method. IEEE Trans. Pattern Anal. Mach. Intell. 35, 1660–1673 (2013). https://doi.org/10.1109/TPAMI.2012.254
Article Google Scholar
Phan, A., Cichocki, A., Tichavsk, P., Luta, G., Brockmeier, A.: Tensor completion through multiple Kronecker product decomposition Institute of Information Theory and Automation. Proc. ICASSP. c, 3233–3237 (2013)
Google Scholar
Bugg, C., Chen, C., Aswani, A.: Nonnegative Tensor Completion via Integer Optimization. (2021)
Google Scholar
Yu, Q., Zhang, X., Chen, Y., Qi, L.: Low Tucker rank tensor completion using a symmetric block coordinate descent method. Numer. Linear Algebr. with Appl. 30, e2464 (2022). https://doi.org/10.1002/nla.2464
Article MathSciNet Google Scholar
Elharrouss, O., Almaadeed, N., Al-Maadeed, S., Akbari, Y.: Image inpainting: a review, https://springerlink.bibliotecabuap.elogim.com/article, https://doi.org/10.1007/s11063-019-10163-0 (2020)
Xiang, H., Zou, Q., Nawaz, M.A., Huang, X., Zhang, F., Yu, H.: Deep learning for image inpainting: A survey. Pattern Recognit. 134, 109046 (2023). https://doi.org/10.1016/j.patcog.2022.109046
Article Google Scholar
Romero, A., Castillo, A., Abril-Nova, J., Timofte, R., Das, R., Hira, S., Pan, Z., Zhang, M., Li, B., He, D., Lin, T., Li, F., Wu, C., Liu, X., Wang, X., Yu, Y., Yang, J., Li, R., Zhao, Y., Guo, Z., Fan, B., Li, X., Zhang, R., Lu, Z., Huang, J., Wu, G., Jiang, J., Cai, J., Li, C., Tao, X., Tai, Y.W., Zhou, X., Huang, H.: NTIRE 2022 Image Inpainting Challenge: Report. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 1149–1181 (2022)
Google Scholar
Park, M., Brocklehurst, K., Collins, R.T., Liu, Y.: Deformed lattice detection in real-world images using mean-shift belief propagation. IEEE Trans. Pattern Anal. Mach. Intell. 31, 1804–1816 (2009). https://doi.org/10.1109/TPAMI.2009.73
Article Google Scholar
Agustsson, E., Timofte, R.: NTIRE 2017 challenge on single image super-resolution: dataset and study. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 1122–1131 (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, University of Manitoba, Winnipeg, Canada
Ishan M. Wickramasingha, Biniyam K. Mezgebo & Sherif S. Sherif

Authors

Ishan M. Wickramasingha
View author publications
You can also search for this author in PubMed Google Scholar
Biniyam K. Mezgebo
View author publications
You can also search for this author in PubMed Google Scholar
Sherif S. Sherif
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sherif S. Sherif .

Editor information

Editors and Affiliations

Technical University of Sofia, Sofia, Bulgaria
Roumen Kountchev
Technical University of Sofia, Sofia, Bulgaria
Rumen Mironov
Technical University of Sofia, Sofia, Bulgaria
Ivo Draganov
TK Engineering, Sofia, Bulgaria
Roumiana Kountcheva
University of Hyogo, Kobe, Japan
Kazumi Nakamatsu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wickramasingha, I.M., Mezgebo, B.K., Sherif, S.S. (2024). Weighted Tensor Least Angle Regression for Solving Sparse Weighted Multilinear Least Squares Problems. In: Kountchev, R., Mironov, R., Draganov, I., Kountcheva, R., Nakamatsu, K. (eds) New Approaches for Multidimensional Signal Processing. NAMSP 2023. Smart Innovation, Systems and Technologies, vol 385. Springer, Singapore. https://doi.org/10.1007/978-981-97-0109-4_3

Download citation

DOI: https://doi.org/10.1007/978-981-97-0109-4_3
Published: 29 June 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-2743-8
Online ISBN: 978-981-97-0109-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics