1 Introduction

Image compression systems have traditionally been based on transform coding like the Discrete Cosine Transform (DCT) [1] or the Discrete Wavelet Transform (DWT) [25] as used in JPEG [32], and JPEG2000 [16, 26] respectively. The key idea behind these conventional compression methods is to extract and retain only a small number of high-energy transform coefficients and encode these while discarding the remaining ones. The major disadvantage of conventional sampling followed by compression is that the sampling and acquisition stages deal with a huge amount of data requiring efficient sensors and storage capacity, which results in increased measurement cost, hence time wastage. Instead of compressing data after sampling, CS helps in sampling data in a compressed way [9, 17] before any compression. CS is a new signal sampling paradigm, which allows to recover signals using only few samples. This approach goes beyond the Nyquist sampling theory, in which the exact signal recovery requires a sampling rate larger than twice the highest frequency present in the signal. In CS, the minimum number of samples required for reconstructing a signal depends on the sparsity of the signal rather than its bandwidth. Figure 1 shows the difference between conventional and compressed sensing based compression frameworks. Previous work has shown that with few non-adaptive linear measurements, exact reconstruction is possible, if the signal of interest is inherently sparse (few non-zero samples) in some basis. CS has gained popularity in a number of applications including face recognition [33], medical magnetic resonance imaging (MRI) [24], radar imaging [37], astronomy [6], seismic imaging [21] and video forensic analysis [31], among others.

Fig. 1
figure 1

Conventional compression versus compressed sensing

In this paper, we present a novel image compression algorithm using CS. Our main focus is to develop a sampling scheme based on an optimal representation of the image in the transform domain. In this work, the CS sparsity is achieved using the wavelet transform. From the wavelet transform, sparse vectors are created by picking the wavelet coefficients across all subbands in a specific structured manner. In conventional CS, the measurement matrix is fixed and does not depend on the input signal. Here, we propose an image dependent measurement matrix where more weights are assigned to low-frequency (high energy) components. In contrast to the traditional block-based DCT compression algorithms proposed in earlier work [19, 20, 35], the wavelet transform is computed for the whole image at once. The proposed technique also results in reduced length measurement matrix and sparse vectors controlled by the wavelet decomposition levels. The proposed technique results in improved reconstruction quality compared to other CS-based image compression methods. The CS-based image compression also results in a simple encoder as the computational load shifts to the decoder side. In summary, the main contributions of the paper are: an optimal rearrangement of the wavelet coefficients to form sparse vectors for CS, a normalized Gaussian random measurement matrix based on the energies of the wavelet coefficients, and finally a new robust CS-based compression algorithm using this new type of sparse vectors.

The remainder of the paper is organized as follows. Section 2 reviews work related to compression using CS. Section 3 provides a brief introduction to compressed sensing using DCT and DWT. The proposed multi-wavelet based compressed sensing technique is then described in Section 4. Our experimental results are shown in Section 5, and finally, the paper is concluded in Section 6.

2 Related work

The concept of CS in image compression applications has been discussed by several researchers during the past few years. According to compressed sensing theory, for the exact recovery from few measurements, the signal of interest should be sparse in some sense or in some transform domain i.e., DCT, DWT etc. Moreover, under the compressed sensing framework, the input signal is one-dimensional, and to use CS for two-dimensional image, the simplest way is to transform it into a one-dimensional long vector. For, reduced measurements, a large random matrix is needed and hence increasing the computational cost and memory storage. For two-dimensional images, different researchers addressed this problem in a number of ways. In [19], Lu Gan initially proposed a block-based compression algorithm for natural images based on CS. The image was first divided into non-overlapping blocks, and sampling was then performed for each transformed block with the same measurement matrix. The reconstruction was based on a variant of the Projected Landweber (PL) algorithm with hard thresholding and, to reduce the blocking artifacts, iterative Wiener filtering was used. It was further extended using directional transforms [27]. The advantage of block-based compressed sensing over conventional CS is the reduced dimension of the measurement matrix as well as length of sparse vectors and hence better computational performance. The main problem with these schemes is that the statistical structure of the image data is not fully exploited.

In [35], Yang et al. proposed a perceptual sampling process based on block DCT for images using a weighted Gaussian sampling matrix. The rationale for it is that human eyes are more sensitive to low-frequency (high energy) components than high-frequency (low energy) components in an image. The weighting coefficients are obtained from the inverse entries of the JPEG quantization table (scaled to a suitable range). The weighted sampling matrix effectively increased the quality of the reconstructed image. Since the sampling matrix is not the same for all blocks, the scheme was more complex compared to the algorithms in [19, 27]. Yang et al., in [36], proposed a reweighted sampling using CS. Instead of modifying the sampling matrix, the sampled coefficients were modified by the weights calculated from the statistical properties of the image, resulting in a significant increase in performance compared to iterative reweighted reconstruction [14]. In [20], Gao et al. proposed a sampling scheme, using CS, based on random permutations of the coefficients in block-based DCT as well as an adaptive sampling matrix based on energy contributions of different frequency components of the DCT coefficients in each block. But the main problem with this scheme is that the permutation order is different for different vectors. The before above mentioned approaches were based on sparse representation of images using block-based DCT. The problem with block DCT based CS compression is that blocking artifacts appear in the reconstructed image reducing overall perceptual quality.

To reduce the unpleasant blocking artifacts, a number of CS-based compression methods were introduced using the wavelet transform. Sevak et al., in [30], used the wavelet transform based on the Haar wavelet, and applied compressed sensing on medical CT images. In the reconstruction stage, an 1-minimization framework was used. The problem with this method is that the size of the input vector is very large and the method was not tested on natural images. Similarly, Chen et al., in [15], applied compressed sensing on MRI images using wavelet tree sparsity and produced promising results on MRI images. However, they did not test their algorithm on natural images, while in [23], Lustig et al. used compressed sensing for MRI images without any sparse transformation considering the fact that MRI images are inherently sparse.

In [34], Yang et al. performed compressed sensing on natural images using a multilevel wavelet transform with Orthogonal Matching Pursuit (OMP) as the reconstruction algorithm. But the results reported were not promising compared to other methods in the literature. Bi et al., in [5], created small sparse vectors by taking the wavelet transform for each contourlet directional subband and reducing both the length of the sparse vectors and the size of the random measurement matrix. But the scheme itself was computationally expensive due to the contourlet transform and the numerous wavelet transforms for each contourlet subband. In [18], Du et al. proposed an algorithm based on compressed sensing for sparse representation, the Wavelet-Based Contourlet Transform (WBCT) was applied on non-overlapping blocks of different dimensions. The reconstruction was performed using Iterative Hard Thresholding along with Weiner filtering to reduce the blocking artifacts introduced due to the block-based approach. In [5, 18], the problem of large random measurement matrices and lengthy sparse vectors was addressed. But these schemes were computationally expensive due to wavelet-based contourlet transform compared to the discrete wavelet transform without major improvement. Cen et al., in [13] proposed a CS-based image compression algorithm using a single-level wavelet transform. The lowpass wavelet coefficients were preserved, and reduced measurements were taken from the detail coefficients from each subband separately. The reconstruction was performed using the OMP algorithm followed by the inverse wavelet transform. It was further improved by Zhang et al. in [38], in which CS was applied on the sparse vectors extracted from the detail subbands column by column, then, the same process was repeated for the rows of each subband. The average was then taken for the two reconstructed images. In [22], Kalra et al. also proposed CS-based image compression using the wavelet transform. The sparse vectors were created from the wavelet image using the parent child relationship. The reduced measurements were obtained using codebooks, based on vector quantization. The performance of the algorithm was enhanced by the proper choice of the number of measurements, the wavelet function, and the dictionary matrix. The main problem with this approach is that the coefficients that belong to edges regions result in non-sparse vectors (as parent and child coefficients have high values).

In this paper, we introduce a new framework for image compression based on CS. The sparsity is obtained using a simple multilevel wavelet transform. Similar to the work in [20], we use an image dependent measurement matrix based on the energies of the wavelet coefficients. The wavelet transform decomposes the image into different subbands, i.e. an approximation band (low pass filtered image) and several detail bands. The approximation wavelet coefficients are non-sparse while the detail coefficients are nearly sparse. To use the CS framework efficiently, we propose to form the sparse vectors using a novel approach based on dividing the transformed image into non-overlapping blocks equal to the size of the approximation band, then picking the corresponding coefficients (in order) from each block. The advantage of the proposed scheme is that it avoids the blocking artifacts which are inherent in DCT based schemes and more importantly we show that a better overall reconstruction quality can be achieved. The proposed algorithm is also efficient in terms of the size of the measurement matrix and the length of the sparse vectors as they are only controlled by the decomposition levels without using any extra transformation. In addition, we show that the sparsity is guaranteed as only one coefficient from the lowpass subband is used as first element (large value) for each sparse vector, and the remaining elements are taken from other details bands (usually very small values) using the block-based approach. Note that, to be consistent with the existing CS-based image compression techniques, we only focus here on the transformation of the coefficients without discussing of the other stages of compression systems including quantization, and entropy encoding.

3 Preliminaries

3.1 Overview of compressive sensing

CS has attracted a lot of attention in various fields over the past few years. It was introduced by Candes, Romberg, and Tao [11] and Donoho [17]. CS helps to recover the sparse signal using only few samples in contrast to Nyquist sampling theory, where sampling of the signal is performed at a rate larger than highest frequency present in the signal. An overview of compressed sensing theory can be found in [2, 3, 10, 29].

Consider x to be the actual signal of dimensions N × 1; xR N. To obtain M non-adaptive linear measurements from x, we multiply x by a fat matrix Φ. This sampling process is represented as:

$$ \boldsymbol{y} = \mathbf{\Phi} \boldsymbol{x} $$
(1)

Here, Φ is called sampling (or measurement) matrix with dimensions M × N. y is a compressed measurement vector of dimension M × 1 (see Fig. 2) where MN.

Fig. 2
figure 2

Structure of compressive sensing matrices

According to the conventional CS theory [9, 11, 12], for the exact recovery from few measurements, x should be sparse in some sense or a certain transform domain (e.g. DCT, Wavelet, Curvelet etc.). If we consider s to be the sparse representation of a non-sparse signal x in transform domain (Ψ), then, the signal x can be expressed as:

$$\boldsymbol{x} = \boldsymbol{\Psi} \boldsymbol{s} = {\sum\limits_{i=1}^{N}} \Psi_{i}s_{i}$$
(2)

where s = {s 1, s 2, ⋯ , s N } is N × 1 coefficients vector, such that s i = 〈x, Ψ i 〉 and Ψ N × N = {Ψ12, ⋯ , Ψ N }, where Ψ i is the i th column vector of basis matrix Ψ.

The overall sampling process becomes:

$$ \mathit{\boldsymbol{y}} = \mathbf{\Phi} \mathbf{\Psi} \boldsymbol{s} = \mathbf{A} \boldsymbol{s} $$
(3)

where A is the M × N sensing matrix, and Ψ is the transformation matrix (also called sparsifying matrix), s is K-sparse (i.e. it has K non-zero entries) and only K measurements are required for the exact reconstruction of x.

The recovery of s in (3), involves searching for the sparsest solution \(\boldsymbol {\hat {s}}\) (i.e. a solution with few non-zero entries or lowest 0-norm) for an under-determined system of linear equations (MN). The recovery using l 0-minimization is formulated as:

$$ \boldsymbol{\hat{s}} = \arg\min\limits_{\boldsymbol{\hat s}} \left\|\boldsymbol{\hat{s}}\right\|_{\ell{_{0}}} \textrm{ subject to:} \boldsymbol{y} =\mathbf{A} \boldsymbol{\hat s} $$
(4)

where \(\|\boldsymbol {s}\|_{l_{0}}\) corresponds to non-zero entries in vector s. The above reconstruction problem is non-deterministic polynomial-time hard (NP-hard). Due to the non-convex nature of 0-minimisation, the problem is ill-conditioned and difficult to solve as it requires an exhaustive search to find the most sparse solution (i.e. combinatorial problem) [8]. However, the unique sparse solution can be found exactly using 1-minimisation [12]. The above reconstruction problem can be reformulated into a convex problem and solved using linear programming [9]:

$$ \boldsymbol{\hat{s}} = \arg\min\limits_{\boldsymbol{\hat s}} ||\boldsymbol{\hat{s}}||_{\ell{_{1}}} \text{ subject to:} \boldsymbol{y} =\mathbf{A} \boldsymbol{\hat s} $$
(5)

where

$$ ||\boldsymbol{{s}}||_{\ell{_{1}}} = \sum{|s_{i}|} $$
(6)

In addition to sparsity, the minimum acceptable number of samples for reconstruction also depends upon the incoherence between Φ and Ψ, such that there is no correlation between the rows of Φ and the columns of Ψ. To achieve this, A should satisfy the Restricted Isometry Property (RIP) [7]. The sensing matrix A is (K, δ) − R I P if for every K-sparse vector s, the following condition is satisfied:

$$ (1-\delta)\|\boldsymbol{s}\|_{2}^{2}\leq\|{\Phi} \boldsymbol{s}\|_{2}^{2}\leq(1+\delta)\|\boldsymbol{s}\|_{2}^{2}. $$
(7)

for δ > 0. The main point here is that the RIP is sufficient to guarantee sparse reconstruction by 1-minimization. The signal information is preserved for \(M=O\left (K\log \frac { N}{M}\right )\) random measurements.

The RIP condition holds true if Φ (measurement matrix) consists of randomly generated independent and identically distributed (i.i.d) Gaussian random variable and Ψ (sparsifying matrix) is the identity matrix. Several algorithms have been proposed to implement the optimization algorithm above [9].

Finally, it is worth noting that the efficient implementation of CS algorithms has been a very active area of research. To reduce the complexity of the optimization algorithms, a number of authors argued that the implementation of CS in the transform domain brings with it several advantages. For completeness, and since the paper is about compression, we present in the following a brief discussion of the two most popular transforms used in image compression, mainly the DCT and the DWT.

3.2 Discrete cosine transform

The Discrete Cosine Transform (DCT) is widely used in diverse image processing applications. The transform is able to represent most of the visually significant information, in a typical image, with a limited number of DCT coefficients. It is the building block of the JPEG compression standard. It is similar to a Fourier transform considering only the real part of the basis function.

The DCT is mainly used for energy compaction and inter-pixel redundancy reduction [28]. It assumes that the low frequency components carry most of the signal energy compared to the high-frequency components. Since the Human Visual System (HVS) is less sensitive to errors in the high-frequency band compared to the low-frequency band, the high-frequency DCT coefficients can be neglected hence achieving good compression without disturbing the overall image perceptual quality.

Due to the non-stationary nature of images, the DCT is computed using small image blocks. The partitioning operation does not eliminate the correlation across the block boundaries and results in perceptible and disturbing blocking artifacts at high compression rates.

3.3 Discrete wavelet transform

The wavelet transform has been introduced as an excellent tool for the analysis of non-stationary signals as it can be used to localize (across time) changes in frequency content [25]. In contrast to sinusoids (with infinite energy) used in the Fourier Transform (FT), the Wavelet transform is based on small waveforms (mother wavelets) with varying frequency, limited duration, and zero average values.

For an image with dimensions M × N, the single-stage two-dimensional DWT is calculated using one-dimensional DWT. The image rows are transformed using two one-dimensional lowpass and highpass filters. The filtered output is then down-sampled by a factor of two. The down-sampled filtered image is again transformed column-wise using the same filters followed by a decimation stage resulting into four decompositions: LL (low-pass low-pass), LH (low-pass high-pass), HL (high-pass low-pass) and HH (high-pass high-pass). The process is repeated for the required levels by transforming the LL sub-band into four decompositions. The wavelet transform can be seen as a cascade of lowpass and highpass filters resulting in an approximation signal and several detail signals. The 2-level wavelet decomposition of the Lena image is shown in Fig. 3. Its blurred (approximation) version is displayed in the top left corner. Its detailed versions are on its right, bottom left and bottom right corresponding to the horizontal, vertical and diagonal detail information at different scales. The size of the wavelet transform image is the same as the original image.

Fig. 3
figure 3

Level-2 wavelet decomposition of the Lena image a Original image b The two-dimensional discrete wavelet transform (2-levels) subbands c Wavelet image

Note that there is no need for image partition in blocks as with the DCT. The DWT has a slightly higher computational complexity than the DCT, but exhibits visually more pleasing artifacts. The DWT offers high compression ratio as compared to the DCT based compression. In DCT based compression methods, the transform coefficients are calculated for non-overlapping blocks of size either 8 × 8 or 16 × 16. Since the correlation between the successive blocks is not considered, blocking artifacts are visible under high compression ratios. However, the multilevel discrete wavelet transform (MDWT) is calculated for the whole image without block decomposition. With a proper choice of the mother wavelet, like Haar, Daubechies, etc., specific frequency components and features from the signals can be extracted and/or enhanced [25].

The coarse (low-pass) wavelet coefficients carry much of the image energy. Since low-frequency components energy contributions is much larger than the high-frequency components, image quality greatly depends upon these low-frequency components.

4 The Proposed Technique

Conventional CS does not assume any structure for the data. This can lead to poor performance when used in image compression. Here, to use CS effectively, we propose a novel approach for formulating the sparse vectors by rearranging the wavelet coefficients in a given order (see details later). Compared with conventional CS with fixed measurement matrix, we also propose an image dependent measurement matrix where higher weights are assigned to low-frequency components with high energy. Figure 4 displays the main steps of the proposed technique with the details of each step given below:

  1. 1.

    For a given image, of dimension M × N, the multi-level wavelet transform is first obtained (see Fig. 3).

  2. 2.

    The wavelet transform is used to decompose a given image into an approximation image and several detail images. The wavelet coefficients in the approximation image are not sparse while the coefficients in the detail images are nearly sparse.

    To form the sparse vectors, the wavelet transform image is first divided into non-overlapping blocks, α i , of dimension (m × n) or \(\left (\frac {M}{2^{L}} \times \frac {N} {2^{L}}\right )\) where L is the depth of the decomposition, and i = 1, 2, 3, ⋯ , B, where B is the total number of wavelet image “blocks” (see Fig. 5b).

    The coefficients within each block represent similar frequency components; this is different from the traditional block-based DCT, in which each block contains all possible frequency components. The total number of blocks (corresponding to each vector length) is obtained as follows:

    $$ \mathrm{No.~ of~ Blocks~ (B)} =1+3\sum\limits_{\mathrm{i}=1}^{\mathrm{L}}4^{\mathrm{i}-1} $$
    (8)

    To form the sparse vectors, we propose to group the wavelet coefficients in the following manner:

    $$ \boldsymbol{\beta}_{k} = \left\{{\alpha_{i}^{k}}\right\} \quad \text{for} \, i = 1,2,\cdots,B \quad \text{and}\, \,k=1,2,\cdots,(m\times n) $$
    (9)

    where \({\alpha _{i}^{k}}\) is the k th component of block, α i . This step results in (m × n) sparse vectors of dimension B × 1 each (see Fig. 5c). The number of sparse vectors and their lengths for different wavelet decomposition levels is also given in Table 1

  3. 3.

    The Root Mean Squared Energy of the i th wavelet coefficient for each sparse vector, β k , is evaluated using the following expression:

    $$ E_{i} = \sqrt {\sum\limits_{k=1}^{{(m \times n)}}|\boldsymbol{\beta}_{k}^{i}|^{2}} \; \, \text{for}\,\, i=1,2,\cdots,B $$
    (10)
  4. 4.

    We propose to normalize the Gaussian measurement matrix, Φ, using the energy weighting matrix entries:

    $$ \mathbf{\Phi}_{E} = orth((\mathbf{\Phi} .\times \mathbf{\Omega)}^{T})^{T} $$
    (11)

    where the energy weighting matrix is represented by Ω = [ω; ω; ⋯ ; ω], ω = [E 1, E 2, ⋯ , E B ], and the (.) represents entry-wise multiplication. To satisfy the RIP condition, the rows of sensing matrix Φ E are normalized.

  5. 5.

    Now, to obtain the compressed measurements, the normalized sensing matrix, Φ E , is multiplied with the sparse vectors, β k , using the following relation:

    $$ \boldsymbol{y}_{k} = \mathbf{{\Phi}_{E}} \boldsymbol{\beta}_{k} \text{ for}\,\, k = 1,2,\cdots,(m \times n) $$
    (12)
  6. 6.

    For reconstruction, the following 1-minimization problem is solved:

    $$ \boldsymbol{\hat{\beta}}_{k} = \arg\min\limits_{\boldsymbol{\beta}_{k}} \|\boldsymbol{\hat{\beta}}_{k} \|_{\ell_{1}} \text{ s.t.} \boldsymbol{y}_{k} =\mathbf{{\Phi}_{E}} \boldsymbol{\hat{\beta}}_{k} $$
    (13)
  7. 7.

    After reconstruction, the recovered sparse vectors are rearranged in the following manner:

    $$ \boldsymbol{\hat{\alpha}}_{i} = \left\{\hat{\beta}_{k}^{i}\right\}\,\, \text{for}\,\, k = 1,2,\cdots,(m \times n), \quad i=1,2,\cdots,B $$
    (14)

    where \(\hat {\beta }_{k}^{i}\) is the i th component of block \(\boldsymbol {\hat {\beta }}_{k}\).

  8. 8.

    Finally, the recovered vectors are rearranged into non-overlapping blocks and the inverse wavelet transform is performed to obtain the final reconstructed image.

Fig. 4
figure 4

The flowchart of the proposed algorithm

Fig. 5
figure 5

Proposed rearrangement of the wavelet coefficients (L = 2, B = 16)

Table 1 Total number of vectors and their lengths for an 256 × 256 image

To assess the performance of the proposed algorithm in comparison with other CS based compression schemes, we have used the Peak-Signal-to-Noise-Ratio (PSNR). The PSNR is defined as the ratio between the maximum possible power of a signal, and the power of the compression noise. It is usually expressed in decibel scale as:

$$ \text{PSNR (dB)} = \mathrm{10log} \frac{255^{2}}{\text{MSE}} $$
(15)

where

$$ \text{MSE} = \frac{1}{\text{MN}} \sum\limits_{\mathrm{x}=1}^{\mathrm{M}} \sum\limits_{\mathrm{y}=1}^{\mathrm{N}}[{\mathrm{I}}(\mathrm{x,y})-{\mathrm{I}}_{\mathrm{r}}(\mathrm{x,y})]^{2} $$
(16)

computed for the original image, I, and the reconstructed image, I r .

While a number of other metrics have been developed to measure quality degradations due to compression, we have limited our analysis to the use of the PSNR as the results with this index were publically available for the other algorithms that we wanted to benchmark against.

5 Experimental results

To evaluate the performance of the proposed algorithm, we started by selecting ten gray scale test images of different types (.BMP,.PNG,.PGM) commonly used in benchmarking. These are shown in Fig. 6. The multilevel wavelet transform was computed for the 256 × 256 test images with different decomposition levels (L = 2, 3, 4, 5, 6). For our proposed algorithm, the total number of sparse vectors and their dimensions depend on the decomposition level (see Table 1).

Fig. 6
figure 6

Images used for the experiments

For L < 3, the vector length is too small for efficiently using CS, while for L > 5, a larger length vector results in an increase in computational complexity. We tested the reconstruction performance in terms of PSNR for the orthogonal Daubechies and Coiflet wavelets, of different orders, and under same testing conditions, i.e. wavelet decomposition level (L = 4), and measurement ratio (MR = 0.3). For different test images, we observed that the ”coif5” gave better PSNR compared to the other wavelets (see Table 2). Based on this observation, we adapted this family of wavelets for the rest of the paper.

Table 2 PSNR (in dB) comparison for different wavelets (L = 4, MR = 0.3)

A Gaussian i.i.d measurement matrix was generated and normalized with the energies of the different frequency components as described in the proposed algorithm. We have used the NESTA algorithm for signal reconstruction which is fast, accurate and efficient for recovering near sparse signals (natural images are near sparse in the transform domain) [4].

The sparse vectors were created from the wavelet coefficients as described above. The reduced-vectors (also called measurement vectors) were obtained (described earlier) for different Measurement Ratios (MR), M R = 0.1, 0.2, ⋯ , 0.6. The MR is defined as the ratio of the measured dimension to the total dimension of the sparse vectors.

Table 3 shows the results of the PSNR for different measurement ratios and wavelet decomposition levels. For comparison purposes, we also display in Table 4 the results for variable size block DCT. In compressed sensing, the reduced measurements are obtained using random Gaussian measurement matrix. We noted a difference in the PSNR values within the range of ± 0.2 dB for different runs. For consistent results, the PSNR values were averaged over 10 independent trials. From Table 3, we observe that the PSNR values increase with an increase in MR across all images. Moreover, the wavelet decompositions with L ≥ 4 provide a higher PSNR as compared to block DCT for the same measurement ratio. Figure 7 depicts the same phenomenon in graphical form for the Lena image. Our extensive experiments showed that, in applications where computational power is not critical, L = 6 decomposition is recommended, while where sufficient computational power is available, L = 4 decomposition can be used. The visual results for the reconstructed images with different measurement ratios are shown in Figs. 8 and 9. From the results, it is evident that CS using a normalized sensing matrix with DWT gives better compression performance than block DCT.

Table 3 Comparison of PSNR with different wavelet levels
Table 4 Comparison of PSNR with different DCT block size
Fig. 7
figure 7

Comparison of PSNR under different wavelet level decomposition and block DCT scenarios (CS-DWT is our proposed approach)

Fig. 8
figure 8

Simulation results for Lena image with Wavelet transform a Original image b MR = 0.1 L = 4 c MR = 0.4, L = 4 d MR = 0.5, L = 4 e MR = 0.1, L = 6 f MR = 0.4, L = 6 g MR = 0.5, L = 6

Fig. 9
figure 9

Simulation results for Lena image with block DCT a MR = 0.1, 8 × 8 block b MR = 0.4, 8 × 8 block c MR = 0.1, 16 × 16 block d MR = 0.4, 16 × 16 block

We also compared our technique to other existing CS-based methods in the literature. Table 5 shows the quality of reconstructed images in terms of PSNR (in dB) under different measurement ratios. The Lena image is used for comparison with L = 4 wavelet decomposition. The comparison is made for both DCT and DWT based schemes. Note that some values of the PSNR are missing in Table 5 as the values were taken directly from the original papers [22, 34]. The test results for reference [5] were omitted as the authors were using larger resolution images. From the results, it is evident that our method outperforms the other methods in terms of PSNR even at low measurement ratios.

Table 5 Performance comparison of different CS-based schemes in terms of PSNR (dB)

6 Conclusions

In this article, a new CS-based approach for image compression is proposed. After processing the image of interest using a traditional wavelet transform, we propose a new rearrangement of the resulting wavelet coefficients into a new efficient structure allowing the formation of fully sparse vectors. These sparse vectors are then used as input to the CS algorithm. Compared with conventional CS with fixed measurement matrix, an image dependent Gaussian measurement matrix is used where higher weights are assigned to low-frequency (high energy) components. Image reconstruction is performed using CS recovery followed by inverse arrangement of the wavelet coefficients and the inverse wavelet transform. The proposed algorithm deals with the whole image rather than the conventional block-based approach used with DCT. As such, the disturbing blocking artifact using DCT are eliminated. Our experimental results showed that the proper choice of wavelet decomposition level, wavelet type and the measurement ratio have a major effect on the image reconstruction quality. The use of a higher level wavelet decomposition is shown to provide enhanced PSNR at the expense of a small increase in the overall computational load. Our experimental results show that the proposed algorithm, achieves better reconstruction quality, at the same measurement ratio, compared to state-of-the-art CS-based image compression methods.