1 Introduction

Internet usage has become a vital part of our daily life and almost two third of the global population is interconnected with web applications [1]. The web-based devices such as smartphones and tablets influence each individual connected across the world and upraise the demand for designers to develop advanced data processing tools [2]. In general, web contents include high-resolution images, text, audio, and video data which are larger in volume and require more transmission time for its communication. Image compression is a conventional method to reduce the size of data without disturbing the architecture of the system [3]. The wavelet-based progressive image compression is one of the prominent algorithms used in current compression technology available in both lossy and lossless methods [4].

Motivation

Even substantial wavelet methods are effective in high-quality reconstruction despite having limitations on storage and transmission bandwidth. The ability of feature extraction in JPEG 2000 from the region of interest dealt with compression ratio, even at low bandwidth transmission is quite impressive and made this codec ideal for internet/ network applications. Hence, revisiting the old compression algorithm and reducing all its constraints for generating the best possible image quality reconstruction with a high transmission rate is a rousing step in compression technology.

Problem Statement

Shapiro’s Embedded Zero Tree Wavelet coding (EZTW) provides tunable resolution in embedded bit stream with wider dynamic range, and less error [5]. However, they consume a lot of computer processing power and provide the best results. There is a tradeoff in achieving quality reconstruction at dynamic range and no such tuning options for quality enhancement. Achieving low bit rate transmission, high compression ratio, and preservation of image quality during reconstruction are the key factors we are focusing on in this work.

Literature review

The quest for efficient lossless image coding is always critical at low bandwidth transmission in network applications. The spatial-based image compression is a very primitive method, achieving compression either by eliminating the recursive pixels or by predicting the neighborhood pixels in adaptive lossless compression [6]. The multiresolution property of wavelet transform made it a robust compression algorithm in identifying redundant sub-bands and their removal [7]. The multiresolution scaling and adaptive thresholding property in Wavelet transform for image reconstruction made remarkable changes in image compression technology. The JPEG and JPEG2000 are prominent compression standards that use discrete cosine and wavelet transforms as conventional methods for image compression [8, 9]. JPEG compression has the substantial disadvantage of blocking artifacts and information reliance in an Image [10]. The EZTW is an efficient wavelet-based image compression that produces a fully embedded bitstream with high compression performance [5]. Later, much revolutionary research happened in compression coding using the EZTW algorithm, which includes Set Partition Hierarchical Tree (SPIHT), Embedded Block Coding Technique (EBCOT), Wavelet Difference Reduction (WDR), and Advanced WDR (ASWDR) [11,12,13,14,15]. Many attempts were made to hybridize the EZWT algorithm to reduce its constraints and enhance the compression performance suggested in the literature [16,17,18,19,20]. In [21], the author proposed a hybrid image compression algorithm that uses WDR with SVD followed by resolution enhancement using stationary wavelet transform (SWT) and discrete wavelet transform (DWT). Whereas, the low-frequency subband is decoded by inverse WDR and the high-frequency sub-band by matrix multiplication. As a result, the quality of a reconstructed image is significantly improved with a reduced embedded bit stream. In [22], a blended compression algorithm using SPIHT with SVD retains image quality by effective edge reconstruction and block-based SVD for low-frequency sub-bands during reconstruction. With the above literature survey, we can say the hybridization of dynamic image compression with SVD will perhaps improve the compression performance.

Contribution

This proposed work made an effort to enhance the traditional EZTW algorithm by using Segmentation based adaptive EZTW with SVD for tunable compression rates. In this algorithm, compression speed is increased by increasing the correlation between wavelet coefficients and dividing the entire image into small chunks, and then compressing it into an embedded bit stream. In SVD, the distribution of singular values and their percentage of ranks are augmented by an optimizing algorithm. Meantime, adaptive EZTW compresses the image with an optimal threshold value targeting edge recovery. This hybrid algorithm found significant improvement in terms of PSNR and tested with standard test images and compared with Basic EZTW with Huffman coding, SPIHT, and JPEG.

This paper is organized as follows: proposed hybrid compression algorithm is explained in section II. Section III describes the functional block of the proposed hybrid compression method. The quality metrics, experimental outcomes, and result analysis are explained in sections IV, V then followed by a conclusion in section VI.

2 The proposed hybrid compression algorithm

In this proposed algorithm, the image is compressed in two different stages, using SVD and modified EZTW(MEZTW) as follows,

2.1 Compression using SVD

The general, SVD considers the image frame as a two-dimensional matrix of m x n pixels. Each pixel has its grey scale intensity ranging from 0 to 255. The SVD detaches the matrix into three product matrices UαVT, where U and V are the orthogonal matrices of m x n and n x n, respectively, and α is a non-negative and diagonal matrix of m x n [23]. Here compression is achieved by selecting the minimum number of ranks in the diagonal matrix to approximate the original image during the reconstruction process.

Here we refine the SVD in two additional steps:

  • Calculate the mean value of the image matrix, then subtract the image matrix by its mean value before SVD and then add the mean value during reconstruction.

  • Divide the mean extracted image matrix into sub-block to use the irregular density of the original image. Select the suitable percentage of the sum of the singular value instead of the predetermined value [24].

Consider for an image J, the segmentation-based rank one SVD process is given by.

  1. a.

    A matrix ‘S’ is obtained by subtracting the original image matrix ‘J’ from its mean value.

    $$S\left(m,n\right)=J\left(m,n\right)- mean(J)$$
    (1)
  2. b.

    Define the block size (256 × 256,512 × 512……..) for segmentation.

  3. c.

    Apply the forward SVD for each sub-blocks of matrix ‘S’ by using Eq.2.

    $$\left[{A}_u,{A}_{\propto },{A}_v\right]= svd\left[S\left(m,n\right)\right]$$
    (2)

    Where Au is m × m matrix of orthogonal eigenvectors of A × AT, Av is a transpose of is n × n matrix consisting of orthogonal eigenvectors of AT × A and A is m × n diagonal matrix of singular values, square root of eigenvectors AT × A. Further, the average rank of matrix P is calculated from SVD by using some nonzero singular values in A.

  4. d.

    The average rank matrix is calculated by using Eq. 3

    $$P=\sum\nolimits_{i=1}^{\mathit{\min}\left(m,n\right)}{\propto}_i{u}_i{v}_i$$
    (3)

    Where, ∝i is the ith singular value and ui, vi are the corresponding left and right singular vectors.

  5. e.

    The specific percentage of ranks is calculated by using Eq. 4 and Table 1

    $$Specified\ percentage\left({S}_p\right)=\frac{\left({p}_1+{p}_2+{p}_3+\dots \dots +{p}_k\right)}{\left({\sigma}_1+{\sigma}_2+{\sigma}_3+\dots \dots +{\sigma}_k\right)}$$
    (4)
Table 1 Average ranks and their percentage for given singular values of boat image

Where Pk is the average rank matrix, σk is a variance of singular values for each sub-blocks of the image.

Table 1 provides a piece of information on the selection of percentage rank, and singular value for block size 8 × 8 from updated singular values 25% to 85% (concerning edge preservation index).

  1. f.

    Apply the Inverse SVD for reconstruction

    $$B\left(m,n\right)= svd\left( Tu,T{\alpha}_{k1}, Tv\right)$$
    (5)

Where, Tu is m by n, Tv is K1 by n, and α1 =  diag (r1, r2, r3, .…, rk1)

  1. g.

    Recombine all sub-blocks to get S(m,n), then add the mean value to reconstruct the original image.

    $${J}^{-1}\left(m,n\right)=S\left(m,n\right)+ mean(J)$$
    (6)

*Rank one updated SVD exploits the distribution of singular values in the diagonal matrix. This singular value distribution varies along with image complexity.

2.2 Compression using MEZTW

The low bit rate is a major problem in EZTW coding because of scalar quantization which generates a sequence of zero and non-zero symbols [25, 26]. The probability of zeros in overall symbols is always high and so low bit rate transmission is difficult. But somehow it is possible by designing adaptive quantizers and encoders. A new modified adaptive threshold compression algorithm used in [27] shows a possible way for conserving the significant subband and killing unwanted zeroes in an embedded bit stream. It uses parent-child relations in decomposed wavelet coefficients using adaptive threshold and creates a new data structure with zero trees to encode the symbols.

The source image is decomposed by using pyramid decomposition [28] with Debauchee’s wavelet filters at level 3 as shown in Fig. 1a. Scan the complete image by using raster scanning order then classify the coefficients as

  1. a.

    Parent: Coefficients at crude decomposed scale. (Each parent has four offspring)

  2. b.

    Child: Coefficients correspond to the same spatial location at the next bigger scale of parallel direction.

  3. c.

    Descendent: Coefficients Child’s offspring.

Fig.1
figure 1

a Wavelet Pyramid decomposition of sub-band with parent-child relation; b flow of Significant mapping for wavelet coefficients

The initial threshold of EZTW is calculated by using the formula for decomposed wavelet coefficients I(m,n),

$${T}_0={\left[{\mathit{\log}}_2\left(\mathit{\max}(I)\right)\right]}^2$$
(7)

Significance mapping

If a wavelet coefficient ‘z’ is said to be significant if∣z ∣  > T0, otherwise coefficient is considered insignificant. And also, symbols are classified as shown in Fig. 1b based on their status as follows,

  • Zero tree roots: Coefficient and child are zero.

  • Isolated Zero: The coefficient is insignificant but has significant descendants.

  • Positive Significant: Coefficient values with a positive sign greater than the threshold value.

  • Negative Significant: Coefficient values with a negative sign greater than the threshold value.

2.3 Optimizer for Adaptive threshold

A coarse compressed image is obtained from the first iteration with initial threshold and average ranks. The optimizer algorithm is designed in a way that image quality can be adopted by using a modified threshold value for required quality at a fixed compression ratio.

  • The adaptive threshold for MEZTW is calculated by computing the median of the ordered list of reconstructed block coefficients φ using Eq.8.

    $${T}_{adp}=\left\{\begin{array}{c}\varphi\ \left[\frac{n+1}{2}\right]\ if\ n\ is\ odd\\ {}\frac{\varphi \left\lfloor \frac{n}{2}\right\rfloor +\varphi \left\lfloor \frac{n}{2}+1\right\rfloor }{2}, if\ n\ is\ even\end{array}\right.$$
    (8)

    Whereas, n is the number of values of the data set from the reconstructed image. Then, carry out the significance test and repeat the process until the required bit rate is achieved. These symbols are encoded by a Huffman encoder, and finally, a lengthy compressed bit stream is obtained.

  • With the help of a first-level reconstructed image, a specific percentage is normalized by Eq.9

    $${S}_P^{+}=\sum \frac{\left({S}_p-\varphi \right)}{S_{pmax}-{S}_{pmin}}$$
    (9)

    then sort the ranks left to right based on significance \(\left({S}_{p1}^{+}\ge {S}_{p2}^{+}\ge \dots \dots ..,{S}_{pmin}^{+}\ge \dots .0\right)\) and eliminate less significant specific ranks and apply for image blocks in the second iteration. The application of adaptive ranks is more effective in minimizing the noise and preserving the edge information in the image block.

3 Process flow of the proposed compression method

In this proposed method, the SVD compression is used for mean extracted source image blocks followed by the MEZTW encoder to get a compressed bit stream. Meantime, the optimizer calculates the average ranks and adaptive threshold value based on a reconstructed image in the first iteration to reduce the size of a bit stream. This process will continue until all blocks of the original image are compressed and reconstructed for a defined bit rate Fig. 2 and 3.

Fig. 2
figure 2

Pipeline view of proposed MEZTW with SVD algorithm

Fig. 3
figure 3

mean extracted blocks of Lena image

It takes the following steps,

  1. a.

    Preprocessing: The original image is scaled down to standard size (1024 × 1024) to perform compression on a standard-size image. The mean subtraction method is the most common normalization process to increase inter-pixel relations and streamline the compression bit stream. The geometric interpretation is to Centre the significance pixels around the origin along each dimension by removing the mean across each unique feature in the image using Eq. 1.

  1. b.

    Segment the image matrix into blocks of standard size (256 × 256 or 512 × 512). This step will help in increasing the coding efficiency of compressors.

  1. c.

    SVD Compression: Apply SVD compression using Eq.2 and 4 with the application of adaptive ranks.

  1. d.

    The application of MEZTW compression for given blocks:

Let, B(m,n) = db4 wavelet coefficients of blocks

  • l = level of wavelet subtrees in a significant Mapping

  • l _ max  = maximum decomposition level of significant mapping

  • Qn(.) = n bit linear quantizer

  • AST _ Enc(.) = Adaptive significant tree symbol encoder

  • ES = Encoded symbol.

  • K(m, n) = Spatial oriented tree of root B(m, n)

  • P3 = subtree of K(m, n)at level 3

  • βs(m, n) = Significant Coefficient of subtree P3

  • Z(m, n) = encoded positions of βs

  1. Step1:

    Calculate the minimum number of bits required to encode decomposed coefficients

$$n=\left[{\mathit{\log}}_2\left\{\max |\textrm{B}\left(m,n\right)|\right\}\right]$$
(10)

Apply n-bit quantizer:

$${B}_Q\left(m,n\right)={Q}_n\left\{B\left(m,n\right)\right\}$$
(11)
  1. Step2:

    Apply Adaptive threshold from eq.8.

  2. Step3:

    Significant mapping:

figure a
  1. Step4:

    Bitstream based on coefficients significance:

    $${\displaystyle \begin{array}{c} if\left(\left|Q\left[B\left(m,n\right)\right]\right|>\left|B\left(m,n\right)\right|\right)\\ {} Ref\ Bit=1;\\ {}\begin{array}{c} else\\ {} Ref\ Bit=0;\end{array}\end{array}}$$
  2. Step5:

    Update the Threshold. \({T}_{adp}=\frac{T_{adp}}{2}\) and repeat step 3.

  1. e.

    The embedded bit stream is decoded by the Huffman decoder followed by Debouches 4 wavelet reconstruction filters.

  2. f.

    The optimizer algorithm calculates the finest ranks and threshold using eq. 9 and 10 for both compression sub-blocks.

  3. g.

    Use inverse SVD for all sub-blocks using Eq. (5) and add the extracted mean value.

4 Quality metrics

There are several metrics available to evaluate the quality of reconstructed image, but in this paper, we use three common error metrics peak signal-to-noise ratio (PSNR), structural similarity index mode (SSIM), and edge-preserving index (EPI) to measure the performance of the proposed algorithm. PSNR is given as,

$$MSE=\frac{1}{MN}{\sum}_{i,j=1}^{M,N}{\left({I}_r\left(m,n\right)-{I}_o\left(m,n\right)\right)}^2$$
(12)
$$PSNR=10{\mathit{\log}}_{10}\left({Q}^2/ MSE\right)$$
(13)

Where M, N is the size of the image and Io is the original image Ir is a reconstructed image, Q is 255 for the grayscale image. Variation of brightness in a reconstructed image amplifies the PSNR values. Measuring only PSNR is not the best choice to evaluate the image quality, therefore we use SSIM as an improved version of PSNR and measures clears the inconsistency in human visual perception. It is defined as [26],

$$SSIM\left(x,y\right)=\frac{\left(2{\mu}_x{\mu}_y+{C}_1\right)\left(2{\sigma}_{xy}+{C}_2\right)}{\left({\mu}_x^2+{\mu}_y^2+{C}_1\right)\left({\sigma}_x^2+{\sigma}_y^2+{C}_2\right)}$$
(14)

x and y correspond to two image blocks that need to be measured,μx, μyare average and σxσyare the variance of x, y respectively. σxy Is the covariance of x and y.C1 = (K1L)2, and C2 = (K2L)2 Are variables of the stabilization factor. L = dynamic range, K1 = 0.01, K2 = 0.03It measures the similarity index of the reconstructed image concerning the original image.

The edge preservation is a quantitative measure, calculated by using the formula for reference grayscale image I(x,y),

$$EPI=\frac{\tau \left(\Delta I-\overline{\Delta I},\hat{\Delta I}-\overline{\hat{\Delta I}}\right)}{\sqrt{\tau \left(\Delta I-\overline{\Delta I},\Delta I-\overline{\Delta I}\right)\ \tau \left(\hat{\Delta I}-\overline{\hat{\Delta I}},\hat{\Delta I}-\overline{\hat{\Delta I}}\right)}}$$
(15)

Where \(\hat{I}\left(m,n\right)\) is the estimated pixel intensity of the reference image,\(\overline{\ I\ }\left(m,n\right)\) and \(\overline{\hat{I}}\left(m,n\right)\) are the mean value of the reference image in the region of interest.∆I(m, n) is a highpass filtered version of I(m, n), obtain from the 3 × 3 pixel standard estimate of the Laplacian operator [29].

5 Simulation results and discussions

The proposed compression algorithm is evaluated by using standard test images collected from (8-bit grayscale test images) from the SIPI database, which is shown in Fig. 4.

Fig. 4
figure 4

SVD Compressed block4 Image using different levels of singular values ‘S’ with Average ranks ‘r’

Fig. 5
figure 5

Test images (a)Artificial; (b)Big_building; (c)Boat; (d)Fireworks; (e)Goldhills; (f)Lena; (g) Bridge; (h)Deer; (i)Pepper

The tabulations in Table 2, 3, 4, 5, 6, 7, 8, 9 and 10 shows a comparative study of the proposed method with EZTW, SPIHT, JPEG, and WDR_SVD in terms of PSNR, SSIM, and EPI at compression ratio 30:1,70:1,90:1 respectively.

Table 2 PSNR comparison for fixed CR 30:1
Table 3 PSNR comparison for fixed CR 70:1
Table 4 PSNR comparison for fixed CR 90:1
Table 5 SSIM values for fixed CR 30:1
Table 6 SSIM values for fixed CR 70:1
Table 7 SSIM values for fixed CR 90:1
Table 8 EPI values for fixed CR 30:1
Table 9 EPI values for fixed CR 70:1
Table 10 EPI values for fixed CR 90:1

Tables 2, 3 and 4 show that the PSNR values of the proposed method are improved from the basic EZTW method and compete with other algorithms. The PSNR value of the proposed method for boat image is 0.8 dB, 0.2 dB, and 2 dB higher than the WDR_SVD, JPEG compression technique at compression ratios 30:1, 70:1, and 90:1 respectively.

Tables 5, 6 and 7 shows SSIM values of the proposed method and other algorithms at a compression ratio of 30:1, 70:1, and 90:1. The SSIM value of the proposed method at 30:1 was found satisfactory compared to other algorithm except for deer and Lena images. The deer and fireworks have a rich number of artifacts and compression with high CR value coarsely removes that feature during reconstruction Fig. 5.

However, in Tables 8 and 9, the EPI value of the proposed method is comparatively higher than that of other methods, but images of firework, artificial, etc. have a set of large significant information, and preserving all of them give less EPI. The edge-preserving capacity of the proposed method is retained even at a high compression ratio can be observed in Table 10.

Figure 6 shows a graphical comparison of the proposed method with the art of work for Boat images in terms of PSNR, SSIM, and CR. This graphical comparison shows that the PSNR and SSIM of the proposed method compete with other algorithms in achieving high reconstruction quality.

Fig. 6
figure 6

Graphical comparison of the proposed method with the art of work in terms of PSNR, SSIM v/s Compression ratio, (a) PSNR v/s Compression ratio, (b) SSIM V/s Compression ratio

5.1 Computational complexity

The extraction of the mean from the image increases inter-pixel relation and partition of the image into subblocks for SVD to enhance coding efficiency and time reductional process. The steps in level two compression using adaptive specific ranks in SVD and encoding of blocks in MEZTW with significance mapping for adaptive thresholds are an addon for this algorithm. There is an uncertainty between reconstruction quality PSNR and bit per pixel (BPP) since high PSNR favor a larger number of encoding bit stream. Hence, the computational cost is associated with the range of significant mapping in MEZTW and the optimization process. This algorithm attempts to reduce the range of embedded bit stream with the optimal threshold at a high compression rate of 90 but, a smaller bitstream affected the quality of the decoded image and SSIM.

It is shown in Fig. 7. The firework image is compressed by the proposed method at compression ratios 30:1, 70:1, and 90:1 respectively. In Fig. 8 the boat image is compressed by using EZTW, SPIHT, JPEG, and the proposed method at a fixed compression ratio of 50:1. From Fig. 8 we can observe the perceptual quality of the reconstructed image by the proposed compression algorithm is efficient and improved from basic EZTW algorithm.

Fig. 7
figure 7

a Uncompressed fireworks image is compressed by compression ratios of (b) 30:1; (c) 70:1; (d) 90:1 using segmentation-based rank one updated SVD_EZTW method

Fig. 8
figure 8

Original Boat image compressed by (a) EZTW; (b) Basic SPIHT; (c) JPEG; (d) proposed method at compression rate 50:1

6 Conclusion

The scaled SVD and EZTW are progressive image compressions that found significant improvement from current compression algorithms. The above result and discussion show this algorithm is an efficient image compression method for multimedia image compression applications. The determination of the advanced level of spots in SVD and low bit rate coding in MEZTW improves the PSNR estimation at a fixed compression rate. The one significant improvement we observed in this algorithm is at a high compression rate the image quality preservation was not worse than other algorithms. Even though the SSIM performance of the proposed technique is less contrasted with SPIHT and JPEG at a high compression ratio. Since in MEZTW compression at a high compression ratio, the unwanted image information is eliminated, even a few significant sub-bands were also neglected for low bit rate transmission. This algorithm efficiency can be enhanced by reducing time complexity at the point of segmentation and optimizer by using a neural network to compute the average ranks and thresholds. This technique effectively preserves edge information even at a high compression rate. In general, the proposed technique fundamentally improves the traditional way of the EZTW algorithm and is also relatively higher than SPIHT and JPEG aside from some points.