1 Introduction

The explosive growth of data to be stored and/or transmitted in real world applications has been the driving force behind the need of developing algorithms and techniques for data compression in particular image compression [1]. Neural network is an area under research for years together. Neural networks are utilized in wide variety of applications including image compression. Similarly wavelet transform has also formed the basis of numerous applications and image compression in particular.

The use of SOM started from 1990. In 1990, an in depth study of SOM and its variants namely Learning Vector Quantisation 1 (LVQ1), Learning Vector Quantisation 2 (LVQ2) and their practical applications have been performed by Kohenon [2]. Jiang [3] conducted a survey on the effect and impact of various neural networks over image compression namely back propagation neural network based image compression, Hebbian learning based image compression, vector quantization based image compression, wavelet neural network based image compression, fractal neural network based image compression. A family of networks that learn from input namely Kohenon Self Organizing Feature Map (KSOFM), competitive learning, frequency sensitive competitive learning, fuzzy competitive learning and predictive neural networks are used for vector quantization. A review of vector quantization of images for code book design for image compression has been performed by Lu and Shin [4]. They have also proposed a compression technique by first classifying edge block and background block and then the design of separate codebooks by KSOFM. Amerijckx et al. [5] have worked on developing a Self Organizing Kohenon Map based image compression over still-images maintaining visual quality. Park and Woo [6] have proposed an edge preserving image compression algorithm based on unsupervised competitive neural network called Weighted Centroid Neural Network. Annadurai and AnnaSaro [7] have proposed a method to reduce the convergence time in Self Organizing Feature Map (SOFM) based image compression. In this, cumulative distribution function is first estimated and used for mapping image pixels which act as input for SOFM. Sharma et al. [8] have proposed a six step SOM algorithm for image compression using Kohenon’s Self Organizing Map which is a class of neural networks. Hierarchical SOM technique for efficient and effective code book design has been proposed by Tsai et al. [9] for image compression. Sarlin [10] has focused the use of SOM neural network for monitoring millennium development goals.

A framework for analysing various algorithms for image compression based on wavelet approximations has been presented by DeVore et al. [11]. Karayiannis et al. [12] have combined wavelet image decomposition, vector quantization using LBG (Linde, Buzo and Gray) and fuzzy algorithm to perform image compression. Cho and Pearlman [13] have used progressive resolution coding for fast and efficient image decompression based on prediction of dynamic ranges of wavelet sub bands. Jain and Jain [14] have performed a detailed study on image compression based on wavelet transform along with evaluation and comparison of seven wavelet families by applying them over test images. Venkateswaran and Rao [15] have used different methodology to achieve image compression. In their work, instead of applying DWT over whole image, sub blocks of the image are subjected to wavelet decomposition and the wavelet coefficients are clustered using K-Means clustering. Designing a vector quantization code book using fuzzy Probabilistic C Means clustering algorithm over wavelet packet tree coefficients has been implemented by Nagendran and Arockia Jansi Rani [16]. Improvement on clustering based on feature weight learning has been proposed by Wang et al. [17] and Yeung and Wang [18]. Image compression should not remove the important features of the image but at the same time unwanted and redundant feature need not be considered for compression. New sensitivity measure has been proposed in a literature to remove the redundant feature of the network [19].

SOM and wavelets combined together have been used for different applications like image segmentation, code book generation and image compression. Nunez and Llacer [20] have implemented image segmentation of astronomical images using SOM and wavelets. While designing a scheme for transmission of fixed images for wireless communication, Chatellier et al. [21] have developed compression module by applying DWT over the image first and then SOM vector quantization is applied to generate code books. Pandian and Anitha [22] have proposed a technique where generic code book is implemented using SOFM, discrete cosine transform (DCT) and DWT. Similar work has been proposed by Dandawate and Londhe [23]. Image compression using set partitioning in hierarchical trees (SPIHT) and SOFM vector quantization has been implemented by Rawat and Meher [24]. Another work of Dandawate et al. [25], has introduced an idea of designing code book for vector quantization based on SOFM and DWT. Huang et al. [26] have made a survey on training approaches for neural networks and extreme learning machines. Upper integral network with extreme learning mechanism has been proposed for classification system by Wang et al. [27].

The proposed work aims at reducing the size of the image file by hybridizing SOM and wavelet transform. DWT is applied on the code vector which is obtained from SOM neural network after vector quantization and storing only the approximation coefficients along with the index values obtained from SOM. Experimentally, the proposed method has been tested and found that better compression is achieved and retaining the visual quality of the image. The efficiency of the compression achieved by the proposed work is shown by applying over six images.

2 Design of compression and decompression techniques

SOM is a type of artificial neural network consisting of an input layer and Kohenon layer, that is trained using unsupervised learning to produce a two-dimensional, discretized representation of the input space of the training samples, called a map. SOMs operate in two modes namely training and mapping. Training builds the map using input examples and is called as vector quantization. After training, mapping automatically classifies a new input vector.

A self-organizing map consists of components called nodes or neurons. The neurons in the Kohenon layer are arranged in the form of grid. Associated with each node is a weight vector of the same dimension as the input data vectors and a position in the map space. The self-organizing map describes a mapping from a higher dimensional input space (input patterns) to a lower dimensional map space (weight vector or code vector). The input pattern is mapped to the neuron having weight vector closest to the input pattern. Once the closest neuron is located, the index of the neuron is assigned to the input pattern.

In the proposed work after a number of trials, it was decided to have 16 × 16 neurons in the Kohenon layer of the SOM network with 16 inputs as shown in Fig. 1.

Fig. 1
figure 1

Structure of SOM neural network

DWT when applied over a set of data in the form of a matrix, decomposes it to produce four components namely, LL, LH, HL, HH coefficients and it can be applied to many levels. Figure 2 shows decomposition of single level Haar DWT used in the proposed work.

Fig. 2
figure 2

Single level DWT

In the proposed work, during compression, a gray scale image with 256 × 256 pixels is first divided into 4,096 blocks, each of size 4 × 4. Every 4 × 4 block is converted into 16 element vector. There are 4,096 vectors corresponding to 4,096 blocks. They are given as input to SOM network. SOM network is trained using unsupervised batch weight/bias training with 4096 input patterns. After training the SOM network, the weight matrix of size 16 × 256 is obtained and 4,096 indexes corresponding to 4,096 input patterns are obtained. The 256 weight vectors called as code vectors and the 4,096 indexes obtained as output from the trained SOM network are processed separately for encoding. Single level Haar DWT is applied over the code vectors and 4,096 indexes corresponding to the input patterns of 4,096 blocks are encoded using arithmetic encoding.

Each code vector is converted into 4 × 4 block and when DWT is applied over it, LL (approximation coefficients), LH (horizontal coefficients), HL (vertical coefficients) and HH (diagonal coefficients) components are obtained. For each code vector, only LL coefficients of size 2 × 2 are stored because the detailed coefficients contain less important information and will not improve the visual quality of the image. The approximation coefficients in floating point are quantized and encoded into integers. Arithmetic encoded values of 4,096 indexes are stored along with the encoded approximation coefficients as binary file, which is the compressed form of the 256 × 256 input image. Figure 3 shows the steps involved in compression.

Fig. 3
figure 3

Block diagram showing compression

During decompression, first the index part and LL part are read from the compressed binary file. Arithmetic decoding is performed over the indexes resulting in 4,096 indexes. LL components of the code vectors are dequantized. LH, HL and HH components are assigned zero as they have not been stored during compression. Inverse DWT is applied over LL, LH, HL and HH components of the 256 code vectors. Mapping the retrieved 4,096 indexes onto the code vectors results in 4,096 output vectors each with 16 elements. Each output vector is converted into 4 × 4 sub block which forms the decompressed image. The steps involved in the decompression process are shown in Fig. 4.

Fig. 4
figure 4

Block diagram showing decompression

3 Proposed algorithm for compression and decompression

Algorithm for compression

  • Step 1: Split the image into blocks of size 4 × 4. Convert each block as vector.

  • Step 2: Input the 4,096 vectors to SOM network and train the network. 16 × 256 weight matrix and 4,096 indexes are obtained as a result of training.

  • Step 3: Apply DWT over each 4 × 4 sub block of the weight matrix (code vector).

  • Step 4: Encode the 4,096 indexes using arithmetic encoding.

  • Step 5: Store the encoded indexes.

  • Step 6: Quantize only the LL components of code vector into integers.

  • Step 7: Store the encoded LL components of code vectors.

Algorithm for decompression

  • Step 1: Read the 4,096 encoded indexes from the compressed binary file.

  • Step 2: Apply arithmetic decoding over the 4,096 indexes.

  • Step 3: Read the LL components of code vectors.

  • Step 4: Dequantize the LL components of the code vectors.

  • Step 5: Apply inverse DWT over the dequantized LL components by considering LH, HL, HH components as zeros and then reconstruct the 256 code vectors.

  • Step 6: Map the 4,096 indexes with the 256 reconstructed code vectors resulting in 4,096 output vectors each with 16 elements.

  • Step 7: Convert each 16 element output vector into 4 × 4 block to reconstruct the decomposed image.

4 Experimental results and discussion

The implementation of the proposed work was applied over six images namely lena, pepper, zelda, cameraman, girlface and goldhill. Haar wavelet has been chosen as it has the advantages such as best performance in terms of computation time, memory efficiency. Many strategies were tested such as change in the number of input neurons, change of number of neurons in the Kohenon layer, analysing and deciding the number of detailed coefficients to skip without storing them. It has been decided after many trials that the proposed method could be applied using SOM with 16 input neurons, 16 × 16 neurons in Kohenon layer. All the detailed coefficients are not stored as they contain less important information when DWT is applied over the code vector.

The input image was represented as a two dimensional matrix. After applying SOM and DWT, the 4,096 indexes and LL components of the 256 code vectors were stored as binary file. The size of image lena.bmp is 66,616 bytes. After compression the size of the binary file becomes 4,869 bytes. For any input image of size 256 × 256, we are storing only the encoded 4,096 indexes and LL components of 256 code vectors. Hence the size of the compressed binary file for all 256 × 256 images will be at the maximum 5 KB. Table 1 shows the file size of the considered original images, equivalent jpeg images and their compressed images.

Table 1 Original file size, jpeg file size and compressed file size

The size of the binary files obtained using the proposed method is less than the size of the file compressed using jpeg. RMSE and PSNR between the original images and the decompressed images are tabulated in Table 2.

Table 2 RMSE and PSNR of decompressed images of the proposed method

Root mean square error is calculated as \( {\text{RMSE = }}\sqrt {\frac{{\left( {\sum\nolimits_{i = 1}^{N} {(y_{i} - x_{i} )^{2} } } \right)}}{N}} \) where y i = intensity of the ith pixel of the original image, x i = intensity of the ith pixel of the decompressed image, N = number of pixels in the image.

Peak signal to noise ratio is computed as \( {\text{PSNR = 20}}\;{ \log }_{10} \frac{255}{\text{RMSE}}. \)

Compression ratio is calculated as \( {\text{CR = }}\frac{{{\text{Number}}\;{\text{of}}\;{\text{bits}}\;{\text{in}}\;{\text{the}}\;{\text{original}}\;{\text{image}}}}{{{\text{Number}}\;{\text{of}}\;{\text{bits}}\;{\text{in}}\;{\text{the}}\;{\text{compressed}}\;{\text{image}}}}. \)

A review over the researches made over 256 × 256 Lena image for image compression revealed the following results. For Lena image, according to Lu and Shin [4], PSNR value obtained using LBG is 29.40 dB but by the classified vector quantization technique and KSOFM is 30.03 dB. They classified the blocks into background, horizontal edge, vertical edge and diagonal edge and applied KSOFM. Applying pure Self Organizing Kohenon Map, Amerijckx et al. [5] have got PSNR of 24.74 dB at the compression rate 25.22 and at the compression rate 38, PSNR obtained is 24 dB. DCT was applied over image blocks followed by KSOM. According to Weighted Centroid Neural Network (WCNN) of Park and Woo [6], PSNR obtained is 31.04 dB. WCNN was proposed to reduce the edge degradations in reconstructed images. Tsai et al. [9] have achieved a PSNR of 34.016 dB using New Hierarchical SOM by splitting LBG to speed up convergence. Compared with the PSNR obtained by all the above four methods, proposed method gives better PSNR of 34.3802 dB. This is shown in Table 3.

Table 3 Comparison over existing techniques based on PSNR

DeVore et al. [11] have stored 20,236 DWT coefficients in 14,604 bytes and 12,068 DWT coefficients with 8,925 bytes but the proposed method requires 4,869 bytes for storing 1,024 coefficients and 4,096 indexes. With hierarchical partition priority wavelet image compression to retain maximum fidelity, Efstratiadis et al. [28] have achieved PSNR of 32 dB at 0.65 bpp (bits per pixel). At compression ratio of 4:1, using SOFM and applying cumulative distribution function (cdf), the PSNR of Lena image according to Annadurai and Saro [7] is 28.91 dB. Laha et al. [29] have achieved PSNR 28.47 dB at 0.218 bpp by applying Restricted Window Search with L2-SOM (RWS). Applying Linked Significant Tree method (LST), Muzaffar and Choi [30] have made PSNR of 24.58 dB at 0.1 dB, 27.22 dB at 0.2 bpp, 28.96 dB at 0.3 bpp, 30.81 dB at 0.4 bpp, 31.87 dB at 0.5 bpp. They have used LST wavelet image coding algorithm based on SPIHT. Over the same lena image, Chopade and Ghatol [31] have used wavelet based SPIHT and have achieved 32.52 dB at 0.2 bpp. By SOM vector quantization on DWT coefficients with 256 Quadrature Amplitude Modulation (QAM), Chatellier et al. [21] have got PSNR of 31.34 dB. When considering the PSNR values of different techniques at various bpp, it has been observed that the proposed method outperformed them by achieving PSNR of 34.3802 dB at 0.5954 bpp as shown in Table 4.

Table 4 Comparison over existing techniques based on PSNR and bpp

Sanchez et al. [32]. have obtained PSNR of 18.3135 at 0.5 bpp using adaptive filter whereas proposed method results in 34.3802 dB at 0.5954 bpp. With SPIHT and Embedded Block Coding with Optimized Truncation (EBCOT), Sudhakar et al. [33]. have made it 26.81 dB at compression ratio of 13.03 and 31.28 dB at compression ratio of 6.57. Using curvelet, widgelet and ridgelet Transforms, Joshi et al. [34]. have achieved PSNR of 18.3, 17.57 and 27.1 dB respectively over lena image at compression factor of 7.49. The proposed method outperforms these methods by achieving PSNR of 34.3802 dB at compression ratio of 13.68:1 and 0.5954 bpp as shown in Table 5.

Table 5 Comparison over existing techniques based on PSNR and CR

Sonja Grgic et al. [35]. have got PSNR of 32.52 dB at compression Ratio 10:1 whereas in the proposed work PSNR is 34.3802 dB. Pandian and Anitha [22] have used spatial quantization and achieved PSNR 36.16 dB at compression ratio 10.05:1. By designing a Neuro-wavelet based codebook with vector quantizer using SOM and neural network, Dandawate et al. [25] have got PSNR 35.89 dB and the compressed file size 15.2258 KB. But in the proposed method, the compressed file size is only 4.75 KB and PSNR achieved is 34.3802 dB at compression ratio 13.68:1. It is noted that better compression is achieved by the proposed work. Table 6 shows the bit rate achieved by the proposed work. The result reveals that the implementation of the proposed work gives good compression rate with acceptable loss in image quality. Table 7 shows the entropy of the decompressed images after decompression and Table 8 gives the compression ratio obtained.

Table 6 Bit rate of compressed images
Table 7 Entropy of decompressed images
Table 8 Compression ratio

Figure 5 shows the original Lena image and the decompressed Lena image of the proposed method. It has been observed from the decompressed image that the visual quality of the decompressed image is acceptable because PSNR value is greater than 30 [36]. Similarly the work has been applied over the other five images and it has been observed that the proposed work outperformed well on each image.

Fig. 5
figure 5

Original image lena256.bmp and decompressed lena image lenasdrecons.bmp

5 Conclusions

Proposed a method for compressing images using a hybrid of SOM neural network and wavelet transform and implemented successfully over 256 × 256 gray scale images. Novelty of this hybrid work is applying DWT on the code vector obtained from SOM after vector quantization and storing only the approximation coefficients. The proposed method was tested over images of different sizes and observed that compression and decompression are performed well. Considerable reduction in the file size is achieved. Both Vector Quantization using SOM and wavelet transform are lossy compression techniques. The size of the image file is reduced to a considerable extent and the decompressed images are visually acceptable. Since the compressed binary file contains the indexes and LL components of the code vectors, the size of the file is reduced and reconstruction is faster when compared with compression. The experimental results obtained from this technique reveals that this hybrid method leads to improved performance measures namely Peak signal to noise ratio and bits per pixel over existing techniques. The PSNR values are greater than 30 showing that the decompressed images are acceptable [36].