Keywords

1 Introduction

In the past few years, development of multimedia products has increased. The need of high storage capacity of storage devices to store the graphics files can be accomplished by encoding of graphics files, which reduces the size of number of bits as much as possible. Image encoding reduces the size of file without hampering the quality of images in order to store and transmit in an efficient manner. Image encoding is broadly classified in two kinds: lossy and lossless encoding. A lossless method gives an identical image to the original image after decoding, whereas lossy will provide an image that somewhat resembles the original one after decoding process. Fractal image encoding is a lossy image compression technique. This approach uses self-similarity characteristic in natural image to perform the encoding. It improves the compression ratio and PSNR value with some acceptable loss of data to maintain the quality of an image. However, original process of coding is very time-consuming. The computation of similarity measure can be reduced using a robust search method that reduces computations and also coding time.

There have been various approaches to overcome the limitation of fractal image encoding in the past few decades. Some of the methods were based on classification of image blocks [13]. Many algorithms have been designed with different search strategy to reduce the region of interest [46]. Techniques developed to determine the best domain block for the range blocks of image for further transformation to encode and decode the image [79]. Antonioni et al. [10] proposed a hybrid approach using wavelet transform and vector quantization to give progressive transmission of an image to receiver to obtain the picture at a faster rate with minimum cost. Fisher [11] described the whole fractal encoding implementation in detail. Saupe [4] gave the nearest neighbor-based fractal encoding algorithm which speed up the encoder. Fu [1] introduced the fractal encoding based on classification in wavelet domain to overcome the limitation of basic fractal algorithm. Truong [12] developed a fast encoding algorithm by reducing MSE computations. Tong [5] made some improvements in Saupe’s [4] nearest neighbor approach to obtain high PSNR value and compression ratio with less encoding time. Iano et al. [2] presented a fast hybrid fractal-wavelet approach which used Fisher’s domain classification to apply at low-pass subband of wavelet-transformed image and modified SPIHT reduces 94 % of encoding and decoding time. Zhou et al. [7] anticipated an improved search scheme using image feature for best matching of domain and range blocks with a bit reduction in PSNR value. Fu and Zhu [3] introduced a discrete cosine transform (DCT) lower coefficient-based classification of range and domain blocks with high compression ratio and good reconstruction fidelity. Yu et al. [13] gave a segmentation-based approach to overcome the drawback of high encoding time. Wu and Lin [14] derived a genetic algorithm-based hybrid select approach to speed up the fractal encoder with little decay in image quality. Lin and Wu [15] improved the speed of fractal encoder using edge property-based neighborhood search in DCT domain with 0.9 dB decay in image quality. Further, he [8] used particle swarm optimization as search method under discrete wavelet transform (DWT) classification and dDihedral transformations to reduce the mean square error (MSE) computations. Wei et al. [16] improved the traditional fractal encoding by using only diagonal data to calculate MSE to speed up the encoder. de Quadros Gomes [17] introduced parallel modeling of fractal image encoding using dual or quadcore machines to get the encoded image in feasible time. Shiping and Lijing [9] used ortho difference sum-based method to restrict the domain for best matching and to reduce the encoding time. Zhang et al. [6] also used neighborhood search-based approach to save plenty of time to encode by using sub-block subtraction to limit the region for search. Nodehi et al. [18] gave an intelligent fuzzy approach to improve the encoding speed. There have been various approaches introduced by researchers to overcome the time consumption of fractal encoder and Du et al. [19] made a contribution using different a approach based on quantum mechanics to reduce the computational complexity.

1.1 Basic Fractal Image Encoding

Fractal image encoding is based on the natural image property of self-similarity. It decomposes the image using various techniques to determine the range blocks and domain blocks and then map these blocks on the basis of self-similarity of image blocks. M. F. Barnsley and Demko introduced the theory of iterated function systems IFS to define the natural objects. IFSs (iterated function systems) are the unified mathematical model to generate the parts of image that are similar to the whole image. Decoding of image is faster when compared to encoding.

The basic theory of fractal encoding based on IFS was introduced by Barnsley and Demko [20]. It was followed by the development of automated algorithm for fractal image encoding based on partitioned IFS (PIFS). Jacquin [21] exploits the restrictions of local self-similarity of image by dividing it into a number of subblocks. He suggested fractal block-based encoding technique for compressing the image using lesser memory and faster encoding. The algorithm for basic fractal encoding is as follows:

  1. 1.

    Partition the given image into mutually disjoint portions \(R_{i} ,\) i.e., \(R_{i} \, \cap \,R_{j} = \emptyset (i \ne j)\).

  2. 2.

    An equal number of other image portions are generated known as domain blocks \(D_{j} ,\) which can overlap each other and double in size than that of range blocks to follow the contractive mapping fixed point theorem.

  3. 3.

    For each range block \(R_{i} ,\) affine transformations are made to find the most similar domain block \(D_{j} ,\) and store affine transformation parameters as the iterative parameter.

  4. 4.

    The image can be reconstructed using iterative parameters. In general, iterate for 5–8 times to produce the encoded image.

2 Image Block Transformation Techniques

In this section, three approaches based on neighborhood search are explained. All the algorithms describe three different domains for the search method. The first method uses spatial domain to find the best matching pair of range and domain blocks using orthogonal projection and pre-quantization, which reduces the problem to nearest neighbor search of transformed domain blocks for transformed blocks. The DCT-based approach provided frequency domain to gather all range and domain blocks with similar edge properties to reduce the area of search. However, DWT reduces the dihedral transformations of blocks by classifying them into different regions of same class to which they belong.

2.1 Neighborhood Search Strategy in Spatial Domain

In traditional fractal image encoding, the encoding scheme is time-consuming due to the search of appropriate domain block for each range block of an image. It uses linear search to determine the best domain blocks. This strategy suffers from redundancy present in the search method. To accelerate the search for domain, Saupe introduced a novel approach to find the best domain–range pair by using Euclidean space of feature vectors of domains and ranges. It reduces the search complexity from linear time to logarithmic time [4].

Saupe [4] derived the theorem stating that least square error E(R, D) is proportional to function g of Euclidean distance between normalized projections \(\emptyset \left( R \right)\) and \(\emptyset (D)\) is defined as:

$$E(R,D) = \left\| {R - \bar{r}I} \right\|^{2} g(\Delta (R,D))$$
(1)

where \(\emptyset (.)\) is the normalized projection operator, \(g(\Delta ) =\Delta \sqrt {1 -\Delta ^{2} /4}\) and \({\Delta (}R,D\text{)} = \hbox{min} \left( {\left\| {{\emptyset }(R) + {\emptyset }(D)} \right\|,\left\| {{\emptyset }(R) - {\emptyset }(D)} \right\|} \right).\)

Since the function g(∆) is monotonically increasing in the domain interest and has the essence of the new form for the distortion, in the domain pool Ω minimisation of E(R, D) is equivalent to minimizing ∆ (R, D). Now nearest neighbor search is done \({\emptyset (}R\text{)}\) in the set \(\left\{ { \pm \emptyset (D):D \in\Omega } \right\}\). In between the transformed domain blocks \({\emptyset }(D)\) and transformed range block \({\emptyset }(R)\) problem is now converted from domain range matching to nearest neighbor problem.

The brute-force linear search has \(O\left( {N_{D} } \right)\), there are many algorithms and data structures of nearest neighbor research. Many variants of the k-d tree algorithm proposed by Arya et al. are adopted by Saupe [4]. \(O(N_{D} \log N_{D} )\) Operations are required to build a tree data structure where domain blocks after transformation is done, before search. There is a degradation in the time complexity from \(O\left( {N_{D} } \right)\) to \({\text{O}}(\log N_{D} )\) [5].

2.2 Neighborhood Search Method Using Discrete Cosine Transform (DCT)

Lin and Wu [15] introduced a novel and efficient approach based on neighborhood region search method using the edge property in DCT frequency domain. In this approach, the two-dimensional coordinate system is prepared using lowest frequency coefficients of DCT to gather all domain and range blocks with same edge property. The two lowest DCT coefficients are used to build coordinate system of frequency domain, i.e., F(1, 0) and F(0, 1). The quantity F(p, q) can be defined as

$$F(p,q) = \frac{2}{N}C_{p} C_{q} \sum\limits_{i = 0}^{N - 1} {\sum\limits_{j = 0}^{N - 1} {f(i,j)} } \cos \left( {\frac{(2i + 1)p\pi }{2N}} \right)\cos \left( {\frac{(2j + 1)q\pi }{2N}} \right)$$
(2)

where f(i, j) is the image block of size N × N and \(p,q = 0,1,2,3, \ldots ,N - 1\) and,

$$C_{m} = \left\{ {\begin{array}{*{20}l} {\frac{1}{\sqrt 2 },} \hfill & { {\text{if}}\,\,m = 0} \hfill \\ {1,} \hfill & { {\text{otherwise}}} \hfill \\ \end{array} } \right.$$
(3)

The algorithm for the neighborhood search-based fractal encoding in DCT domain is explained as follows:

For each image block, both the DCT coefficients must be calculated and their absolute values constitute a pair \(\left( {\left| {F(1,0)} \right|,\left| {F(0,1)} \right|} \right)\) represents an image block. It was assumed that \(\left( {\left| {F^{*} (1,0)} \right|,\left| {F^{*} (0,1)} \right|} \right)\) is the element with maximal norm. In Fig. 1, \(\left( {\left| {F^{*} (1,0)} \right|s,\left| {F^{*} (0,1)} \right|s} \right)\) denote the scaled element with maximum norm. Therefore, all the ranges and domain blocks are further mapped into this unit loop in the first quadrant. All the similar edge-shaped blocks are gathered together in this quadrant and the two nearest blocks will have a highly similar edge shape.

Fig. 1
figure 1

Classification of image blocks in frequency domain (Source Computers and mathematics with applications 62, p. 313)

After completion of the nearest neighborhood search for image blocks, these blocks can be classified into two different regions, \(\Omega _{1 }\) and \(\Omega _{2}\) as shown in Fig. 2. Begin with the initialization of spanned phase angle \(\hat{\theta }\) of \({\emptyset } = \theta_{j}\) and positive constant value c. In Fig. 2, the d j is the distance between the origin and any range block v j , the straight line connecting origin and d j is given as \({\emptyset } = \theta_{j} ,\) where \(\theta_{j} = \tan^{ - 1} \left( {\left| {F(0,1)} \right|s/\left| {F(1,0)} \right|s} \right),\left| {F(1,0)} \right|s > 0\). All these notations must be calculated and should be used further to check whether search space contains any domain block or not. If it does not have any domain block then \(\hat{\theta } = \hat{\theta } +\Delta \theta\) and \(c = c +\Delta c,\) otherwise determine the optimal domain from search space and store it as fractal code for range block v j.

Fig. 2
figure 2

Search space for fractal encoding (Source Computers and mathematics with applications 62, p. 314)

2.3 Neighborhood Search Method Using Discrete Wavelet Transform (DWT)

Lin and Chen [8] introduced further improvements in their previous works to overcome the drawback of high time consumption in the encoding phase. In this approach, the particle swarm optimization has been used as a block search mechanism to collect the domain and range blocks for further classification and transformation. The particle swarm optimization worked as a neighborhood search algorithm as discussed earlier in this section. After completing the search mechanism, the blocks are classified using discrete wavelet transform in different three regions, i.e., S, H, and D regions as shown in Fig. 3, where S denotes the smooth, H denotes the horizontal/vertical, and D denotes diagonal/sub-diagonal regions. On the basis of texture characteristics of image blocks and position of normalized DWT pair, the blocks can be classified in any of these regions.

Fig. 3
figure 3

Partition search space for DWT classification (Source JISE 28, p. 23)

After classification of all the blocks, the dihedral transformations are performed on domain blocks for each range block to find the best matching pair using dihedral property shown in Fig. 3. The MSE computations are completed after the transformations are performed according to the classification. The algorithm used for fractal image encoding using neighborhood search in DWT domain can be summarized as follows:

  1. 1.

    Initialize the swarm size and all the parameters of particle.

  2. 2.

    For each image block, check whether they belong to same class or not. If yes, then go to step 3. Otherwise, go to step 5.

  3. 3.

    For each block of same class, check whether they are in the same region (S, H, or D) or not. If yes, then apply the first four dihedral transformations on them. Otherwise, the last four transformations will be performed.

  4. 4.

    Calculate MSEs after performing the transformations.

  5. 5.

    Update the position and velocity of each particle.

  6. 6.

    If stopping criterion has been reached, then stop. Otherwise, go to step 2.

3 Experiments and Results

In this section, all the three neighborhood search-based approaches discussed in previous section are compared. All these approaches are analyzed on the basis of four parameters, i.e. PSNR value, compression ratio, MSE computations and CPU time for encoding and decoding. The results of comparison are as follows:

3.1 Results of Neighborhood Search in Spatial Domain

In this approach, the PSNR value and the compression ratio varies according to the variation in range sensitivity and block size of range cells which is shown in the Tables 1, 2 and 3. The graphical representation for the factors using miscellaneous image dataset from [22], i.e., PSNR value, computation time, and compression ratio to define the image quality are shown in Figs. 4, 5 and 6.

Table 1 Analysis of four parameters with respect to block size of minimum range cells
Table 2 Analysis of four parameters with respect to block size of maximum range cells
Table 3 Analysis of three different images using evaluation parameters
Fig. 4
figure 4

PSNR values corresponding to image dataset

Fig. 5
figure 5

Compression time values corresponding to image dataset

Fig. 6
figure 6

Compression ratio values corresponding to image dataset

3.2 Results of Neighborhood Search in DCT Domain

The PSNR value and MSE computations are calculated using the variable \(\hat{\theta }\) and c which are randomly initialized with some value is illustrated in Table 4.

Table 4 Edge-based neighborhood search method using DCT for Lena image

3.3 Results of Neighborhood Search in DWT Domain

In Table 5, the neighborhood search in DWT domain is used to process the encoding for three different images on the basis of PSNR value, MSE computations and the CPU time.

Table 5 Evaluation of three different images using neighborhood search in DWT domain

4 Conclusion

In this paper, three major neighborhood fractal image compression approaches are observed and analyzed. Parameters used to compare and analyze are mentioned below:

  1. 1.

    PSNR value

  2. 2.

    Compression ratio

  3. 3.

    MSE computations

  4. 4.

    Computational time.

The neighborhood search strategy used in spatial domain reduces the encoding time from linear time to logarithmic time. This method reduces the search space by using Euclidean space of feature vectors of domain. It eventually provided multi-dimensional neighborhood search. The same principle was followed by neighborhood search in DCT-based lowest frequency domain by reducing the search space to gather image blocks with similar edge shapes. In addition to spatial domain-based approach, it take less time to map the range blocks and domain blocks which consecutively consumes lesser time to encode image. The discrete wavelet-based classification used particle swarm optimization to find the best domain block in global domain which considers only the same type of range and domain blocks. After classification, only four dihedral transformations are made for domain blocks which reduce the encoding time. All the three algorithms gave high PSNR value, high compression ratio and reduced MSE computations with little decay in image quality which is acceptable. For references, follow the given guidelines.