Keywords

1 Introduction

Different from conventional image compression techniques, fractal image coding is characterized by its novel idea, potential high compression ratio, fast decoding, resolution independence and predictable quality of decoded images [1,2,3,4]. Thus, it has attracted much attention from the researchers worldwide. However, it suffers from high computational complexity in the encoding process. In order to overcome this problem, some researchers perform block matching in the feature space [5,6,7]. Due to the low dimension of the features selected, the encoding process can be finished in a short time. In order to accelerate the encoding process furthermore, no-search fractal image encoding is also proposed at the expense of the quality of decoded images [8].

In our research, we find that the range blocks with large variances play a more important role in determining the quality of decoded images. On the contrary, the effect of the remaining range blocks can be ignored. Thus, in our research, we design an extended domain block pool (EDBP) for the range blocks with large variances. EDBP can reduce the collage errors effectively, and then, the quality of decoded images can be expected to improve. But it leads to larger computational complexity in the encoding process, and the bits per pixel (Bpp) increase as well. In order to accelerate the encoding process and reduce Bpp, we adopt the no-search fractal coding method to encode the remaining range blocks with small variances. Finally, two fractal encoding methods, Jacquin’s and He’s methods, are used to assess the performance of the proposed method [1, 6]. Experiments show that compared with the previous methods, the proposed method has shorter encoding time, fewer Bpp and better quality of decoded images simultaneously.

This paper is organized as follows: An introduction about conventional fractal image coding will be reviewed in Sect. 2. In Sect. 3, a hybrid fractal encoding method is proposed. By analyzing the importance of variances, we encode the range blocks with larger variances in an extended domain block pool and the remaining range blocks with no-search fractal image coding. In Sect. 4, the experimental procedures and performance of the proposed method will be presented and discussed. Finally, the conclusions are given in Sect. 5.

2 Conventional Fractal Image Coding

Fractal image coding is to establish an iterated function system (IFS) whose fixed point can approximate the input image well. It mainly consists of a series of block matching operations between domain blocks and range blocks. The range blocks come from dividing the input image uniformly into nonoverlapping B × B blocks. The domain blocks can be obtained by sliding a 2B × 2B window over the input image from the left to the right and from the top to the bottom. Generally, the sliding step is set to be 2B. In order to make the domain blocks match range blocks well, the domain blocks are firstly contracted to the same size of range blocks and then extended with eight isometric transformations. After performing affine transformations on contracted domain blocks, for arbitrary range block, its best-matched domain block can be found by minimizing the following function:

$$ {\text{CE}}(\varvec{R}) = \mathop {\hbox{min} }\limits_{\alpha ,\beta } \left\| {\varvec{R} - \alpha \varvec{D} - \beta \varvec{I}} \right\|^{2} $$
(1)

where I denotes a matrix whose elements are all ones. α and β denote the scaling coefficient and offset coefficient for the affine transformation, respectively. CE(R) is the collage error for the range block R, and D denotes the associated best-matched domain block. In order to minimize Eq. (1), we set the derivatives of Eq. (1) with respect to α and β to zeros and the optimal α and β can be computed as

$$ \alpha = {{\left\langle {\begin{array}{*{20}c} {\varvec{R} - \bar{r}\varvec{I,}} & {\varvec{D} - \bar{d}\varvec{I}} \\ \end{array} } \right\rangle } \mathord{\left/ {\vphantom {{\left\langle {\begin{array}{*{20}c} {\varvec{R} - \bar{r}\varvec{I,}} & {\varvec{D} - \bar{d}\varvec{I}} \\ \end{array} } \right\rangle } {\left\| {\varvec{D} - \bar{d}\varvec{I}} \right\|^{2} }}} \right. \kern-0pt} {\left\| {\varvec{D} - \bar{d}\varvec{I}} \right\|^{2} }},\,\beta = \bar{r} - \alpha \bar{d} $$
(2)

where \( \left\langle { \bullet , \bullet } \right\rangle \) denotes the inner product. \( \bar{r} \) and \( \bar{d} \) are the mean values of R and D, respectively. Figure 1 illustrates the above fractal encoding process.

Fig. 1.
figure 1

Illustration of the fractal encoding process.

At the decoding phase, arbitrary image can be selected as the initial image. The same transformations as those in the encoding process are applied to the initial image recursively. After about ten iterations, the decoding process converges to the final fixed point image. We take the 256 × 256 Lena image, for example, as the input image. If the 256 × 256 Goldhill image is selected as the initial image which is shown in Fig. 2a, the associated first five iteration images in the decoding process are illustrated in Fig. 2b–f, respectively.

Fig. 2.
figure 2

a Initial image. bf First five iteration images in the decoding process.

3 Proposed Method

From Sect. 2, we know that in the encoding process, we spend the same encoding time and memory space for each range block, but the quality of decoded range blocks differs greatly. In our research, we find that the range blocks with larger variances play a more important role in leading to the degradation of decoded range blocks. Thus, a larger domain block pool is designed for them, and then, we can get better quality of decoded images at the expense of longer encoding time and less compression ratio. For the range blocks with small variances, by adopting the no-search fractal coding method, we can get higher compression ratio and shorter encoding time at the expense of slightly worse quality of decoded images. Finally, for the whole input image, the encoding time, compression ratio and quality of decoded images can be all expected to be improved.

In the fractal encoding process, the range blocks are encoded one by one. We define the accumulated collage error (ACE) and accumulated variance (AVAR) as follows:

$$ \left\{ {\begin{array}{*{20}l} {\text{ACE = }\sum\limits_{i = 1}^{{\text{Num1}}} {\text{CE}\left( {\varvec{R}_{i} } \right)} } \\ {\text{AVAR = }\sum\limits_{i = 1}^{{\text{Num1}}} {\text{VAR}\left( {\varvec{R}_{i} } \right)} } \\ \end{array} } \right. $$
(3)

where Num1 denotes the number of the coded range blocks. By substituting Eq. (2) back to Eq. (1), we can get

$$ {\text{CE}}(\varvec{R}) = \left\| {\varvec{R} - \alpha \varvec{D} - \beta \varvec{I}} \right\|^{2} = \left\| {\varvec{R} - \bar{r}I} \right\|^{2} - \alpha^{2} \left\| {\varvec{D} - \bar{d}I} \right\|^{2} \le \left\| {\varvec{R} - \bar{r}I} \right\|^{2} $$
(4)

From Eq. (3), we can see that for arbitrary range block R, its variance provides an upper limit for its collage error, and the range blocks with large collage errors must have corresponding large variances. Thus, only the range blocks with large variances have the possibility to lead to large collage errors. Thus, from Eqs. (3) and (4), we have

$$ \text{ACE} \le \text{AVAR}\,\text{especially}\,\text{ACE}^{{\text{All}}} \le \text{AVAR}^{{\text{All}}} $$
(5)

where ACEAll and AVARAll denotes the ACE and AVAR of all range blocks. Furthermore, we can calculate the average collage error (ACER) as follows:

$$ \text{ACER = }{{\text{ACE}^{{\text{All}}} } \mathord{\left/ {\vphantom {{\text{ACE}^{{\text{All}}} } {\text{NumR}}}} \right. \kern-0pt} {\text{NumR}}}\text{ = }{{\sum\limits_{i = 1}^{{\text{NumR}}} {\text{CE}\left( {\varvec{R}_{i} } \right)} } \mathord{\left/ {\vphantom {{\sum\limits_{i = 1}^{{\text{NumR}}} {\text{CE}\left( {\varvec{R}_{i} } \right)} } {\text{NumR}}}} \right. \kern-0pt} {\text{NumR}}} $$
(6)

In our research, we adopt the peak signal to noise ratio (PSNR) to measure the quality of decoded images as follows:

$$ \text{PSNR} = \log_{10} \left( {{{255^{2} } \mathord{\left/ {\vphantom {{255^{2} } {\frac{1}{M \times N}\sum\limits_{j = 1}^{N} {\sum\limits_{i = 1}^{M} {\left( {\varvec{f}^{{\text{Original}}} \left( {i\text{,}j} \right) - \varvec{f}^{{\text{Decoded}}} \left( {i\text{,}j} \right)} \right)^{2} } } }}} \right. \kern-0pt} {\frac{1}{M \times N}\sum\limits_{j = 1}^{N} {\sum\limits_{i = 1}^{M} {\left( {\varvec{f}^{{\text{Original}}} \left( {i\text{,}j} \right) - \varvec{f}^{{\text{Decoded}}} \left( {i\text{,}j} \right)} \right)^{2} } } }}} \right) $$
(7)

where fOriginal and fDecoded denote the original image and the decoded image, respectively. M and N are the height and width of the images. Moreover, the logarithmic relationship between the average collage error (ACER) and the peak signal-to-noise ratio (PSNR) of the decoded image can be formulated as

$$ f\left( x \right) = \beta_{1} + \beta_{2} \log_{10} \left( x \right) $$
(8)

where β1 and β2 are constant coefficients. x and f(x) denote the ACER and PSNR of the decoded image, respectively.

We take 512 × 512 Lena, Boat, Zelda and Mandrill, for example, to illustrate the importance of the variances of range blocks quantitatively. In the fractal encoding process, we sort the range blocks by their variances from largest to smallest and then encode the range block in order one by one. Then, the actual percentage of accumulated collage error (APACE) can be denoted as

$$ \text{APACE = }{{\text{ACE}} \mathord{\left/ {\vphantom {{\text{ACE}} {\text{ACE}^{{\text{All}}} }}} \right. \kern-0pt} {\text{ACE}^{{\text{All}}} }}\text{ = }{{\sum\limits_{i = 1}^{{\text{Num1}}} {\text{CE}\left( {\varvec{R}_{i} } \right)} } \mathord{\left/ {\vphantom {{\sum\limits_{i = 1}^{{\text{Num1}}} {\text{CE}\left( {\varvec{R}_{i} } \right)} } {\sum\limits_{i = 1}^{{\text{NumR}}} {\text{CE}\left( {\varvec{R}_{i} } \right)} }}} \right. \kern-0pt} {\sum\limits_{i = 1}^{{\text{NumR}}} {\text{CE}\left( {\varvec{R}_{i} } \right)} }} $$
(9)

where NumR denotes the total number of range blocks.

From Fig. 3, we can see that in the sorted range block sequence, the first 50% range blocks contribute 91.57, 89.06, 79.72 and 90.74% collage errors for Lena, Boat, Zelda and Mandrill, respectively. The importance of the range blocks with large variances can be verified. Thus, if we can reduce the collage errors of the range blocks with large variances, AVARAll will decrease effectively and then, ACEAll and ACER can be expected to decrease as well. According to Eqs. (7) and (8), the PSNR quality of decoded images can be improved.

Fig. 3.
figure 3

APACE versus area for Lena, Boat, Zelda and Mandrill, respectively.

We set a threshold σ for the variances of range blocks, and all the range blocks can be divided into two different categories: the range blocks with large variances and the range blocks with small variances. For the former ones, their respective best-matched domain blocks will be all searched in an extended domain block pool. Besides the conventional DBP as in Sect. 2, we also design a supplementary domain block pool (SDBP) which can be constructed with two steps: (1) slide a 4B × 4B window over the input image from the left to the right and from the top to the bottom, and the sliding step is set to be 4B. (2) Contract all the domain blocks to the size of B × B by averaging every 4 × 4 pixels, and perform eight isometric transformations on them. As stated in Sect. 2, for arbitrary range block with large variances, its best-matched domain block will be searched in both DBP and SDBP. In this case, compared with the conventional method in Sect. 2, the collage errors can be expected to decrease effectively at the expense of more encoding time and memory space. For the latter ones, as illustrated in Fig. 4, appoint the domain block which has the same center with the range block as the best-matched domain block directly. Although the no-search method will lead to larger collage errors, according to Eq. (4), we know that smaller variances will provide an upper limit for their respective collage errors. This implies that small collage errors can be maintained. Moreover, the no-search method can provide shorter encoding time and less memory space. In addition, according to Eq. (8), since β2 is a negative value, we know that smaller collage errors imply better decoded image quality. Finally, for the whole image, we can expect to obtain shorter encoding time, less memory space and better quality of decoded images.

Fig. 4.
figure 4

Range block and its associated best-matched domain block.

4 Experiments

In this section, four 256 × 256 images, Lena, Camera, Couple and Zelda are used. The size of range blocks and the threshold σ for the variances of range blocks are set to be 4 and 8, respectively. The size of domain blocks and search step δ are both set to be 8. The coefficients, s and o, are quantized by 5 and 7 bits, respectively. The experiments are carried out on a Pentium Dual-Core 2.93 GHz PC and programmed using MATLAB software. We will compare Jacquin’s and He’s methods with our proposed method by bits per pixel (Bpp), encoding time and quality of decoded images, respectively [1, 6]. Based on the discussions in Sect. 2, we design the encoding procedures as follows:

Step 1: Given an input image, divide it uniformly into B × B range blocks. Calculate the variances of range blocks and sort them by their variances from largest to smallest.

Step 2: Slide a 2B × 2B window over the input image from the left to the right and from the top to the bottom. The sliding step is set to be δ. After contracting and perform eight isometric transformations on the domain blocks, a domain block pool (DBP) can be obtained.

Step 3: Slide a 4B × 4B window over the input image as in Step 2. The sliding step is set to be 4B. Contract the domain blocks into the same size of range blocks and then perform eight isometric transformations on them. A supplementary domain block pool (SDBP) is obtained.

Step 4: For the range blocks with large variances, their respective best-matched domain block is searched within DBP and SDBP. For arbitrary range block, we will use one bit to label which domain block pool, DBP or SDBP, is used, and thus an extra column vector V1 is needed. The remaining range blocks with small variances are encoded with the no-search fractal encoding method. Another 1 × NumR column vector V2 is needed to label the above two kinds of range blocks.

Table 1 illustrates the comparison results between Jacquin’s method and our proposed method. Similar comparison results between He’s method and our proposed method are listed in Table 2 as well. From the second and third rows of Tables 1 and 2, we can see that the proposed method can provide better quality of decoded images than Jacquin’s method. Compared with Jacquin’s method, since the range blocks with large variances can find their respective best-matched domain blocks in a larger domain block pool, EDBP, this will result in smaller collage errors, and then, better quality of decoded images can be obtained. From the fourth to seventh rows of Table 1, we can see that compared with Jacquin’s method, the proposed method can provide less encoding time and need less memory space. For the range blocks with smaller variances, since their best-matched domain blocks are directly designated without the searching process, and there is no need to store the position information of the best-matched domain blocks, shorter encoding time and less memory space can be achieved.

Table 1. Performance comparison between Jacquin’s method and the proposed method [1].
Table 2. Performance comparison between He’s method and the proposed method [6].

5 Conclusions

In our research, we propose a method to improve the performance of fractal coding methods. By analyzing the importance of the variances of range blocks, we find that the range blocks with large variances play a more important role in determining the quality of decoded images. Thus, we design an extended domain block pool for the range blocks with large variances. For the remaining range blocks with small variances, the no-search fractal coding method is adopted. Experiments show that the proposed method can improve the performance of fractal coding methods effectively and make them more useful in practical applications.