1 Introduction

In the last years, the coding efficiency of wavelet-based image encoders have been improved, achieving in this way, a reduction in the bandwidth or amount of memory needed to transmit or store a compressed image. Unfortunately, many of these coding optimizations involve higher complexity, requiring faster and more expensive processors. For example, the JPEG 2000 [9] standard uses a large number of contexts with an iterative time-consuming optimization algorithm (called PCRD) to improve coding efficiency. Other encoders (like the one proposed in [18]) achieve very good coding efficiency with the introduction of high-order context modeling, being the model formation a slow process. Even bit-plane coding employed in many encoders (like [16] and [3]) results in a slow coding process since an image is scanned several times, focusing on a different bit-plane in each pass, which in addition causes a high memory access overhead (cache miss rate increase).

The above mentioned encoders are designed to obtain high performance in rate-distortion terms and also a broader functionality, but unfortunately other parameters like complexity or memory resources are not considered as critical as the former ones.

Recently, several authors have shown interest in developing very fast and simple wavelet encoders that are able to get reasonable good R/D performance with reduced computing resources requirements. The objective of these fast and efficient image encoders is mainly targeted to interactive real-time applications running under resource constrained devices. In that scenario, the data must be encoded as soon as possible to fit the application constraints using the scarce available resources in the system (memory and processing power).

Basically, these encoders do not present any type of iterative method, and each coefficient is encoded as soon as it is visited. However, this results in the loss of SNR scalability and precise rate control capabilities (in other words, the image cannot be compressed to a specific user-defined file size). They simply apply a constant quantization to all wavelet coefficients, encoding the image at a constant and uniform quality, as it happened in the former JPEG standard [8], where only a quality parameter was available and no precise rate control was performed.

In [15] the first non-embedded encoder was proposed with the aim of reducing the complexity of a wavelet-based image encoder. This algorithm is a modified version of SPIHT [16], in which once a coefficient is found to be significant, all significant bits are encoded to avoid refinement passes (losing SNR-scalability and rate control, both available in original SPIHT). However, in this proposal, bit-plane processing is still needed in the sorting passes, and thereby the coding process is not speeded up too much.

One of the first fast non-embedded image encoders was LTW [13], a tree-based wavelet encoder that avoids bit-plane processing and predictive encoding techniques; instead of that, it uses a one-pass coefficient coding process with a very reduced number of contexts for arithmetic encoding.

In [2] it has been proposed another very fast non-embedded encoder called PROGRESS. It follows the same ideas of [13], avoiding bit-plane coding and using coefficient trees to encode wavelet coefficients in only one pass. In this encoder, all the coefficients and not only the zero coefficients, are arranged in trees. The number of bits needed to encode the highest coefficient in each tree is computed, and all the coefficients at the current subband level are binary encoded with that number of bits. Then, the following subband level is encoded (in decreasing order) simply by computing again the number of bits needed to represent each sub-tree at that level and using that number of bits again.

Another fast non-embedded image encoder is the BCWT encoder [7]. It offers high coding speed, low memory usage and good R/D performance. The key of BCWT encoder is its one-pass backward coding, which starts from the lowest level sub-bands and travels backwards. Map of Maximum Quantization Levels of Descendants (MQD Map) calculation and coefficient encoding are all carefully integrated inside this pass in such a way that there is as little redundancy as possible for computation and memory usage.

None of the above non-embedded encoders support rate control so, in this paper, we propose several rate control algorithms for them. We have chosen the LTW encoder to evaluate the rate control algorithms not only in terms of rate/distortion (R/D) performance but also in terms of coding delay and overall memory usage.

Our first rate control proposal extracts some features from the wavelet transformed image and finds correlations with the quantization parameter for a specific target bit-rate. The second proposal is based on a simple model of the encoding engine in a similar way than in [6] and [11] where statistical models were employed to accomplish rate control. To set the finer scalar uniform quantization parameter (Q), we model the bit-rate evolution with a second order polynomial function. Finally, to cope with the rate accuracy demands of certain applications, we propose an iterative version for bounding the estimation error with the minimum number of iterations.

The rest of the paper is organized as follows. In Section 2, the LTW algorithm is outlined. In Section 3, we describe the proposed rate control algorithms and evaluate their accuracy in Section 4. Then, in Section 5, we show the results of the global encoder system (including rate control) and compare it with SPIHT and JPEG 2000 in R/D performance, complexity and memory requirements. For further evaluation, we have compared its performance with a fully optimized version of JPEG2000 (Kakadu). Finally, in Section 6 some conclusions are drawn.

2 LTW: A Fast Non-embedded Image Encoder

LTW is a tree-based wavelet image encoder, with state-of-the-art coding efficiency, but less resource demanding than other encoders in the literature. The basic idea of this encoder is very simple: after computing a dyadic wavelet transform of an image [1], the wavelet coefficients are first quantized using two quantization parameters (rplanes and Q) and then encoded with arithmetic coding.

For the coding stage, if the absolute value of a coefficient and all its descendants (considering the classic quad-tree structure from [16]) is lower than a threshold value (2rplanes), the entire tree is encoded with a single symbol, which we call LOWER symbol. But if a coefficient is lower than the threshold and not all its descendants are lower than it, that coefficient is encoded with an ISOLATED_LOWER symbol. On the other hand, for each wavelet coefficient higher than 2rplanes, we encode a symbol indicating the number of bits needed to represent that coefficient, along with a binary coded representation of its bits and sign (note that the rplanes less significant bits are not encoded).

More details about the coding and decoding algorithms, as well as a formal description can be found in [14].

In this work, the rate control will be achieved by properly tuning the quantization parameters. In particular the rplanes parameter for a coarser quantization (in which rplanes less significant bits are removed from each coefficient), and the Q parameter for a finer adjustment in a typical uniform scalar quantization process.

3 Rate Control Support for Non-embedded Encoders

In this section, we propose several lightweight rate control algorithms for non-embedded encoding, with increasing complexity and accuracy. These algorithms will predict the proper quantization values that lead to a final bit-rate close to the target one. Although these rate control algorithms could be applied to whatever non-embedded wavelet encoder, we will use the LTW encoder in the evaluation to observe their behavior in the overall encoding system.

3.1 Zero-order Entropy Based Rate Control Algorithm

This method is based on the zero-order entropy (Eq. 1) of the wavelet coefficients. The estimation of the quantization parameters is based on the correlation between entropy, target bit-rate and quantization parameters.

$$ H(x) = -\sum\limits_{x} p(x) {\rm log}_{2}\left(p(x)\right) $$
(1)

We use the Kodak image set [4] as a representative set of natural images for our purposes and the LTW encoder with both Q and rplanes quantization parameters. As there is a correlation between the wavelet coefficients entropy and the quantization parameters, we can establish a relationship between them for a given target bit-rate by means of curve and surface fitting techniques [5, 12, 17, 19]. In particular, surface fitting process was driven by polynomial bivariate (bit-rate and entropy) equations due to its low computational complexity. So, Eqs. 2, 3 and 4 represent the surface fitting expressions corresponding to the fine quantizer estimation (\(Q_{\rm rplanes}\left(x,y\right)\)) for rplanes values of 2, 3 and 4 respectively. The variables ’x’ and ’y’ represent the wavelet coefficients entropy and the target bit-rate respectively, and constant values a, b, c, d, e, f, g, h, i and j have been computed through the aforementioned surface fitting methods using the wavelet coefficient entropy information extracted from the Kodak image set. For each equation, we also show the Coefficient of Determination (r 2) that measures the fitting goodness (ideally r 2 = 1).

$$\begin{array}{rll} Q_{rp2}\left(x,y\right) &=& a+bx+c/{y}+dx^{2}+e/y^{2}+fx/y \\ && + gx^{3}+h/y^{3}+ix/y^{2}+jx^{2}/y \end{array}$$
$$\begin{array}{lll} && a=23.99, \;\, b=-23.68, \;\, c=-1.27, \;\, d=7.36, \\ && e=0.06, \;\, f=1.10, \;\, g=-0.72, \;\, h=0.003, \nonumber \\ && i=-0.06, \;\, j=-0.009 \end{array}$$
$$\left(r^{2}\right)=0.949$$
(2)
$$\begin{array}{rll} Q_{rp3}\left(x,y\right) &=& a+b/{x}+c \ln{y}+d/{x^{2}}+e\left(\ln{y}\right)^{2} \\ && + f\left(\ln{y}\right)/{x}+g/{x^{3}}+h\left(\ln{y}\right)^{3} \\ &&+ i\left(\ln{y}\right)^{2}/{x}+j\left(\ln{y}\right)/{x^{2}} \end{array}$$
$$\begin{array}{lll} && a=13.04,\thickspace b=-69.08,\thickspace c=-5.98,\thickspace d=129.67,\nonumber \\ && e=0.58,\thickspace f=19.07,\thickspace g=-82.79,\thickspace h=-0.07, \nonumber \\ && i=-0.66,\thickspace j=-16.31 \end{array}$$
$$\left(r^{2}\right)=0.950$$
(3)
$$\begin{array}{rll} Q_{rp4}\left(x,y\right) &=& a+b/{x}+c \ln{y}+d/{x^{2}}+e\left(\ln{y}\right)^{2} \nonumber \\ && + f\left(\ln{y}\right)/{x} +g/{x^{3}}+h\left(\ln{y}\right)^{3} \nonumber \\ &&+ i\left(\ln{y}\right)^{2}/{x}+j\left(\ln{y}\right)/{x^{2}} \end{array}$$
$$\begin{array}{lll} && a=5.29,\thickspace b=-19.55,\thickspace c=-2.29,\thickspace d=25.77, \nonumber \\ && e=0.15,\thickspace f=4.74,\thickspace g=-11.72,\thickspace h=-0.02, \nonumber \\ && i=-0.02,\thickspace j=-2.54 \end{array}$$
$$\left(r^{2}\right)=0.968 $$
(4)

The Entropy-based algorithm shown in Fig. 1 works as follows:

  • First (E1), the rplanes value is determined as a function of the target bit-rate (T bpp ). We have experimentally determined the proper rplanes value for the working bit-rate ranges. Particularly, the best choice for a target bit-rate in the range 0.0625–0.5 bpp is rplanes = 4, rplanes = 3 in the range 0.5–1.5 bpp and rplanes = 2 in the range 1.5–2 bpp

  • Second (E2), we apply the coarser quantization by removing the previously computed rplanes less significant bits from all wavelet coefficients.

  • Third (E3), the zero-order entropy of the coarse quantized wavelet coefficients (S e ) is calculated.

  • Finally (E4), the finer quantization parameter Q is obtained from the surface fitting equation corresponding to the selected rplanes value:

    \(Q = Q_{\rm rplanes}\left(S_{e},T_{bpp}\right)\).

Figure 1
figure 1

Entropy-based algorithm.

3.2 Rate Control Based on a Trivial Coding Model

It is difficult to estimate the quantization parameters at a certain degree of accuracy for a particular target bit-rate by only using the zero order entropy of the wavelet coefficients. So, we decided to study how the encoder works in order to define a simplified statistical model of the encoding engine in a similar way as in [6] and [11]. In [6] authors propose an expensive rate allocation scheme based on the Lagrangian optimization problem that offers a low accurate rate control capability. In [11] another statistical model based in the generalized-Gaussian densities (GGD) approach is proposed, obtaining an expensive but high accurate rate control behavior.

In this work we will use the LTW coding engine in order to define a simple model that will be able to supply a fast and accurate estimation of the resulting bit-rate. The encoding model uses a two-tier quantization process based on (1) a coarse quantization where the rplanes least signifficant bits of all wavelet coefficients are removed and (2) a finer quantization applied to all wavelet coefficients through a scalar uniform quantization (Q). So, the proposed model of the encoding system will be defined in two parts:

Firstly, we need to determine the appropiate rplanes value for a given target bit-rate. For each image in the Kodak set, we perform the DWT and apply the coarse quantization for each specific rplanes value (from 2 to 7). Then, for each rplanes value, the coarse quantized DWT coefficients are classified to build the symbol map. The symbol map will be compossed by (1) non-significant symbols (zero coefficients) and (2) significant symbols with magnitude n (wavelet coefficients that require n bits to represent their value). Taking into account this representation, we compute the probability distribution function (pdf) of the LTW symbol map.

After that, we will obtain an estimation of the bit-rate required to encode the symbol map by means of its zero-order entropy (S e ).

Since we also know the number of significant wavelet coefficients and the number of bits needed for coding their value and sign, we can calculate the exact number of bits sent to the output bitstream, as they are raw binary encoded (Bits total).

So, the final bit-rate estimation for each rplane value (E bpp (rplanes)) is obtained by adding the arithmetic encoder estimation of the symbol map (S e ) to the raw encoding bit count of significant coefficients (Bits total) (Eq. 5)

$$ E_{bpp}(rplanes)=S_{e}(rplanes)+Bits_{\rm total}(rplanes) $$
(5)

The resulting estimation gives a biased measure of the real bit-rate for all operative bit-rate range (from 0.0625 to 2 bpp) considered in this work. The model bit-rate estimation (E bpp ) uses a zero-order entropy model. However, LTW encoding scheme uses an adaptive arithmetic encoder with context modeling. This difference produces an error between the estimated bit-rate (E bpp ) and the target bit-rate as shown in Fig. 3. We observed that the bit-rate estimation depends on the symbol map entropy (S e ), so the lower the entropy the lower the estimation error. This leads us to reduce the estimation error by means of an adjustment function which will be defined from the entire Kodak image set.

In Fig. 2 we show the error adjustment function for rplanes = 4 (no finer quantization Q is considered at the moment). For each image of the Kodak set we measure the real estimation error, so by using curve fitting we define the adjustment error as a function of the symbol map entropy (S e ). In other words, this curve will determine the model estimation error for rplanes = 4. So, for each rplanes value we have obtained the adjustment function (\(\Delta\left(rplanes,S_{e}\right)\)) that we will apply to reduce the model estimation error. So, the bitrate estimation expresion will be rewrited as:

$$\begin{array}{rll} E_{bpp}(rplanes) &=& S_{e}(rplanes)+Bits_{\rm total}(rplanes) \nonumber \\ &&+ \Delta\left(rplanes,S_{e}(rplanes)\right) \end{array}$$
(6)

In Fig. 3 we show the estimated and target bit-rates resulting from encoding the whole Kodak image set with a rplane value of 4. As it can be seen, the estimation error is significantly reduced after applying the corresponding adjustment function.

Figure 2
figure 2

(Model-based)—Estimation error as a function of symbol map entropy (S e ) from the entire Kodak image set for rplanes = 4.

Figure 3
figure 3

(Model-based)—Estimated vs Real bit per pixel for the entire Kodak image set (rplanes = 4).

After that, the target bit-rate (T bpp ) will determine the proper value of rplanes by choosing rplanes so that E bpp (rplanes) ≥ T bpp  > E bpp (rplanes + 1).

Once the rplanes value is determined, we have to estimate the scalar uniform quantization value (Q) that will produce a bit-rate as close as possible to the target bit-rate. For this purpose, we observed that the bit-rate progression from rplane value to rplane + 1 value, follows a second order polynomial curve (y = A*x 2 + B*x + C) that shares near the same x-value of the vertex (\(K_{\rm min}=\frac{-B}{\left(2*A\right)}\)) for all images in the Kodak set (see Fig. 4). Since we know three points of that quadratic polynomial curve E bpp (rplanes), E bpp (rplanes + 1) and the curve vertex (K min), we can build the corresponding expression that will supply the estimated value of Q for a given target bit-rate.

Figure 4
figure 4

(Model-based)—Bit-rate progression of four images from the Kodak set from rplanes 3 to rplanes 4.

As mentioned, the K min value is not exactly the same for all images in the Kodak set, so we have estimated this value taking into account the symbol map entropy (S e ) of all images in the Kodak set, in a similar way to that in the error adjustment function. We could have used the mean K min value from the Kodak set, but in order to obtain a more accurate estimated value of K min we use a curve fitting method instead. In Fig. 5 we show the x-value of K min progression as a function of the symbol map entropy, S e , of all images in the Kodak set for rplanes = 2. We have obtained a fitting equation for each rplanes value (from 2 to 7).

Figure 5
figure 5

(Model-based)—Kmin as a function of symbol map entropy from the entire image Kodak set for rplanes = 2.

The whole algorithm, shown in Fig. 6, works as follows:

  • First (E1), we estimate the resulting bit-rate after applying only the coarser rplanes quantization to wavelet coefficients for rplanes values from 2 to 7 (E bpp (rplanes)).

  • Second (E2), we apply the corresponding error adjustment functions to these estimations.

  • Third (E3), we set the appropriate rplanes value for the requested target bit-rate (T bpp ).

    (E bpp (rplanes) ≥ T bpp  > E bpp (rplanes + 1)).

  • Next (E4), we compute the K min value using the symbol map entropy (S e ) and rplanes.

  • Then (E5), we obtain the quadratic expression for determining the value of Q by using the Newton interpolation algorithm.

  • Finally, we solve the expression, so we obtain the estimated Q value.

Figure 6
figure 6

Model-based algorithm.

3.3 Lightweight Iterative Rate Control

With the Model-based rate control algorithm described in the previous subsection we can define an iterative version to reduce the estimation error with a moderate computational complexity increase. Thus, depending on the application requirements, we can get the proper trade-off between both rate control factors: complexity and accuracy. Now, we can define the maximum allowed estimation error (MAE) as a relative or absolute error and the algorithm will perform coding iterations until this condition is satisfied or a maximum number of iterations is reached.

In the first iteration, the proposed algorithm will estimate the rplanes and Q values for the target bit-rate by using the algorithm described in the previous subsection (see Fig. 6). Then the source image will be coded with the quantization parameters found. If the resulting bit-rate error is lower than the maximum allowed error (MAE), then the algorithm finishes, otherwise, it performs a new Q estimation based on the observed error. This is done during the first three iterations obtaining pair values of Q and real bit-rate. In the following iterations, if needed, the new three real points obtained are used to compute a new quadratic polynomial curve for Q by means of the Newton interpolation algorithm which will be more accurate than the previous one (see algorithm in Fig. 7).

Figure 7
figure 7

Iterative algorithm.

4 Rate Control Evaluation

Using a C+ + implementation of the LTW encoder, the different proposals were developed and tested on an Intel PentiumM 1.6 Ghz Processor. To determine the curve fitting and error adjustments in the first two methods, we have used the Kodak image set as a representative set of natural images. We restrict our proposals to work in the range from 0.0625 to 1 bpp. Finally, we used Lena (512 × 512), Barbara (512 × 512), Goldhill (512 × 512) and Peppers (512 × 512) test images (outside the Kodak set) to validate the proposed methods.

In Fig. 8 we can see that, for all bit-rates in the range 0.625 to 1 bpp, the Model-based algorithm gets the best results. Although for several images in the Kodak set the Entropy-based algorithm performs better, the maximum error peaks in the Model-based algorithm are significantly lower than in the Entropy-based one.

Figure 8
figure 8

Entropy-based vs Model-based % error prediction for the entire Kodak set. Vertical lines indicate maximum and minimum error values, while boxes indicate quartiles q1-q3, and q2 (median) corresponds with horizontal lines inside the boxes.

The Model-based algorithm yields a lower error on the estimation process over the Kodak image set than the Entropy-based algorithm. Furthermore, the choice of the rplanes parameter has been included in the estimation process. Table 1 shows the average estimation error of both the Entropy-based and the Model-based algorithm at different target bit-rates. The Model-based relative error produced is around 5% on average at low compression rates and it grows up to 9% at higher compression rates. This is due to the high slope of the quadratic expression used to obtain Q (see Fig. 4) when the rplanes parameter grows, so, a slight change over Q parameter implies a high variation on the final bit-rate and as a consequence, a higher estimation error.

Table 1 Average % relative estimation error.

In Fig. 9, we show the bit-rate accuracy of the proposed rate control methods for Lena and Goldhill test images. Although not shown, the behavior is very similar in the other non-Kodak test images. In general, the Model-based method does not work efficiently at very low bit-rates in the range from 0.0625–0.125 bpp. This behavior is due to the model simplicity where there is no symbol differentiation in the insignificant coefficients set. In particular the roots (LOWER symbols) and members of lower trees, which are very common symbols at these compression rates, are not handled separately. However, at moderate and low compression rates, the Model-based proposal is more accurate than the Entropy-based one.

Figure 9
figure 9

Bit-rate accuracy for a Lena and b GoldHill images.

In Fig. 10, we measured the computational cost (in CPU cycles) of the proposed methods when coding the Goldhill test image. As it can be seen, the Model-based method is the fastest one, due to the simplicity of computations required for issuing an estimation. The entropy-based proposal is 3 times slower than the Model-based one, mainly due to the higher complexity of float type computations. We want to remark that, in the Entropy-based proposal the zero-order entropy of all wavelet coefficients must be computed while in the Model-based we only need to compute the symbol map entropy, which is in fact a reduced number of symbols. In the case of iterative versions, we can observe that with a maximum 2% relative error, we obtain a very fast rate control estimator (sometimes faster than the entropy based). Also, we can state that the computational cost is not dependent on the target bit-rate, although in the iterative versions, the number of iterations may produce some deviations.

Figure 10
figure 10

Complexity evaluation of different proposals for GoldHill image. x% in the iterative version indicates the maximum percent error allowed.

5 Global Performance Evaluation

In order to analyze the impact of rate control proposals when applied to LTW encoder, we have performed several experiments comparing the obtained results with the original encoder. In addition to R/D performance, we also analyze other performance metrics like coding delay and memory consumption.

So as to perform a fair evaluation, we have chosen SPIHT (original version), JPEG2000 (Jasper 1.701.0) and LTW version 1.1, since their source code is available for testing. The correspondent binaries were obtained by means of Visual C+ + (version 6.0) compiler with the same project options. The test images used in the evaluation were: Lena (512 × 512), Barbara (512 × 512), GoldHill (512 × 512), Cafe (2560 × 2048) and Woman (2560 × 2048).

Table 2 shows the coding delay for all the encoders under evaluation. LTW_RC is the Model-based rate control version of LTW (described in Section 3.2). LTW_RCi is the iterative rate control version of LTW (described in Section 3.3) with a relative (%value) and an absolute (ABS suffix) rate control maximum allowed error that we experimentally have fixed to ±0.04 bpps. We discard the first rate control method (entropy-based) due to its lower accuracy with respect to the Model-based one. As expected, JPEG2000 is the slowest encoder and the original LTW is one of the fastest encoders. As shown in Table 2, the LTW_RC version does not introduce a great overhead and it has an acceptable accuracy. If this rate control algorithm precision is not enough for the application, LTW_RCi is the candidate at the expense of an increasing complexity. In general, all the rate control versions of LTW are faster than SPIHT, specially the non-iterative version, LTW_RC, that performs the encoding process twice as fast as SPIHT.

Table 2 Comparison of coding and decoding delays, excluding DWT (time in million of CPU cycles).

Although it could be thought that original LTW should be faster than the LTW_RC version due to rate control overhead, the results show just the opposite. The reason about this behavior is based on the differences between the quantization processes of both encoders. An image is encoded with the original LTW encoder using a fixed value of rplanes parameter (rp = 2) and moving the Q parameter through a wide range (\(\left[0.5-\infty\right[\)). However, the LTW_RC version uses the estimated value for rplanes parameter (from 2 to 7) limiting the value of the Q parameter to a shorter range (\(\left[0.5-1.2\right]\)). So, as more non-significant symbols are produced before applying the finer quantization parameter (Q) with this method, the algorithm becomes faster because more multiplication operations are avoided. However, as a ‘side effect’, coding efficiency decreases slightly, as we will see later.

In the iterative rate control versions, we have found two ways of defining the maximum allowed error: a relative or an absolute MAE (Maximum Allowed Error). The relative maximum error shows a non linear behavior, since rate control precision of 1% is not the same at 2 bpp than at 0.125 bpp. For very low bit-rates, achieving an accuracy of 1% has no effects to R/D performance. The maximum absolute error is fixed independently of the target bit-rate, so it produces different relative errors at different bit-rates. It is important to take into account that proposed rate control methods have an average precission error around 5% at 1 bpp and 9% at 0.125 bpp, as previously shown.

Table 3 shows the R/D evaluation of the proposed encoders. In general, the original LTW encoder obtains very good performance results, especially in Lena and Woman test images. The iterative rate control versions of LTW have slightly lower PSNR performance than SPIHT and JPEG2000, being the LTW_RCi at 2% the one that better R/D behavior shows. Table 3 also shows the absolute bit-rate error in brackets for all LTW rate control versions. The lower performance of the rate control algorithm versions is mainly due to the achieved final bit-rate that is always lower than the target one (obviously, the more accuracy, the better R/D performance).

Table 3 PSNR (dB) with different bit-rate and coders.

In Table 4, memory requirements of the encoders under test are shown. The original LTW needs only the amount of memory to store the source image. LTW_RC requires also an extra of 1.2 KB, basically used to store the histogram of significant symbols needed to accomplish the Model-based rate control algorithm. On the other hand, the LTW_RCi version requires twice the memory space than LTW and LTW_RC, since at each iteration the original wavelet coefficients must be restored to avoid a new DWT time- consuming procedure. SPIHT requires near the same memory than LTW_RCi, and JPEG2000 needs three times the memory of LTW.

Table 4 Memory Requirements for evaluated encoders (KB).

Figure 11 shows Lena test image (512×512) compressed at 0.125 bpp and 0.0625 bpp with (a, f) LTW_RCi, (b, g) SPIHT and (c, h) JPEG2000. Although SPIHT encoder is in terms of PSNR slightly better than LTW_RCi and JPEG2000, subjective test does not show perceptible differences between reconstructed versions of Lena image. At 0.0625 bpp the difference in PSNR between LTW_RCi or SPIHT and JASPER is near 0.5 dB, but this difference is only visible if we carry out a zoom over the eyes zone as it can be seeing in Fig. 12. Both SPIHT and LTW_RCi have a similar behavior.

Figure 11
figure 11

Lena compressed at 0.125 bpp a LTW_RCi_2%, b SPIHT, c JPEG2000 and Lena compressed at 0.0625 bpp e LTW_RCi_2%, f SPIHT, g JPEG2000.

Figure 12
figure 12

Zoom over eyes zone in reconstructed Lena at 0.0625 bpp - a LTW_RCi_2%, b SPIHT, c JPEG2000.

5.1 Optimized Encoders

All LTW versions were developed finding the optimizations for maximizing R/D performance, so its software code is not optimized, just like JPEG2000 reference software. However, we have compared its performance with respect to a full optimized implementation of the JPEG 2000 algorithm: Kakadu [10], in order to evaluate whether a full optimization of LTW is worth the effort. For that purpose, we have used the 5.2.5 version, one of the fully optimized Kakadu versions which includes multi-thread and multi-core hardware support, processor intrinsics like MMX/SSE/SSE2/SIMD and fast multicomponent transforms.

As shown in Fig. 13, LTW_RC is a very fast encoder even though not being fully optimized. The speed of LTW_RC lies on the simple engine coding model. LTW_RC is on average 1.2 times as fast as Kakadu-5.2.5 for images like Cafe or Woman. For smaller images like Lena or Barbara, LTW_RC is on average 3.2 times as fast as Kakadu-5.2.5.

Figure 13
figure 13

Execution time comparison (end-to-end) of the coding process at 0.5 bpp.

Regarding to memory requirements, LTW_RC needs only the amount of memory to store the source image and 1.2 KB to store the model histogram as mentioned before, while Kakadu memory requirements are independent of the image size due to its DWT block-based implementation and they are on average 1660 KB.

In terms of R/D, there are slight differences between both codecs as Table 3 shows. For images with lots of high frequency components, like Barbara, Kakadu provides a better PSNR than LTW, but for images like Lena or Woman, LTW outperforms Kakadu. So, a full optimization of LTW codec will certainly increase the coding speed and it will reduce the memory requirements even more, making the codec a very competitive still image coding solution.

6 Conclusions

In this paper, we have presented three different rate control algorithms suitable for non-embedded wavelet-based encoders. We have implemented them over the LTW encoder in order to evaluate their behavior and compare the performance results with JPEG 2000 and SPIHT encoders in terms of R/D, execution time and memory consumption. Furthermore, we have shown that we can add rate control functionality to non-embedded wavelet encoders without a significant increase of complexity and little performance loss. Among the proposed simple rate control algorithms, the LTW_RC proposal is the one that exhibits the best trade-off between R/D performance, coding delay (twice as fast as SPIHT and 8.8 times as fast as JPEG 2000) and overall memory usage (similar to original LTW).

Also, we have compared LTW_RC coder with a highly optimized version of JPEG2000 (Kakadu), resulting competitive in terms of coding delay (up to 3.9 times as fast as Kakadu for medium size images) with slightly lower R/D performance. So, a full optimization process will make LTW_RC even faster and with lower memory requirements. These optimizations will be mainly focused on the DWT coding step by using fast and low memory demanding DWT techniques like line-based or block-based ones and exploiting the parallel capabilities of modern processors (like multithreading and SIMD instructions).