Keywords

1 Introduction

Important stage in Computed Tomography (CT) is archiving the images obtained in an efficient manner concerning the data volume occupied and the image quality. A vast number of medical image compression techniques exist [1] which can be divided into two large groups—lossless [2, 3] and lossy [4] depending on the ability to restore the image fully or not. In both groups often a certain type of decomposition of the image is applied—either a linear orthogonal transform or a wavelet one combined with spectral coefficients rearrangement and entropy coding. Some authors propose completely different methods such as the min–max method developed by Karadimitriou and Tyler [5].

Wu [6] propose an approach based on adaptive sampling of DCT coefficients achieving compression in the interval 0.18–0.25 bpp at PSNR between 41 and 43 dB. The quality of the images at this compression is comparable to that of the JPEG2000 as the author shows while the JPEG coder produces images with PSNR between 31 and 40 dB for the same levels. Erickson et al. [7] confirm that wavelet decomposition assures better quality for the images being compressed from 0.1 to 0.4 bpp in comparison to the JPEG coder.

Further more authors undertake the advantages of the wavelet decomposition for medical image compression combining it with other techniques to construct more efficient coders—using joint statistical characterization [8], by linear prediction of the spectral coefficients [9], introducing region of interest (ROI) [10], incorporating planar coding [11], etc. Nevertheless, the higher compression levels achieved some authors point out the significant reduction of the visual quality of these images [4]. While cumulative quality measures such as PSNR stay high the smoothing of vast image areas due to the wavelet coefficients quantization becomes intolerable for compression ratios (CR) smaller than 0.8 bpp in some cases.

In this chapter a new approach for lossy CT image compression is suggested. It is based on linear orthogonal transforms with a new type of spectral coefficients hierarchical grouping provided by the Branched Inverse Pyramidal Decomposition (BIDP). Along with entropy coding the approach assures higher image quality than the previously developed methods at the same CR.

The chapter is arranged as follows: in Sect. 2 are given the steps of the proposed algorithm; in Sect. 3 some experimental results are presented, and then a conclusion is made.

2 Compression of CT Images with BIDP

A new opportunity for achieving highly effective compression of CT images is the usage of BIDP with 3 levels based on orthogonal transforms. It represents a generalization of the Inverse Pyramidal Decomposition (IDP) [12] related to group of CT images.

The new approach called BIDP includes the following stages:

  1. 1.

    Selection of a referent image from the group of CT images based on correlation analysis. For the purpose the correlation coefficient ρ xy should be found between the vectors \( \vec{X} = [x_{1} ,x_{2} , \ldots ,x_{S} ]^{t} \) and \( \vec{Y} = [y_{1} ,y_{2} , \ldots ,y_{S} ]^{t} \) describing the intensity of the pixels inside a couple of images from the group:

$$ \rho_{x,y} = {{\sum\limits_{s = 1}^{S} {(x_{s} - \bar{x})(y_{s} - \bar{y})} } \mathord{\left/ {\vphantom {{\sum\limits_{s = 1}^{S} {(x_{s} - \bar{x})(y_{s} - \bar{y})} } {\left[ {\sqrt {\sum\limits_{s = 1}^{S} {(x_{s} - \bar{x})^{2} } } } \right] \times \left[ {\sqrt {\sum\limits_{s = 1}^{S} {(y_{s} - \bar{y})^{2} } } } \right]}}} \right. \kern-0pt} {\left[ {\sqrt {\sum\limits_{s = 1}^{S} {(x_{s} - \bar{x})^{2} } } } \right] \times \left[ {\sqrt {\sum\limits_{s = 1}^{S} {(y_{s} - \bar{y})^{2} } } } \right]}}. $$
(1)

Here \( \bar{x} = \frac{1}{S}\sum\nolimits_{s = 1}^{S} {x_{s} } \) and \( \bar{y} = \frac{1}{S}\sum\nolimits_{s = 1}^{S} {y_{s} } \) are the average values of the elements x s and y s of the both vectors and S is the number of pixels in the images. The selection of referent image is done after calculation of all correlation coefficients for all couples possible from the group of CT images. The number of consecutive images N forming a group for compression from all the images in the CT packet is found according the relation \( \text{var} (\rho_{xg,yg} ) > \rho_{xd,yd} \) where \( \rho_{xg,yg} \) is the correlation coefficient between all the couples of images in the group and the \( \rho_{xd,yd} \) is the correlation coefficient between the referent image from the group and the most distant one from the CT packet. As shown in Sect. 3 significant variation exist for the correlation coefficient inside the selected group and outside it there is saturation for its value indicating the limits of the group itself. For a group of N images the number L of all couples l(p, q) is:

$$ L = \sum\limits_{p = 1}^{N - 1} {\,\sum\limits_{q = p + 1}^{N} {1(p,q)} } . $$
(2)

After calculating all L correlation coefficients ρ pq the index p 0 is found for which it is true that:

$$ \sum\limits_{q = 1}^{N} {\rho_{{p_{0} q}} } \ge \sum\limits_{q = 1}^{N} {\rho_{pq} } \;{\text{for}}\;p,\,q = 1,2, \ldots ,N,\;{\text{when}}\;p \ne q\;{\text{and}}\;p \ne p_{0} . $$
(3)

Then the consecutive number of the referent image for the group is p 0, that is \( [B_{R} ] = [B_{{p_{0} }} ]. \)

  1. 2.

    The matrix of the referent image R is divided to blocks with dimensions 2n × 2n and each of them is presented with Inverse Pyramidal Decomposition (IDP) with 3 levels:

$$ [B_{R} (2^{n} )] = [\tilde{B}_{0R} (2^{n} )] + \sum\limits_{p = 1}^{2} {[\tilde{E}_{p - 1,R} (2^{n} )]} + [E_{2,R} (2^{n} )], $$
(4)

where \( [E_{2,R} (2^{n} )] \) is the matrix of the residual from the decomposition. In the last expression each matrix is with dimensions 2n × 2n. The first component \( [\tilde{B}_{0R} (2^{n} )] \) for the level p = 0 is a rough approximation of the block [B R (2n)]. It is obtained by applying inverse 2D-DCT over the transformed block \( [\tilde{S}_{0R} (2^{n} )] \) in correspondence with the expression:

$$ [\tilde{B}_{0R} (2^{n} )] = [T_{0} (2^{n} )]^{ - 1} [\tilde{S}_{0R} (2^{n} )][T_{0} (2^{n} )]^{ - 1} , $$
(5)

where \( [T_{0} (2^{n} )]^{ - 1} \) is a matrix with dimensions 2n × 2n for the inverse 2D-DCT.

The matrix \( [\tilde{S}_{ 0R} ( 2^{n} ) ]= [m_{ 0} (u ,v ).s_{ 0R} (u ,v )] \) is the transform block of the cut 2D-DCT over [B R (2 n)]. Here m 0(u,v) are the elements of the binary matrix-mask [M 0(2n)] with the help of which the preserved coefficients are being determined \( [\tilde{S}_{ 0R} ( 2^{n} ) ] \) in accordance to the equation:

$$ m_{ 0} (u ,v )= \left\{\begin{array}{*{20}ll} 1 ,\, & if \; s_{0R} (u ,v ) \;is \; preserved \; coefficient , \\ 0,\,& otherwise , \\ \end{array}\right.\;{\text{for}}\;u,v = 0, 1, \ldots , 2^{n} - 1.$$
(6)

The values of the elements m 0(u,v) are chosen by the condition the preserved coefficients \( \tilde{s}_{ 0R} (u ,v )= m_{ 0} (u,v).s_{0R} (u,v) \) to correspond to those with the highest average energy into the transformed blocks \( [S_{0R} (2^{n} )] \) for all the blocks to which the image has been divided. The transformed block \( [S_{0R} (2^{n} )] \) from [B R (2n)] is found by the 2D-DCT:

$$ [S_{0R} (2^{n} )] = [T_{0} (2^{n} )][B_{R} (2^{n} )][T_{0} (2^{n} )], $$
(7)

where \( [T_{0} (2^{n} )] \) is a matrix with dimensions 2n × 2n for level p = 0 which is used for implementing the DCT.

The rest components in decomposition (4) are the approximation matrices \( [\tilde{E}_{p - 1,R} (2^{n - p} )] \) for p = 1, 2. Each of them consists of sub-matrices \( [\tilde{E}_{p - 1,R}^{{k_{p} }} (2^{n - p} )] \) with dimensions 2np× 2 np for k p  = 1, 2,…,4p obtained by its quad-tree split. On the other hand each sub-matrix \( [\tilde{E}_{p - 1,R}^{{k_{p} }} (2^{n - p} )] \) is calculated by:

$$ [\tilde{E}_{p - 1,R}^{{k_{p} }} (2^{n - p} )] = [T_{p} (2^{n - p} )]^{ - 1} [\tilde{S}_{pR}^{{k_{p} }} (2^{n - p} )][T_{p} (2^{n - p} )]^{ - 1} \;{\text{for}}\; k_{p} = 1, 2, \ldots , 4^{p} , $$
(8)

where 4p is the number of the branches of the quad-tree in level p of the decomposition; \( [T_{p} (2^{n - p} )]^{ - 1} \)—matrix for inverse 2D-WHT; \( [\tilde{S}_{pR}^{{k_{p} }} (2^{n - p} )] \)—the transformed block of the cut 2D-WHT of the difference matrix \( [E_{p - 1,R}^{{k_{p} }} (2^{n - p} )]. \) The elements \( \tilde{s}_{pR}^{{k_{p} }} (u,v) = m_{p} (u ,v ).\,s_{pR}^{{k_{p} }} (u,v) \) of the matrix \( [\tilde{S}_{pR}^{{k_{p} }} (2^{n - p} )] \) depend on the elements m p (u,v) of the binary mask [M p (2 np)]:

$$ m_{p} \,\left( {u,v} \right) = \left\{ {\begin{array}{*{20}ll} {1,\,if\,s_{{pR}}^{{k_{p} }} \left( {u,v} \right)} & { - \,preserved\,coefficient,} \\ 0 & { - \,otherwise.} \\ \end{array} } \right. \;{\text{for}}\;u,v = \, 0, 1, \ldots ,2^{n - p} - 1.$$
(9)

Here \( s_{pR}^{{k_{p} }} (u ,v ) \) are elements of the transformed block \( [S_{pR}^{{k_{p} }} (2^{n - p} )] \) which is obtained by the 2D-WHT:

$$ [S_{pR}^{{k_{p} }} (2^{n - p} )] = [T_{p} (2^{n - p} )][E_{p - 1,R}^{{k_{p} }} (2^{n - p} )][T_{p} (2^{n - p} )]. $$
(10)

where \( [T_{p} (2^{n - p} )] \) is a matrix with dimensions 2np × 2np for level p = 0 by which WHT is applied.

It is possible to represent each group of four neighbouring elements \( \tilde{s}_{pR}^{{k_{p} }} (u,v) \) for one and the same u and v in the following way:

$$ \left[ {\begin{array}{*{20}c} {\tilde{d}_{pR}^{{k_{p} }} (u,v)} \\ {\tilde{d}_{pR}^{{k_{p} + 1}} (u,v)} \\ {\tilde{d}_{pR}^{{k_{p} + 2}} (u,v)} \\ {\tilde{d}_{pR}^{{k_{p} + 3}} (u,v)} \\ \end{array} } \right] = \frac{1}{4}\left[ {\begin{array}{*{20}c} 1& 1 & 1 & 1 \\ 0 & 4 & 0 & { - 4} \\ { - 4} & 0 & 4 & 0 \\ 0 & 0 & { - 4} & 4 \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {\tilde{s}_{pR}^{{k_{p} }} (u,v)} \\ {\tilde{s}_{pR}^{{k_{p} + 1}} (u,v)} \\ {\tilde{s}_{pR}^{{k_{p} + 2}} (u,v)} \\ {\tilde{s}_{pR}^{{k_{p} + 3}} (u,v)} \\ \end{array} } \right], $$
(11)

which allows to gain even higher correlation between the spectral coefficients since the last three ones for positions (0, 1), (1, 0) and (1, 1) form differences two by two and these differences often are zero valued because neighboring blocks contain almost identical content.

The inverse transform which leads to full restoration of \( \tilde{s}_{pR}^{{k_{p} }} (u,v) \) is given by:

$$ \left[ {\begin{array}{*{20}c} {\tilde{s}_{pR}^{{k_{p} }} (u,v)} \\ {\tilde{s}_{pR}^{{k_{p} + 1}} (u,v)} \\ {\tilde{s}_{pR}^{{k_{p} + 2}} (u,v)} \\ {\tilde{s}_{pR}^{{k_{p} + 3}} (u,v)} \\ \end{array} } \right] = \frac{1}{4}\left[ {\begin{array}{*{20}c} 4& { - 1} & { - 3} & { - 2} \\ 4 & { \, 3} & 1 & { \, 2} \\ 4 & { - 1} & 1 & { - 2} \\ 4 & { - 1} & 1 & { \, 2} \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {\tilde{d}_{pR}^{{k_{p} }} (u,v)} \\ {\tilde{d}_{pR}^{{k_{p} + 1}} (u,v)} \\ {\tilde{d}_{pR}^{{k_{p} + 2}} (u,v)} \\ {\tilde{d}_{pR}^{{k_{p} + 3}} (u,v)} \\ \end{array} } \right]. $$
(12)

The difference matrix \( [E_{p - 1,R} (2^{n - p} )] \) for level p containing the sub-matrices \( [E_{p - 1,R}^{{k_{p} }} (2^{n - p} )] \) is determined by the following equation:

$$ [E_{p - 1,R} (2^{n - p} )] = \left\{ {\begin{array}{*{20}ll}{[B_{R} (2^{n} {)]} - {{[\tilde{B}}}_{{{0}R}}{(}2^{n} {)]}} & {{\text{for }}\,p = 1;} \\{[E_{p - 2,R} (2^{n - p} )] - [\tilde{E}_{p - 2,R}{(}2^{{{n} - {p}}} {)]}} & {{\text{for }}\,p = 2.}\\ \end{array} } \right.$$
(13)
  1. 3.

    Branch is taken only for level p = 0 of the pyramid (4) of the referent image R with dimensions H × V. The preserved coefficients \( \tilde{s}_{0R} (u,v) \) with all the same spatial frequencies (u,v ) from all blocks \( [\tilde{B}_{0R} (2^{n} )] \) for p = 0 are united into two-dimensional arrays \( [\tilde{S}_{0R} (u,v)] \) with dimensions (H/2n) × (V/2n). The number of these matrices is equal of the number of the preserved coefficients \( \tilde{s}_{0R} (u,v) \) in each block from the referent image. In resemblance to Eq. (4) every found matrix \( [\tilde{S}_{0R} (u,v)] = [B_{uv} ] \) is represented by IPD with 2 levels:

$$ [B_{uv} ] = [\tilde{B}_{uv} ] + [\tilde{E}_{0,uv} ] + [E_{1,uv} ], $$
(14)

where \( [E_{1,uv} ] \) is the residual from the decomposition. Its components are matrices with dimensions (H/2n) × (V/2n) and they are found in a similar fashion as it was done in (4). The first component for level p = 0 is found by:

$$ [\tilde{B}_{uv} ] = [T_{0} ]^{ - 1} [\tilde{S}_{0} ][T_{0} ]^{ - 1} , $$
(15)

where \( [T_{0} ]^{ - 1} \) is a matrix with dimensions (H/2n) × (V/2n) used for the inverse 2D-WHT. The matrix \( [\tilde{S}_{0} ] \) is the transformed block of \( [\tilde{B}_{uv} ] \) obtained by the cut 2D-WHT:

$$ [\tilde{S}_{0} ] = [T_{0} ][\tilde{B}_{uv} ][T_{0} ]. $$
(16)

The preserved coefficients of the transformed block \( [\tilde{S}_{0} ] \) are calculated according to Eq. (6). The next component for p = 1 of decomposition (14) is estimated based on the difference:

$$ [E_{0} ] = [B_{uv} ] - [\tilde{B}_{uv} ]. $$
(17)

The approximation of this difference is given by:

$$ [\tilde{E}_{0}^{{k_{1} }} ] = [T_{1} ]^{ - 1} [\tilde{S}_{1}^{{k_{1} }} ][T_{1} ]^{ - 1} \;{\text{for}}\; k_{1} = 1, 2, 3, 4, $$
(18)

where \( [\tilde{S}_{1}^{{k_{1} }} ] \)is the transformed block returned by the cut 2D-WHT:

$$ [S_{1}^{{k_{1} }} ] = [T_{1} ][E_{0}^{{k_{1} }} ][T_{1} ]. $$
(19)

Here [T 1] is a matrix for WHT with dimensions (H/2n + 1) × (V/2n + 1).

  1. 4.

    For every block of the ith CT image from the group which is not referent a difference is found:

$$ [E_{0i} (2^{n} )] = [B_{i} (2^{n} )] - [\tilde{B}_{0R} (2^{n} )]\;{\text{for}}\;i = \, 0, 1, 2, \ldots ,N - 1, $$
(20)

where N is the number of the CT images in the group.

The difference matrices \( [E_{p - 1,i} (2^{n} )] \) for the next levels p = 1, 2 are divided to 4p sub-matrices with dimensions 2n−p × 2n−p and over each one of them is applied the cut 2D-WHT. Further the processing of the obtained matrices is done in a similar way as the processing of the components of the referent image R зa p = 1, 2.

It should be noticed that when the number of the preserved coefficients in a certain block is 4 using the 2D-WHT it is possible to reduce this number for levels p = 1, 2. As shown in [4] for each of these levels it is not necessary to calculate coefficients \( s_{p}^{{k_{p} }} ( 0 , 0 ) \) as they are always zero. Thus, the number of the coefficients necessary for lossless reconstruction of the image becomes smaller with a factor of 1.33.

From the output of the coder the following arrays containing spectral coefficients are passed:

  1. 1.

    From level p = 0 of the referent image represented with a branch in the form of pyramid with levels p = 0, 1 and residual 3 arrays are formed of coefficients with frequencies (u, v). Then the total amount of arrays is 3 × (number of preserved coefficients) and the length of each array is (H/2n) × (V/2n);

  2. 2.

    From levels p = 1, 2 for each of the N-th images in the group are formed arrays of preserved coefficients with frequencies (u, v). The number of the arrays is equal to that of the preserved coefficients and their lengths are equal to 4 p(H/2n+p) × (V/2n+p) = (H/2n) × (V/2n).

Over the coefficients from the output of the coder for all the levels of the branched pyramid for every CT image in the group lossless entropy coding (EC) is applied which includes run-length coding (RLC), Huffman coding (HC) and arithmetic coding (AC).

At the stage of decoding the compressed data for the group of CT images all the operations are carried out in reverse order: lossless decoding, branch matrix restoration based on Eq. (14), referent image decoding according to (4) and the rest images in correspondence to (20). As a result all CT images from the group are restored.

3 Experimental Results

The CT test images are 576 greyscale slices in DICOM format. The size of all images is H = 512 × V = 512 pixels with intensity depth of 16 bpp.

In Fig. 1a the correlation coefficient is presented between each two images from the packet and in Fig. 1b—the same coefficient only between the first image and all the others. As suggested in Sect. 2 a strong variation of the correlation exists inside a candidate group around a proper referent and outside it asymptotically goes to a constant.

Fig. 1
figure 1

The correlation coefficient for a all couples of images, and b the first one and all the others

The first test group for which experimental results are presented in Table 1 consists from 9 images shown in Fig. 2—the first one appears to be the referent. The size of the initial block is 16 × 16 (n = 4) and the number of the preserved coefficients is 7—all of them low-frequent. In the zero and first level of the branch the preserved coefficients are 4—again low-frequent. For the main branch of the inverse pyramid in the first and second level 4 low-frequent coefficients are preserved.

Table 1 Compression and quality for the group images at three levels of the main pyramid
Fig. 2
figure 2

First test group of 9 images: a base image, and bi 8 side images

In Fig. 3 the changes of the average PSNR and the average SSIM from the average CR are given for the group compared to those obtained when JPEG2000 coder is applied over each image separately.

Fig. 3
figure 3

Quality comparison between BIDP and JPEG2000 based on a PSNR, and b SSIM

With the exception of the range of low compression (under 1.5 bpp) the suggested approach produces higher value for the average PSNR than JPEG2000. Especially higher is the difference for the big CR values—with 6 dB difference on average. In relation to preserving the structural similarity of the compressed images it is also visible that the proposed approach is dominating JPEG2000—at 0.1 bpp with more than 0.05 for the SSIM.

In Fig. 4a is shown the referent image with an isolated area magnified after compressing with BIDP in Fig. 4b and with JPEG2000 in Fig. 4c at CR = 0.11 bpp.

Fig. 4
figure 4

Visual quality comparison for a segment of the a original referent image compressed at CR = 0.11 bpp using, b BIDP, and c JPEG2000

Worsening the quality for JPEG2000 is obvious in comparison to the BIDP coder—even vast homogenous areas are highly blurred and no details are visible in practice. Only slight block effect is present when applying BIDP at such high CR changing the smaller details insignificantly.

4 Conclusion

From the presented experimental results the advantages of the proposed approach using BIDP become evident when compressing CT images. The coder presented proves to be more efficient than the widely used in practice JPEG2000 coder. Considerably high values for the compression ratio are achieved while preserving high quality of the images—around 39 dB on average and in some cases—over 44 dB. The structural similarity index is close to 1. With the introduction of quantization tables for the spectral coefficients being transmitted it is possible to achieve smooth change for the compression ratio. With the increase of the size of the images it is suitable to increase the size of the initial block (working window, 2n × 2n) and the number of the levels of the pyramid. Possibility for further development of the proposed approach is applying it over MRI images.