1 Introduction

Transform coding is at the heart of the majority 2D image/video coding systems and standards. Spatial image data (image samples or motion-compensated residual samples) are transformed into a different representation, the transform domain. There are good reasons for transforming image data in this way. Spatial image data is inherently ‘difficult’ to compress; neighbouring samples are highly correlated and the energy tends to be evenly distributed across an image, making it difficult to discard data or reduce the precision of data without adversely affecting image quality [1, 2].With a suitable choice of transform, the data becomes ‘easier’ to compress in the transform domain. There are several desirable properties of a transform for compression. It should compact the energy in the image, i.e., concentrate the energy into a small number of significant values; it should de-correlate the data so that discarding ‘insignificant’ data—normally high frequency data—has a minimal effect on image quality; and it should be suitable for practical implementation in software and hardware [3, 4].

The two most widely used image compression transforms are the discrete cosine transform (DCT) and the discrete wavelet transform (DWT) [35]. The DCT is usually applied to small, regular blocks of image samples (e.g. 8 × 8 squares) and the DWT is usually applied to larger image sections or to complete images. Many alternatives have been proposed, for example 3D transforms (dealing with spatial and temporal correlation), variable block size transforms, fractal transforms, and Gabor analysis. The DCT has proved particularly useful and it is at the core of most current generation of image and video coding standards, including JPEG, H.261, H.263, H.263+, MPEG-l, MPEG-2 and MPEG-4 [6, 7].

To demonstrate the effectiveness of our approach, we focus on compressing 2D image data appropriate for 3D reconstruction. This includes 3D reconstruction from structured light images, and 3D reconstruction from multiple viewpoint images. Previously, we have demonstrated that while geometry and connectivity of a 3D mesh can be tackled by several techniques such as high degree polynomial interpolation [8] or partial differential equations [9, 10], the issue of efficient compression of 2D images both for 3D reconstruction and texture mapping has not yet been addressed in a satisfactory manner. Moreover, in most applications that share common data, it is necessary to transmit 3D models over the Internet. For example, to share CAD/CAM assets, e-commerce applications, update content for entertainment applications, or to support collaborative design, analysis, display of engineering, medical and scientific datasets. Bandwidth imposes hard limits on the amount of data transmission and, together with storage costs calls for more efficient 3D data compression for exchange over the Internet and other networked environments. Using structured light techniques for 3D reconstruction, surface patches can be compressed as a 2D image together with 3D calibration parameters, transmitted over a network and remotely reconstructed (geometry, connectivity and texture map) at the receiving end with the same resolution as the original data [11, 12].

Related to the techniques proposed in this paper, our previous work on data compression is summarized as follows: (1) Focused on compressing structured light images for 3D reconstruction, Siddeq and Rodrigues [11] proposed a method in 2014 where a single level DWT is followed by a DCT on the LL sub-band yielding the DC component and the AC-matrix. A second DWT is applied to the DC components whose second level LL2 sub-band is transformed again by DCT. A matrix minimization algorithm was applied to the AC-matrix and other sub-bands. Compression ratios of up to 98% were achieved. (2) Siddeq and Rodrigues in same year [13], proposed technique where a DWT was applied to variant arrangements of data blocks followed by arithmetic coding. The novel aspect of that paper is at decompression stage, where a Parallel Sequential Search Algorithm was proposed and demonstrated. Compression ratios of up to 98.8% were achieved. (3) In Siddeq and Rodrigues [14] a two-level DWT was applied followed by a DCT to generate a DC-component array and an MA-Matrix (Multi-Array Matrix). The MA-Matrix was then partitioned into blocks and a minimization algorithm coded each block followed by the removal of zero valued coefficients and arithmetic coding. At decompression stage, a new algorithm called Fast-Match-Search decompression was used to reconstruct the high-frequency matrices by computing data probabilities through a binary search algorithm in association with a look up table. A comparative analysis of various combinations of DWT and DCT block sizes was performed, with compression ratios up to 99.5%.

In this paper, we introduce a new method based on DCT and DST for compressing 2D images with structured light patterns for 3D surface reconstruction (i.e., the 2D images have embedded stripe patterns, and the detection and processing of such patterns are used to generate a 3D surface). Additionally, the method is applied to a series of 2D images (with no structured light patterns) and used to convert from multiple 2D images to a 3D surface [15]. Following the discrete transformations, a high-frequency minimization method is used to convert each three adjacent coefficients to a single integer, reducing the data size to a third of its original size. The final step is to apply arithmetic coding to the reduced data. The main steps in the compression algorithm are depicted in Fig. 1.

Fig. 1
figure 1

The main steps of the proposed compression algorithm

2 Using One-Dimensional Discrete Cosine Transform (DCT)

The one-dimensional DCT is used to transform each row from an image (spatial domain) to obtain transform image called “Tdct”, a shown in the following [35]:

$$T_{dct} \left( i \right) = \frac{\sqrt 2 }{n}C(i)\sum\limits_{t = 0}^{n - 1} {I(t)\;\cos \left[ {\frac{(2t + 1)i\pi }{2n}} \right]}$$
(1)
$$I\left( t \right) = \frac{\sqrt 2 }{n}\sum\limits_{j = 0}^{n - 1} {C(j)\;T_{dct} (j)} \;\cos \left[ {\frac{(2t + 1)j\pi }{2n}} \right]$$
(2)

where

$$C\left( i \right) = \left\{ \begin{aligned} = 2^{ - 1/2} ,\quad if\;i = 0 \hfill \\ = 1,\quad if\;i > 0 \hfill \\ \end{aligned} \right.$$

where i = 0, 1, 2, 3, …, n−1 represents images row size from “I , and the output is a set of DCT coefficients T dct . The first coefficient is called DC coefficient, and the rest are referred to as the AC coefficients. Notice that the coefficients are real numbers, and they are rounded off to integers. The important feature of the DCT is that it is useful in image compression [16]. It takes correlated input data and concentrates its energy in just the first few transform coefficients. If the input data consists of correlated quantities, then most of the “n” transform coefficients produced by the DCT are zeros or small numbers [17], and only a few are large (normally the first data). The early coefficients contain the most important (low-frequency) image information and the later coefficients contain the less-important (high-frequency) image information [6, 18]. This feature allows good compression performance as a proportion of the less important coefficient scan be discarded without much degradation of image quality. Figure 2 shows the DCT applied to each row of a small image size 8 × 8 without using scalar quantization.

Fig. 2
figure 2

Left Original block of data, right T dct produced by applying one-dimensional DCT to each row independently

3 One Dimensional Discrete Sine Transform (DST)

Our research has indicated that a one dimensional DCT works together with a one-dimensional DST yielding large amounts of high-frequency components. These high frequency components are useful to obtain high compression ratios comparable to the JPEG technique. In this research, we will apply one dimensional DST to each column of the transformed matrix “T dct ” from previous section. The DST definition is represented as follows [17, 19]:

$$T_{dst} (k) = \sum\limits_{i = 1}^{n} {T_{dct} (i)\;\text{Sin} \left( {\pi \frac{k * i}{n + 1}} \right)}$$
(3)
$$K = 1,\,2, \ldots N$$
$$T_{dct} (i) = \frac{2}{N + 1}\sum\limits_{k = 1}^{n} {T_{dst} (k)\;\text{Sin} \left( {\pi \frac{k * i}{n + 1}} \right)}$$
(4)

Equation (3) is used to transform “n” values of “T dct ” matrix into “n” coefficients. These are the low and high frequency coefficients containing important and less important image information. The one-dimensional DST is applied to each column of “T dct ” to produce a new transformed matrix “T dst ”. The DST is equivalent to the imaginary part of the Discrete Fourier Transformation (DFT) and the results of the DST are real numbers [20, 21]. The main advantage of using the DST for image compression in this context is that the DST preservers the image quality encoded by the low frequency components of “T dct ” and increases the number of zeros, which can be discarded without loss of quality.

After the DST, we apply a quantization of the high frequency components of the transformed matrix “T dst ”. In this way, the quantization means losing only insignificant information from the matrix. Each coefficient in the matrix is divided by the corresponding number from a “Quantization table” and the result is rounded off to the nearest integer. The following equation is proposed as a quantization table.

$$Q\left( {i,\,j} \right) = \left( {i + j} \right)*F$$
(5)

where: F > 0 and i, j = 1,2,3,…, n × m (image dimensions).

In Eq. (5) “F” is a real number greater than zero. This value affects image quality as for “F > 1” image quality is decreased. There is no limit for F, however, from our experiments we suggest F from 0.1 to 10. Figure 3 shows the DST applied to each column and quantized by Eq. (5).

Fig. 3
figure 3

One-dimensional DST applied to each Column independently of T dct followed by quantization with F = 2 (See Eq. (5))

In the above example, low and high frequency components are determined by the user. The low-frequency ones are not compressed any further, we just represent them in fewer bytes by arithmetic coding. Meanwhile, the high-frequency components either horizontal or vertical are compressed by the High-Frequency Minimization algorithm described in the next section.

4 High Frequency Minimization Algorithm

In this section, we describe an algorithm to convert the high-frequency coefficients (i.e. from previous section results passed to Minimization algorithm) into a compressed array called Minimized-Array through a matrix minimization method involving eliminating zeros and triplet encoding whose output is then subjected to arithmetic coding. Normally, the high frequency components contain large numbers of zeroes with a few nonzero data. The technique eliminates zeroes and enhances compression ratio [11, 13, 14, 16, 18].

The high-frequency minimization algorithm is applied further reducing the size of high-frequency sub-matrix by 2/3. This process hinges on defining three key values and multiplying these by three adjacent entries in H (the matrix of high frequency coefficients) which are then summed over producing single integer values as shown in Fig. 4 [13, 14].

Fig. 4
figure 4

High-Frequency Minimization Algorithm used to compress coefficients (D1, D2…, Dnm) from matrix H (i.e. matrix H scanned row-by-row for compression)

Thus, each set of the three entries from H are converted into a single value which are then stored into a new coded array (Minimized-Array). Assuming that \(N\) is the length of H, \(i = 1,2, \ldots ,N - 3\), and \(j\) is the index of new coded array, the following transformations define the high frequency encoding [11, 13, 14, 18]:

$$\varvec{Minimized}\_\varvec{Array}_{j} = K_{1} \varvec{H}_{i} + K_{2} \varvec{H}_{{\left( {i + 1} \right)}} + K_{3} \varvec{H}_{{\left( {i + 2} \right)}}$$
(4)

The key values \(K_{1} ,\,K_{2} ,\,K_{3}\) are generated by a key generator algorithm as in [14, 18] described through Eqs. 5, 6, 7 and 8 below. Because the keys are data-dependent of max(H), each matrix will have their unique set of keys if their max(H) are distinct.

$$M = 1.5\hbox{max} \left( H \right)$$
(5)
$$K_{1} = 1$$
(6)
$$K_{2} = K_{1} + M + Factor$$
(7)
$$K_{3} = Factor*M(K_{1} + K_{2} )$$
(8)

where \(Factor \ge 1\) and \(K_{1} \ge 1\) are integer values. The quantity Factor is a scaling factor to enlarge the degree of separation between the 3 generated keys. The keys themselves are the weights of each triplet summation in the minimized-array. The original values of each triplet can later be recovered by estimating the H values (See Sect. 5) for that Minimized - Array. Following the models above, the Minimized-Array for the example in Fig. 3 can be illustrated in the following Table 1.

Table 1 From the example of Fig. 3: each high-frequency sub-matrix is compressed independently

Our compression method creates a new array of header data H, which is used later by the decompression algorithm to estimate the original data values. This information is kept in the header of the compressed file as a string. Figure 5 below illustrates the concept through a numerical example.

Fig. 5
figure 5

Limited-Data appearing in H are kept in the header file for recovery

Per above example in Fig. 3, the Limited-Data can be estimated from high-frequency sub-matrices (Horizontal and Vertical). Limited-Data(Vertical) = {−1, 0} and Limited-Data(Horizontal) = {2, 0}.

The encoded triplets in the Minimized-Array may contain large number of zeros which can be further encoded through a process proposed by [14]. For example, assume the following encoded Minimized-Array = {125, 0, 0, 0, 73, 0, 0, 0, 0, 0, −17}.The zero array will be {0, 3, 0, 5, 0} where the zeros in red refer to nonzero data existing at these positions and the numbers in black refer to the number of zeros between two consecutive non-zero data. To increase the compression ratio, the number 5 can be broken up into 3 and 2 to increase data redundancy. Thus, the equivalent zero array would be {0, 3, 0, 3, 2, 0} and the nonzero array would be {125, 73, −17}. According to this method, the Minimized-Array both Horizontal and Vertical can be illustrated in Table 2.

Table 2 Each Minimized-Array is coded to zero-array and nonzero-array

The final step of compression is arithmetic coding which computes the probability of all data and assigns a range to each data (low and high) to generate streams of compressed bits [5].The arithmetic coding applied here takes a stream of data and converts into a single floating point value. The output is in the range between zero and one that, when decoded, returns the exact original stream of data.

5 The Fast-Matching Search Decompression Algorithm

The decompression algorithm is the inverse of compression. First, decode the Minimized-Array for both horizontal and vertical components by combining the zero-array with the non-zero-array. Second, decode high-frequencies from the Minimized-Array using the fast matching search (FMS) algorithm [14]. Third, inverse the DST and DCT to reconstruct the original 2D image. The images are then assessed on their perceptual quality and on their ability to reconstruct the 3D structures compared with the original images. Figure 6 illustrates the decompression method.

Fig. 6
figure 6

The steps in the decompression algorithm

The Fast Matching Search Algorithm (FMS) has been designed to recover the original high frequency data. The compressed data contains information about the compression keys (K1, K2 and K3) and Limited-Data followed by streams of compressed high frequency data. Therefore, the FMS algorithm picks up each compressed high frequency data and decodes it using the key values and compares whether the result is expressed in the Limited-Data. Given 3 possible values from Limited Data, there is only one possible correct result for each key combination, so the data is uniquely decoded. We illustrate the FMS-Algorithm through the following steps A and B [14]:

  1. (A)

    Initially, the Limited-Data is copied into three separated arrays given that we used three keys for compression. The algorithm picks three items of data (one from each Limited-Data) and apply these to Eq. (4) using the three compression keys. The method resembles an interconnected array D, where each value is combined with each other value, similar to a network as shown in Fig. 7a.

    Fig. 7
    figure 7

    a, b FMS-Algorithm for reconstructing high frequency data from Limited-Data. A, B and C are the original data which are determined by the unique combination of keys. a Estimate all possible compressed data saved in D-array (i.e. each possible compressed data connected with their relevant original data). b Decompression by using Binary Searching algorithm

    Since the three arrays of Limited-Data contain the same values, that is A1 = B1 = C1, A2 = B2 = C2 and so on the searching algorithm computes all possible combinations of A with K1, B with K2 and C with K3 that yield an intermediate array D. As a means of an example consider that Limited-Data1 = [A1 A2 A3], Limited-Data2 = [B1 B2 B3] and Limited-Data3 = [C1 C2 C3]. Then, according to Eq. (4) these represent H(L), H(L + 1) and H(L + 2) respectively. The equation is executed 27 times (3-3 = 27) testing all possibilities. One of these combinations in array D will match the value in the compressed data. The match indicates that the unique combination of A, B and C are the original data. If we apply this to our example, Limited-Data(vertical) = {−1, 0} (See Table 1) number of possibilities for A = [−1, 0], B = [−1, 0] and C = [−1, 0] as shown in Table 3.

    Table 3 All possible data computed according to Eq. (4) to generate the D-Array (K1 = 1, K2 = 3 and K3 = 4)
  2. (B)

    The searching algorithm used in the decompression method is called Binary Search Algorithm. It finds the original data (A, B, C) for any input from compressed data file “Minimized-Array”. For binary search, the D-Array should be arranged in ascending order.

    The decompression algorithm compares a value from the Minimized-Array with the middle of element of the array “D”. If the value matches, then a matching element has been found and its position is returned (i.e. the relevant A, B and C are the decompressed data we are after) [22]. Otherwise, if the search is less than the middle element of “D”, then the algorithm repeats its action on the sub-array to the left of the middle element or, if the value is greater, on the sub-array to the right. There is no probability of “Not Matched”, because the FMS-Algorithm computes all compression data possibilities as shown in Fig. 7b.

Once the horizontal and vertical high frequency components are recovered by the FMS-Algorithm, they are combined to regenerate the 2D matrix (See Fig. 6). Then each data from the matrix is multiplied by each data in Q (Eq. 5) followed by the inverse DST (Eq. 4) applied to each column. Finally, we multiply each data by F followed by the inverse DCT (Eq. 2) applied to each row to recover the original 2D image as shown in Fig. 6. If we compare the results in Fig. 6 with the original 8 × 8 matrix of Fig. 2, we find that there is not much difference, and these differences do not affect image quality. For this reason, our technique is very attractive for image compression.

6 Experimental Results

The experimental results described here were implemented in MATLAB R2013a and Visual C++ 2008 running on an AMD Quad-Core microprocessor. We describe the results in two parts: first, we apply the method to general 2D images of different sizes and assess their perceived visual quality and RMSE. Additionally, we compare our compression method with JPEG and JPEG2000 through the visualization of 2D images, 3D surface reconstruction from multiple views and RMSE error measures.

Second, we apply the compression and decompression algorithms to 2D images that contain structured light patterns allowing 3D surface data to be generated from those patterns. The rationale is that a high-quality image compression is required otherwise the resulting 3D structure from the decompressed image will contain apparent dissimilarities when compared to the 3D structure obtained from the original (uncompressed) data. We report on these differences in 3D through visualization and standard measures of RMSE-root mean square error.

6.1 Results for 2D Images

In this Section, we apply the algorithms to generic 2D images, that is, images that do not contain structured light patterns as described in the previous section. In this case, the quality of the compression is performed by perceptual assessment and by the RMSE measure. We use images with varying sizes from 2.25 to 9 MB. Also, we present a comparison with JPEG and JPEG2000 highlighting the differences in compressed image sizes and the perceived quality of the compression.

Figure 8a gives an indication of compression ratios achieved with our approach while in (b) is shown details with comparative analysis with JPEG2000 and JPEG. First, the decoded ‘baby’ image by JPEG2000 contains some blurring at places, while the same image decoded by our approach and JPEG are of higher quality. Second, the decoded ‘eyes’ image by JPEG algorithm had some block artefacts resulting in a lower quality compression. Also, the same image decoded by our approach and JPEG2000 at equivalent compression ratios, has excellent image quality. Finally, the decoded ‘girl’ image by JPEG2000 is slightly degraded, while our approach and JPEG show good image quality.

Fig. 8
figure 8

Compressed images by JPEG and JPEG2000 at equivalent compressed file sizes as with our approach. a Compressed and decompressed 2D images by our approach. b Details of compression/decompression by our approach, JPEG2000 and JPEG respectively

Additionally, we applied our compression techniques to a series of 2D images and used Autodesk 123DCatch software to generate a 3D model from multiple images. The objective is to perform a direct comparison between our approach and both JPEG and JPEG2000 on the ability to perform 3D reconstruction from multiple views. Images are uploaded to the Autodesk server for processing which normally takes a few minutes. The 123D Catch software uses photogrammetric techniques to measure distances between objects producing a 3D model (i.e. image processing is performed by stitching a plain seam with correct sides together). The application may ask the user to select common points on the seam that could not be determined automatically [15, 23]. Compression sizes and RMSE for all images used are depicted in Table 4.

Table 4 Compressed sizes and 2D RMSE measures

Figure 9 shows two series of 2D images for objects “APPLE”, and “FACE” (all images are available from 123D Catch website). We start by compressing each series of images whose compressed sizes and 2D RMSE measures are shown in Table 4. A direct comparison of compression with JPEG and JPEG2000 is presented in Table 5. It is clearly shown that our approach and JPEG2000 can reach an equivalent compression ratio, while the JPEG technique does not. It is important to stress that both our technique and JPEG depend on DCT. The main difference is that our approach is based on DCT with DST and the coefficients are compressed by the frequency minimization algorithm, which renders our technique far superior to JPEG as shown in the comparative analysis of Fig. 10.

Fig. 9
figure 9

a, b show series of 2D images used to generate 3D models by 123D Catch. a Apple images Number of image 48 images. b Face images Number of images 28 images

Table 5 Comparison of 3D reconstruction for images compressed to the same size. Note that JPEG failed to reconstruct the 3D structure as the images were too deteriorated
Fig. 10
figure 10

a, b Successful 3D reconstruction following compression by our approach. Images were compressed to the same size by our approach, JPEG and JPEG2000. a 3D model for series of APPLE images decompressed by our approach (48 images, average 2D RMSE = 8.33, total compressed size = 1.94 MB). The compression ratio for the 3D mesh is 99.4% for connectivity and vertices. b 3D model for series of FACE images decompressed by our approach (28 images, average 2D RMSE = 5.68, total compressed size = 1.72 MB). The compression ratio for the 3D mesh is 99.1% for connectivity and vertices

In our method, DCT with DST are applied on an image as one block. The used low frequency block size for colour was 150 × 150, the scalar quantization for DCT was 1, 5 and 5 for each layer (Y, Cb and Cr) respectively. Furthermore, the quantization matrix used after DST performs an aggressive quantization, this means that approximately 50% of the coefficients are zeros (i.e. the left bottom of the image matrix contains lot of zeros after the quantization process—see Eq. 5).

6.2 Results for Structured Light Images and 3D Surfaces

3D surface reconstruction was performed with our own software developed within the GMPR group [9, 10, 12]. The justification for introducing 3D reconstruction is that we can make use of a new set of metrics in terms of error measures and perceived quality of the 3D visualization to assess the quality of the compression/decompression algorithms. The principle of operation of GMPR 3D surface scanning is to project patterns of light onto the target surface whose image is recorded by a camera. The shape of the captured pattern is combined with the spatial relationship between the light source and the camera, to determine the 3D position of the surface along the pattern. The main advantages of the method are speed and accuracy; a surface can be scanned from a single 2D image and processed into 3D surface in a few milliseconds [24]. The scanner is depicted in Fig. 11.

Fig. 11
figure 11

a depicts the GMPR scanner together with an image captured by the camera (b) which is then converted into a 3D surface and visualized (c). Note that only the portions of the image that contain patterns (stripes) can be converted into 3D; other parts of the image are ignored by the 3D reconstruction algorithms

Figure 12 shows several test images used to generate 3D surfaces both in grayscale and colour. The top row shows two grayscale face images, FACE1 and FACE2 with size 1.37 MB and dimensions \(1392 \times 1040\) pixels. The bottom row shows colour images CORNER and METAL with size 3.75 MB and dimension \(1280 \times 1024\) pixels. We use the RMSE measure to compute the differences between decompressed images and original ones. The RMSE however, cannot give an absolute indication of which is the ‘best’ reconstructed image or 3D surface, as errors may be concentrated in a region that may or may not be relevant to the perception of quality. To get a better assessment of quality, we analyse 3D surface images at various compression ratios.

Fig. 12
figure 12

Structured light images used to generate 3D surfaces. Top row grayscale images FACE1 and FACE2, and colour images CORNER and METAL respectively. Images were compressed to the same size by our approach, JPEG and JPEG2000

Table 6 shows the compressed size for our approach using two different values of quantization. First, the quantization scalar for FACE1 and FACE2 is 1. This means that after DCT each coefficient is divided by 1, this means rounding off each floating-point value to integer. Similarly, after DST the quantization equation is applied with F (Eq. 5).

Table 6 Structured light images compressed by our approach

The colour images are defined by using colour transformation [5, 25] into YCbCr format. We then apply the proposed approach to each layer independently. For this reason, after DCT the quantization scalar for colour images is {1, 5, 5} for each layer of Y, Cb and Cr respectively.

Figure 13 shows the visualization of the decompressed 2D images using different values for quantization. These decompressed images are converted to 3D surfaces. FACE1 on top of Fig. 13 from left to right are higher quality surface per 3D RMSE. In fact, some parts of 3D surface have disappeared at higher compression ratio. But in FACE2 in the middle, the 3D reconstructed image at higher compression ratio is approximately the same as for low compression ratio. This means that 3D reconstruction depends on the structured light’s quality in an image. Figure 13 (bottom) shows zoomed-in regions for the two images, the structure light patterns are clearly present at 99% compression ratio.

Fig. 13
figure 13

Top FACE1 shows decompressed 3D surface with texture and shaded at compressed size 18.7 and 11.7 KB. Middle: FACE2 shows decompressed 3D surface with texture and shaded at compressed size 15.6 and 7.8 KB. Bottom details of 2D images FACE1 and FACE2 respectively at the higher compression ratio

Figure 14 shows 3D reconstructed surfaces for CORNER and METAL images respectively. On top, the quality of CORNER 3D surface at 99% compression ratio. But the 3D surface (top right) has some artefacts; this type of artefacts not present in the original and decompressed 2D image at lower compression ratio. Artefacts appear when the structure light patterns are not clearly defined in the image, or are degraded after compression and decompression. In Fig. 14 middle, the decompressed METAL image is converted to a 3D surface. The reconstructed 3D surface of middle right is degraded for all cases in which compression ratios exceed 99%.To analyse 2D colour image compression, we zoomed-in the decompressed 2D images. It is shown that the structure lights are clearly visible at higher compression ratios of 99%.

Fig. 14
figure 14

Top shows decompressed 3D surface of CORNER with texture and shaded at compressed sizes 21.2 and 14.7 KB. Middle shows decompressed 3D surface of METAL with texture and shaded at compressed sizes 27.5 and 12.1 KB. Bottom zoomed-in details for 2D images CORNER and METAL respectively at higher compression ratio

For a comparative analysis, we compressed and decompressed the 2D images by JPEG2000 and JPEG, then converted to a 3D surface. Table 7 and Fig. 15 describe the compressed and decompressed results respectively for JPEG2000. The comparison is based on applying the same compression ratios between JPEG2000 and our approach and show the visualization for the two methods. While the JPEG algorithm simply failed to compress the images at the required ratio. Also, the important point to note is that JPEG2000 [26] cannot decompress some 2D images to equivalent quality for 3D reconstruction or if it does, the 3D surface contains some degradation. Additionally, Fig. 16 shows the compressed 2D images by JPEG2000 with zoomed in image details.

Table 7 Compression and decompression for 3D reconstruction by JPEG2000 and JPEG at higher compression ratios. All images were compressed to the same size by the techniques
Fig. 15
figure 15

Top 3D reconstructed surface for FACE1 and FACE2 respectively using JPEG2000. Bottom CORNER image successfully 3D reconstructed, while METAL image failed 3D reconstruction

Fig. 16
figure 16

Details of 2D decompressed images by JPEG2000: Top FACE1 on the left is clearly blurred leading to degraded 3D reconstruction. Bottom METAL image on the right is blurred rendering it unable to reconstruct a 3D surface

7 Conclusions

This paper has presented and demonstrated a new method for image compression and illustrated the quality of compression through 2D and 3D reconstruction, 2D and 3D RMSE. Our compression algorithm is based on DCT applied to each row of an image, then followed by DST which is applied to each column of the matrix. After the transformation stage, the minimization of high frequency algorithm is used to reduce the number of high-frequency coefficients. The compression stage is then completed with arithmetic coding. In the decoding stage, the Fast-Matching-Search algorithm based on binary search is used to recover the original data. The results show that our approach introduces better image quality at higher compression ratios than JPEG and JPEG2000 as it can more accurately reconstruct 3D surfaces than both techniques. A slight disadvantage of the proposed method is that it is more complex than both JPEG2000 and JPEG. This is because our approach uses two types of transforms, and that neither JPEG nor JPEG2000 rely on a search method.

The most important aspects of the method and their role in providing high quality image with high compression ratios are identified as follows:

  1. 1.

    The one-dimensional DCT can be applied to an image row (i.e. larger block sizes \(\ge 8\)). Equally, the one-dimensional DST can be applied to each column of the output from DCT.

  2. 2.

    The user can ignore the scalar quantization to remove higher frequency coefficients (i.e. keeping more coefficients increases image quality).

  3. 3.

    The two-dimensional quantization (cf. Eq. 5) provides a more aggressive quantization removing most of matrix contents as about 50% of the matrix entries are zero. Applying this over the DST can keep image quality at higher compression ratios.

  4. 4.

    The final transformed matrix is divided into: low-frequency sub-matrix, and horizontal and vertical high-frequency matrices.

  5. 5.

    The minimization of high frequency algorithm produces a Minimized-Array used to replace each three values from the high-frequencies sub-bands by a single integer value. This process reduces the coefficients by 2/3 leading to increased compression ratios.

  6. 6.

    Since the Minimized-Array for both vertical and horizontal high-frequencies contains large number of zeros, we applied a new method to eliminate zeros and keep nonzero data. The process keeps significant information while reducing data up to 80%.

  7. 7.

    At decompression stage, the Fast-Matching-Search algorithm is the engine for estimating the original data from the minimized array and depends on the organized key values and the availability of a set of unique data. The efficient C++ implementation allows this algorithm to recover the high-frequency matrices very efficiently.

  8. 8.

    The key values and unique data are used for coding and decoding an image, without this information images cannot be recovered. This is an important point as a compressed image is equivalent to an encrypted image that can only be reconstructed if the keys are available. This has applications to secure transmission and storage of images and video data.

  9. 9.

    Our proposed image compression algorithm was tested on true colour and YCbCr layered images at high compression ratios. Additionally, the approach was tested on images resulting in better 3D reconstruction than JPEG2000 and JPEG.

  10. 10.

    The experiments indicate that the technique can be used for real-time applications such as 3D data objects and video data streaming over the Internet.

Future work is focused on efficient implementation of the decoding steps and their application to video compression. Research is underway and will be reported soon.