1 Introduction

Nowadays, LoRa technology is present in many Internet of Things (IoT) applications which use wireless sensor networks [1, 2]. Compared to other technologies such as Wi-Fi, Zigbee, GSM or Bluetooth, LoRa has a longer range, lower energy consumption and can be used in unlicensed frequency bands. Nonetheless, due to its low bitrate, it is restricted to telemetry applications [3, 4]. It cannot be employed in monitoring systems with greater capabilities, such as those which transmit more information in short time bursts via images [5]. In this context, the proposed image encoder is designed to be used in low bitrate transmission modules such that they can handle digital image transmission, thus enhancing their monitoring capabilities. This work presents a use case in which the proposed encoder, implemented in a single-board computer, transmits images of hydrometric rulers. These are measuring tools to indicate relevant water levels in remote regions. Quick visualization of rulers’ images aid monitoring centers in remote regions with flooding hazards due to seasonal El Niño events. The proposed encoder achieves this feature by using a progressive transmission, where lower-resolution images are quickly visualized first in the receiver and the visualization is steadily improved in subsequent moments. Thus, the encoder design and chosen transmission mode aim for a minimum latency in the image transmission when compared to other encoders such as the JPEG2000 while maintaining the quality of the reconstructed image.

2 Related work

This section presents two groups of selected articles which focus on image transmission in low bandwidth communications systems. On one hand, Table 1 describes the first group, which contains proposals that use low bandwidth IoT modules. On the other hand, Table 2 presents the second group, which has articles that seek to improve information transmission efficiency considering a wider range of applications and not necessarily IoT settings.

Table 1 List of similar works using low-bandwidth IoT modules
Table 2 List of similar works with general purpose

The image transmission methods in Table 1 mostly handle only low resolution, grayscale images. Clearly, these images cannot carry information otherwise available in the color bands. The transmission schemes in Table 2 can work with higher resolutions and color images efficiently, but neither group of proposals evaluate implementations on embedded devices. These devices require balancing an acceptable quality of the uncompressed image, decent image visualization delays and manageable computational loads.

With these considerations, the proposed encoder presents a successful alternative which balances the use of color images, high compression rates and reduced transmission times with progressive decoding and a low complexity. The encoder slightly outperforms the JPEG2000 encoder, which serves as a benchmark to highlight this work’s contribution.

3 Proposed methodology

The proposed encoder uses wavelet subband decomposition and reconstruction with multiresolution analysis, which gives it the unique characteristic of being able to progressively reconstruct the image while each subband is transmitted and decoded. Thus, it achieves comparable quality values [validated via Evaluating the peak signal-to-noise ratio (PSNR) at structural similarity index (SSIM) fixed compression rates] as JPEG2000 at slightly better compression rates, for RGB images. Moreover, it enables a rapid first, low-quality visualization of the decoded image, whose quality then progressively increases as the reception and decoding of the subsequent subbands takes place. The encoder and decoder were implemented in LoRa modules, where image transmission times were evaluated with and without the encoding. A maximum transmission time reduction of 99.09% was achieved. For these reasons, the proposed image encoder and decoder constitute the largest contribution in this work.

The following sections describe the encoding stage, data transmission stage and decoding stage.

3.1 Design of the proposed encoder

Figures 1 and 2 summarize the proposed image compression scheme, presenting the image encoder and decoder respectively. This section details the encoding and decoding steps, which aim to balance an acceptable image quality and a low computational load.

Fig. 1
figure 1

Block diagram of the proposed encoder

Fig. 2
figure 2

Block diagram of the proposed decoder

The proposed encoder and decoder were implemented in a Raspberry Pi 3B + single-board computer [16]. Image acquisition was performed either via a webcam or a Raspberry Pi Camera Module 2, using the RGB color format in both cases.

Table 3 Evaluation of wavelet filters on hydrometric ruler images

The encoding process is described below:

Step 1:

Acquire the image in its primary RGB components and convert to YCoCg color space to separate the luminance and chrominance components, with a digital resolution of \(M\) rows and \(N\) columns. Each component can be expressed by \({I}_{Y}\left(x,y\right)\), \({I}_{Co}\left(x,y\right)\) and \({I}_{Cg}\left(x,y\right)\) [17].

Step 2:

The Co and Cg chrominance components where subsampled with the 4:2:0 color format [18]. The resulting subsampled chrominance components are \({I}_{SCo}\left(x,y\right)\) and \({I}_{SCg}\left(x,y\right)\).

Step 3:

The luminance component is processed by 3-level deep tree, while the chrominance components are processed by a 4-level deep tree [19]. Evaluating PSNR and SSIM of the different subbands between the original and reconstructed images, the approximation subband \({I}_{Y,\mathrm{3,1}}\left(x,y\right)\), the horizontal detail subband \({I}_{Y,\mathrm{3,2}}\left(x,y\right)\) and the vertical detail subband \({I}_{Y,\mathrm{3,3}}\left(x,y\right)\) are chosen for the luminance component. The rest of the subbands are never encoded nor transmitted.

For both chrominance subbands, only the approximation subbands (\({I}_{SCo,\mathrm{4,1}}\left(x,y\right)\) and \({I}_{SCg,\mathrm{4,1}}\left(x,y\right)\)) are chosen and the rest subbands are neither encoded nor transmitted. The different subbands will subsequently be expressed as \({I}_{subband}\left(x,y\right)\). This notation will be employed in the rest of the article.

Different families of wavelets were analyzed and compared to choose low-pass and high-pass filters [20]. A total of 35 hydrometric ruler images were used in the evaluation process, which considered computational loads and PSNR and SSIM metrics [21]. Table 3 shows the comparison of the filters’ impulse response length in each family. This parameter is directly related to the computational load of the convolutions performed during the subband decomposition and reconstruction.

The biorthogonal family has the best trade-off between quality and computational load, resulting in choosing the bior3.9 wavelet family for this encoder.

Step 4:

A uniform scale factor quantizer is used which is expressed in Eq. (1) [22]. Each quantizer’s resolution was chosen considering the human visual tolerance to distortion, through testing and comparison of subjective quality, average PSNR and SSIM of diverse images. Finally, luminance subbands were quantized with \(r=5\) bits per sample and chrominance subbands, with \(r=4\) bits per sample.

$${I}_{subband}^{Q}\left(x,y\right)=round\left(\frac{{I}_{subband}^{{\prime}}\left(x,y\right)}{{fe}_{subband}}.{(2}^{r}-1)\right)$$
(1)

where \({I}_{subband}^{\mathrm{^{\prime}}}\left(x,y\right)={I}_{subband}\left(x,y\right)-{Imin}_{subband}\) which ensures that the subbands only have positive values.

Every scale factor and minimum scaling value for each luminance and chrominance subband must also be quantized and encoded. These numbers were quantized with 24 bits to avoid hindering the obtained image quality. On the other hand, the minimum scaling values in every subband are always negative, so that quantization is done to the absolute value. The scaling factor and minimum scaling value are expressed in Eqs. (2) and (3) respectively.

$$fe^Q_{subband} = round\left( {fe_{subband} . 10^5 } \right)$$
(2)
$$Imin^Q_{subband} = round\left( {\left| {Imin_{subband} } \right|. 10^5 } \right)$$
(3)
Step 5:

The quantized subbands are further coded with a Huffman entropy coder [23]. Static codebooks were generated from the subbands evaluated in 100 images. Two types of distributions were predominantly obtained: those with occurrences at all numerical values for a given bit quantization depth; and distributions with missing values. The second type of distribution generated codebooks with an average of 5% higher coding gain. The total number of codebooks is 44, so a 6-bit number ("Number codebook") is transmitted in the packet header to indicate which codebook encodes a particular subband.

Before Huffman coding is applied to a subband, an algorithm determines which option occupies the least number of bits in the transmission packet. Whether or not Huffman coding has been used is indicated in the packet header by the 2-bit number "Huffman Enable". If it is "00", it indicates that no coding has been used; if it is "11", it means that Huffman coding has been applied. Finally, every Huffman encoded subband will subsequently be denoted by \(I_{subband}^{QH} \left( {x,y} \right).\)

3.2 Data transmission and reception

The images and encoded data are transmitted in the LoRa packet payload (see Fig. 3), which contains a maximum of 146 bytes. The explicit packet type is used which allows payloads of variable size [24]. The "Data type identifier" field (1 byte) indicates whether the packet transmits header data or a coded subband. The "Counter" field (1 byte) contains the count of packets sent, and the "Information" field (up to 144 bytes) contains the header or coded subband data. When transmitting header data, the "Data type identifier" has a value of 0; when transmitting a coded subband, it has a value of 255.

Fig. 3
figure 3

Explicit LoRa packet format and data structure carried in the payload

Header data is sent three times in three consecutive LoRa packets. This redundancy aids in avoiding reception errors for header data, which could have a large negative impact on image reconstruction. Figure 4 shows the header data frame sent in the “Information” field.

Fig. 4
figure 4

Header data frame

The 8-bit “Number padding bits” field indicates the number of bits added to a field that contains an encoded subband to always have an integer number of bytes. The 8-bit “Subband identifier” field indicates which subband is being encoded. Each of the 5 encoded subbands has a unique 8-bit identifier code. The “M” and “N” 8-bit fields correspond to the number of rows and columns of the matrix that represents the subband. The 8-bit \(fe^Q\) corresponds to the quantized scalefactor. The 24-bit \(Imin^Q\) is the minimum scaling value used during subband quantization. The 2-bit “Huffman Enable” indicates whether Huffman encoding was applied. The 16-bit “Subband Size in bits” store the number of bits that are used to store the codified subband. Finally, the 6 bit “Codebook number” field indicates the codebook used for Huffman encoding, if applicable.

Figure 5 shows the data frame of the encoded subband data, sent in the “Information” field in Fig. 3. “Padding bits” holds the 0-valued bits required so that the frame has an integer number of bytes. Finally, \(I_{subband}^{QH} \left( {x,y} \right)\) or \(I_{subband}^Q \left( {x,y} \right)\) represent the subband, encoded with or without Huffman respectively.

Fig. 5
figure 5

Encoded subband data frame

For the reception process, the binary data is reconstructed from the header and the received subband frames. Figure 6 shows a flowchart of this process. The frames contain coded subbands, the data is concatenated until a complete subband is obtained. If a subband packet is lost, zeros are concatenated to the subband so that the image can still be reconstructed.

Fig. 6
figure 6

Data reception process

3.3 Image reconstruction and restoration

The header data serves to reconstruct the 2-D matrices in each decoded subband, and with these to progressively reconstruct the image. Figure 7 illustrates the flowchart of the employed decoding [23] and de-quantization process [22]. This process may or may not utilize Huffman binary decoding, according to the header data. The process is described below:

Fig. 7
figure 7

Decoding and subband de-quantization process

Step 1 :

The de-quantization process is executed. Equations (4)–(6) describe the de-quantization, scalefactor \(fe^D_{subband}\) decoding and \(Imin^D_{subband}\) value computation for each subband, respectively.

$$I_{subband}^D \left( {x,y} \right) = \left( {\frac{{I_{subband}^Q \left( {x,y} \right)}}{2^r - 1}.fe^D_{subband} } \right) - Imin^D_{subband}$$
(4)
$$fe^D_{subband} = \frac{{fe^Q_{subband} }}{10^5 }$$
(5)
$$Imin^D_{subband} = \frac{{Imin^Q_{subband} }}{10^5 }$$
(6)
Step 2 :

The components of the YCoCg color model are reconstructed from the subbands using the wavelet family and reconstruction trees described in the Sect. 3.1 [19]. The reconstruction trees employ the inverse wavelet transform, where the subbands that were neither encoded nor sent are set to 0 during the reconstruction. The resulting components of the reconstruction are \(I_Y^D \left( {x,y} \right)\), \(I_{SCo}^D \left( {x,y} \right)\) and \(I_{SCg}^D \left( {x,y} \right)\).

Step 3 :

This last stage restores the image from the reconstructed components, YCoCg. First, it is necessary to interpolate the chrominance components, since they were undersampled with a 4:2:0 scheme in the encoder. Due to their computational burden, zero-order interpolation was applied [25]. The resulting chrominance components are \(I_Y^{D^{\prime}} \left( {x,y} \right)\), \(I_{SCo}^{D^{\prime}} \left( {x,y} \right)\) and \(I_{SCg}^{D^{\prime}} \left( {x,y} \right)\).

With the recovered chrominance components, the YCoCg image is converted to the RGB primary color space [17], where the numerical values are rounded and scaled to a range from 0 to 255. Finally, the reconstructed RGB primary components \(I^{\prime}_R \left( {x,y} \right)\), \(I^{\prime}_G \left( {x,y} \right)\) and \(I^{\prime}_B \left( {x,y} \right)\) are shaped into a 3D array for displaying in any graphical user interface. The complete process is summarized in Fig. 8.

Fig. 8
figure 8

Image restoration process

4 Experimental results and discussion

The PSNR (Peak Signal-to-noise Ratio) and SSIM (structural similarity index) metrics were used to validate the proposed encoder [21]. Both metrics are always related to the obtained compression factor, such that different image encoders can be compared at similar compression levels. Firstly, the PSNR measures the ratio of the maximum signal power to average encoding noise, in dB (decibels), and is useful for low-frequency, uniform image regions [26]. It is computed as:

$$PSNR\left( {dB} \right) = 10log_{10} \left( {\frac{255^2 }{{MSE}}} \right)$$
(7)

where

$$MSE\, =\, \frac{1}{3 \times M \times N}\left( {\mathop \sum \limits_{i = 0}^{M - 1} \mathop \sum \limits_{j = 0}^{N - 1} \left[ {I^{\prime} _R \left( {x,y} \right) - I_R \left( {x,y} \right)} \right]^2 + \mathop \sum \limits_{i = 0}^{M - 1} \mathop \sum \limits_{j = 0}^{N - 1} \left[ {I^{\prime} _G \left( {x,y} \right) - I_G \left( {x,y} \right)} \right]^2 + \mathop \sum \limits_{i = 0}^{M - 1} \mathop \sum \limits_{j = 0}^{N - 1} \left[ {I^{\prime} _B \left( {x,y} \right) - I_B \left( {x,y} \right)} \right]^2 } \right)$$
(8)

Next, the SSIM is an objective evaluation metric ranging from 1 to 0, where 0 translates to a null similarity. The metric is useful for high-frequency, highly varying image regions [27]. The SSIM has a better performance with respect to subjective metrics such as the Mean Opinion Score (MOS). The SSIM computation is done between an original image, \(F\left( {x,y} \right)\), and a reconstructed image, \(V\left( {x,y} \right)\). Both images must be monochrome. The metric is computed for image blocks or regions, and it is then averaged to obtain the mean SSIM (MSSIM) in each image block \(l\) [27]:

$$MSSIM_{FV} = \frac{1}{L}\mathop \sum \limits_{l = 0}^{L - 1} SSIM_{FV,l}$$
(9)

where

$$\begin{aligned} & SSIM_{FV,l} = \frac{{\left( {2 \times \mu_{F,l} \times \mu_{V,l} + C_1 } \right) \times \left( {2 \times \sigma_{FV,l} + C_2 } \right)}}{{\left( {\mu_{F,l}^2 + \mu_{V,l}^2 + C_1 } \right) \times \left( {\sigma_{F,l}^2 + \sigma_{V,l}^2 + C_2 } \right)}} \\ & {\text{for}}\quad l = 0,1,2, \ldots , L - 1 \\ \end{aligned}$$
(10)
$$\mu_{F,l} = \mathop \sum \limits_{x = 0}^{P - 1} \mathop \sum \limits_{y = 0}^{P - 1} w\left( {x,y} \right)F_l \left( {x,y} \right)$$
(11)
$$\mu_{V,l} = \mathop \sum \limits_{x = 0}^{P - 1} \mathop \sum \limits_{y = 0}^{P - 1} w\left( {x,y} \right)V_l \left( {x,y} \right)$$
(12)
$$\sigma_{F,l}^2 = \mathop \sum \limits_{x = 0}^{P - 1} \mathop \sum \limits_{y = 0}^{P - 1} w\left( {x,y} \right)\left( {F_l \left( {x,y} \right) - \mu_{F,l} } \right)^2$$
(13)
$$\sigma_{V,l}^2 = \mathop \sum \limits_{x = 0}^{P - 1} \mathop \sum \limits_{y = 0}^{P - 1} w\left( {x,y} \right)\left( {V_l \left( {x,y} \right) - \mu_{V,l} } \right)^2$$
(14)
$$\sigma_{FV,l} = \mathop \sum \limits_{x = 0}^{P - 1} \mathop \sum \limits_{y = 0}^{P - 1} w\left( {x,y} \right)\left( {F_l \left( {x,y} \right) - \mu_{F,l} } \right)\left( {V_l \left( {x,y} \right) - \mu_{V,l} } \right)$$
(15)

Here, \(w\left( {x,y} \right)\) is a circular symmetry Gaussian filter of size 7 × 7 \(\left( {P = 7} \right)\) [28], so that the SSIM computation is done over \(P \times P\) blocks. The l-th image block is \(F_l \left( {x,y} \right)\) and the reconstructed block is \(V_l \left( {x,y} \right)\). Following the recommendations in [27], \(C_1 = 6.5025\) and \(C_2 = 58.5225\). For color images, such as in the proposed encoder, the final SSIM value is the average of the SSIM of each color component.

Another metric of interest is the compression factor \(\left( {Fc} \right)\) which is the ratio between the number of bits in the original image and the number of bits in the compressed image after the encoder. It indicates how much smaller is the compressed image size with respect to the original image size. It is computed by:

$$Fc = \frac{Nbo}{{Nbc}}$$
(16)

where \(Nbo\) is the number of bits of the original image and \(Nbc\) is the number of bits of the encoded image.

4.1 Comparative evaluation of compression results

To evaluate the proposed encoder, PSNR and SSIM metrics were first calculated for different image compression factors. They were calculated for both the proposed encoder and the JPEG2000 encoder. The following test images were used: "Lena" [28] and "Monument" [29]. As mentioned in the introduction, other tests were performed using images of hydrometric rulers which measure water levels in remote areas. Figure 9 shows these images. The acquired images were encoded and transmitted to a monitoring station with LoRa transceivers. The numbers on the rulers could not be distorted by the encoder, to avoid errors in the determination of water levels.

Fig. 9
figure 9

Images of “Lena”, “Monument” and hydrometric rulers 1 and 2 ordered from left to right

Table 4 shows the results for the "Lena", "Monument" images and the hydrometric rulers. In all cases, the PSNR and SSIM quality metrics for JPEG2000 and the proposed encoder are very similar, but the proposed encoder achieves a slightly better compression factor and can be adaptable to different situation.

Table 4 Results of comparison between JPEG2000 encoder and the proposed encoder

4.2 Evaluation of progressive transmission results

Transmission timing tests were done with hydrometric ruler images, using the SX1272 LoRa transceptor [24]. Testing was done with a total of 20 images of size 248 × 1504 pixels. Table 5 show the obtained results. There are time measurements for each received subband and additional time measurements for the header frames. The transmission distances of the links were 2068, 4848 and 16,150 m. The maximum test distance required a minimum transmission time of 3 min and 33 s.

Table 5 Transmission time test results for three distances using the proposed encoder

The subband decomposition and reconstruction technique allows a progressive display of the image and the quality of the reconstruction improves as the subbands are decoded at the receiver. The first decoded subband is the approximation subband, which gives a general idea of the characteristics of the transmitted image.

The proposed encoder can achieve a low-quality image transmission in 34 s for a 2069-m link, and in 66 s for a 16,150-m link. Yet, the quality will improve as the rest of subbands reach the decoder. Figure 10 illustrates this progressive quality improvement.

Fig. 10
figure 10

Progressive reconstruction stages, where each image (ae) corresponds to the sequential reception of each subband. Left to right: first to fifth subband

4.3 Comparative evaluation of encoded and non-coded transmissions

Table 6 presents the results of encoded and un-encoded image transmission tests. Clearly, there is a significant improvement in the transmission time of the encoded images compared to the un-encoded ones. For a link of 2068 m and using the same transceiver configuration as for coded images, there is a 99.07% reduction in transmission time compared to the time required to transmit coded images; for a 16,150-m link, the reduction is of 99.09%. Therefore, the benefit of using the proposed encoder is evident: there is a faster transmission time and, consequently, a lower energy usage and a longer autonomy of the device.

Table 6 Comparative table of transmission time between the proposed encoder and without the encoder

5 Conclusions and future scope

When compared to the standard JPEG2000 image encoder, the proposed encoder achieved similar quality levels, measured by the PSNR and SSIM. Yet, the proposal in this work did this with a higher compression level, while JPEG2000 has compression limits. This is summarized in Table 4, where the proposal results in similar PSNR and SSIM than JPEG2000 but with slightly higher compression rates.

Moreover, the proposed encoder obtained a reduction in transmission times of up to 99.09% with regards to transmitting un-encoded images. Increasing the transmission distances did not have a significant impact on this time reduction, which could translate into lower energy consumption and a greater device autonomy when the device is powered by batteries. Furthermore, the largest transmission distance was of 16,150-m, and it could be increased by using the maximum capacity of the LoRa transceptor.

As another concluding remark, this work demonstrates that the proposed encoder firstly achieves an acceptable quality image visualization with the reconstruction of the first received subband (approximation subband). Then, this quality progressively improves as the rest of the subbands are received and decoded. This progressive transmission, reception and decoding accommodates the encoder to the limitations of LoRa technology. Hence, there is a significant reduction in transmission and visualization times of the information of interest.

Finally, a particular application for this encoder was explored, transmitting images of hydrometric rulers installed in water sources, as part of a project where the environmental parameters and level of these sources was remotely monitored. This application, where water overflows after El Niño events can severely harm the surrounding population, requires the complementary information which these images bring to the information transmitted by the water level sensors. Moreover, these sensors are prone to failure. Hence, the complementary information brought by the hydrometric ruler images is even more impactful. The proposed image encoder enables adding these redundancies in low-connectivity regions via LoRa networks.

Future works could improve the performance of the encoder using artificial intelligence and machine learning in combination with the subband decomposition technique, while maintaining a low computational load in the overall algorithm. It is important to state that the encoder could be also implemented in other single-board computers, such as Toradex, Jetson Nano, Orange Pi or others.