1 Introduction

Wireless Video Sensor Networks (WVSNs), which consist of a large number of low-cost, low-power sensor nodes aimed at transferring multimedia content, are gaining more and more popularity. However, establishing robust multimedia communications among sensor nodes are quite challenging [1]. In fact, the traditional video coding techniques, such as High Efficiency Video Coding (HEVC) [2] and H.266/VVC [3], with high complexity encoders and the error drift cannot be easily employed in WVSNs, due to the low-power and low-cost requirements of the sensors. Therefore, many kinds of techniques have been proposed for WVSNs [4,5,6,7,8].

For such scenarios there is an urgent need for a low-complexity video coding technology with high compression performance and error resilience. Distributed Video Coding (DVC) [9] is a flexible video coding mode, which can shift the complex tasks from encoder to decoder, thus it is very suitable for WVSNs [10,11,12,13,14,15]. Another important issue in the multimedia transmission is the high error and loss rate in wireless channels [16], especially in WVSNs.

The fundamental idea of DVC is to employ a Distributed Joint Source-Channel Coding (DJSCC) scheme. There are some works focusing on wireless image transmission based on DJSCC [13, 17]. Bourtsoulatze et al. [18] proposed a novel JSCC scheme for wireless image transmission. Kurka et al. [17] introduced deep learning into JSCC for the successive refinement of images over wireless channels. There is an urgency to develop robust transmission schemes for DVC in WVSNs, but few research activity focused on it. Yang et al. proposed a Robust Distributed Video Coding (RDVC) framework [19] and proposed an efficient Scalable Distributed Video Coding (SDVC) scheme [20]. Duong proposed a novel Multiple Description Coding (MDC) method to enhance the robustness of video transmission over error-prone networks [21].

On the other hand, an important aspect of DVC system is the Side Information (SI), which plays an important role in the compression and transmission performance, since it is considered the estimation of Wyner-Ziv Frames (WZF). The SI can be generated from WZF and Key Frames (KF), which is coded by traditional video coding [22,23,24,25,26,27,28,29]. Some researchers investigated transmission and error concealment of traditional video coding schemes in presence of bit errors [30]. Nevertheless, the mentioned algorithms do not consider the influence on the quality of SI caused by the information loss in both of KF and WZF comprehensively.

To sum up, there is still much room for improvement regarding video compression and transmission performance over wireless channels. At the same time, how to effectively implement robust transmissions and obtain high quality reconstructed video remain an important task in WVSNs. In order to achieve good compression while enabling progressive refinement of multimedia quality in WVSNs, this paper proposes a Progressively Refined Scheme (PRS) for DVC systems. The main contributions are:

  1. (1)

    An Error Concealment (EC) algorithm for KF, which is based on the best match block. This is the first step refinement of the PRS which allows to obtain better SI.

  2. (2)

    A Repaired Layers (RLs) that can effectively reduce the impact of channel losses on the video quality. This step is also defined inside the encoder and decoder of PRS.

  3. (3)

    The decoder of PRS is a refinement strategy, which is relying on different Bit Plane (BP) levels.

The proposed PRS scheme can significantly improve the quality of the SI and reduce the decoded bit rate, thus obtaining better RD performance for DVC in WVSNs. The rest of this paper is organized as follows. In Sect. 2, the necessary background of the proposed PRS is presented and the DVC system is briefly introduced. Then, the proposed PRS is presented in Sect. 3, followed by the details of the PRS strategies introduced in Sect. 4. The simulation results and analysis are shown in Sect. 5. Finally, Sect. 6 concludes the paper.

2 Background

This section briefly presents the implementation of DVC based on DJSCC, which can potentially solve the problems of energy constraint of the sensors, video compression and transmission errors.

2.1 Compression in DJSCC

The DJSCC scheme involves two channels, as shown in Fig. 1. The ‘virtual’ correlation channel be assumed to be a Binary Symmetry Channel (BSC) and the actual channel is a Binary Erasure Channel (BEC).

Fig. 1
figure 1

source-channel coding (DJSCC)

Scheme of distributed joint

Let \(X\) be a source with entropy of \(H(X)\) and \(U\) is the parity check part of encoded \(X\). \(U\) will be transmitted through the BEC and \(Z\) will be received by joint decoder. The rate-distortion function is:

$$ R(d) = \min p_{{\hat{X}|X}} I(X;\hat{X}) $$
(1)

where I refers to the mutual information entropy of the source bits and the decoded bits, and \(d \ge 0\) refers to the distortion between source \(X\) and decoded bits \(\hat{X}\). Kaspi [31] defined the Distortion function as:

$$ D:X \times \hat{X} \to \left[ {0,\infty } \right) $$
(2)

and then:

$$ E_{{X|\hat{X}}} \left( {D\left( {X,Dec\left( {Z,Y^{\prime}} \right)} \right)} \right) \le d $$
(3)

that is, when \(Y^{\prime}\) is closer to \(X\), the decoded of DJSCC \(\hat{X}\) will use less parity check bits \(Z\), which means the compression of the DJSCC scheme will be higher. On the other hand, less bits produced by the DJSCC scheme will be lost in the wireless channel.

2.2 Low-complexity channel coding

Yang et al. [19] designed the Group Puncture Rate Adaptive Irregular Repeat Accumulate (GPRA-IRA) scheme, in which the \(N\) bit source \(B\) is divided into \(l\)-dimensional \(k_{0}\) vectors, i.e., \(B_{0}\),…, \(B_{{k_{0} - 1}}\).Therefore, the parity check can be calculated as:

$$ P_{0} { = }\sum\limits_{j = 0}^{l - 1} {\left( {\sum\limits_{i = 0}^{l - 1} {L_{i,j} } } \right)} B_{j} $$
(4)
$$ P_{1} { = }\sum\limits_{j = 0}^{l - 1} {L_{0,j} B_{j} } + P_{0} $$
(5)
$$ P_{i + 1} { = }\sum\limits_{j = 0}^{l - 1} {L_{i,j} } B_{j} + L_{i,0} P_{0} + P_{i} ,i = 1, \cdots ,l - 2 $$
(6)

where L is the check matrix of the GPRA-IRA codec and the encoding algorithm exhibits a linear encoding complexity [19]. If the channel coder of DJSCC uses the GPRA-IRA scheme, it can further reduce the encoding complexity of DJSCC.

3 The proposed PRS for wireless video sensor networks

The purpose of robust video transmission is to produce the least distortion for the decoded video while achieving suitable compression performance. A complete PRS is designed in this paper and shown in Fig. 2. Some of the abbreviations used in the figure are listed in Table 1.

Fig. 2
figure 2

The Progressively Refined Scheme for WVSN

Table 1 The full name of abbreviation in Fig. 2

In short, the proposed scheme consists of two DJSCC schemes: one is the repair layer for the corrupted KF, and the other one is the coder for the WZF. Therefore, the rate-distortion function in Eq. (3) is modified as Eq. (7).

$$ E_{{X\hat{X}}} \left( {D\left( {X,Dec\left( {Z,Dec\left( {Y^{\prime},Z^{\prime}} \right)} \right)} \right)} \right) \le E_{{X\hat{X}}} \left( {D\left( {X,Dec\left( {Z,Y^{\prime}} \right)} \right)} \right) $$
(7)

In this scheme, if some bits of \(Y\) are lost, the RLs of parity check bits used in the decoder will make the \(Y\) closer to \(X\), which means that the correlation noise between \(X\) and \(Y\) is reduced. Due to the two SR schemes, the distortion is reduced further with respect to the value in Eq. (3) in Sect. 2.1. On the other hand, if no losses affect \(Y\), the PRS does not need the RLs also for KF.

4 Progressively refined strategy for corrupted KF

Since the DVC system is a joint decoder framework, there is much more useful information that can be referenced by the decoded frame, such as the previous and next adjacent frames. In addition, if some information or the entire frame is lost during transmission, an error concealment scheme can be applied to exploit abundant auxiliary information to repair the lost block. As shown in Fig. 2, when the decoder receives the HEVC stream, it can determine whether any information has been lost or not. If there is no loss, the video stream can be reconstructed directly by the HEVC Intra-DEC without any RL. If there is some packet loss, the corrupted KF is firstly processed by the EC scheme, then by SR and lastly repaired by RLs, as shown in Fig. 3.

Fig. 3
figure 3

Scheme of the Refine Layer with the PIRA based on DWT

4.1 The error concealment strategy for KF

Based on the concept of SI, in case of partial data loss in KF some researchers proposed, as the first step, to take the corrupted KF as SI, and then proceed with the WZ codec [32]. However, this method did not consider the compression efficiency of KF and the value of the correctly decoded part of KF. Based on the temporal correlation of frames, the proposed error concealment algorithm in any case relies on the currently decoded KF before it is modified, even if it lost some information, as shown in Fig. 4a. If a portion of the corrupted KF (\(X_{t}\)) is lost, the HEVC decoder knows which block \(B_{t}\) is lost and its pixel point position \((x,y)\). The diagram of EC is shown in Fig. 4b.

Fig. 4
figure 4

The EC strategy for corrupted KF

4.2 The refined layers for KF

In order to design more suitable channel codes for a lossy environment, some researchers conducted studies on channel codes based on check bits. Chi [33] found that the Irregular Repeat Accumulation (IRA) code shows performance close to the Shannon limit. The encoder and decoder of RLs and WZ bits are very important in the proposed PRS. This paper proposes a typical WZ bits encoder scheme based on DWT, which is based on the GPRA-IRA [19] and called Parallel Irregular Repeat Accumulation (PIRA) code, as shown in Fig. 3. At the encoder side, frames (WZF or KF) are transformed by DWT and decomposed with an N-level wavelet. When N = 4, we obtain 10 sub-bands:

$$ DWT(F) = \left\{ {HL_{n} ,LH_{n} ,HH_{n} ,LL_{4} } \right\}(n = 1,2,3) $$
(8)

where the F is the encoded frame. The proposed PRS takes the 1-level wavelet as the Refined Layer 1 (RL1), the 2-level as the Refined Layer 2 (RL2) and the 3-level as the Refined Layer 3 (RL3):

$$ \begin{gathered} RL1 = \left\{ {HL_{1} ,LH_{1} ,HH_{1} } \right\} \hfill \\ RL2 = \left\{ {HL_{2} ,LH_{2} ,HH_{2} } \right\} \hfill \\ RL3 = \left\{ {HL_{3} ,LH_{3} ,HH_{3} ,LL_{3} } \right\} \hfill \\ \end{gathered} $$
(9)

Regarding the size of RL2 which is double compared to RL3, in this paper we partitioned the different Bit Plane (BP) levels in RL3, then we coded them by PIRA to generate parity check bits and stored them in a buffer. Finally, a certain number of bits are supplied to the decoder for WZF decoding.

As in the reference [19], the \(N\) bit source \(B\) is divided into \(l\)-dimensional \(k_{0}\) vectors, i.e., \(B_{0}\),…, \(B_{{k_{0} - 1}}\), then encoded as shown in Eq. (4)–(6). This framework can effectively reduce the complexity of the coder and the impact of channel losses on video quality, thus improving the RD performance in a wireless error-prone environment.

5 Simulation results

5.1 Experimental settings

In order to verify the performance of the proposed PRS, experiments are implemented using the following conditions:

  1. (1)

    Sequences: two video sequences, Foreman.cif and Hall.cif (resolution 352*288, 30 frames per second in both cases), are tested and shown in the evaluations. The first 100 frames of each sequence are tested.

  2. (2)

    Conditions for KF: HEVC/H.265 Intra-coding, which is popularly used in DVC systems, is employed to code the KF. The quantization step of KF is chosen among five different values: {22, 28, 30, 32, 34}.

  3. (3)

    Conditions for WZF: the DISCOVER system presented in [9] is used. Its \(4 \times 4\) quantization matrices Q1, Q4, Q7 and Q8 are selected as RD points. The experimental conditions for KF in step 2 are also used to code the KF in DISCOVER.

  4. (4)

    Channel coding: following our proposed work, each BP of sub-bands is coded by PIRA.

  5. (5)

    Packet loss: in order to measure the effect of different loss rates on the transmission, two loss rates are used: 5% and 10%.

In our simulations, the sequences are encoded and decoded by the HEVC reference software and the Peak Signal-to-Noise Ratio (PSNR) for the luminance component is used as an objective video quality measure. Then, the corrupted frames are rebuilt by means of the proposed Progressively Refined Scheme (PRS) in this paper and, for comparison purposes, by means of the Motion Vector Recovery (MVR) method in [30], the Robust Distributed Video Coding (RDVC) method in [19] and the Multiple Descriptions Coding (MDC) method in [21], respectively.

5.2 The progressively refinement performance of KF

This first set of experiments mainly illustrate the performance of the proposed PRS, which is composed by the RL1, RL2 and RL3, when some blocks of KF are lost. The KF is coded by HEVC/H.265 Intra-coding, in which the quantization step is set as 22. The loss rate is set to 10%. The impact of transmission losses is measured by means of the PSNR value of the frame. The PSNR values of experiments for the first 20 frames are shown in Fig. 5, in which the abscissa axis represents the sequence number of each frame, and the ordinate axis represents the PSNR value. For comparison purposes the figure also shows the upper bound of the PSNR value, i.e., the performance of HEVC/H.265 Intra-coding when there is no loss, and the lower bound, i.e., the PSNR value when HEVC/H.265 Intra-coding is used with 10% loss rate and no refinements techniques.

Fig. 5
figure 5

The PSNR of the experiment sequences at 10% loss rate

In the RDVC method [19], the encoder structure of KF includes the protection of WZ bits in Discrete Cosine Transform based on DJSCC. Figure 5 shows the fluctuations of the PSNR due to the protection of WZ bits [19]. That is due to the decoding failure of some bit planes, which affects the reconstruction quality of the video frames. As expected, the PSNR of the corrupted KF modified by the three RL in the proposed PRS is higher than the lower bound. Moreover, the results show a progressively refined quality when moving from RL1 to RL2 and RL3. Note also that the quality of the RL3 in the proposed PRS is closer to the upper bound, that is, the result of the RL3 refinement is very close to the maximum, no loss performance. This result shows that, in general, the proposed PRS is more suitable than the other compared methods for a wireless error-prone transmission environment, such as the one found in WVSNs.

5.3 The rate-distortion performance of KF

In order to show the contribution of the proposed PRS, we test the RD performance of the KF. Results are shown in Fig. 6. Four different HEVC Intra-ENC quantization steps are used: {28, 30, 32, 34}. When some information is lost in KF, the quality of HEVC Intra-DEC deteriorates, which results in a significant decrease in the RD performance, as shown by the bottom line in Fig. 6.

Fig. 6
figure 6

The RD performance of KF for 5% loss rate

However, the refinement system of the proposed PRS allows to significantly improve the RD performance for the tested sequences compared to the other methods shown in the comparison. The MVR method reconstructs the motion vector of a damaged macro-block by means of the fitting plane algorithm which, in turn, is based on the improved boundary matching algorithm, and then the optimal motion vector is selected to reconstruct the damaged images [30]. In the MDC method, instead, WZ frames are sub-sampled into four parallel low resolutions images to generate four descriptions, which are then encoded and transmitted to the decoder independently. At the receiver, the MDC method utilizes a successively refined SI algorithm, to exploit both temporal and spatial correlations between subsequent video frames [21]. For instance, for the ‘Hall’ sequence, at 5% loss rate, Fig. 6a shows that the PSNR of the proposed PRS is 10 dB higher than the non-refined case, 5 dB higher than the MVR method [30] and 2 dB higher than the MDC method [21] after being repaired by using 350 kbps. The essence of the proposed PRS in this paper is to improve the PSNR of KF by means of RL (additional code rate), thereby improving SI quality, and finally optimizing the performance of the whole DVC system. From the figure it can also be noted that the performance of two methods, i.e., RDVC [19] and MDC [21], is very similar. A similar behavior can be observed also for the Foreman sequence, shown in the second part of the figure.

Finally, note that the proposed PRS can achieve robust transmission of KF, which is very important for a DVC transmission scheme when used in a wireless communication environment.

5.4 The rate-distortion performance of the proposed system

In order to evaluate the effect of the proposed PRS on the performance of the whole DVC system used in a wireless environment, Fig. 7 shows the overall RD performance under different loss rates (5% and 10%). The GOP sizes is set as 8 (i.e., one KF followed by seven WZF). Four quantization parameters \(Q_{w}\) {5, 10, 15, 20} are tested.

Fig. 7
figure 7

The Overall RD performance in WVSN for different loss rates

With the increase of loss rate, the performance of the proposed PRS is less affected than the one provided by the MDC method [21] and it is closer to the no loss performance provided by DISCOVER [9], as shown in Fig. 7. In particular, for the Hall sequence, when the channel conditions deteriorate, the PSNR performance of the PRS is 2 ~ 3 dB higher than that of the MDC method [21], as shown in Fig. 7a.

Note also that the PRS exhibits better PSNR for the reconstructed KF when subject to loss (which is also shown in Fig. 6). This in turn results in better SI, therefore the PRS needs less rate for decoding WZF, which further confirms the rate-distortion function in Eq. (7), where it is shown that the lower distortion between the source and the SI in the system needs lower bits to decode. In case of loss, the code rate of the PRS is reduced by half compared to the MDC method: this is the advantage of the proposed PRS in this paper. Similar behaviors and characteristics can be observed for the Foreman sequence.

The results of Fig. 7 clearly show that the proposed PRS is a robust transmission scheme suitable to protect the transmission of KF and WZF. The decoder of SR for KF can, in fact, generate better SI, which reduces the transmission rate of WZF. In summary, the proposed PRS is more robust.

6 Conclusions

In order to achieve better performance when transmitting multimedia content in WVSNs in terms of compression and error concealment, this paper proposed a Progressively Refined Scheme suitable for both KF and WZF. The proposed scheme is shown to significantly improve RD performance compared to other existing methods. The experimental results of this paper show that the PSNR achieved by the proposed algorithm is closer to the performance that can be achieved when no error/losses are present. The overall RD performance illustrates that the proposed PRS shows robust transmission characteristics, which is a key element for solving the video transmission problem in WVSNs.