1 Introduction

The 3D Multi-view Video Coding (3D-MVC) has received a broad attentiveness, and it is predictable to rapidly take place of the traditional 2D video in numerous applications [1,2,3]. In the 3D-MVC, the original 3DV consists of multiple video streams, which are taken for the same object with various cameras. The 3DV streaming via wireless or Internet networks has increased, dramatically [4, 5]. Thus, to transport 3DV over limited-resources networks, a highly efficient compression algorithm must be implemented, whilst maintaining a high reception quality. Therefore, the MVC must benefit from the advantages of time and space matching between frames in the same stream in addition to the inter-view matching within the different 3DV streams to enhance the encoding performance. However, the extremely compressed video is more sensitive to transmission channel errors.

The 3DV transmission through wireless channels is permanently servile to packet errors of both burst and random errors [6, 7]. The predictive 3D-MVC framework that is presented in Fig. 1 is utilized to compress the broadcasted 3DV sequences [8,9,10]. It uses the inter- (P and B) and intra- (I) coded frames. So, errors might propagate to the neighbouring views or to the subsequent frames, thus creating destitute 3DV quality. Because of delay restrictions on video broadcasting, it is difficult to re-transmit the whole lost or erroneous MBs. Therefore, there is a necessity for error control techniques such as pre-processing Error Resilience (ER) algorithms at the encoder part and post-processing Error Concealment (EC) techniques at the decoder part. The ER algorithms are exploited to make the transmitted 3DV streams more robust to errors, and to improve the performance of EC techniques at the decoder side. The EC algorithms are recommended because they can decrease the channel errors without increasing the latency or requiring complicated encoder modifications [11,12,13,14,15].

Fig. 1
figure 1

The 3D video hierarchical B-prediction coding structure

There are different ER and data retrieval algorithms that may be employed to decrease the influence of packet damages before 3DV transmission. In the previous works [16,17,18,19,20,21,22,23], Automatic Repeated Request (ARQ), Forward Error Correction (FEC), and different multimedia retrieval algorithms have been proposed. Unfortunately, they increase the transmission bit rate, and increase the transmission delay. Therefore, in this paper, we utilize more dynamic ER tools that neither add redundant data nor increase the transmission bit rate. The proposed ER algorithms adaptively exploit the available 3DV content information to conserve the transmitted 3DV bit streams prior to transmission. Also, we propose EC algorithms in this paper, that have the feature of enhancing the received 3DV performance without modifications in the encoder software or hardware or in the transmitted bit rate. The proposed EC algorithms exploit the advantage of inter-view and intra-view matching among views and frames to sufficiently recover the erroneous MBs inside inter- and intra-frames.

The 3DV-plus-Depth (3DV + D) is another format for 3D video representation that has recently been discussed intensively [3]. It contains data about the scene per-pixel geometry. The depth information can be utilized to recognize the objects boundaries to assist the EC mechanism. Moreover, it optimizes the amount of 3DV stored bits compared to those of the traditional 2D video. The literature depth-based EC works require the depth maps to be encoded and transmitted through the network [24,25,26,27,28,29,30]. Thus, they increase the transmission latency, and they are not suitable for band-limited wireless networks. Yan et al. [24] introduced a Depth-aided Boundary Matching Algorithm (DBMA) for color-plus-depth sequences. In addition to the zero MV and other MVs of the collocated MBs and neighbouring spatial MBs, another MV estimated through depth map motion calculations is appended to the group of estimated candidate MVs. Yunqiang et al. [25] proposed a Depth-dependent Time Error Concealment (DTEC) technique for the color streams. In addition to the ready MVs of the MBs surrounding a corrupted MB, the corresponding MV of the depth map MB is added into the candidate group of the former MVs. Furthermore, the MB depth identical to a corrupted MB color is classified into homogeneous or boundary types. A homogeneous MB can be recovered as a whole. On the other hand, a boundary MB may be divided into background and foreground, and after that they are separately recovered. The DTEC method can differentiate between the background and foreground, but it fails to allow adequate outcomes, when the depth information is absent at the identical corresponding positions as the color sequence. In [26], Chung et al. suggested a frame damage concealment scheme, where precise calculations for the corrupted color MVs are carried out with the assistance of depth data.

Tai et al. [27] suggested an efficient full frame algorithm for depth-based 3D videos. This method obtains the objects according to temporal and depth information. The TME and depth map were exploited to estimate the MV of the lost depth MBs by extrapolating each object in the reference frame to recover the corrupted frame. Khattak et al. [28] presented a consistent framework model for 3D video-plus-depth concealment that allows high levels of temporal and inter-view consistency. Assuncao et al. [29] introduced an error concealment method for intra-coded depth maps based on spatial intra- and inter-view methods. This method exploits neighbouring data of depth and received error-free color images. An efficient three-stage processing algorithm was devised to reconstruct sharp depth transitions, i.e. lost depth contours. Wang et al. [30] presented an MB distinction model for both texture and depth video concealment to relieve the virtual view quality degradation induced by packet loss. It is noticed that most of the previous above-mentioned depth-assisted EC works extract the depth data at the encoder and allow depth maps to be coded and transported through the network to the decoder. So, they increase the computational complexity and bit rate of the 3DV transmission system that does not meet future bandwidth constraints of the wireless communication networks.

To avoid the limitations of the EC related works, in this work, we propose a depth-assisted EC algorithm that predicts the depth maps at the decoder side rather than at the encoder via estimating the depth DVs and MVs of the received color and depth 3DV bit streams. So, to transmit the 3DV bit streams efficiently over erroneous wireless channels and to deal efficiently with the case of heavy MB losses with which the literature EC works fail to deal [8,9,10,11,12,13,14,15, 24,25,26,27,28,29,30], we suggest multi-stage error control schemes for robust color and depth 3DV transmission over severe corrupted wireless channels. At the encoder, we propose to employ the Context Adaptive Variable Length Coding (CAVLC), the Slice Structured Coding (SSC), and the Explicit Flexible Macro-block Ordering (EFMO). We apply these methods at the encoder side to improve the performance of the suggested concealment schemes at the decoder side no order to recover the corrupted frames, efficiently. At the decoder, a hybrid approach of Directional Interpolation Error Concealment Algorithm (DIECA) and Directional Textural Motion Coherence Algorithm (DTMCA) is suggested to estimate the DVs and MVs of the corrupted erroneous color frames. For the lost depth frames, an improved depth-based EC algorithm is proposed, which is called Encoder Independent Decoder Dependent Depth-Assisted Error Concealment (EIDD-DAEC) algorithm. Furthermore, the Weighted Overlapping Block Motion and Disparity Compensation (WOBMDC) algorithm is proposed to select the best concealment MVs and DVs. After that, for further enhancement of the decoded 3DV quality, we propose applying the Bayesian Kalman Filter (BKF) to improve the overall performance of the proposed multi-stage error control algorithms.

The rest of this paper is organized as follows. Section 2 introduces the 3D-MVC prediction framework, the BKF model, and the basic EC algorithms. In Sect. 3, the proposed multi-stage error control algorithms are explained. Simulation results and comparative analysis are presented and summarized in Sect. 4. The concluding remarks are given in Sect. 5.

2 Related works

2.1 3D video prediction structure (3DV-PS)

The 2D video encoding differs greatly from 3D video coding [31]. In this work, the 3DV-PS introduced in Fig. 1 is used because of its efficacious encoding and decoding. The 3DV-PS benefits from the high inter-view matching between different views, and also from the space-temporal correlations among frames within the self-same view video. It introduces a Group Of Pictures (GOPs) that contains eight different time pictures. The vertical direction represents the multiple camera views, while the horizontal direction represents the temporal axis. The Disparity Compensation Prediction (DCP) and Motion Compensation Prediction (MCP) can be used at the encoder to fulfil efficient 3DV compression. Moreover, they may be utilized at the decoder for performing an efficient concealment process [32].

In the 3DV-PS, the MCP is utilized to calculate the MVs among different frames in the same video view, and the DCP is used to estimate the DVs among various frames of contiguous views. So, each frame in the 3DV-PS can be estimated through temporally neighboring frames and/or through various-view frames. For example, the V0 view is encoded only via time correlation depending on the MCP. The other even V2, V4, and V6 views are likewise encoded based on the MCP. On the other hand, the prime key frames are encoded via the inter-view prediction. In the V1, V3, and V5 odd views, the inter-view and temporal estimations are simultaneously used for enhancing the coding performance. In this work, the 3DV view is named considering its elementary locality 3DV frame. Therefore, as presented in Fig. 1, the odd (V1, V3, and V5) views are called B-views, the even (V2, V4, and V6) views are referred to as P-views, and the V0 view is referred to as I-view. The final view might be even or odd based on the suggested 3DV GOP prediction coding structure. Here, it is the P-view.

The 3DV-PS consists of two encoded frame types; inter- and intra-frames. The inter-frames in B views are estimated via the intra-frames in the I view, and also from the inter-frames inside P views. Thence, if a fault happens in I frame or in P frame, it proliferates to the relative inter-view frames, and furthermore to the adjacent temporal frames in the same video view. Therefore, we propose different EC scenarios for concealing the lost color and depth inter- and intra-frames in the B, I, and P views. Then, the suitable EC scenario is selected depending on the lost view and frame types to recover the corrupted MBs.

2.2 Bayesian Kalman Filter (BKF) model

A Kalman Filter (KF) is a recursive Bayesian filtering mechanism that describes the mathematical model of the discrete time observed process [33]. It is used to predict the discrete time problem events s(k) that are observed through the stochastic linear difference formula from a series of calculated measurements y(k). The mathematical BKF model is expressed in (1) and (2).

$${\mathbf{s}}(k) = {\mathbf{A}}(k - 1){\mathbf{s}}(k - 1) + {\mathbf{w}}(k)$$
(1)
$${\mathbf{y}}(k) = {\mathbf{H}}(k){\mathbf{s}}(k) + {\mathbf{r}}(k)$$
(2)

In (1) and (1), s(k) = [s1(k), s2(k), …, sN(k)]T, y(k) is an M × 1 measurement vector, r(k) is an M × 1 measurement noise vector with N(0,\(\sigma_{r}^{2}\)), H(k) is an M × M measurement matrix, w(k) is an N × 1 white noise vector with N(0,\(\sigma_{w}^{2}\)), and A(k − 1) illustrates an N × N transition matrix. It is assumed that the process and the measurement noises w(k) and r(k) follow independent and normal distributions.

In this paper, the BKF is employed for the MVs and DVs prediction process. We assume that the MVs and DVs estimation of the lost MBs from their neighboring reference MBs follows a Markov model. Thus, the BKF can manage their prediction for recovering the corrupted MBs. The BKF is an optimum recursive method that follows two steps in the estimation process, which are the time update and the measurement update as depicted in Fig. 2 that will be explained in more details in Sect. 3.5. The \({\hat{\mathbf{s}}}_{{\bar{k}}}\),\({\mathbf{p}}_{{\bar{k}}}\), \({\hat{\mathbf{s}}}_{k}\), \(\mathop {\mathbf{p}}\nolimits_{k}\) and \(\mathop K\nolimits_{k}\) are a priori-estimated state, a priori-estimated error covariance, a posteriori-estimated state, a posteriori-estimated error covariance, and KF gain, respectively. The BKF is used to estimate the filtered events s(k) from a set of noisy states y(k) [34]. It exploits the predestined state from the prior step and the measurement at the immediate step to predict the current state. Further discussion, details, equations, mathematical proofs, conditions, and assumptions for the BKF can be found in [33, 34].

Fig. 2
figure 2

Bayesian Kalman Filter (BKF) prediction process

2.3 The basic EC algorithms

The 3DV communication through unreliable channels may encounter large challenges due to unavoidable bit errors or packet losses that deteriorate the decoded 3DV quality. Thus, it is a promising solution to use the EC for efficient 3DV transmission. The EC is an efficacious methodology to resolve the errors through substituting the lost 3DV MBs by already decoded and recovered MBs of the video stream to reduce or eliminate the bit stream errors. The EC techniques utilized in the 2D video system to compensate for the broadcasting errors [9,10,11] can be utilized with particular modifications to deal with corrupted 3DV frames. They are expected to be more valuable for concealment of the transmission errors in the 3DV system as they exploit the inter-view correlations among different 3DV views. The Frame Temporal Replacement Algorithm (FTRA) is a straightforward time concealment algorithm that exchanges the erroneous MBs by the spatial MBs in the relative correlated frame. The Decoder Motion Vector Estimation Algorithm (DMVEA) and Outer Block Boundary Matching Algorithm (OBBMA) are more advanced time EC algorithms [9].

The OBBMA estimates the MVs amongst the external outlines of the substituted MB pixels and the same outer outlines of the corrupted MB pixels. It utilizes only the reference MB external borders to set the most correlated adjacent MVs as shown in Fig. 3a. The DMVEA determines the corrupted MVs of the MBs by employing a complete search in their relative 3DV frames. It is beneficial in characterizing the exchange of MBs that reduces the error distortion limit as shown in Fig. 3b. The DMVEA offers more efficient EC performance than that of the OBBMA with approximately the same computational complexity. Further details of the OBBMA and MVVEA can be found in [9]. The Directional Interpolation Error Concealment Algorithm (DIECA) and Weight Pixel Averaging Interpolation (WPAI) algorithm are applied for the inter-view and spatial EC [11]. The WPAI recovers the lost pixels by utilizing the vertical and horizontal pixels in the adjacent MBs as indicated in Fig. 4a. The DIECA conceals the lost MBs by searching on the neighboring blocks to estimate the object boundary direction, where the largest value of the object boundary orientation is selected to conceal the corrupted MBs as noticed in Fig. 4b. Further details of the DIECA and WPAI can be found in [13].

Fig. 3
figure 3

a Outer block boundary matching, and b decoder motion vector estimation algorithms

Fig. 4
figure 4

a Weighted pixel averaging interpolation, and b directional interpolation algorithms

3 Proposed multi-stage error control algorithms

In this section, the proposed color and depth dynamic error control algorithms are introduced. We present the proposed ER pre-processing algorithms at the encoder side. Also, the proposed post-processing EC algorithms at the decoder side for the corrupted color frames and the proposed depth-assisted EC algorithm for the lost depth frames are explained. Furthermore, the proposed selection WOBMDC algorithm is explained. Finally, the proposed BKF enhancement algorithm is discussed. A full schematic diagram of the framework of the suggested adaptive combined color-plus-depth algorithms is given in Fig. 5 that will be explained in more details in the next sections.

Fig. 5
figure 5

General framework of the suggested 3DV transmission system

3.1 Proposed ER algorithms

The proposed ER procedures are employed to improve the concealment performance at the decoder side to recover the corrupted color and depth frames, sufficiently. They neither add redundant information to the transmitted bit streams nor increase the transmission bit rate. They adaptively exploit the available 3DV content information to preserve the transmitted 3DV bit streams prior to transmission. We use the SSC modes, EFMO mapping, and CAVLC entropy.

The main entropy encoders available for the 2D H.264/AVC system are the Context Adaptive Binary Arithmetic Coding (CABAC) and CAVLC [31]. It is known that the CAVLC is an efficient entropy encoding scheme that provides low delay characteristics for unpreserved bit stream transmission over wireless channels. In this paper, the CAVLC entropy encoding is adopted for 3D H.264/MVC transmission due to its efficient performance under heavy error conditions, and also to its limited processing power. With the CAVLC encoder, synchronization is independent of the received codewords. So, the CAVLC encoding presents a minimal likelihood of codeword synchronization loss.

In order to get better 3DV quality and to limit the loss in the transmitted bit streams consistency and propagation of errors, the SSC tool is employed. Also, we deploy the EFMO technique to stop both of the spatial and temporal error propagation. In this work, the EFMO is used with four slice groups within each encoded frame as shown in Fig. 6. The EFMO method spreads the wrong MBs into the frame structure rather than leaving them to accumulate in certain positions within the corrupted frame. So, it works as an interleaving method to exploit the merit of the successfully decoded surrounding MBs for facilitating the EC process. In the EFMO, successive MBs are transported into various coding group slices to conserve the closeness of MBs. It regularly scrambles losses to the entire video frame to avoid error aggregation in certain locations by using an arrangement that is recognized to the decoder and encoder to distribute the 3DV MBs. Therefore, it enhances the performance of the proposed EC algorithms at the decoder.

Fig. 6
figure 6

Proposed EFMO with four slice groups

3.2 Proposed EC algorithms for color frames

The temporal OBBMA and the spatial WPAI algorithms were proposed for 2D video concealment [31, 32]. They can be exploited to recover the MVs and DVs of the lost 3DV frames. Unfortunately, they fail in the case of heavily erroneous channel conditions, and therefore this will result in decreasing the EC performance [8,9,10,11]. Thus, in this work, an efficient temporal EC scheme named DTMCA is proposed and an effective spatial and inter-view EC algorithm named DIECA is suggested to reconstruct the MVs and DVs of the lost MBs within the 3DV color frames.

The proposed intra- and inter- color frames concealment scenarios are shown in Fig. 7, and explained with more details in the pseudo-code steps of Algorithm (1). In the first view V0, the lost MBs in the first key I frame are reconstructed using the spatial neighboring MB pixels inside the frame itself. The lost MBs in the other B frames inside the V0 view are reconstructed via their relative next and preceding frames among the selfsame video view. In the even V2, V4, and V6 views, the lost MBs in the first key P frame are recovered utilizing the reference inter-view left frames, while the lost MBs in the other B frames are reconstructed via their relative next and preceding frames among the selfsame video view. In the odd V1, V3, and V5 views, the concealment is employed utilizing the reference following and previous frames in the same video view in addition to the right and left frames in the neighboring views. Consequently, excluding the initial key frames, there are four MBs concealment candidates for the corrupted MBs of the inter-frames within the odd views, and there are two MBs concealment candidates for the damaged MBs of the inter-frames inside the even views.

figure e
Fig. 7
figure 7

Framework of the suggested 3DV color frames concealment schemes

The proposed temporal DTMCA divides the MV space of the lost MB into eight different directional textural regions (R, T, L, B, TR, BR, TL, BL) as illustrated in Fig. 8, where the R, B, T, L, BR, TL, TR, and BL denote to the right, bottom, top, left, bottom-right, top-left, top-right, and bottom-left surrounding MBs, respectively. The DTMCA firstly checks the direction of the corrupted MB and its adjacent MBs with the co-located MBs in the reference frame. Then, it calculates eight directional candidate MV groups among the damaged MB and its relative MB that has the same position and moving textural direction as the faulty MB in the reference frame.

Fig. 8
figure 8

The proposed temporal DTMCA scheme

So, the different estimated eight candidate MV groups are given as follows. Group T: {MVref, MVT, MV0}, Group B: {MVref, MVB, MV0}, Group R: {MVref, MVR, MV0}, Group L: {MVref, MVL, MV0}, Group TR: {MVref, MVTR, MVT, MVR, MV0}, Group TL: {MVref, MVTL, MVT, MVL, MV0}, Group BR: {MVref, MVBR, MVB, MVR, MV0}, Group BL: {MVref, MVBL, MVB, MVL, MV0} and Group O: {MVref, MVT, MVTR, MVR, MVBR, MVB, MVBL, MVL, MVTL, MV0, MVavg}. The MVavg given by (3), and the MV0 refer to the average MV and zero MV, respectively, and they are defined for the MBs in the slow-moving pieces and background MBs. The group O is employed, when the lost MB or its reference MB is found on the edge of the moving object.

$$MV_{avg} = (MV_{T} + MV_{B} + MV_{R} + MV_{L} )/4$$
(3)

The DTMCA selects the suitable candidate MV group from the above-mentioned eight estimated groups depending on both the textural directions of the adjoining MBs and the reference MB. So, for instance, if the directions of both the MVref and MVT are assigned in the T region, thus the group T is chosen for concealment. All possible selection conditions of the DTMCA scheme are as follows:

  • Condition 1 If both textural directions of MVT and MVref are defined in T zone, then Group T is chosen.

  • Condition 2 If both textural directions of MVB and MVref are defined in B zone, then Group B is chosen.

  • Condition 3 If both textural directions of MVR and MVref are defined in R zone, then Group R is chosen.

  • Condition 4 If both textural directions of MVL and MVref are defined in L zone, then Group L is chosen.

  • Condition 5 If both textural directions of MVTR and MVref are defined in TR zone, then Group TR is chosen.

  • Condition 6 If both textural directions of MVTL and MVref are defined in TL zone, then Group TL is chosen.

  • Condition 7 If both textural directions of MVBR and MVref are defined in BR zone, then Group BR is chosen.

  • Condition 8 If both textural directions of MVBL and MVref are defined in BL zone, then Group BL is chosen.

  • Else Group O is chosen.

The proposed spatial and inter-view DIECA is used to estimate the DVs of the erroneous MBs in the intra- and inter-frames. It hides the lost MBs by searching on the contiguous pixels and MBs to predict the object boundary orientation, where the largest value of the object boundary orientation is elected to conceal the corrupted MBs as shown in Fig. 9. It calculates the DVs of the lost MBs in the I intra-frames in the V0 view using its reference surrounding pixels. Also, it is used to estimate the DVs of the corrupted MBs within the P inter-frames inside the even views using their reference left inter-view frames. In addition, it is employed to recover the DVs of corrupted MBs of the B-frames in the odd views by their right and left relative frames in the neighboring views.

Fig. 9
figure 9

The proposed spatial and inter-view DIECA scheme

3.3 Proposed EC algorithm for depth frames

In this section, the suggested EIDD-DAEC algorithm is introduced for concealing the depth intra- and inter-frames of the received 3DV streams. A full description of the suggested depth-assisted EC algorithm at the decoder is presented in the pseudo-code steps of Algorithm (2). The proposed EIDD-DAEC algorithm is used to conceal the errors in the probably corrupted 3DV intra- and inter-frames. It has the advantage of neither needing alterations at the encoder side nor increasing the transmission bit rate. Moreover, it does not require the transmission of the depth maps from the encoder to the decoder.

figure f

It was stated in Sect. 2.1 that every frame in the 3DV-PS can be estimated based on the temporal neighboring frames and/or different inter-view frames. Thus, the errors in each corrupted frame will be concealed with different concealment scenarios depending on their locations in the 3DV-PS. Therefore, the proposed EIDD-DAEC algorithm adapts itself to the corruptly received MBs and the type of view. It jointly exploits the available time, space, and inter-view matching to reconstruct the corrupted MBs of the inter- and intra-encoded frames. The transmitted 3DV bit-stream may be exposed to errors in B-, P- or I-frames, and in the view (odd or even view (I-, P- or B-view)). Thus, for the error concealment in the I intra-frames, the EC makes use of the advantage of the available matching in time and space dimensions. On the other hand, for the P and B inter-frame error concealment, the EC is carried out either in the inter-view or/and time directions. Therefore, the proposed depth-assisted algorithm can choose one from different EC scenarios depending on error location to recover the corrupted MBs, as it is obvious in the pseudo-code steps of Algorithm (2).

The MVs and DVs are calculated at the encoder and then transmitted to the decoder. They may be used at the decoder to calculate the depth maps of the received MBs via their reference MBs in the surrounding frames. So, the calculation of the correctly received depth maps of the MB depends on their correctly received MVs and DVs using (4) and (5).

$$\mathop d\nolimits_{MV} (m,n) = k \cdot \sqrt {MV(m,n)_{x}^{2} + MV(m,n)_{y}^{2} }$$
(4)
$$\mathop d\nolimits_{DV} (m,n) = k \cdot \sqrt {DV(m,n)_{x}^{2} + DV(m,n)_{y}^{2} }$$
(5)

The \(MV(m,n)\) and \(DV(m,n)\) are the motion and disparity vectors of the correctly received MBs, respectively, at the location (m, n). The k is a calibration factor, and the subscripts x and y demonstrate the MB horizontal and vertical components, respectively. The spatio-temporal and inter-view depth values of the corruptly received MBs can be determined using their reference MBs in the neighboring frames, as indicated in (6) and (7).

$$D_{MV} (m,n) = k \cdot \sqrt {\frac{{MV_{F - 1} (m,n)_{x}^{2} + MV_{F - 1} (m,n)_{y}^{2} + MV_{F + 1} (m,n)_{x}^{2} + MV_{F + 1} (m,n)_{y}^{2} }}{2}}$$
(6)
$$D_{DV} (m,n) = k \cdot \sqrt {\frac{{DV_{S - 1} (m,n)_{x}^{2} + DV_{S - 1} (m,n)_{y}^{2} + DV_{S + 1} (m,n)_{x}^{2} + DV_{S + 1} (m,n)_{y}^{2} }}{2}}$$
(7)

The subscripts F + 1, F − 1, S + 1, and S − 1 refer to the right, left, up, and down of the reference frames, respectively, as shown in Fig. 10. The calculation of the motion and disparity depth values for the erroneous received MBs is performed using their collocated adjacent MBs. Thus, the DCP and MCP mechanisms are employed on the previously-estimated depth maps, which are calculated by (6) and (7) to estimate the depth DVs and depth MVs, respectively. The Sum of Absolute Difference (SAD) is used as the estimation metric to collect the concealment candidate depth MVs as given in (8) and (9), and to determine the concealment candidate depth DVs that can be found by (10) and (11).

$$SAD_{MV} (m,n) = \sum\limits_{i = - R}^{R - 1} {\sum\limits_{j = - R}^{R - 1} {\left| {D_{F} (m,n) - D_{F - 1} (m + i,n + j)} \right|} }$$
(8)
$$SAD_{MV} (m,n) = \sum\limits_{i = - R}^{R - 1} {\sum\limits_{j = - R}^{R - 1} {\left| {D_{F} (m,n) - D_{F + 1} (m + i,n + j)} \right|} }$$
(9)
$$SAD_{DV} (m,n) = \sum\limits_{i = - R}^{R - 1} {\sum\limits_{j = - R}^{R - 1} {\left| {D_{F} (m,n) - D_{S - 1} (m + i,n + j)} \right|} }$$
(10)
$$SAD_{DV} (m,n) = \sum\limits_{i = - R}^{R - 1} {\sum\limits_{j = - R}^{R - 1} {\left| {D_{F} (m,n) - D_{S + 1} (m + i,n + j)} \right|} }$$
(11)
Fig. 10
figure 10

3DV frames prediction structure

The \(D_{i} (m,n)\) signifies the amount of the depth at position \((m,n)\) of frame i and the R is the search domain. In these formulas above, we suppose that the MBs within the frame F are corrupted. Therefore, the value \(D_{F} (m,n)\) is estimated by (6) or (7). The \(D_{F - 1} (m,n)\) and \(D_{F + 1} (m,n)\) values are calculated by (4). By the way, the \(D_{S - 1} (m,n)\) and \(D_{S + 1} (m,n)\) values are determined by (5).

After the calculation of the recovered candidate depth MVs and DVs of the corruptly received MBs, we use the correctly and corruptly received MBs as discussed previously as shown in the detailed steps of the proposed Algorithm (2). Moreover, to enhance the concealment efficiency, we also gather the co-located texture candidate color MVs and DVs set to the lost MBs inside their reference frames. The texture color MVs and DVs are those associated with the immediate corrupted MBs and their collocated MBs in the reference frames. They can be estimated with the detailed concealment steps that are shown in the pseudo-code steps of Algorithms (3), (4), and (5). Algorithm (3) recovers the color-assisted candidate DVs of the lost MBs inside the I-frame within the I-view or inside the P-frame within the P-view. Algorithm (4) restores the color-assisted candidate MVs of the lost MBs within the B-frame inside the I-view or within the B-frame inside the P-view. Algorithm (5) determines the color-assisted candidate DVs and MVs of the lost MBs in the B-frame inside the B-view.

As discussed previously, the depth MVs and DVs of the lost MBs are derived from the depth maps data values with the self-same locations as the texture color MVs and DVs, whose depth is estimated by (6) and (7). However, we as well combine the depth MVs and DVs associated with the correctly received MBs, whose depth values are estimated by (4) and (5). At this stage, several texture color-assisted candidate MVs and DVs for concealment are available in addition to the estimated depth-assisted candidate MVs and DVs for concealment. Therefore, the DTMCA and DIECA are employed to select the optimum concealment MVs and DVs, respectively, from the whole available color-assisted and depth-assisted candidate MVs and DVs set for concealing the lost depth and color MBs of the received 3DV streams, efficiently.

figure g
figure h
figure i

3.4 Proposed WOBMDC selection algorithm

The convenient color and depth EC scheme depends on the lost MB view and frame types to recover the damaged color and depth MBs. After the incipient EC process, to select the most optimum MVs and DVs among the previously color and depth estimated MVs and DVs, the WOBMDC is employed as a selection stage to improve the visual quality with more accuracy. The WOBMDC scheme is utilized to enhance the initially estimated concealment MBs through predicting more accurate candidate MVs and DVs for concealment of the lost color and depth MBs. The Weighted Overlapping Block Motion Compensation (WOBMC) is employed to optimally enhance the elementary concealed MBs inside the B frame within the I-view or inside the B frame in the P-view, whilst the Weighted Overlapping Block Disparity Compensation (WOBDC) is exploited to improve the rudimentary reconstructed MBs inside the P frame inwards the P-view. On the other hand, the WOBMDC is used to select the optimum MVs and DVs from the initially chosen candidate MVs and DVs to conceal the color and depth MBs within the B frame into the B-view, efficiently. Therefore, the WOBMC, WOBDC, and WOBMDC schemes are exploited to avoid the deblocking effects after the primary concealment process of the corruptly received color and depth MBs. Thus, they are used to retain the spatial smoothness in the previously concealed color and depth MBs. They split the incipient selected candidate MBs that are estimated in the previous color or depth EC stages into four 8 × 8 blocks, and then each pixel within each 8 × 8 block is concealed separately using the surrounding MB pixels and predefined weighting matrices using (12) [35, 36].

$$p(i,j) = p_{e} (i,j){\mathbf{H}}_{e} (i,j) + p_{ab} (i,j){\mathbf{H}}_{ab} (i,j) + p_{lr} (i,j){\mathbf{H}}_{lr} (i,j) + 4 > > 3$$
(12)
$${\mathbf{H}}_{\text{e}} = \left[ {\begin{array}{*{20}l} {\begin{array}{*{20}c} {\begin{array}{*{20}c} 4 & 5 \\ 5 & 5 \\ \end{array} } & {\begin{array}{*{20}c} 5 & 5 \\ 5 & 5 \\ \end{array} } \\ {\begin{array}{*{20}c} 5 & 5 \\ 5 & 5 \\ \end{array} } & {\begin{array}{*{20}c} 6 & 6 \\ 6 & 6 \\ \end{array} } \\ \end{array} } \hfill & {\begin{array}{*{20}c} {\begin{array}{*{20}c} 5 & 5 \\ 5 & 5 \\ \end{array} } & {\begin{array}{*{20}c} 5 & 4 \\ 5 & 5 \\ \end{array} } \\ {\begin{array}{*{20}c} 6 & 6 \\ 6 & 6 \\ \end{array} } & {\begin{array}{*{20}c} 5 & 5 \\ 5 & 5 \\ \end{array} } \\ \end{array} } \hfill \\ {\begin{array}{*{20}c} {\begin{array}{*{20}c} 5 & 5 \\ 5 & 5 \\ \end{array} } & {\begin{array}{*{20}c} 6 & 6 \\ 6 & 6 \\ \end{array} } \\ {\begin{array}{*{20}c} 5 & 5 \\ 4 & 5 \\ \end{array} } & {\begin{array}{*{20}c} 5 & 5 \\ 5 & 5 \\ \end{array} } \\ \end{array} } \hfill & {\begin{array}{*{20}c} {\begin{array}{*{20}c} 6 & 6 \\ 6 & 6 \\ \end{array} } & {\begin{array}{*{20}c} 5 & 5 \\ 5 & 5 \\ \end{array} } \\ {\begin{array}{*{20}c} 5 & 5 \\ 5 & 5 \\ \end{array} } & {\begin{array}{*{20}c} 5 & 5 \\ 5 & 4 \\ \end{array} } \\ \end{array} } \hfill \\ \end{array} } \right],\;{\mathbf{H}}_{\text{ab}} = \left[ {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\begin{array}{*{20}c} 2 & 2 \\ 1 & 1 \\ \end{array} } & {\begin{array}{*{20}c} 2 & 2 \\ 2 & 2 \\ \end{array} } \\ {\begin{array}{*{20}c} 1 & 1 \\ 1 & 1 \\ \end{array} } & {\begin{array}{*{20}c} 1 & 1 \\ 1 & 1 \\ \end{array} } \\ \end{array} } & {\begin{array}{*{20}c} {\begin{array}{*{20}c} 2 & 2 \\ 2 & 2 \\ \end{array} } & {\begin{array}{*{20}c} 2 & 2 \\ 1 & 1 \\ \end{array} } \\ {\begin{array}{*{20}c} 1 & 1 \\ 1 & 1 \\ \end{array} } & {\begin{array}{*{20}c} 1 & 1 \\ 1 & 1 \\ \end{array} } \\ \end{array} } \\ {\begin{array}{*{20}c} {\begin{array}{*{20}c} 1 & 1 \\ 1 & 1 \\ \end{array} } & {\begin{array}{*{20}c} 1 & 1 \\ 1 & 1 \\ \end{array} } \\ {\begin{array}{*{20}c} 1 & 1 \\ 2 & 2 \\ \end{array} } & {\begin{array}{*{20}c} 2 & 2 \\ 2 & 2 \\ \end{array} } \\ \end{array} } & {\begin{array}{*{20}c} {\begin{array}{*{20}c} 1 & 1 \\ 1 & 1 \\ \end{array} } & {\begin{array}{*{20}c} 1 & 1 \\ 1 & 1 \\ \end{array} } \\ {\begin{array}{*{20}c} 2 & 2 \\ 2 & 2 \\ \end{array} } & {\begin{array}{*{20}c} 1 & 1 \\ 2 & 2 \\ \end{array} } \\ \end{array} } \\ \end{array} } \right],\;{\mathbf{H}}_{\text{lr}} = \left[ {\begin{array}{*{20}c} {\begin{array}{*{20}c} {\begin{array}{*{20}c} 2 & 1 \\ 2 & 2 \\ \end{array} } & {\begin{array}{*{20}c} 1 & 1 \\ 1 & 1 \\ \end{array} } \\ {\begin{array}{*{20}c} 2 & 2 \\ 2 & 2 \\ \end{array} } & {\begin{array}{*{20}c} 1 & 1 \\ 1 & 1 \\ \end{array} } \\ \end{array} } & {\begin{array}{*{20}c} {\begin{array}{*{20}c} 1 & 1 \\ 1 & 1 \\ \end{array} } & {\begin{array}{*{20}c} 1 & 2 \\ 2 & 2 \\ \end{array} } \\ {\begin{array}{*{20}c} 1 & 1 \\ 1 & 1 \\ \end{array} } & {\begin{array}{*{20}c} 2 & 2 \\ 2 & 2 \\ \end{array} } \\ \end{array} } \\ {\begin{array}{*{20}c} {\begin{array}{*{20}c} 2 & 2 \\ 2 & 2 \\ \end{array} } & {\begin{array}{*{20}c} 1 & 1 \\ 1 & 1 \\ \end{array} } \\ {\begin{array}{*{20}c} 2 & 2 \\ 2 & 1 \\ \end{array} } & {\begin{array}{*{20}c} 1 & 1 \\ 1 & 1 \\ \end{array} } \\ \end{array} } & {\begin{array}{*{20}c} {\begin{array}{*{20}c} 1 & 1 \\ 1 & 1 \\ \end{array} } & {\begin{array}{*{20}c} 2 & 2 \\ 2 & 2 \\ \end{array} } \\ {\begin{array}{*{20}c} 1 & 1 \\ 1 & 1 \\ \end{array} } & {\begin{array}{*{20}c} 2 & 2 \\ 1 & 2 \\ \end{array} } \\ \end{array} } \\ \end{array} } \right]$$
(13)

\(\forall \left( {i,\varvec{ }j} \right) \in 8 \times 8\;{\text{block}}\;{\text{of}}\;{\text{primary}}\;{\text{recovered}}\;{\text{MB}},\) the Pe is the best estimated pixel involved in the 8 × 8 MB of the initially reconstructed MB, the Pab is the estimated pixel of the 8 × 8 MB at the neighborhood above and below the present MB and the Plr corresponds to the predicted pixel of the 8 × 8 MB at the neighborhood right and left of the current MB. The He, Hab, and Hlr are their weighting matrices, which are given by (13), the P is the definitive reconstructed pixels for the corrupted MB, and “≫” is the shift-right bit operation.

3.5 Proposed BKF refinement algorithm

In this section, the proposed BKF enhancement algorithm is introduced. It is employed for refining the previously-estimated optimum chosen color and depth MVs and DVs by the WOBMDC algorithm. It is used for eliminating the measurement immanent errors among the optimum chosen candidate MVs and DVs. Thus, the BKF is used as a refinement stage for further improvement of the 3DV visual quality.

We assume that the desired predicted MV or DV is defined by the state vector s = [sn, sm], and that the previously recovered measurement MV or DV is given by y = [yn, ym], where m and n represent the height and width of the MB [34]. The desirable outcome s(k) of kth frame is assumed to be selected from that case at frame k − 1 using (14). Likewise, the observation measurement state y(k) of the state vector s(k) is calculated by (15). So, the state and measurement model for the MV or DV prediction can be described as in (14) and (15). In the proposed algorithm, we assume that the constant process w(k) and observation r(k) noises follow zero mean white Gaussian noise distributions, where we set them with constant covariance of \(\sigma^{2}_{w} = \sigma_{r}^{2} = 0.5I_{2}\) (I2 is the unity matrix of size two). Also, we adopted the constant transition A(k) and measurement H(k) matrices as two-dimensional unity matrices, where we assume that there is a high correlation between the horizontally and vertically neighboring MBs.

The BKF optimization process can be characterized by the prediction and updating steps as discussed in Sect. 2.2, and shown in Fig. 2. The detailed proposed BKF refinement mechanism can be summarized in the following three phases:

  • Phase (1): Initialization phase

    1. 1.

      Define the previously measured noisy DTMCA-estimated MVs or/and DIECA-estimated DVs.

    2. 2.

      Set the initial state and covariance vectors as in (16).

  • Phase (2): Prediction phase

    1. 1.

      Calculate the predicted priori estimated state (MV or DV) \({\hat{\mathbf{s}}}_{{\bar{k}}}\) of frame k using the previous (MV or DV of the MB) updated state \({\hat{\mathbf{s}}}_{k - 1}\) of frame k-1 as introduced in (17).

    2. 2.

      Calculate the estimated priori predicted error covariance \({\mathbf{p}}_{{\bar{k}}}\) utilizing the updated estimate covariance \({\mathbf{p}}_{k - 1}\) and the noise covariance process \(\sigma^{2}_{w}\) as presented in (18).

  • Phase (3): Updating phase

    1. 1.

      Calculate the updated estimated state (finally estimated MV or DV) \({\hat{\mathbf{s}}}_{k}\) employing the measurement state \(\mathop {\mathbf{y}}\nolimits_{{_{k} }}\)(the recovered DTMCA-estimated MVs or/and DIECA-estimated DVs) and the observation noise covariance \(\sigma_{r}^{2}\) as shown in (19).

    2. 2.

      Calculate the updated filtered error covariance \(\mathop {\mathbf{p}}\nolimits_{k}\) via the predicted estimate covariance \({\mathbf{p}}_{{\bar{k}}}\) and the noise covariance observation \(\sigma_{r}^{2}\) as in (20), where the BKF gain \(\mathop K\nolimits_{k}\) is given by (21).

Therefore, for each lost color and depth MB indexed according to its position in the corrupted frame and view, the above three phases are applied as a refinement process. The proposed BKF steps are repeated for all corrupted MBs until we get the finally estimated optimal concealment candidate MVs or/and DVs as given in (19). Such a mechanism repeats until all corrupted color and depth MBs are concealed. Thus, if the previously-estimated MVs and DVs are not reliable for fully optimum EC due to heavy error and low available surrounding and boundary pixel information, they are then enhanced by the best predicted MVs and DVs via the proposed BKF model to fulfill high objective and subjective 3DV quality.

$$\left[ {\begin{array}{*{20}l} {\mathop s\nolimits_{{\mathop n\nolimits_{k} }} } \hfill \\ {\mathop s\nolimits_{{\mathop m\nolimits_{k} }} } \hfill \\ \end{array} } \right] = \left[ {\begin{array}{*{20}l} 1 \hfill & 0 \hfill \\ 0 \hfill & 1 \hfill \\ \end{array} } \right]\left[ {\begin{array}{*{20}l} {\mathop s\nolimits_{{\mathop n\nolimits_{k - 1} }} } \hfill \\ {\mathop s\nolimits_{{\mathop m\nolimits_{k - 1} }} } \hfill \\ \end{array} } \right]\left[ {\begin{array}{*{20}l} {\mathop w\nolimits_{{\mathop n\nolimits_{k - 1} }} } \hfill \\ {\mathop w\nolimits_{{\mathop m\nolimits_{k - 1} }} } \hfill \\ \end{array} } \right];\quad \mathop w\nolimits_{{\mathop n\nolimits_{k - 1} }} ,\;\mathop w\nolimits_{{\mathop m\nolimits_{k - 1} }} \sim N\left( {0,\mathop \sigma \nolimits_{w}^{2} } \right)$$
(14)
$$\left[ {\begin{array}{*{20}l} {\mathop y\nolimits_{{\mathop n\nolimits_{k} }} } \hfill \\ {\mathop y\nolimits_{{\mathop m\nolimits_{k} }} } \hfill \\ \end{array} } \right] = \left[ {\begin{array}{*{20}l} 1 \hfill & 0 \hfill \\ 0 \hfill & 1 \hfill \\ \end{array} } \right]\left[ {\begin{array}{*{20}l} {\mathop s\nolimits_{{\mathop n\nolimits_{k} }} } \hfill \\ {\mathop s\nolimits_{{\mathop m\nolimits_{k} }} } \hfill \\ \end{array} } \right]\left[ {\begin{array}{*{20}l} {\mathop r\nolimits_{{\mathop n\nolimits_{k} }} } \hfill \\ {\mathop r\nolimits_{{\mathop m\nolimits_{k} }} } \hfill \\ \end{array} } \right];\quad \mathop r\nolimits_{{\mathop n\nolimits_{k} }} ,\;\mathop r\nolimits_{{\mathop m\nolimits_{k} }} \sim N\left( {0,\mathop \sigma \nolimits_{r}^{2} } \right)$$
(15)
$$\left[ {\begin{array}{*{20}l} {\mathop s\nolimits_{{\mathop n\nolimits_{0} }} } \hfill \\ {\mathop s\nolimits_{{\mathop m\nolimits_{0} }} } \hfill \\ \end{array} } \right] = \left[ {\begin{array}{*{20}l} 0 \hfill \\ 0 \hfill \\ \end{array} } \right],\quad \left[ {\begin{array}{*{20}l} {\mathop p\nolimits_{{\mathop n\nolimits_{0} }} } \hfill \\ {\mathop p\nolimits_{{\mathop m\nolimits_{0} }} } \hfill \\ \end{array} } \right] = \left[ {\begin{array}{*{20}l} 0 \hfill \\ 0 \hfill \\ \end{array} } \right]$$
(16)
$$\left[ {\begin{array}{*{20}l} {\hat{s}_{{\mathop n\nolimits_{{\bar{k}}} }} } \hfill \\ {\hat{s}_{{\mathop m\nolimits_{{\bar{k}}} }} } \hfill \\ \end{array} } \right] = \left[ {\begin{array}{*{20}l} 1 \hfill & 0 \hfill \\ 0 \hfill & 1 \hfill \\ \end{array} } \right]\left[ {\begin{array}{*{20}l} {\mathop {\hat{s}}\nolimits_{{\mathop n\nolimits_{k - 1} }} } \hfill \\ {\mathop {\hat{s}}\nolimits_{{\mathop m\nolimits_{k - 1} }} } \hfill \\ \end{array} } \right]$$
(17)
$$\left[ {\begin{array}{*{20}l} {p_{{\mathop n\nolimits_{{\bar{k}}} }} } \hfill \\ {p_{{\mathop m\nolimits_{{\bar{k}}} }} } \hfill \\ \end{array} } \right] = \left[ {\begin{array}{*{20}l} 1 \hfill & 0 \hfill \\ 0 \hfill & 1 \hfill \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {\mathop p\nolimits_{{\mathop n\nolimits_{k - 1} }} } \\ {\mathop p\nolimits_{{\mathop m\nolimits_{k - 1} }} } \\ \end{array} } \right]\left[ {\begin{array}{*{20}l} 1 \hfill & 0 \hfill \\ 0 \hfill & 1 \hfill \\ \end{array} } \right]^{T} + \mathop \sigma \nolimits_{w}^{2}$$
(18)
$$\left[ {\begin{array}{*{20}l} {\hat{s}_{{\mathop n\nolimits_{{\bar{k}}} }} } \hfill \\ {\hat{s}_{{\mathop m\nolimits_{{\bar{k}}} }} } \hfill \\ \end{array} } \right] = \left[ {\begin{array}{*{20}l} {\hat{s}_{{\mathop n\nolimits_{{\bar{k}}} }} } \hfill \\ {\hat{s}_{{\mathop m\nolimits_{{\bar{k}}} }} } \hfill \\ \end{array} } \right] + \left[ {\begin{array}{*{20}l} {\mathop k\nolimits_{{\mathop n\nolimits_{k} }} \cdot \left( {\mathop y\nolimits_{{\mathop n\nolimits_{k} }} - \hat{s}_{{\mathop n\nolimits_{{\bar{k}}} }} } \right)} \hfill \\ {\mathop k\nolimits_{{\mathop m\nolimits_{k} }} \cdot \left( {\mathop y\nolimits_{{\mathop m\nolimits_{k} }} - \hat{s}_{{\mathop m\nolimits_{{\bar{k}}} }} } \right)} \hfill \\ \end{array} } \right]$$
(19)
$$\left[ {\begin{array}{*{20}l} {\mathop p\nolimits_{{\mathop n\nolimits_{k} }} } \hfill \\ {\mathop p\nolimits_{{\mathop m\nolimits_{k} }} } \hfill \\ \end{array} } \right] = \left[ {\begin{array}{*{20}l} {(I - k_{{n_{k} }} ) \cdot p_{{\mathop n\nolimits_{{\bar{k}}} }} } \hfill \\ {(I - k_{{m_{k} }} ) \cdot p_{{\mathop m\nolimits_{{\bar{k}}} }} } \hfill \\ \end{array} } \right]$$
(20)
$$\left[ {\begin{array}{*{20}l} {\mathop k\nolimits_{{\mathop n\nolimits_{k} }} } \hfill \\ {\mathop k\nolimits_{{\mathop m\nolimits_{k} }} } \hfill \\ \end{array} } \right] = \left[ {\begin{array}{*{20}l} {{{p_{{\mathop n\nolimits_{{\bar{k}}} }} } \mathord{\left/ {\vphantom {{p_{{\mathop n\nolimits_{{\bar{k}}} }} } {\left( {p_{{\mathop n\nolimits_{{\mathop k\limits^{ - } }} }} + \mathop \sigma \nolimits_{r}^{2} } \right)}}} \right. \kern-0pt} {\left( {p_{{\mathop n\nolimits_{{\mathop k\limits^{ - } }} }} + \mathop \sigma \nolimits_{r}^{2} } \right)}}} \hfill \\ {{{p_{{\mathop m\nolimits_{{\bar{k}}} }} } \mathord{\left/ {\vphantom {{p_{{\mathop m\nolimits_{{\bar{k}}} }} } {\left( {p_{{\mathop m\nolimits_{{\mathop k\limits^{ - } }} }} + \mathop \sigma \nolimits_{r}^{2} } \right)}}} \right. \kern-0pt} {\left( {p_{{\mathop m\nolimits_{{\mathop k\limits^{ - } }} }} + \mathop \sigma \nolimits_{r}^{2} } \right)}}} \hfill \\ \end{array} } \right]$$
(21)

4 Simulation results and discussion

4.1 Performance evaluation

To assess the performance of the suggested multi-stage color and depth error control algorithms, we have carried out several simulation tests on standard well-known 3DV (Kendo, Ballet, and Balloons) sequences [37]. The resolution of the standard test sequences is 1024 × 768. The three tested 3DV sequences have different characteristics. The Kendo stream is a moderate animated video, the Ballet sequence is a fast-moving video, and the Balloons sequence is a slow-moving video. For each 3DV sequence, eight views with 250 frames in each view are encoded. The frame rate is 20 frames per second (fps). In the test experiments, we consider that the first frame of the GOP is the I-frame, and the size of the GOP is 8. The Quantization Parameters (QPs) values for the 3DV sequences used in the experiments are 27, 32, and 37. For each sequence, the encoded 3D H.264/MVC bit streams are produced by employing the proposed ER tools at the encoder, and after that transmitted through the wireless channel with different PLRs of 10, 20, 30, 40, and 50%. The received bit streams are then subsequently decoded and concealed via the proposed multi-stage EC algorithms at the decoder. In the simulation work, the reference JMVC codec [38] is employed, depending on the 2D video codec [31]. The encoding simulation parameters which are utilized in this work are selected depending on the JVT well-known experiment [37]. The Peak Signal-to-Noise Ratio (PSNR) is utilized for measuring the objective quantity of the concealed and recovered MBs to appraise the efficiency of the suggested algorithms.

To clarify the influence of employing the suggested error control algorithms for recovering the corruptly-received color and depth MBs, we compare their performance to the case of utilizing the ER and color-plus-depth EC techniques without the WOBMDC and BKF schemes (ER/Color-based EC/Depth-based EC/No-WOBMDC/No-BKF). Also, we compare the proposed algorithms with the case of utilizing the ER, color-plus-depth EC, and WOBMDC algorithms and without the BKF algorithm (ER/Color-based EC/Depth-based EC/WOBMDC/No-BKF), and with the case when none of the error control methods are applied (No-ER/No Color-based EC/No Depth-based EC/No-WOBMDC/No-BKF). In the introduced simulation results, the ER/Color-based EC/Depth-based EC/WOBMDC/BKF algorithms refer to the suggested multi-stage error control algorithms. Figures 11, 12 and 13 present the color-plus-depth visual subjective simulation results comparison at different QPs of 37, 32, and 27 and different severe PLRs of 30, 40, and 50% for the color and depth intra- and inter-frames of the selected three 3DV (Kendo, Ballet, and Balloons) sequences. For each sequence, we choose miscellaneous erroneous frame positions and types (I, B, P-frame) inside different views locations (I, B, P-view) to show the effectiveness of the proposed schemes in ameliorating the heavy lost color and depth MBs inside whichever frame or within whatever view for various 3DV characteristics at different PLRs.

Fig. 11
figure 11

Visual subjective results of the chosen color and depth 41st intra- I frame in I view (V0) of the “Kendo” sequence: a error-free original I41 color frame, b erroneous I41 color frame with PLR = 30%, QP = 37, c concealed I41 color frame with the case of ER/Color-assisted EC/Depth-assisted EC/No-WOBMDC/No-BKF, d concealed I41 color frame with the case of ER/Color-assisted EC/Depth-assisted EC/WOBMDC/No-BKF, e concealed I41 color frame with the case of ER/Color-assisted EC/Depth-assisted EC/WOBMDC/BKF, f error-free original I41 depth frame, g erroneous I41 depth frame with PLR = 30%, QP = 37, h concealed I41 depth frame with the case of ER/Depth-assisted EC/Color-assisted EC/No-WOBMDC/No-BKF, i concealed I41 depth frame with the case of ER/Depth-assisted EC/Color-assisted EC/WOBMDC/No-BKF, j concealed I41 depth frame with the case of ER/Depth-assisted EC/Color-assisted EC/WOBMDC/BKF

Fig. 12
figure 12

Visual subjective results of the chosen color and depth 81st inter- B frame in B view (V1) of the “Ballet” sequence: a error-free original B81 color frame, b erroneous B81 color frame with PLR = 40%, QP = 32, c concealed B81 color frame with the case of ER/Color-assisted EC/Depth-assisted EC/No-WOBMDC/No-BKF, d concealed B81 color frame with the case of ER/Color-assisted EC/Depth-assisted EC/WOBMDC/No-BKF, e concealed B81 color frame with the case of ER/Color-assisted EC/Depth-assisted EC/WOBMDC/BKF, f error-free original B81 depth frame, g erroneous B81 depth frame with PLR = 40%, QP = 32, h concealed B81 depth frame with the case of ER/Depth-assisted EC/Color-assisted EC/No-WOBMDC/No-BKF, i concealed B81 depth frame with the case of ER/Depth-assisted EC/Color-assisted EC/WOBMDC/No-BKF, j concealed B81 depth frame with the case of ER/Depth-assisted EC + Color-assisted EC/WOBMDC/BKF

Fig. 13
figure 13

Visual subjective results of the chosen color and depth 57th inter- P frame in P view (V2) of the “Balloons” sequence: a error-free original P57 color frame, b erroneous P57 color frame with PLR = 50%, QP = 27, c concealed P57 color frame with the case of ER/Color-assisted EC/Depth-assisted EC/No-WOBMDC/No-BKF, d concealed P57 color frame with the case of ER/Color-assisted EC/Depth-assisted EC/WOBMDC/No-BKF, e concealed P57 color frame with the case of ER/Color-assisted EC/Depth-assisted EC/WOBMDC/BKF, f error-free original P57 depth frame, g erroneous P57 depth frame with PLR = 50%, QP = 27, h concealed P57 depth frame with the case of ER/Depth-assisted EC/Color-assisted EC/No-WOBMDC/No-BKF, i concealed P57 depth frame with the case of ER/Depth-assisted EC/Color-assisted EC/WOBMDC/No-BKF, j concealed P57 depth frame with the case of ER/Depth-assisted EC/Color-assisted EC/WOBMDC/BKF

Figure 11 introduces the color and depth subjective visual results of the 3DV Kendo sequence at channel PLR = 30% and QP = 37 for the chosen 41st intra-encoded color and depth I frame inside the I view V0. We recovered the tested 41st color and depth intra-frames with the suggested ER and color-plus-depth EC schemes, without applying the WOBMDC and BKF schemes as introduced in Fig. 11c, h, as well as when utilizing the ER, color-plus-depth EC, and WOBMDC algorithms and without the BKF algorithm as shown in Fig. 11d, i, and finally with the case of using the proposed ER/Color-based EC/Depth-based EC/WOBMDC/BKF schemes as given in Fig. 11e, j. We observe that the subjective results, which are given in Fig. 11e, j of the color and depth frames for the suggested algorithms outperform the results of the other schemes, which are introduced in Fig. 11c, h, d, i.

Figure 12 shows the subjective visual results of the color and depth frames of the 3DV Ballet sequence with channel PLR = 40% and QP = 32. We have chosen the 81st inter-encoded frames as trail color and depth B frames within the B-view V1. From the shown concealed color and depth frames results in Fig. 12e, j compared to those offered in Fig. 12c, d, h, i, we notice the prominence of exploiting the suggested multi-stage schemes in contrast to the other error control algorithms for recovering the corrupted color-plus-depth MBs efficiently in the case of heavily erroneous channel conditions such as the presented state of PLR = 40%. Figure 13 introduces the subjective visual results of the color and depth frames of the 3DV Balloons stream at channel PLR = 50% and QP = 27 for the chosen 57th inter-encoded color and depth P-frames inside the P-view V2. From both introduced color frames results in Fig. 13e, compared to those introduced in Fig. 13c, d, and also from the visual results of the presented depth frames given in Fig. 13j, compared to those shown in Fig. 13h, i, we observe that the subjective visual results of the proposed ER/Color-based EC/Depth-based EC/WOBMDC/BKF schemes outperform the cases of utilizing the ER/Color-based EC/Depth-based EC/No-WOBMDC/No-BKF and the ER/Color-based EC/Depth-based EC/WOBMDC/No-BKF algorithms. Thus, the significance of exploiting the full proposed multi-stage error control algorithms is clear for concealing the lost color-plus-depth MBs efficiently, especially in the case of heavily erroneous channel conditions such as that of the case of PLR = 50%.

In Figs. 14, 15 and Table 1, we compare the objective PSNR values for the tested 3DV sequences at different PLRs and QP values of 27, 32, and 37. We compare the PSNR results of the suggested error control schemes (ER/Color-based EC/Depth-based EC/WOBMDC/BKF) with the case of utilizing the ER/Color-based EC/Depth-based EC/No-WOBMDC/No-BKF schemes, with the case of utilizing the ER/Color-based EC/Depth-based EC/WOBMDC/No-BKF algorithms, and with the case when no error control methods (No-ER/No-Color-based EC/No-Depth-based EC/No-WOBMDC/No-BKF) are applied. From all tested 3DV sequences, we perceive that the suggested ER/Color-based EC/Depth-based EC/WOBMDC/BKF schemes always achieve superior PSNR values. It can be realized that the proposed multi-stage algorithms have meaningful average gain in objective PSNR for all tested 3DV sequences about 0.45–0.80 dB, and 1.15–2.10 dB at different QPs and PLRs over the case of harnessing the ER/Color-based EC/Depth-based EC/WOBMDC/No-BKF schemes, and the case of applying the ER/Color-based EC/Depth-based EC/No-WOBMDC/No-BKF schemes, respectively. We also notice that the proposed algorithms are crucial as they provide about 12.55–23.75 dB average PSNR gain at different QPs and PLRs compared to the case of not bestowing error control algorithms (No-ER/No-Color-based EC/No-Depth-based EC/No-WOBMDC/No-BKF).

Fig. 14
figure 14

PSNR (dB) objective results of the three tested. a Kendo. b Ballet and c Balloons 3DV streams at different PLRs and QP = 27

Fig. 15
figure 15

PSNR (dB) objective results of the three tested. a Kendo. b Ballet and c Balloons 3DV streams at different PLRs and QP = 37

Table 1 PSNR (dB) objective results of the three tested Kendo, Ballet, and Balloons videos at various PLRs and QP = 32

It is noticed from all the presented objective and subjective results that exploiting the proposed multi-stage error control schemes is more recommended in the case of heavily erroneous channel conditions. They have the best objective and visual results compared to the other introduced algorithms. The results demonstrate the significance of deploying the proposed WOBMDC selection and BKF refinement algorithms to reinforce the subjective video quality. In addition, they achieve a significant gain in the objective PSNR. Furthermore, it is observed that the suggested algorithms provide efficient experimental results for the 3DV streams with different characteristics, and they can recover any erroneous frame type inside any view position.

4.2 Comparative analysis and discussions

In order to verify and demonstrate the effectiveness of the proposed multi-stage error control algorithms for reliable 3DV transmission over heavy-packet-loss channels, various experiments have been carried out to compare the results of the proposed algorithms with those of the state-of-the-art algorithms using the 3D H.264/MVC reference software. We compared the performance of the proposed algorithms with those of the FTRA [9], Disparity Vector Extrapolation (DVE) [15], Hybrid Motion Vector Extrapolation (HMVE) [39], a hybrid algorithm of WPAI and OBBMA [8], joint DIECA and DMVEA [10], a hybrid method of DIECA and DMVEA EC techniques employing the Dispersed Flexible Macro-block Ordering (DFMO) error resilience technique [11], and the four-directional MV/DV extrapolation for whole frame loss concealment algorithm [14]. All these algorithms do not employed the ER, depth-assisted EC, WOBMDC, and BKF. On the other hand, in the proposed algorithms, we exploited color-plus-depth-assisted MVs and DVs concealment in addition to the ER tools, the refinement WOBMDC algorithm, and the BKF selection algorithm.

We have run the comparative simulation experiments with the Ballet, Kendo, and Balloons standardized test sequences with the same simulation parameters, which were introduced in Sect. 4.1 at PLR = 40%. All simulation comparison tests have been performed using Intel® Core™i7-4500U CPU @1.80 and 2.40 GHz with 8 GB RAM and running with Windows 10 64-bit operating system. The PSNR and average frame execution time numerical results are presented to validate the performance of the suggested algorithms for a variety of experimental configurations. Table 2 shows the average PSNR and average frame execution time values of the proposed algorithms compared to the FTRA, DVE, HMVE algorithms, [8, 10, 11, 14] at QP = 27, 32, and 37, respectively. It can be seen that the proposed algorithms outperform the state-of-the-art methods in all experimental configurations. It is observed that with the proposed error control algorithms, the recovery quality can be further enhanced by exploiting the depth-assisted concealment in addition to exploiting the color-assisted concealment, the proposed ER tools, the WOBMDC algorithm, and the BKF algorithm. Also, it is noticed that the proposed algorithms have a little additional increase in the computational cost of execution compared to those of the literature algorithms. So, the execution times of the proposed algorithms may be acceptable for online and real-time 3DV communication applications.

Table 2 The objective average PSNR (dB) and the average execution time comparison between the proposed algorithms and the state-of-the-art EC works on the tested Ballet, Kendo, and Balloons 3DV sequences at different QPs and PLR = 40%

We verified the achievements of the proposed error control algorithms compared to those of the state-of-the-art algorithms exploiting the depth-aided concealment [24, 29, 30]. The literature algorithms exploited the depth-assisted concealment besides the color-assisted concealment, but they extract and encode the depth maps data at the encoder and transmit them through the network. So, they increase the transmission bit rate and the computational complexity of the 3DV communication system, and thus they will not be compatible with the resource constraints of the future wireless communication networks. We have run the comparative simulation experiments on the Kendo and Balloons 3DV test sequences at QP = 27 and transmission PLR = 20%. Table 3 shows the average PSNR comparison of the proposed algorithms and the state-of-the-art depth-based EC algorithms [24, 29, 30]. It is noticed that the proposed algorithms outperform the state-of-the-art methods, and they achieve further objective quality improvements by exploiting the color-plus-depth concealment in addition to using the proposed ER tools, the WOBMDC refinement algorithm, and the BKF selection algorithm. Also, the proposed algorithms have the advantage of extrapolating the depth map values at the decoder not at the encoder, and thus they are better than the literature works. Moreover, the proposed algorithms have a little increase in the execution time compared to the related works, and thus they can be acceptable and compatible with real-time 3DV transmission applications.

Table 3 The objective average PSNR (dB) and average execution time comparison between the proposed algorithms and the state-of-the-art depth-assisted EC methods on the tested Kendo and Balloons 3DV sequences at PLR = 20% and QP = 27

5 Conclusions

This paper proposed efficient multi-stage error control algorithms for 3D video transmission over severe erroneous wireless channels to recover the lost color and depth MBs of the 3DV encoded intra- and inter-frames. The proposed algorithms achieve sufficient objective and subjective qualities, while saving the transmission bit rate as it is and without increasing the latency or requiring any difficult modifications in the encoder side. Moreover, the proposed color-plus-depth EC algorithms adaptively select the appropriate EC hypothesis mode based on the corrupted color or depth MBs size and the faulty frame and view. Furthermore, they are more desirable and reliable in the case of heavy and severe lossy channel conditions and they outperform the performance of the state-of-the-art error control techniques. Also, the proposed efficient ER tools neither add any redundant data nor increase the transmission bit rate. The proposed WOBMDC and BKF algorithms avoid the deblocking effects of the erroneous color-plus-depth MBs, and thus they are used to retain the spatial smoothness in the corrupted MB concealment. Experimental simulation results manifest the heartening achievements of the proposed algorithms on concealing heavily corrupted color and depth 3DV sequences with high PLRs. The proposed algorithms have proved their capability to adequately conceal various standard 3DV streams that have different temporal and spatial characteristics with high 3DV quality. Our simulation results expounded the prominence of deploying the proposed algorithms to reinforce the 3DV subjective quality and likewise acquire a considerable objective PSNR gain compared to those of the state-of-the-art algorithms.