1 Introduction

In recent years, evolution in the field of computing, networking and communications made immersive systems popular. Tele-immersion is sought after to be the next generation of communications. Immersive audio/visual systems provide the end users a level of multimodal multimedia intercommunications that cannot be achieved by conventional 2D systems.

The research projects like blue-c [10] at ETH Zurich and TEEVE (Tele-immersive Environments for Everybody) [17] at the University of Illinois and the University of California at Berkeley have provided a platform for the development real time immersive system. These projects aim to provide users an immersive environment using the existing three-dimensional (3D) capture, transmit and visualization tools. The significant challenges associated with the transmission of 3D videos have been overcome with the advancement in both video compression and transmission technologies like MPEG, H.264/AVC and 4G LTE, respectively.

The proliferation of immersive audio/visual applications, urge for the secure transmission of multimedia content, especially over the wireless networks. One of the most challenging security issues is to guarantee the integrity of media content against the malicious intruder (eavesdropper). It is very easy to eavesdrop on wireless medium due to its broadcast nature. A commonly used model to characterize physical-layer security schemes was presented in [16]. Specifically, a transmitter (called Alice) expects to transmit secret message to the legitimate receiver (called Bob), but does not want the message to be overheard by the eavesdropper (called Eve). The eavesdropper is assumed to be passive, and thus, it is hard to eliminate eavesdropper to access wireless networks.

By means of cryptographic primitives, the secure transmission of wireless multimedia content mainly depends on upper layers processing such as transport layer, network layer, and application layer, while physical layer security [18] is often overlooked. Figure 1 shows a typical example of multimedia communications over wireless channel, where the multimedia content is transmitted over the wireless channel from a transmitter to a receiver in the presence of a passive eavesdropper. Secrecy in typical wireless networks is usually achieved by key-based enciphering techniques at upper protocol layers. Particularly, a secret key is shared between the transmitter and the legitimate receiver to encrypt and decrypt the data. It is widely assumed to be computationally infeasible for the eavesdropper to decipher when lacking the knowledge of secret key. However, encryption techniques like ciphers are insufficient due to the continual growth of computational power. The main idea behind this paper is to limit the amount of information that can be extracted by unauthorized user by physical-layer techniques rather than encryption techniques. Our goal is to decrease the information leaked to the eavesdropper at bit level by means of physical-layer security, with no limitations assumed on the channel quality and computational resources of unauthorized receiver. We present a novel noise aggregation method for the secure transmission of data to legitimate user over the wireless channel. Our proposed method is able to enhance secrecy even when eavesdropper is under the same channel conditions as the intended receiver.

Fig. 1
figure 1

Multimedia transmission over wireless channel

The rest of the paper is outlined as in Section 2, existing approaches for secure video communication are reviewed. In Section 3, we describe the system model for the noise aggregation method. Section 4, we propose the noise aggregation method, conduct the probabilistic analysis of the proposed scheme, and apply it in Immersive Systems. In Section 5, we present the simulation results for our noise aggregation scheme. The paper concludes with Section 6.

2 Related work

For quite a long time the major focus in the field of multimedia communication over wireless channel was to optimize the video compression algorithms. The major focus of the researchers were to meet video’s Quality of Service (QoS) requirements and muddle through the copyright violations. The secure video communication over the wireless channel majorly relied on the cipher based encryption techniques at network or higher layers. Voloshynovskiy et al. [14] presented a framework for multimedia security and secure communication based on data hiding methods. They proposed visual scrambling and steganography methods for secure multimedia communication. These methods are key based enciphering methods for video communication that are error resilient to the wireless channel. However these methods can be surpassed using high computational power by eavesdropper.

Liang Zhou et al. [18] presented a cross layer architecture for secure multimedia communication as well as to cope with copyright violations. Application layer technologies like watermarking and authentication technologies [6] to cope with copyright violations and to verify the integrity of the multimedia content and source. Authentication methods aid the receiver to verify that the multimedia content is transmitted by legitimate transmitter. While Secrecy Capacity (SC) techniques [12] at physical layer provide secrecy against the passive eavesdropper. They investigated the existing application layer and physical layer technologies and presented a joint application and physical layer security mechanism. The fundamental principle behind the physical layer security is to limit the amount of information that can be extracted by unauthorized user at bit level [1]. In past, Ciphers were considered to be unbreakable without the knowledge of secret key. However with the relentless growth of computational power ciphers are continually surmounted [12].

Initial theoretical work by Shannon in [11] and Wayner in [16] for secrecy capacity (SC) has gained a lot of attention in the field of information security. Shannon describes SC as the maximum achievable transmission rate when legitimate users suffer from eavesdropping by unauthorized users. He suggested a perfect cipher, also known as one-time pad proposed by Vernam [13] can achieve perfect secrecy if secret key is as long as the plain text message. However such a cipher is completely impractical to be implemented in real time. Hence based on Shannon’s assumption all presently used ciphers can be theoretically broken.

Wyner in [16] presented a concept of wiretap channel to achieve virtually perfect secrecy. The wiretap channel (eavesdropper’s channel) is assumed to be noisier than the main channel (legitimate receiver’s channel). Under this condition it is possible to achieve perfect secrecy without relying on ciphers. However, in wireless communication systems it is impossible to assure that eavesdropper’s channel is always noisier than legitimate receiver’s channel.

During 1970s and 1980s the research work in the field of physical layer security was limited due to the strict secrecy capacity defined by Shannon and classical wiretap channel by Wyner requires legitimate users to have advantage over eavesdropper. However the work of Maurer [9] cause to regain the interest in the field of physical layer security. He describe the secrecy capacity as the maximum rate at which a transmitter can send information to the intended receiver in the presence of eavesdropper, as a function of binary entropy of channel error probabilities.

Baldi et al. [2] presented a physical layer security through Scrambled Codes and ARQ protocol. They have proposed non-systematic channel codes (based on scrambling) over the AWGN wiretap channel. The legitimate user also have the facility of ARQ but eavesdropper cannot enjoy such facility. The secrecy is only achieved by considering limitation to the eavesdropper channel. Hence this scheme cannot work when Eve have better channel quality then Bob.

Harrison and Boyce presented a physical layer security method based on linear block codes for binary erasure wiretap channel (BEC) [7, 8]. The idea behind the proposed method is to intentionally erase certain bits from the coded message signal to weaken the corrective capability of the codes at eavesdropper end. The main channel between the legitimate users Alice and Bob is assumed to be error free whereas the wiretap channel is assumed to be BEC, hence further erasure in the transmitted bits limit the information leakage to Eve. The proposed method was tested against the maximum likelihood and maximum passing strategies that can be utilized by the eavesdropper to correct the erased bits and extract information. However this method is also impractical due to the assumption of noiseless main channel.

In [5] Bloch et al. presented the physical layer security method based on secret key agreement between the legitimate users. The key idea behind this method is to exploit the randomness of fading channel to generate a secret between the legitimate users Alice and Bob. No limitation on eavesdropper’s channel is assumed, however it is presumed that transmitter have the complete information of the channel gains of legitimate receiver and the eavesdropper. The authors have also extended their work to imperfect CSI case in [3].

3 System model

The system model for this paper is shown in Fig. 2, similar to the Wyner wiretap channel model [16]. However in our system model the main channel is not noiseless and the addition of noiseless feedback channel for authenticated receiver. At transmitter side it is require to achieve reliable and secure communication with the legitimate receiver in the presence of passive eavesdropper. Alice is sending a video packet to a user Bob over the main channel, while Eve is also receiving data over the wiretap channel. The main and wiretap channel are assumed to be independent of each other; implies that the passive eavesdropper Eve may enjoy better channel quality than legitimate user Bob. It signifies that our proposed method is not limited to degraded scenarios. It can also be noted in Fig. 2 that Bob has the facility to request retransmission of lost packets however Eve does not have such a facility, hence Eve could only get the missing packets unless it is requested by Bob. For example, if any packet is not received by Eve however Bob successfully receive the packet, Bob will not request for Eve’s lost packet.

Fig. 2
figure 2

System Model with feedback channel assuming the same main and wiretap wireless channel

Figure 2 shows that Alice wants to transmit private message X to the desired receiver Bob. She encodes the message X using Nosie Aggregation method and sends the encoded message M, over the wireless channel. The desired receiver Bob and a passive eavesdropper Eve receive the encoded messages as Y and Z respectively. Assuming that the passive eavesdropper already knows about the encoding algorithm, both Bob and Eve start decoding the received messages using Nosie Aggregation method and get decoded messages as \( \tilde{X} \) and \( \widehat{X} \) respectively. Security encoding algorithm is discussed in detail in Section 4.

4 Noise aggregation approach for security enhancement in immersive systems system model

4.1 Principles of noise aggregation for security enhancement

Let’s assume Alice has a collection of message packets X = X 1, X 2, …, X N to be transmitted. The security encoder at the transmitter side performs bitwise exclusive-or (XOR) operation on even packets with odd ones. However odd packets are transmitted without encoding. To decode the message correctly at destination, receiver requires to get all odd packets correctly. Hence loss of odd packet tends to loss of next even packet as well. The input and output of the security encoder at the transmitter is characterized as X and M given in Fig. 3, where

$$ \begin{array}{cc}\hfill {M}_i = {X}_i\hfill & \hfill for\ i=1,3,5,\dots, 2n-1\hfill \end{array} $$
(1)
$$ \begin{array}{cc}\hfill {M}_i={X}_i\oplus {X}_{i-1}\hfill & \hfill for\ i=2,4,6,\dots, 2n\hfill \end{array} $$
(2)
Fig. 3
figure 3

Security Enhancement module at Source

Let’s assume message packets received by Bob and Eve corrupted by the channel affect are Y = Y 1, Y 2, …, Y N and Z = Z 1, Z 2, …, Z N respectively; as given in Fig. 2, where

$$ {Y}_i={M}_i\oplus {W}_i $$
(3)
$$ {Z}_i={M}_i\oplus {U}_i $$
(4)

When receiver start decoding the received packets noise of previous packet will be superimposed into the next packet, hence this technique is called noise aggregation method. The output of the security decoder at the receiver side Bob and Eve is given in Figs. 4 and 5 respectively. Following assumptions can be made based on the received packets.

Fig. 4
figure 4

Example of Decoding Module at Bob

Fig. 5
figure 5

Example of Decoding Module at Eve

  • Bob receives packet \( {\tilde{X}}_1 \) correctly and able to decode next security-enhanced packet Y 2 to extract the required information \( {\tilde{X}}_2 \) however if Eve is unable to receive packet \( {\widehat{X}}_1 \) he will be unable to decode Z 2. Hence the noise of previous packet will be superimposed on next packet and degrade the quality of services at Eve’s end.

  • Eve receive packet \( {\widehat{X}}_1 \) correctly he will use it to decode Z 2 to extract \( {\widehat{X}}_2 \). However if Bob is unable receives packet \( {\tilde{X}}_1 \) correctly so it will request ARQ and Alice will send the packet again until he gets correct packet. So he able to decode Y 2 to extract the required information \( {\tilde{X}}_2 \). The retransmission does not leak more information to Eve

  • The retransmission are made only when requested by Bob hence does not leak more information to Eve.

The ith packet received by Bob and Eve after decoding using the security decoder are given by expression 5 and 6 respectively.

$$ {\tilde{X}}_i={X}_i\oplus {W}_i \oplus {W}_{i-1} $$
(5)
$$ {\widehat{X}}_i={X}_i\oplus {U}_i\oplus {U}_{i-1} $$
(6)

It could be observed that W i − 1 and U i − 1 became the extra noise i.e. noise of previous packet is aggregated to next packet. The probabilistic model for the security capacity of this technique is presented in Section 4.2.

4.2 Analyses for noise aggregation

According to the principle of noise aggregation method presented in Section 4.1, Eve can decode the message correctly iff he receives all the packets by following retransmissions of a packet between Alice and Bob. Otherwise he cannot decode the message encoded by noise aggregation method. The loss of one packet tend towards the loss of next packet, as it serves as a key to decode next packet. To derive an exact expression for the error probability of Bob and Eve, let us consider Bob requests for successive retransmissions against a single packet transmitted from Alice. Let α and β are the error probabilities of Bob and Eve’s channel respectively. Let n be the total number of transmissions of a packet i.e. the transmission of a packet from Alice and subsequent requests of retransmission by Bob. Where n is a random variable, depends on the channel quality of Bob. Assuming each transmission of packet is independent of previous one. The probability that Bob finally receives the packet without any error can be given by independent and identically distributed Bernoulli trials [7] as

$$ \Pr \left({B}_c\right)=\left(1-\alpha \right){\alpha}^n $$
(7)

The probability that Eve can receive the correct packet in n independent retransmissions be 1 − β n. However, the probability that Eve can get the correct packet following the number of total transmissions before Bob gets the correct packet is given by total probability theorem [4] as

$$ \begin{array}{c}\hfill \Pr \left({E}_C\right)={\displaystyle \sum_{n=1}^{\infty }} \Pr \left({E}_C\Big|{B}_c\right) \Pr {B}_c\hfill \\ {}\hfill \Pr \left({E}_C\right)={\displaystyle \sum_{n=1}^{\infty }}\left(1-{\beta}^n\right)\left(1-\alpha \right){\alpha}^{n-1}\hfill \\ {}\hfill \Pr \left({E}_C\right)=\frac{1-\beta }{1-\alpha \beta}\hfill \end{array} $$
(8)

Now let us find the probability of receiving packet correctly, encoded by physical layer security mechanism based on noise aggregation method. Assuming that the probability of receiving a correct packet is independent of each other. Let’s assume the error probabilities of the ith and (i-1)th packets are and \( \overline{\in} \) respectively. The probability of receiving erroneous packets due to physical layer security mechanism as a function of error probability of packets is given as

$$ \Pr \left({\tilde{U}}_i=1\right)= \Pr \left({U}_i=1,{U}_{i-1}=0\right)+ \Pr \left({U}_i=0,{U}_{i-1}=1\right) $$
(9)
$$ \Pr \left({\tilde{U}}_i=1\right) = \Pr \left({U}_i=1\right)\ \Pr \left({U}_{i-1}=0\right)+ \Pr \left({U}_i=0\right) \Pr \left({U}_{i-1}=1\right) $$
(10)
$$ \Pr \left({\tilde{U}}_i=1\right)=\epsilon \left(1 - \overline{\epsilon}\right)+\overline{\epsilon}\left(1-\epsilon \right)=\epsilon +\overline{\epsilon}-2\epsilon \overline{\epsilon}=\widehat{\epsilon} $$
(11)

As in case of binary modulation schemes, error rate is less than 0.5 i.e. \( 0<\epsilon, \overline{\epsilon}<\frac{1}{2} \), hence the probability of decoding even packet erroneously \( \left( \Pr \left({\tilde{U}}_i\right)=\widehat{\epsilon}>\epsilon,\ \overline{\epsilon}\right) \) is enhanced. Now let us find the probability that Eve cannot decode both packets correctly encoded using noise aggregation scheme. Assuming that Eve cannot decode the ith packet \( \widehat{\beta} \), also each packet received by Eve is independent of all other. Finally the probability that Eve cannot extract information by decoding via noise aggregation method can be given as

$$ \Pr \left({E}_E\right)=1-\left(\frac{1-\beta }{1-\alpha \beta}\right)\left(\frac{1-\widehat{\beta}}{1-\alpha \widehat{\beta}}\right) $$
(12)

Hence the probability that Eve cannot be able to decode the packets correctly is greater than Bob, unless the error probability of wiretap channel is much smaller than main channel i.e. β ≪ α.

4.3 Application of noise aggregation approach to immersive systems

In immersive systems to transmit 3D video content, the high Quality of Experience (QoE) is required besides intensive bandwidth. To transmit multimedia contents over wireless networks, the data generated by multimedia sources have to be efficiently compressed. Compression standards like H.264/AVC, MPEG2 and MPEG4 normally aim to achieve high compression ratio. The compressed video content bit-streams are very sensitive to channel errors. Even a single bit error in the compressed bit-streams may lead to loss in synchronization at the receiver’s side, which could significantly degrade the quality of the reconstructed multimedia content [15].

In this paper we have exploited error sensitive nature of compressed video content by using noise aggregation approach while keeping in mind the delay sensitive nature of video content. Eavesdropper experiences degradation in the Quality of Experience (QoE) for video content due to the fact that noise aggregation in encoded message packets increases the error probability. Also it is not required to share a secret key to decode the message packets at receiver’s end; because previous packet serves as a key for the next encoded packet. Therefore the method proposed in this paper does not increase the overhead to wireless communication systems unlike other cipher techniques. Our proposed noise aggregation method effectively enhance the security for video communication by limiting the amount of information eavesdropped by the unauthorized user at bit level.

5 Performance evaluation

The QoE for video content is evaluated against the subjective and objective measurements. The subjective measurements are assumed to be the most precise measures as it based on human experience. While objective measurements are based on statistical methods. In this paper, we have presented both subjective and objective measurements to evaluate the video quality. Objective measurements are based on Peak-Signal-to-Noise-Ratio (PSNR), a most commonly used video quality metric to observe the quality of 3D video.

We assume a frequency-flat block-fading additive white Gaussian noise channel for the system model presented in Section 3. Performance of the system is evaluated against the bit error rate (BER) of legitimate user and passive eavesdropper. The decoding at legitimate user Bob’s end is trivial, Bob can request for retransmission of lost packets until he gets the correct packet or the limit for maximum ARQ is reached. Figure 6 presents the BER of Eve on three dimensional axes as a function of SNR of main and wiretap channel. Error rate is presented on logarithmic scale along z-axis. It could be observed that when the SNR of main channel is lower than a certain limit, the probability of receiving missing packets for Eve is high; therefore he could enjoy a low BER. However if the SNR of main channel is above the certain limit error rate of Eve increases. If the average SNR of the main channel is better than wiretap channel there is significant degradation of services for eavesdropper.

Fig. 6
figure 6

Bit Error Rate of Eve as a function of Bob and Eve’s channel SNR (dB)

Figure 7 shows the frame error rate (FER) gain of Bob and Eve against the same channel conditions i.e. both experience the same average SNR. FER is presented on the logarithmic scale along y-axis vs. the average SNR of wireless channel along x-axis. There is approximately 1dB SNR gain between Bob and Eve for the same error rate, represents that Bob experienced better quality as compared to Eve. However this difference tends to minimize after a certain average SNR limit when the packet lost is significantly lower for Eve.

Fig. 7
figure 7

FER of legitimate user Bob and passive eavesdropper Eve for same channel conditions

In order to justify that our proposed scheme can provide security for video communication, system model presented in Fig. 2 is simulated for video content transmission. Both main and wiretap channel are assumed to be identical and independent of each other. In Fig. 8, it could be observed that Bob experienced better quality as compared to Eve. A frame from the original video, frame received by legitimate user and passive eavesdropper is presented. While an objective comparison for the quality of video content received by legitimate user Bob and passive eavesdropper Eve as a function of Peak Signal to Noise Ratio (PSNR) is presented in Table 1. PSNR values of the video content received by Bob and Eve for different channel conditions, i) when the average SNR of both channels is same, ii) when average SNR of Bob’s channel is better than Eve’s channel and iii) when average SNR of Bob’s channel is worse than Eve’s channel.

Fig. 8
figure 8

Comparison of the video frames a) original transmitted frame, b) received by legitimate user Bob and c) passive eavesdropper Eve

Table 1 Peak Signal to Noise Ratio (PSNR) vs. channel SNR of legitimate user Bob and eavesdropper Eve

6 Conclusion

In this paper, we have presented a noise aggregation method for secure video communication over wireless channel. Noise aggregation method is a physical-layer security approach based on the wiretap channel model with no assumptions on channel quality of eavesdropper. Our proposed physical-layer security scheme provides security against the passive eavesdropper above the SC of main channel. Comprehensive simulation results are provided which prove the effectiveness of our proposed schemes. This scheme can be implemented by adding security encoder in applications that utilize Transmission Control Protocol (TCP) for video transmission/streaming. We believe that this noise aggregation approach will reveal the potential of physical-layer security to secure wireless communications systems.

For future work, we plan to extend our scheme to achieve perfect secrecy against the passive eavesdropper. The scope of this scheme can be extended to provide absolute video security including security against the active eavesdropper and copyright violations.