1 Introduction

Steganography is widely used in various fields, like information integrity certification, identity authentication, confidential communication, and digital copyright protection [13]. Video steganography is an important field of research in industrial as well as military applications, due to its size and memory requirements. It is a process of hiding the secret information inside the video or image such that the hidden data is only known to the sender and the receiver. Steganography otherwise referred to as data hiding is an effective method used to protect the privacy and security of digital information, as it embeds the secret message in the digital media. The digital media contains secret information, which has some meaningful form. Hence, the privacy and security of embedded secret data is guaranteed, which prevent the attention of attackers [22]. In the past decades, various steganography methods were developed in the digital multimedia. The digital media contains various files, such as image, text, and video [17]. Among these media types, video is the most popularly used embedding framework, due to its wide range of applications in storage device, internet, surveillance camera etc. To save the storage space, coding model is applied for compressing the video sequences, as the video sequences has large file size [24]. The steganographic system uses various multimedia objects, namely audio, video and image as cover medium. Based on the category of cover medium, the steganography is categorized into image, video, audio, protocol, and text steganography [38].

The steganography method uses discrete cosine transform (DCT) coefficients, as it is the major transformation domain used for compressing the video files in order to embed the secret data into video coding structure [5, 22, 28]. It is the process used for embedding the secret data into digital media, such as video, text, and image [9, 14, 30]. The key role of the steganography procedure is to maximize the imperceptibility for increasing the level of security while dealing with the high embedding capability. Among different types of cover media, the video file plays a major attention in the area of research, as it provides large capacity. The existing video steganography techniques are classified into two different types, namely re-encoding embedding, and distortion compensated embedding [34]. Due to high complexity of computational process, the above methods are not effective in performing the data hiding mechanism. The existing data embedding techniques uses video coding structure that includes motion vector [31], DCT coefficient, and prediction mode for embedding the secret data [16]. The DCT coefficients utilize lesser modification and low amount of distortions. The present steganography method also uses various optimization techniques to get the best possible results. The well-known methods are genetic algorithm (GA) [7], particle swarm optimization (PSO) [12], monarch butterfly optimization (MBO) [6], artificial bee colony (ABC) [11], grey wolf optimization (GWO) [27], fractional-grey wolf optimization (Fractional-GWO) [29] etc.

In this paper, a novel video steganography method to hide the secret data using the Exponential Fractional-CSO is proposed. The proposed optimization framework involves three different phases to hide the secret data in the video sequence, namely key frame extraction, optimal region selection, and embedding of the secret data. Firstly, the key frames are extracted from the input video, and then grid lines are formed on the extracted key frame. Then, to select the optimal region proposed Exponential Fractional-CSO algorithm, which is developed using the CSO with the fractional concepts based on the Exponential parameter is used. The proposed optimization framework considers the factors, such as intensity, edge, wavelet energy, LBP energy, kurtosis, and neighborhood information associated with the multi-objective cost function before hiding the secret data. Once, the optimal region is selected then the embedding process is achieved using LWT to get the embedded image. After embedding all the frames are combined together to get the stego-video which is then transmitted through the network to the intended receiver. Upon receiving, extraction of the secret data is done from the stego image with the help of original image and optimal region.

The significant contribution of the proposed work are summarized as follows,

  • This scheme employs optimal region selection for hiding the secret data using Exponential Fractional-CSO algorithm.

  • The embedding process is achieved using LWT approach which increases the complexity level of security.

  • The secret data is hard to find as it’s embedded in an optimal region which is calculated from the fitness measure. After going through a series of hiding procedure the hidden data is not degraded and is imperceptible.

  • The system is simple yet robust even if any attacks happens.

The organization of this research is explained as follows: Section 2 elaborates the existing steganography methods. Section 3 explains the proposed optimization framework, Section 4 describes the results and discussion of the proposed model, and conclusion of this paper is made in Section 5.

2 Literature survey

Liu et al. [15] modeled a robust steganography model based on secret sharing to embed the secret message. It increased the robustness in message hiding. It improved the quality of video and achieved better performance. However, this method obtained low survival rate. Xue et al. [37] presented a secure steganographic scheme using quantized DCT coefficients. Here, the const function was constructed using the block distortion, texture complexity and bit rate change of DCT block. However, this method failed because of security performance and embedding capacity. Jalali et al. [10] proposed a video sparse representation method. This method exhibited better invisibility, and was highly robust against various attacks. However, it failed to increase the embedding payload.

Chan et al. [4] proposed a hybrid method in which scene change analysis was used to embed the data and error correcting code is embedded as a watermark in the audio channel. In this approach, the compressed video is first decomposed into standard video format, embedding of the data is done in the video signal and finally the video is recompressed. The proposed method gave a good embedding capacity but takes a lot of computer processing time. Noda et al. [21] presented a video steganography algorithm employing bit-plane complexity segmentation (BPCS) and compressed video in wavelet domain. The DWT domain are used by two coding techniques namely, 3D set partitioning in hierarchical tress (SPIHT) and Motion-JPEG2000. The proposed technique gave a higher embedding payload however it is not sure that all types of cover-video will have enough of noise-like bit-plane regions. Moon et al. [20] proposed a new approach of video steganography algorithm for hiding secret data and for authentication computer forensic tool was used. The secret data prior to embedding using the LSB method is first encrypted and authenticated using a key. The secret data is hidden into the 4LSB of the cover frame of video. Since the method uses spatial domain it not robust against noise. Stanescu et.al [26] presented a video steganography algorithm called StegoStream which uses MPEG-2 video streams to embed the subtitle message without consuming any extra bandwidth. After the quantization process and when the predefined threshold T is reached, the bocks are selected for embedding the data. The secret subtitle message are embedded only in the I frames of the video stream blocks. However, this method failed as the video subtitle will not repeat continuously since for every one second the MPEG-2 video have 4–5 intra frames.

Wang et al. [35] developed a video steganography method based on Intra Prediction Mode (IPM) and effectively evaluated the probability distribution. It increased the performance of security but failed to increase the performance of coding efficiency. Younus et al. [39] introduced Knight Tour algorithm for video steganography. The secret message was encrypted using the mathematical equation such that the key specified was set of random numbers. It was highly reliable in terms of security, but failed to use the key for selecting the frames. In Song et al. [25] to avoid the difficulties of distortion drift, an improved scheme based on Multi-view coding (MVC) standard for 3D video was introduced. The selected coefficients of 4 × 4 Quantized Discrete Cosine Transforms (QDCT) is first modified and then the data are hidden into b4-frames thereby preventing distortion drift. The proposed method uses two random sequenced to maintain randomness. However, this method failed to ensure security of the hidden data.

Gurunathan et al. [8] presented a method named stegano-visual cryptography to embed the secret data within an input image. The cover image is splitted into nblocks of 8 × 8pixels and the secret data into n partitions to enhance the embedding capacity and quality of stego image. For finding an optimum result, Cuckoo Search (CS) optimization technique was employed. It helps to detect the best substitution matrix which change protected data into whole block but failed to find single optimum matrix for substitution. Banharnsakun et al. [2] proposed a LSB based image steganography which is enhanced with the help of an Artificial Bee Colony (ABC). To embed the confidential image inside an input image, the block is optimized by ABC. The ABC algorithm is updated using a using a block assignment list is also presented in this. Though, there was no improvement in the embedding capacity. Miri et al. [18] proposed to map the cover image to a frequency domain, adaptive wavelet transform and GA. Based on Kieu and Chang model, the encrypted data is embedded in the edges of the image by which there is least difference in the cover image. Still, this method did not much concentrate on the improving the time complexity.

The detailed study conveyed that the above issues affected the video quality and security level of the hidden data. To overcome these issues, a new model of data hiding using Exponential Fractional-CSO method to achieve improved video quality and security level is proposed.

3 Proposed exponential fractional-CSO for video steganography

The main objective of developing the video steganography algorithm is to find the optimal region for embedding the secret data into the cover image thereby improving the video quality and offering the best security to the secret data. The proposed scheme benefits from the goodness of the Exponential Fractional-CSO along with multi-objective cost function. Initially, an input video is selected from the database, and the key frames are effectively extracted by applying contourlet transform and SSIM measure. After extracting the key frame, regions are formed through the aid of grid lines. Then, optimal regions are selected with the help of proposed optimization algorithm based on the multi-objective factors, like intensity, edge, wavelet energy, LBP energy, kurtosis, and neighborhood information. The proposed Exponential Fractional-CSO algorithm is designed by incorporating the fractional concept with the CSO based on the Exponential concept. Once, the optimal region is selected, the secret data is embedded in the optimal location with the help of LWT which is an effective transform model to hide the secret data. Figure 1 portrays the schematic diagram of the proposed Exponential Fractional-CSO for video steganography.

Fig. 1
figure 1

Schematic diagram of the proposed Exponential Fractional-CSO for video steganography

3.1 Extracting key frames from the input video

The input video Bi is selected from the database in order to perform the key frame extraction process. The key frame extraction process is simple and is effective for summarizing the long video sequence. The key frames represent the salient information of a video shot and they are effectively extracted within a specified shot. The key frame extraction process reduces the hiding complexity in video by concentrating more on the particular frames. For extracting key frames, the input video is divided into various frames and the difference between each frames is calculated and finally, the key frames used for embedding process is extracted. The key frame that is extracted is represented as,

$$ FK=\left\{F{K}_1,F{K}_2,\dots F{K}_i,\dots F{K}_n\right\} $$
(1)

where, FK denotes the key frame, FKn represents the number of key frames, and FKi indicates the ith key frame, respectively.

3.2 Optimal region selection using the proposed exponential fractional-CSO

Once the key frames are extracted, the next step is the formation of grid lines on the specific key frames. The grid lines are formed on the extracted key frame in order to detect optimal region for hiding the secret data. Once the grid lines are formed, the region used to hide the secret data is selected using the proposed Exponential Fractional-CSO which is designed by incorporating the CSO [1] with the fractional concept [3] based on the Exponential parameter.

The selection of optimal region enables the proposed model to hide the secret data more effectively in such a way that the secret data is not visible to the third party. The optimal region selection is done using the proposed method along with multi-objective cost function, which is formed using the factors, like intensity, edge, wavelet energy, LBP energy, kurtosis, and neighborhood information. These functions are associated together to model the fitness function, which in turn is used to obtain the optimal result.

3.2.1 Solution encoding

Solution encoding is used to represent the grids for selecting the optimal region to hide the secret data. It specifies the solution vector, which comprises of vnumber of regions. The dimension of solution vector is specified as [1 × h] which indicates the total number of regions used to embed the secret data, where h takes the value 9. Figure 2 portrays the solution vector.

Fig. 2
figure 2

Solution encoding

3.2.2 Fitness function

The fitness of the selected regions are computed using the multi-objective cost function factors like intensity, edge, LBP energy, kurtosis, neighborhood information, and wavelet energy. The fitness function is evaluated using the below Equation as,

$$ F=A+\left(1-T\right)+\left(1-D\right)+E+L+P $$
(2)

where, Arepresents the wavelet energy, T specifies the intensity of pixels, D represents the edge, E denotes the energy of loop, L represents the kurtosis, and P indicates the average of pixels in the neighborhood information. The wavelet energy is represented as,

$$ A=\frac{1}{h}\sum \limits_{m=1}^h\lambda \left({G}_m\right) $$
(3)

where, Gm represents the image corresponding to mth region and the entropies of m are averaged to calculate the wavelet energy. To calculate the intensity, the intensity difference between the pixels of a block is done which is represented as,

$$ T=\frac{1}{h\times R\times S}\sum \limits_{m=1}^h\sum \limits_{q=1}^R\left[\sum \limits_{r=1}^S{G}_m^q-{G}_m^{q,r}\right] $$
(4)

where, R denotes the total number of pixels at mth region, and S specifies the total number of neighbors. The intensity of qth pixel at mth region is represented as \( {G}_m^q \), and \( {G}_m^{q,r} \) specifies the intensity of rth neighboring pixel with respect to qth pixel of the mth block. The edge of the block is calculated as,

$$ D=\frac{1}{h\times R}\sum \limits_{m=1}^h\sum \limits_{q=1}^R{D}_{m,q} $$
(5)

The edge of pixel is denoted as Dm, q such that the value of Dm, q is one in case of edge pixel. The energy of the loop is calculated for each region by fixing the center pixel in the block. The difference between the center and the neighboring pixel is computed in such a way that if the difference is greater than or equal to zero, then the pixel value is updated by one otherwise the value is specified as zero. The energy of the loop is calculated as,

$$ E=\frac{1}{h}\sum \limits_{m=1}^h\lambda \left[J\left({G}_m\right)\right] $$
(6)

where, λ[J(Gm)] represents the entropy of the image corresponding to mth region. The kurtosis of loop corresponding to the mth region is calculated as,

$$ L=\frac{1}{h}\sum \limits_{m=1}^h\beta \left({G}_m\left(q,z\right)\right) $$
(7)

where, the term β(Gm(q, z)) denotes the kurtosis of loop corresponding to mth region such that Gm(q, z) is represented as,

$$ {G}_m\left(q,z\right)=\sum \limits_{m=0}^ha\left({b}_m-{b}_{\upsilon}\right){.2}^m $$
(8)

The condition used to specify the factor a() is represented as,

$$ a\left(\alpha \right)=\Big\{{\displaystyle \begin{array}{c}1\kern2.1em if\upalpha \ge 0\\ {}0\kern1.2em otherwise\end{array}} $$

The average of pixels present in the neighborhood information is represented as,

$$ P=\frac{1}{h}\sum \limits_{m=1}^hT $$
(9)

where, T represents the intensity. The fitness value of intensity and edge should be minimal and in order to afford maximum value, the intensity and the edge factors are subtracted from one.

3.2.3 Proposed exponential fractional-CSO

The CSO algorithm is integrated with the fractional concept based on the Exponential parameter in order to design the proposed Exponential Fractional-CSO for embedding the secret data. The CSO algorithm operates using the behavioral characteristics of cats. The cat uses two modes in order to chase the prey, namely the tracing mode and seeking mode. In seeking mode, the cats take rest for a long time but with curiosity and high degree of alertness of the objects moving around them and in tracing mode it moves around to chase the prey. Based on these two modes, global solution is obtained for embedding the secret data. As, the cats spend less time for tracing, there exist minimum number of cats in this subgroup such that the number are defined using the mixture ratio (MR).

Once, the cats are sorted in any of the two modes, the best solution is derived and saved based on their position and fitness. The steps are repeated until a stopping criterion is attained. Thus, based on factor such as Self-Position Considering (SPC), Seeking Memory Pool (SMP) and Seeking Range of the selected Dimension (SRD) decides between the modes through which the position of the cats is updated. The algorithmic steps involved in the proposed Exponential Fractional-CSO are discussed as follows:

Initialization

Let us consider N number of cats in the Zth dimensional search space and the solution is represented as Y.

Evaluating the fitness

The fitness of each solution is evaluated using Eq. (2) and the results are organized in ascending order with the maximal value of fitness. Finally, the function with the best fitness measure is declared as the optimal solution.

Position update of search agents

The cats update their position either in the tracing mode or in the seeking mode. The cats engaged in the seeking mode update their position using the below Equation as,

$$ {Y}^{x+1}=\left(1\pm O\times w\right)\times {Y}^x $$
(10)

where, Yx + 1 represents the location of cat in the current iteration, Yx indicates the location of cat in the previous iteration,wrepresents a random number and O denotes the Seeking Range of the selected Dimension (SRD) factor. The location of cat in the tracing mode is represented as,

$$ {Y}^{x+1}={Y}^x+{g}^{x+1} $$
(11)

Here, Yx denotes the position of cat at xth iteration and gx + 1 denotes the velocity of cat in the current iteration. By substituting the velocity gx + 1 in the above Equation we get.

$$ {Y}^{x+1}={Y}^x+{g}^x+{w}_1\times {s}_1\times {Y}^{bst}-{w}_1\times {s}_1\times {Y}^x $$
(12)

By integrating the fractional concept with the CSO algorithm, the above Equation is modified as,

$$ {g}^{\mu}\left({Y}^{x+1}\right)={g}^x+{w}_1\times {s}_1\times {Y}^{bst}-{w}_1\times {s}_1\times {Y}^x $$
(13)
$$ {Y}^{x+1}=\mu .{Y}^x+\frac{1}{2}\left(1-\mu \right){Y}^{x-2}+\frac{1}{6}\left(1-\mu \right){Y}^{x-2}+\frac{1}{24}\mu \left(1-\mu \right)\left(2-\mu \right){Y}^{x-3}+{g}^x+{w}_1\times {s}_1\times {Y}^{bst}-{w}_1\times {s}_1\times {Y}^x $$
(14)

With the concept of Exponential Weighted Moving Average (EWMA), the term \( {Y}_d^x \) is represented as,

$$ {Y}^x=\frac{Y_d^x-\left(1-\delta \right){Y}_d^{x-1}}{\delta } $$
(15)

By substituting the above in Eq. (14), the resultant Equation is expressed as,

$$ {\displaystyle \begin{array}{c}{Y}^{x+1}=\frac{\mu .{Y}_d^x}{\delta }-.\frac{\mu \left(1-\delta \right){Y}_d^{x-1}}{\delta }+\frac{1}{2}\left(1-\mu \right){Y}^{x-2}+\frac{1}{6}\left(1-\mu \right){Y}^{x-2}+\frac{1}{24}\mu \left(1-\mu \right)\left(2-\mu \right){Y}^{x-3}+\\ {}\kern11.27999em {g}^x+{w}_1\times {s}_1\times {Y}^{bst}-\frac{w_1{s}_1\;{Y}_d^x}{\delta }-.\frac{w_1{s}_1\left(1-\delta \right){Y}_d^{x-1}}{\delta}\end{array}} $$
(16)

The final updated equation for the proposed Exponential Fractional-CSO is presented in Eq. (16). The cat updates its position based on the position from the previous iterations Yx,Yx − 1,Yx − 2 and Yx − 3, random numberw1, a constant s1, fractional constant μ, and the parameter δvaries from0 < δ ≤ 1.

Termination

The above steps are repeated until an optimal solution is obtained or till it meets a condition criteria. The pseudo code of the proposed Exponential Fractional-CSO is explained below.

figure a

3.3 Embedding the secret data using lifting wavelet transform

Once the optimal region is identified using the proposed Exponential Fractional-CSO, then the embedding process to hide the secret data is carried out in the embedding phase to provide higher degree of security during data transmission. The secret data is embedded using the LWT, as it offers fast computation than the existing wavelet transforms. The cover image or the video frame Wiis used to embed the secret data with dimension of[U × V]. Firstly, the image Wi is decomposed using LWT, which generates bands, such as LL, LH, HL, and HH that provides enhanced edge details of the image. The application of LWT to image Wi is composed of two levels of decomposition. The first level of decomposition is specified as,

$$ \left\{ LL, LH, HL, HH\right\}=K\left({W}_i\right) $$
(17)

The lifting wavelet transform is denoted by K and {LL, LH, HL, HH} denotes the wavelet coefficients of frame. The size of the individual bands are denoted as \( \left(\frac{U}{2}\times \frac{V}{2}\right) \). Each bands of the image Wi is further send for second level of decomposition which generates 16 sub-bands, which are represented as,

$$ \left\{L{L}_1,L{H}_1,H{L}_1,H{H}_1\right\}=K(LL) $$
(18)
$$ \left\{L{L}_2,L{H}_2,H{L}_2,H{H}_2\right\}=K(LH) $$
(19)
$$ \left\{L{L}_3,L{H}_3,H{L}_3,H{H}_3\right\}=K(HL) $$
(20)
$$ \left\{L{L}_4,L{H}_4,H{L}_4,H{H}_4\right\}=K(HH) $$
(21)

The size of sub-bands from Eq. (18 to 21) is denoted as \( \left(\frac{U}{4}\times \frac{V}{4}\right) \). In this research, the wavelet coefficients LL and HH are used to embed the secret data. The embedding process is specified as,

$$ {X}_t\left(u,y\right)={W}_t\left(u,y\right)+l\ast {Z}_j\left(u,y\right); selected\kern0.24em optimal\;\;\; region $$
(22)

where, t and j denotes the total wavelet bands and it values ranges from 1 to 8. The watermarked image is Xt (u, y), Wt(u, y) denotes the lifting wavelet sub bands, Zj(u, y) indicates the secret data, and ldenotes the embedding strength.

3.4 Inverse wavelet transform

Once, we get the embedded image from the previous step then inverse lifting wavelet transform is applied which helps in getting the original embedded image wherein the secret data is concealed. It also involves two levels of decomposition which is applied on the embedded image. The first level of decomposition is given as,

$$ IDWT\left({X}_t\left(u,v\right)\right)=B\left(u,v\right)=\left\{L{L}^{\bullet },L{H}^{\bullet },H{L}^{\bullet },H{H}^{\bullet}\right\} $$
(23)

where, IDWT(Xt(u, v) represents the inverse lifting wavelet transform of the embedded image. Then, the second level of decomposition follows and the outcome is denoted as B (u, v), which is referred to as the stego image. All the frames along with the stego image frames are then combined together to get the stego-video.

3.5 Retrieval of secret data from the stego-video

The stego-video is then forwarded to the receiver through network for extracting the secret data. At the receiver end, the secret data can be retrieved using the original cover video, optimal regions and the embedded image. The lifting wavelet transform is applied to extract the secret data and the receiver extracts it from the optimal region. The process of extraction is represented as,

$$ M\left(u,y\right)={W}_t^{\ast}\left(u,y\right)-{W}_t\left(u,y\right) $$
(24)

Here, M(u, y) represents the retrieved secret data.

The impact of video steganography developed in this paper gives robust data hiding, ensures perception quality and security of the data.

4 Results and discussion

4.1 Experimental setup

The performance of the proposed method is verified using two input videos taken from [32, 36] along with a manually generated secret data. The experimental work is implemented in MATLAB R2015a, Intel Core i7, CPU @2.60 GHz, 8GB RAM under the Windows 10 operating system.

4.2 Evaluation metrics

The assessment of the proposed system is done using different metrics like correlation, PSNR and MSE which is described in detail as follows:

4.2.1 Correlation

It specifies the relation which exists among the original frames and the embedded video frames, represented as,

$$ \mathfrak{R}=\frac{M\left(\kappa, \lambda \right)}{\varpi_{\kappa }{\varpi}_{\lambda }} $$
(25)

where, ϖκ and ϖλ specifies the standard deviation of κ and λ, and M (κ, λ) denotes the covariance of κ and λ.

4.2.2 MSE

It specifies the error between the original video and the stego-video and is calculated as,

$$ Ms=\frac{1}{No}\sum {\left({W}_t\;\left(u,y\right)-{X}_t\left(u,y\right)\right)}^2 $$
(26)

where, Wt(u, y) specifies the original value, and Xt(u, y) denotes the embedded pixels.

4.2.3 PSNR

It specifies the quality of embedded image that is calculated using embedded frames and the cover frames of video. However, the PSNR is expressed as,

$$ {P}_s=20\;{\log}_{10}\frac{W_{\mathrm{max}}\times \hslash \times \ell }{Ms} $$
(27)

where, Wmax indicates the maximal energy with the size of image as ℏ and ℓ and Ms is the MSE.

4.3 Comparative methods

The proposed approach is compared with the conventional methods, like Particle Swarm Optimization (PSO) [19], Multi-Objective Cat Swarm Optimization (MOCSO) [23] and Wavelet [33] to show the performance improvement.

4.4 Comparative analysis

This section describes the comparative analysis of the proposed Exponential Fractional-CSO with the existing methods based on various metrics.

4.4.1 Analysis based on correlation using video 1

The above section presents the comparative analysis of the proposed Exponential Fractional-CSO method with the existing methods using video 1 in terms of frames, noise, random noise, and histogram respectively. When the histogram bins is 252, the methods, PSO, MOCSO, Wavelet and the proposed method acquired correlation of 0.4281, 0.5005, 0.5461, and 0.6579 respectively. It is clear from Fig. 3(a) that the proposed method rendered an accurate embedding through better security of the secret data.

Fig. 3
figure 3

Results of video steganography with video 1 based on Correlation, (a), varying histogram bins, (b) varying salt and pepper noise, (c) varying random noise, (d) varying the number of frames

When the noise level is 7 × 10−3 the methods, PSO, MOCSO, Wavelet and the proposed attained correlation of 0.4833, 0.5429, 0.6091 and 0.7356, respectively. Although the correlation declines with increase in noise levels, the proposed method acquired a higher value of correlation compared to other existing methods which is depicted in Fig. 3(b). When the random noise level is 7 × 10−3 the methods acquired correlation of 0.5091, 0.5501, 0.6249, and 0.9745, respectively as shown in Fig. 3(c). When the fourth frame is considered, the methods gained correlation of 0.5542, 0.6203, 0.7856and 0.9834 respectively. The analysis reveal that the proposed method acquired a higher correlation as portrayed in Fig. 3(d).

4.4.2 Analysis based on correlation using video 2

The comparative analysis using video 2 in terms of histogram, noise, random noise, and frames respectively are illustrated in this section. When the histogram bins is 252, the methods, PSO, MOCSO, Wavelet and the proposed method acquired correlation of 0.4244, 0.5289, 0.5421, and 0.6881 respectively. It is quite clear from Fig. 4(a) that the proposed method provided an accurate embedding through better security of the secret data.

Fig. 4
figure 4

Results with video 2 based on Correlation, (a), varying histogram bins, (b) varying salt and pepper noise, (c) varying random noise, (d) varying the number of frames

When the noise level is 7 × 10−3 the methods, PSO, MOCSO, Wavelet and the proposed attained correlation of 0.4668, 0.5449, 0.6213 and 0.7845, respectively. Although the correlation declines with increase in noise levels, the proposed method acquired a higher value of correlation compared to other existing methods which is represented in Fig. 4(b). When the random noise level is 7 × 10−3 the methods acquired correlation of 0.5835, 0.6456, 0.7148, and 0.9914, respectively as shown in Fig. 4(c). When the fourth frame is considered, the methods gained correlation of 0.6109, 0.6818, 0.7713, and 0.9854 respectively. The analysis reveal that the proposed method acquired a higher correlation as portrayed in Fig. 4(d).

4.4.3 Analysis based on correlation using video 3

The comparative analysis in terms of histogram, noise, random noise, and frames respectively using video 3 are explained in this section. When the histogram bins is 252, the methods, PSO, MOCSO, Wavelet and the proposed method acquired correlation of 0.4344, 0.5548, 0.5752, and 0.6981 respectively. It is quite evident from Fig. 5(a) that the proposed method provided a precise embedding through better security of the secret data. When the noise level is 7 × 10−3 the methods, PSO, MOCSO, Wavelet and the proposed attained correlation of 0.4832, 0.5759, 0.6313 and 0.7567, respectively which is represented in Fig. 5(b).

Fig. 5
figure 5

Results using video 3 based on Correlation, (a), varying histogram bins, (b) varying salt and pepper noise, (c) varying random noise, (d) varying the number of frames

When the random noise level is 7 × 10−3 the methods acquired correlation of 0.5935, 0.6634, 0.7394, and 0.9974, respectively as shown in Fig. 5(c). When the fourth frame is considered, the methods gained correlation of 0.6209, 0.6918, 0.7934, and 0.9754 respectively. The analysis reveal that the proposed method acquired a higher correlation as portrayed in Fig. 5(d).

4.4.4 Analysis based on correlation using video 4

The comparative analysis in terms of histogram, noise, random noise, and frames respectively using video 4 are explained in this section. When the histogram bins is 252, the methods, PSO, MOCSO, Wavelet and the proposed method acquired correlation of 0.4549, 0.5481, 0.56521, and 0.6881 respectively. It is quite evident from Fig. 6(a) that the proposed method provided a precise embedding through better security of the secret data.

Fig. 6
figure 6

Results using video 4 based on Correlation, (a), varying histogram bins, (b) varying salt and pepper noise, (c) varying random noise, (d) varying the number of frames

When the noise level is 7 × 10−3 the methods, PSO, MOCSO, Wavelet and the proposed attained correlation of 0.5078, 0.5629, 0.6591 and 0.7563, respectively which is represented in Fig. 6(b). When the random noise level is 7 × 10−3 the methods acquired correlation of 0.5181, 0.5201, 0.7242, and 0.9785, respectively as shown in Fig. 6(c). When the fourth frame is considered, the methods gained correlation of 0.5934, 0.6956, 0.8156, and 0.9889 respectively. The analysis reveal that the proposed method acquired a higher correlation as portrayed in Fig. 6(d).

4.4.5 Analysis based on PSNR and MSE

This section portrays the analysis based on the PSNR and MSE values of the proposed method with the existing work. The PSNR using video 1 for the methods PSO, MOCSO, Wavelet and the proposed method is 53.233, 61.504, 71.931, and 84.176, respectively. The MSE for the methods are 0.30887, 0.04599, 0.00416, and 0.00024. Likewise, the PSNR and MSE values for video 2, video 3, and video 4 is illustrated in Table 1. It is quite evident that the proposed Exponential Fractional-CSO method acquired maximal PSNR with much lesser MSE values compared to the existing methods.

Table 1 Comparative discussion based on PSNR and MSE

4.4.6 Analysis based on computational complexity

It is clear from Table 2 that the proposed method achieves low computational complexity than the other existing methods.

Table 2 Computational complexity of proposed method along with conventional methods

5 Conclusion

In this paper, an effective method named Exponential Fractional-CSO is developed to hide the secret data in the video sequence. The proposed steganography process involves three different phase, such as key frame extraction, formation of grid lines, and optimal region selection to hide the secret data in the video sequence. Initially, the input video is collected from the database and the key frames are retrieved from the input video. After extracting the key frames, the optimal region used to hide the secret data is selected through the grid lines. The selection of optimal region is carried out using the proposed Exponential Fractional-CSO, which is designed by integrating the CSO with the fractional concept using the exponential parameter. The optimal solution is evaluated using the fitness measure, which in turn uses the multi-objective cost function based on the constraints, like intensity, edge, wavelet energy, LBP energy, kurtosis, and neighborhood information. After determining the optimal region, the embedding process is achieved to hide the secret data using the lifting wavelet transform. The proposed Exponential Fractional-CSO obtained maximal correlation of 0.9931 by considering the frames, maximal PSNR of 89.70 dB, and MSE of 0.00006 respectively. In future, the performance of the video steganography process can be increased using different optimization model and also by encrypting the secret data prior to embedding.