1 Introduction

In the last few years, ownership of multimedia data, illegal copying, avoiding duplicity and copyright protection has become the challenging issue in the age of growing internet and multimedia techniques. Digital watermarking (audio, video and image) [5, 16, 25] provides a solution to all these problems. Digital watermarking [4, 13, 16, 25] is the process of embedding the watermark in the host signal in an imperceptible manner. Imperceptibility, robustness, security and payload are the main requirements of watermarking scheme [5, 13, 16, 25]. In the literature of digital watermarking (image, video and audio), it has been found that the transform domain methods [1, 3, 4, 10, 11, 1315, 17, 2023] are more imperceptible and robust to image processing operations compared to the spatial domain methods [16, 19, 25].

Lin et al. [10] proposed a watermarking scheme based on maximum wavelet coefficient quantization. In this scheme wavelet coefficients are grouped into different blocks and blocks are selected from different subbands. Watermark is embedded into local maximum wavelet coefficient which is obtained by adding different energies to wavelet coefficients. The normalized correlation (NC) of extracted watermark under various image processing attacks like histogram equalization scaling and JPEG compression is very low. Agarwal et al. [1] proposed a watermarking scheme based on GA-BPN hybrid network in DWT domain using HVS parameters. Based on these parameters, 27 HVS rules instances are used for training GA-BPN network and the output of hybrid network is used to embed the watermark. A combined approach of GA-BPN enhances the imperceptibility and robustness of watermarking scheme.

In terms of computational cost and memory requirement, the above mentioned DWT based schemes and others described in [1, 10, 14, 15] are less efficient. Faster and efficient implementation of traditional wavelet transform called second generation wavelet [6] also known as LWT was used by various researchers in the field of watermarking [9, 11, 21]. Verma et al., in [21] proposed a watermarking scheme based on significant difference of lifting wavelet coefficients. Watermark is embedded into the largest coefficient of randomly shuffled blocks of CH3 subband. This subband is quantized using the predefined threshold value by comparing the significant difference value with the average of significant difference value of all the blocks. Through the extensive experiment, they have shown that LWT based scheme shows good imperceptibility and high robustness against image processing operations. Loukhaoukha et al. [11] presented an image watermarking scheme based on SVD and LWT using multi objective genetic algorithm optimization. Combination of SVD and LWT has made this image watermarking scheme imperceptible and use of genetic algorithm made the scheme more robust against image processing attacks.

To increase the performance and robustness, neural network based learning algorithms are employed into watermarking application by many researchers [1, 23]. Recently, the advantages of machine learning algorithms like support vector machine (SVM) [12, 15], support vector regression (SVR) [19] and extreme learning machine (ELM) [8] are used in watermarking applications due to their faster learning speed and better generalization property than iterative based neural network algorithms. Shen et al. [19] proposed an image watermarking using SVR in spatial domain. SVR is used to learn the relationship between the central pixel and its neighboring pixels of selected blocks and then watermark is embedded by comparing the SVR output with original pixel value. Due to good generalization ability of SVR, the authors are able to recover the watermark against image processing attacks. This scheme does not show resistance against common image processing operations due to embedding the watermark in spatial domain. Peng et al. [15] proposed a watermarking scheme in multiwavelet domain based on SVM. In this scheme, mean value modulation method is used to embed the watermark in selected blocks of wavelet coefficients. In the extraction phase, SVM is used as a classification purpose. Due to high generalization property of SVM, the good quality watermark with very less bit error rate is recovered against image processing operations.

Balasundram [2] proposed a faster machine learning algorithm, a modification to LSVR, which is many times faster than classical SVR [19] and has good generalization ability tested on standard datasets [2]. Inspired by the application of LWT and QR decomposition in digital watermarking to extract stable and prominent features for imperceptibility and due to the high generalization ability of LSVR against noisy datasets, a new approach of image watermarking algorithm for copyright protection is proposed. In the proposed approach, firstly the host image is decomposed by one level LWT and obtained LL subband is used for embedding the watermark. Secondly, the LL subband is divided into non-overlapping blocks of size 4 × 4 and based on the correlation of wavelet coefficients of each selected block is decomposed using QR factorization. Thirdly, the significant element of first row of R matrix [20] is regarded as training objective in which the watermark is to be embedded and its remaining upper triangular elements as the training features to LSVR. The effectiveness of the proposed scheme is evaluated through the extensive experiments on different textured images.

The rest of the paper is organized as follows. The preliminaries of the research work presented in this paper is described Section 2. The proposed image watermarking scheme is explained in Section 3. Experimental results, discussions and comparison of the proposed scheme with existing SVR based and QR decomposition based scheme are explained in Section 4. Finally the conclusion is drawn in Section 5.

2 Preliminaries

2.1 Arnold transform

Arnold transformation [24], proposed by V.I. Arnold, used in many digital image scrambling due to its periodicity property. The generalized form of two dimensional (2-D) Arnold transform of a square image is:

$$ \left[\begin{array}{l}{x}_j\\ {}{y}_j\end{array}\right]=\left[\begin{array}{cc}\hfill 1\hfill & \hfill a\hfill \\ {}\hfill b\hfill & \hfill ab+1\hfill \end{array}\right]\left[\begin{array}{l}{x}_{j-1}\\ {}{y}_{j-1}\end{array}\right] \mod N $$
(1)

Where x j and y j are the coordinates of scrambled image corresponding to x j − 1 and y j − 1 after j th iteration; N is the height of image being processed; a and b are positive integers (a = 1, b = 1). Due to the periodicity property, the original position of (x, y) coordinates gets back after T n (called its period) iterations as shown in Fig. 1a.

Fig. 1
figure 1

(a) Arnold Transform. (b) Decomposition and reconstruction of a signal using lifting wavelet scheme

2.2 QR decomposition

The orthogonal-triangular decomposition [20, 22,] of a matrix A (also called QR decomposition) is defined as:

$$ {\left[A\right]}_{M\times N}={\left[Q\right]}_{M\times M}{\left[R\right]}_{M\times N} $$
(2)

Where Q is an M × M unitary matrix and the columns of Q form an orthonormal basis for the column space of matrix A and R is an upper triangular matrix of size M × N. Gram-Schmidt orthogonalization [22] process is used to obtain the columns of Q. The interesting feature of R matrix is that the absolute values of the elements of first row of R matrix are greater than that of the other rows [14] when the columns of A have correlation with each other and the elements of first row of R matrix contains the maximum energy of the signal. Also the computational complexity of QR decomposition is less than other factorization method like SVD [22]. Due to this interesting feature of R matrix, various researchers [3, 14, 20, 22] used QR factorization in digital watermarking.

2.3 Lifting wavelet transform

In recent years, LWT proposed by Swelden [6], becomes the powerful tool for image analysis due to its faster and efficient implementation than DWT. LWT gives better results than DWT in the field of image compression [7], image de-noising [18], and watermarking [9, 11, 21]. The lifting based wavelet transform not only save times [6] but also has the better frequency localization feature that overcomes the shortcomings of DWT. Decomposition of signal using LWT involves three steps: splitting, prediction and update shown in Fig. 1b are described as:

  • Split: divide the original signal x[n] into non overlapping even and odd samples that is x e [n] (even samples) and x o [n] (odd samples),

    $$ {x}_e\left[n\right]=x\left[2n\right], \kern0.24em {x}_o\left[n\right]=x\left[2n+1\right] $$
    (3)
  • Predict: if even samples and odd samples are correlated then one can be the predictor of other. To predict x 0[n] we use x e [n] samples using:

    $$ d\left[n\right]={x}_o\left[n\right]-P\left({x}_e\left[n\right]\right) $$
    (4)

    where d[n] is the difference between the original sample and its predicted value defined as high frequency component and P(.) is the predictor operator.

  • Update: with the help of update operator U(.) and detail signal d[n], we can update the even samples. Then the low frequency components l[n] which represent the coarse shape to the original signal are obtained as:

    $$ l\left[n\right]={x}_e\left[n\right]+U\left(d\left[n\right]\right) $$
    (5)

2.4 LSVR formulation

In recent years, statistical learning theory based supervised machine learning algorithm called support vector machine is used for both classification and regression problems [2, 12]. The aim of the regression model is to find a relationship between the given input samples corresponding to their target values.

The 2-norm objective function of linear SVR with ε insensitive error loss function as a constrained minimization problem can be defined as [2]:

$$ \underset{\left(w,b,\xi, {\xi}^{*}\right)}{ \min}\frac{1}{2}\left({w}^tw+{b}^2\right)+\frac{C}{2}{\displaystyle \sum_{i=1}^m\left({\xi}_i^2+{\xi}_i^{*2}\right)} $$
(6)
$$ \mathrm{Subject}\ \mathrm{t}\mathrm{o}\kern1em \begin{array}{l}\left({y}_i-{A}_iw-b\right)\le \left(\varepsilon +{\xi}_i\right)\hfill \\ {}\left({A}_iw+b-{y}_i\right)\le \left(\varepsilon +{\xi}_i^{*}\right)\hfill \end{array} $$
(7)

where ξ i , ξ * i are slack variables and ε, C are input parameters. Since none of the components of the vector ξ = (ξ 1,........, ξ m )t or ξ* = (ξ *1 ,......, ξ * m )t will be negative at optimality, their non-negativity constraints have been dropped in the formulation (6). The linear regression estimation function of (6) and its approximation to the vector y ∈ R m of observed values will become y ≈ Aw + be.

Where w and b be the solution of (6) and e is column vector of ones of dimension m. Using Lagrange multipliers λ 1 = (λ 11,........, λ 1m )t andλ 2 = (λ 21,......., λ 2m )t in R m, the obtained Lagrangian function L is:

$$ L\left(w,b,\xi, {\xi}^{*},{\lambda}_1,{\lambda}_2\right)=\frac{1}{2}\left({w}^tw+{b}^2\right)+\frac{C}{2}{\displaystyle \sum_{i=1}^m\left({\xi}_i^2+{\xi}_i^{*2}\right)+{\displaystyle \sum_{i=1}^m{\lambda}_{1i}\left({y}_i-{A}_iw-b-\varepsilon -{\xi}_i\right)+{\displaystyle \sum_{i=1}^m{\lambda}_{2i}\left({A}_iw+b-{y}_i-\varepsilon -{\xi}_i^{*}\right)}}} $$
(8)

The partial derivatives of L with respect to the primal variables will be zero at optimality the dual can be written as minimization problem of the form defined in [2, 12] in which

$$ w={A}^t\left({\lambda}_1-{\lambda}_2\right) and\;b={e}^t\left({\lambda}_1-{\lambda}_2\right) $$
(9)

hold. The linear regression estimation function f(.) using (9) is:

$$ f(x)=\left[{x}^t\kern0.24em 1\right]\left[\begin{array}{l}{A}^t\\ {}{e}^t\end{array}\right]\left({\lambda}_1-{\lambda}_2\right). $$
(10)

Define G = [A e], an augmented matrix. Then the dual problem (8) can be written as:

$$ \underset{0\le {\lambda}_1,{\lambda}_2\in {R}^m}{ \min}\frac{1}{2}\left[{\lambda}_1^t\kern0.48em {\lambda}_2^t\right]Q\left[\begin{array}{l}{\lambda}_1\\ {}{\lambda}_2\end{array}\right]-{r}^t\left[\begin{array}{l}{\lambda}_1\\ {}{\lambda}_2\end{array}\right] $$
(11)
$$ \mathrm{Where}\kern0.5em Q=\left[\begin{array}{l}\frac{I}{C}+G{G}^t\kern1.92em -G{G}^t\\ {}-G{G}^t\kern2.88em \frac{I}{C}+G{G}^t\end{array}\right] $$
(12)
$$ \mathrm{and}\kern0.5em r=\left[\begin{array}{l}{r}_1\\ {}{r}_2\end{array}\right]=\left[\begin{array}{l}y-\varepsilon e\\ {}-y-\varepsilon e\end{array}\right] $$
(13)

are block matrices. The linear SVR formulation (11) defined in dual variables can be extended into non linear SVR model by replacing GG t by kernel matrix K = K(G, G t) which is positive semi definite symmetric. The nonlinear SVR problem in dual variables can be formulated in the form of (11) where Q will become

$$ Q=\left[\begin{array}{ll}\frac{I}{C}+K\left(G,{G}^t\right)\hfill & -K\left(G,{G}^t\right)\hfill \\ {}-K\left(G,{G}^t\right)\hfill & \frac{I}{C}+K\left(G,{G}^t\right)\hfill \end{array}\right] $$
(14)

Thus, for any vector x ∈ R n the kernel regression estimation function f(.) is obtained to be of the form

$$ f(x)=K\left(\left[{x}^t\kern0.24em 1\right],\kern0.24em {G}^t\right)\left({\lambda}_1-{\lambda}_2\right) $$
(15)

In SVR we have seen that the dual problem for either the linear or nonlinear case can be written as:

$$ \underset{0\le \lambda \in {R}^{2m}}{ \min}\frac{1}{2}{\lambda}^tQ\lambda -{r}^t\lambda $$
(16)

where \( \lambda =\left[\begin{array}{l}{\lambda}_1\\ {}{\lambda}_2\end{array}\right] \) is a vector in R 2m. The KKT necessary and sufficient optimality conditions for the dual problem (16) will become solving the classical nonlinear complementarity problem

$$ 0\le \lambda \perp {\left(\left(Q\lambda -r\right)-\alpha \lambda \right)}_{+} $$
(17)

However, the optimality condition (17) holds if and only if for any α > 0 the relation (Qu − r) = ((Qu − r) − αu)+ holds. The solution of the above problem is obtained by applying the following iterative procedure

$$ {\lambda}^{i+1}={Q}^{-1}\left(r+{\left(\left(Q{\lambda}^i-r\right)-\alpha {\lambda}^i\right)}_{+}\right) $$
(18)

Based on the discussion of the algorithm and its convergence [8], it is defined

$$ H=\left\{\begin{array}{l}G{G}^t\kern1.08em for\; the\; linearcase\\ {}K\left(G,{G}^t\right)\kern0.36em for\; the\; nonlinear\; case\end{array}\right. $$
(19)

Then, the matrix Q is written as a block matrix of the form

$$ Q=\left[\begin{array}{l}\frac{I}{C}+H\kern0.96em -H\\ {}-H\kern1.44em \frac{I}{C}+H\end{array}\right] $$
(20)

and it is used in (18) to obtain the Lagrangian multipliers which is further used in (15) with the kernel (RBF) to find the regression function.

3 The proposed watermarking scheme

Low frequency subband is used for embedding the watermark as the maximum energy of the signal is concentrated in low frequency coefficients and are more robust against image processing operations. When noise is added to the signal, it corresponds to high frequency components and embedding the watermark in detailed coefficients is not robust. Based on the correlation property of lifting wavelet coefficients, the selected blocks are further decomposed using QR factorization to obtain a unitary matrix Q and an upper triangular matrix R. The interesting feature of R matrix is that when the columns of selected blocks have correlation with each other, the absolute value of element of first row of R matrix is greater than that of the other rows and contains the maximum energy of signal and greater values allows a larger modification range. In order to find the optimum element of the first row of R matrix of each block, several experiments on different benchmark images are performed. In this process, we embed the watermark in the different elements of first row of R matrix respectively and then watermark is recovered from the watermarked image. The lesser the bit error ratio (BER), better is the quality of extracted watermark. The results of watermarked ‘Lena’ image are reported in Table 1. From Table 1, we found that the element r 1,3 give the better result against image processing operations. So, in the proposed scheme we select element r 1,3 of R matrix to embed the watermark. The block diagram of proposed scheme is shown in Fig. 2.

Table 1 BER comparison for selection of element of first row of R matrix to embed watermark
Fig. 2
figure 2

block diagram of proposed watermarking scheme. (a) Watermark embedding procedure using LSVR. (b) Watermark extraction using trained LSVR

3.1 Algorithm: watermark embedding

  • Step 1: The security of watermark is achieved by performing Arnold transformation using Eq. (1) to watermark logo to obtain the scrambled image which is embedded into the host image after converting scrambled image into one dimensional (1-D) vector SW m  = {w i  : i = 1, 2,....., L w }, where SW m is the scrambled watermark, L w is its length and w i  = {0, 1}.

  • Step 2: Let the host image I = {I(x, y) : 1 ≤ x ≤ M, 1 ≤ y ≤ N} be the 8 bit gray scale image with size 512 × 512 is decomposed into four subbands low-low (LL), low-high (LH), high-low (HL) and high-high (HH) using one level LWT. The size of each subband is M L  × N L :

    $$ {M}_L=\frac{M}{2^k},{N}_L=\frac{N}{2^k} $$

    Here k denotes the level of decomposition (here k = 1). Split the lifting wavelet coefficients of LL subband into non-overlapping blocks with size 4 × 4. Compute the standard deviation (SD) of each block and arrange the blocks in ascending order. Then, average SD of all the blocks is decided as the threshold (T) for selecting the blocks to embed the watermark. Select m (m = 2 × N1 × N2) no. of blocks having the SD value less than T.

  • Step 3: Each selected block of LL subband is decomposed by QR decomposition using Eq. (2) to obtain a unitary matrix Q and upper triangular matrix R shown in Fig. 3, each of size 4 × 4. The characteristic of R matrix is that the absolute value of elements of first row is greater than the elements of other rows. From the experimental results shown in Table 1, we find that r 1,3 is the best element for embedding the watermark. So, r 1,3 is the significant element for the desired output to LSVR and its remaining upper triangular elements, (r 1,1, r 1,2, r 1,4, r 2,2, r 2,3, r 2,4, r 3,3, r 3,4, r 4,4) called feature vector act as input to LSVR. Thus a complete dataset with size m × l is formed by extracting the features from all the selected blocks. (Here, l = 10).

    Fig. 3
    figure 3

    R matrix of QR Decomposition of selected LWT block

  • Step 4: After selection of prominent features from each block, a dataset DS is formed to train LSVR:

    $$ DS=\left\{\begin{array}{l}\left({x}_i,{d}_i\right)\in {R}^9\times R:i=1,2,\dots, m\\ {}=\left\{\left({r}_{1,1},{r}_{1,2},{r}_{1,4},{r}_{2,2},{r}_{2,3},{r}_{2,4},{r}_{3,3},{r}_{3,4},{r}_{4,4}\right),{r}_{1,3}\right\}\end{array}\right\} $$

    where r 1,3 as depicted in Fig. 3 is the desired output and its nine upper triangular elements act as input to LSVR. Odd number of samples are used to train the LSVR defined by Eq. (18) i.e. DS = {(x i , d i ) : i = 1, 3, 5, …, m} are act as input to train the LSVR corresponding to desired output d i  = {r 1,3 : i = 1, 3, 5, …, m}. After training LSVR, even number of samples of dataset are used as input to trained LSVR to obtain the predicted output using Eq. (18) corresponding to desired output

    $$ {D}_i=\left\{{r}_{1,3}:i=2,4,6,\dots, m\right\} $$

    in which the watermark is embedded according to the following rule:

    $$ \begin{array}{l} if\;wm\_ bit=1\\ {}\kern1.2em {r}_{1,3}^{\prime }= \max \left({r}_{1,3},{r}_{1,3}^{Lsvr}+\alpha \right)\\ {} else\\ {}\kern1.2em {r}_{1,3}^{\prime }= \min \left({r}_{1,3},{r}_{1,3}^{Lsvr}-\alpha \right)\end{array} $$
    (21)

    where, r 1,3 is the modified value after embedding the watermark and is replaced with original r 1,3 element of R matrix of each selected block, r Lsvr1,3 is the predicted output obtained by trained LSVR, α is the watermark strength and wm_bit is the scrambled watermark bit. The value of α is chosen after a performing a number of repetitions experiments and it is found that imperceptibility and robustness tradeoff can be minimized for α = 20.

  • Step 5: After replacing r 1,3 with r 1,3 of each selected block, the watermarked image is obtained by performing inverse QR process followed by inverse LWT transform. The quality of the watermarked image is evaluated by peak-signal-to-noise ratio (PSNR) defined by Eq. (23).

3.2 Algorithm: watermark extraction

Extracting the watermark from the watermarked image is the reverse of watermark embedding which includes the following steps:

  • Step 1: The watermarked image is decomposed into four subbands LL′, LH′, HL′ and HH′ using one level LWT and according to the index used in embedding process blocks are selected.

  • Step 2: Each selected block of LL′ subband is decomposed by QR decomposition using (2) to obtain a matrix Q′ and upper triangular matrix R′ shown in Fig. 4, each of size 4 × 4. Similar to Step 2 of embedding process, dataset is formed. To perform watermark extraction, even number of data samples

    $$ DS=\left\{\begin{array}{l}\left({x}_i,{d}_i\right)\in {R}^9\times R:i=2,4,\dots, m\\ {}=\left\{\left({r}_{1,1}^{\prime },{r}_{1,2}^{\prime },{r}_{1,4}^{\prime },{r}_{2,2}^{\prime },{r}_{2,3}^{\prime },{r}_{2,4}^{\prime },{r}_{3,3}^{\prime },{r}_{3,4}^{\prime },{r}_{4,4}^{\prime}\right),{r}_{1,3}^{\prime}\right\}\end{array}\right\} $$

    are supplied to trained LSVR (using (18)) to get the output r Lsvr1,3  : i = 2, 4, 6, …, m corresponding to desired output d i  = {r 1,3  : i = 2, 4, 6, …, m} . Then, compare the LSVR output (predicted value) with the desired output D i  = {r 1,3  : i = 2, 4, 6, …, m} corresponding to each block of the watermarked image to extract the binary sequence W m

    $$ {W}_m^{\prime }=\left\{\begin{array}{l}1\kern0.84em if\;{r}_{1,3}^{\prime }>{r}_{1,3}^{Lsvr}\\ {}0\kern0.72em otherwise\end{array}\right. $$
    (22)

    where, r Lsvr1,3 is the LSVR output and r 1,3 is the desired output value of each block.

  • Step 3: After obtaining the watermark sequence in a vector form, it is reshaped to the two dimensional matrix to obtain scrambled image which is followed by inverse Arnold transformation to obtain the recovered watermark. The quality of the extracted watermark is evaluated by computing BER and NC defined by Eqs. (25) and (26) respectively.

Fig. 4
figure 4

The original (a) Lena (b) Pepper (c) Elaine (d) Baboon (e) Boat (f) Plane (g) original watermark1 (h) original watermark2

4 Experimental results, discussion and comparison

All the experiments are carried out using Intel core TM i3-2350M CPU 2.3 GHz windows7 machine with 4 GB RAM in MATLAB 7.10 Platform. The imperceptibility and robustness of the proposed scheme is verified through the extensive experiment on different textured images “Lena”, “Pepper”, “Elaine”, Baboon”, “Plane” and “Boat” with size bits along with binary watermarks ‘CS’ and ‘IPU’ with size 32 × 32 as shown in Fig. 4. To train LSVR, we select radial basis function (RBF) as LSVR kernel with spread σ = 10− 3. The other parameter like penalty parameter C and insensitive constant ε used in LSVR training are determined by performing large number of experiments and they are set as C = 50, ε = 0.01. Several lifting methods like “Haar”, “Daubechies (db2, db4 etc.)”, “sym3”, “sym4” are used under all benchmark images to achieve prominent feature extraction. According to the experimental results, we have seen that as compared to “polynomial” and “linear” kernel, the RBF kernel and lifting scheme “db2” gives better results under different types of image processing attacks. The quality of the watermarked image is quantified by peak signal-to-noise ratio (PSNR) defined by:

$$ PSNR=10{ \log}_{10}\frac{255^2}{MSE}\kern0.6em (dB) $$
(23)

where MSE is the mean square error between the original and the distorted image defined as:

$$ MSE=\frac{{\displaystyle \sum_{x=1}^M{\displaystyle \sum_{y=1}^N{\left(I\left(x,y\right)-{I}^{\prime}\left(x,y\right)\right)}^2}}}{M\times N} $$
(24)

where I(x, y) and I′(x, y) denote the (x, y)th pixel value of the host and watermarked image respectively. BER and NC is used to measure the similarity between extracted watermark W* and original watermark W defined by:

$$ BER\left(W,{W}^{*}\right)=\frac{{\displaystyle \sum_{i=1}^{N1}{\displaystyle \sum_{j=1}^{N2}W\left(i,j\right)\otimes {W}^{*}\left(i,j\right)}}}{N1\times N2} $$
(25)
$$ NC\left(W,{W}^{*}\right)=\frac{{\displaystyle \sum_{i=1}^{N1}{\displaystyle \sum_{j=1}^{N2}W\left(i,j\right)*{W}^{*}\left(i,j\right)}}}{\sqrt{{\displaystyle \sum_{i=1}^{N1}{\displaystyle \sum_{j=1}^{N2}{W}^2\left(i,j\right)}}}\sqrt{{\displaystyle \sum_{i=1}^{N1}{\displaystyle \sum_{j=1}^{N2}{W}^{*2}\left(i,j\right)}}}} $$
(26)

where W(i, j) and W*(i, j) denote the (i, j)th pixel value of the original and extracted watermark respectively, N1 × N2 is the size of watermark. Figure 5 shows the watermarked images along with extracted watermark corresponding to their PSNR, NC and BER value. High PSNR values indicate the imperceptibility of the watermark as well as good quality of watermarked image. From Figs. 4 and 5, we find that there is no degradation in the quality of watermarked image and extracted watermark corresponding to original one.

Fig. 5
figure 5

The watermarked (a) Lena (b) Pepper (c) Elaine (d) Baboon (e) Boat (f) Plane images along with extracted watermark without attacks

4.1 Robustness evaluation

The robustness of the proposed scheme is evaluated by the BER and NC value of the extracted watermark after performing several image processing attacks including JPEG compression, addition of Gaussian noise, salt and pepper noise, median filtering, low pass filtering, contrast enhancement and geometric attacks like scaling, cropping and rotation on the watermarked image. The results of all these attacks are listed in Tables 2, 3 and 4. Figure 6 show the extracted ‘CS’ watermark corresponding to watermarked ‘Plane’ image distorted under different attacks. The various types of image processing attacks are described as:

Table 2 BER value of extracted watermark for addition of Salt & pepper noise, sharpening, contrast enhancement and addition of Gaussian noise
Table 3 BER value of extracted watermark for average filtering, median filtering, speckle noise, Wiener filtering, Gaussian blurring and JPEG compression
Table 4 BER value of extracted watermark for scaling, cropping, rotation and shearing operations
Fig. 6
figure 6

Attacks on watermarked ‘Plane’ image and corresponding recovered watermark (a) Cropping from centre (b) Cropping from Top (c) Cropping from side (d) addition of salt and pepper noise with density 0.005 (e) addition of salt and pepper noise with density 0.01 (f) addition of salt and pepper with density 0.02 (g) addition of Gaussian noise with variance 0.001 (h) addition of Gaussian noise with variance 0.005 (i) addition of Gaussian noise with variance 0.01 (j) Resize(512-256-512) (k) Resize(512-128-512) (l) Rotation (0.5°) (m) average filtering (3 × 3) (n) average filtering (5 × 5) (o) Wiener filtering (3 × 3) (p) median filtering (3 × 3) (q) median filtering (5 × 5) (r) sharpening (s) contrast enhancement (t) Gaussian blurring (u) JPEG (QF = 50) (v) Shearing along Horizontal direction

Noise addition: the distorted images are obtained by adding Gaussian noise with variance 0.001, 0.005 and 0.01 and addition of salt and pepper noise with density 0.005, 0.01 and 0.02 to the watermarked images. The BER value of the extracted watermark under different parameters is summarized in Table 2.

Sharpening and contrast enhancement

an unsharp contrast enhancement filter is formed with size 3 × 3 from Laplacian filter to sharpen the watermarked images with parameter α = 0.2. The contrast of watermarked image is enhanced by histogram equalization operation. The BER value of the extracted watermark after sharpening and contrast enhancement attack is listed in Table 2.

Filtering

median filtering, average filtering and Wiener filtering operations are performed with varying mask size of 3 × 3 and 5 × 5 and the blurring operation is performed using Gaussian filtering with size 3 × 3. The BER value of the extracted watermark under filtering operations is listed in Table 3.

JPEG compression

after compressing the watermarked image with quality factor ranging from 20 to 100, we are able to recover the recognizable watermark. The results under JPEG compression attack of all the images measured by the BER value of extracted watermark are shown in Fig. 7 and the BER value of extracted watermark under quality factor (QF = 50, 70 and 90) is listed in Table 3.

Fig. 7
figure 7

Performance against JPEG compression attacks

Scaling

First, the watermarked image is downscaled from 512 × 512 to 128 × 128 and then downscaled image is upscale to the original size. Second, the watermarked image is downscaled from 512 × 512 to 256 × 256 and then downscaled image is upscale to the original size using bi-cubic interpolation method. The results of scaling attack of all the tested images are listed in Table 4.

Cropping

we cropped the watermarked image under different divisions such as: (a) the image is cropped from centre (b) cropping from side and (c) cropping at the top corner of the watermarked image. The results of cropping attack of all images are shown in Table 4.

Shearing

a distortion of shape is produced by applying shearing on the watermarked image. Shearing along horizontal direction that is along x direction is performed with a shearing factor 0.005 on the watermarked image. The results of shearing attacks on all the test images are tabulated in Table 4 and visual representation of the Plane image along with extracted watermark is shown in Fig. 6.

Rotation

after rotating the watermarked image with small angle of rotation (0.1, 0.5), we are able to recover the watermark but for large rotation angles the BER value of recovered watermark is high which shows that the proposed scheme does not show robustness against rotation attack. The results of all tested image under different angles of rotation are listed in Table 4.

4.2 Comparison and discussion

4.2.1 Imperceptibility comparison

The performance of the proposed scheme is evaluated by comparing it with the method presented by Yashar et al. [14] using QR factorization in wavelet domain, QR decomposition based image watermarking method proposed by Song Wei et al. [22] and image watermarking in multiwavelet domain based on SVM [15]. Different textured images used in [14, 15, 22] are used in our experiment to give a fair comparison shown in Fig. 4. From Table 5, we see that the proposed scheme has higher PSNR value as compared with [14, 15, 22], which shows the better imperceptibility of watermark. The zero BER value indicates the resemblance between the extracted watermark and the original watermark. The parameter ‘NA’ in Table 5 indicates the non availability of images in these methods.

Table 5 Imperceptibility comparison results of proposed scheme

4.2.2 Robustness comparison

The robustness of the proposed scheme is verified by comparing it against various types of geometric and non geometric image processing operations in [14, 15, 22] are described in Tables 6, 7 and 8. Summarization of comparison of the proposed scheme with [14, 15, 22] is explained as:

  1. (a)

    From Table 6, we see that the NC value of extracted watermark against different types of image processing operations is higher than the scheme proposed by [14] using QR factorization in wavelet domain. The authors of [14] claim addition of Salt and Pepper noise with density 0.005 which is very less than the proposed scheme (i.e. with density 0.01 and 0.02). Also in case of average and median filtering attack with window size 5 × 5, proposed scheme is able to recover the recognizable watermark.

  2. (b)

    The comparison of the proposed scheme with [22] based on QR decomposition on ‘Lena’ image are listed in Table 7. The watermarked ‘Lena’ image was subjected under filtering operation, scaling with parameter 0.5 and 0.9, addition of noise until the watermarked image had a PSNR of approximate 20 dB, sharpening, cropping and rotation attack. From Table 7, we see that our scheme outperforms against all the attacks as quantified by BER and NC value of extracted watermark. However, in case of addition of salt and pepper noise and rotation attack Song’s et al. [22] performs better than our scheme.

Table 6 NC value comparison with Naderahmadian et al. [14] method on “Lena” image
Table 7 BER value comparison with Wei et al. [22] method
Table 8 BER value comparison with Peng’s [15] method

A comparison of the proposed scheme with the scheme based on multiwavelet domain using SVM [15] is shown in Table 8. For fair comparison, similar attacks on Boat image with same parameters are performed and results are shown in Table 8. From Table 8, we deduct that under JPEG compression with quality factor 80 and 50, low pass filtering, median filtering, average filtering, addition of 10 % Gaussian noise, scaling (50 %), cropping (25 %), blurring and sharpening attacks, the BER value of the extracted watermark is much less than the scheme proposed in [15]. This proves the robustness of proposed method. However, in case of addition of salt & pepper noise (2 %) and rotation attacks, Peng et al. [15] method gives slightly better results.

4.2.3 Computational cost

For an efficient and robust image watermarking, a faster, imperceptible and robust feature extraction technique is required. In the proposed image watermarking scheme, this requirement has been achieved with the combination of LWT and QR decomposition. Here, LWT provides a faster implementation of transformation technique (DWT is the most used transformation technique for image watermarking and LWT is a faster and efficient implementation of DWT [6]). In LWT, lifting allows for an in place implementation i.e. the wavelet transform can be computed without allocating the auxiliary memory [6]. This means LWT is memory efficient compared to DWT. In lifting wavelet domain, all operations within one lifting step can be performed completely parallel. This means that the sequential part is the computational cost of lifting operation. Thus LWT is computationally faster than DWT. QR decomposition provides the robust coefficient in which watermark bits are embedded. QR decomposition is a faster and robust method as compared to SVD. As the computational complexity of SVD is O(n 3) whereas that of QR decomposition is O(n 2) for a matrix of order n. In the proposed scheme, LSVR is utilized to learn the image characteristics and to find the non linear regression function between the input vector and target. As compared to classical support vector regression (SVR) algorithm which uses quadratic optimization, LSVR is iterative algorithm [2]. Moreover, the host image is divided into 8 × 8 blocks and based on the statistical property of each block, the watermark bits are embedded into the selected blocks which further reduces the computation cost.

The main attraction of the proposed scheme are: (1) selection of LL subband to embed the watermark as the maximum energy of the signal is contained in low frequency coefficients and these coefficients are more robust against distortions. (2) selection of blocks of wavelet coefficient based on their correlation property and selection of element from the first row of R matrix provides the imperceptibility to our scheme. (3) The application of LSVR for image watermarking application is the novelty of the proposed scheme. LSVR has good learning capability (to find the nonlinear relationship) of image features and its high generalization property gives significant improvement as compared to classical SVR under several image processing attacks by which high robustness can be achieved. (4) The proposed scheme is efficient in terms of computational cost and memory requirement.

5 Conclusion

A novel image watermarking scheme through the combination of LWT-QR decomposition and LSVR is proposed in this paper. Feature extraction using LWT-QR decomposition results in good performance on imperceptibility. The robustness against several image processing operations is accomplished by the high generalization property of LSVR in the proposed scheme. The security of the watermark is achieved using Arnold transformation. Faster and efficient implementation of LWT, QR and LSVR as compared to traditional wavelet transform, SVD and classical SVR respectively makes the proposed scheme more efficient in terms of memory requirement and computational cost. Comparison with the state-of-art techniques proves that the proposed scheme not only attains imperceptibility but also has strong robustness. In future work, we will consider the rotation and shearing invariant feature extraction method so that robustness against these attacks can be improved.