1 Introduction

Steganography is a method of hiding secret data within a non-secret data (cover) without any visual distortion. Based on the cover, steganography is classified as image, text, audio or video steganography. Image steganography mainly uses spatial and a transform domain to hide the data in LSBs and frequency coefficients of the pixels respectively. LSB substitution and PVD are the most common spatial hiding techniques. PVD uses the difference between pixel values to hide more data bits in noisy relative to smooth areas. However, algorithms hiding binary data show low embedding capacity and the ones with large hiding capacity result in lower imperceptibility [1].

To enhance payload and imperceptibility, variants of LSB substitution and PVD have been proposed. Seven-way PVD [15] was proposed using nine-pixel blocks to increase both capacity and Peak Signal-to-Noise Ratio (PSNR). PVD combined with modulus function [18], hiding data in R and B planes, increased payload and PSNR. Nine-pixel differencing with modified LSB [20] readjusted the pixel values to minimize distortion. Nonadaptive PVD resulted in a higher capacity than the adaptive method [21]. Five and eight-way PVD with modified LSB [22] were proposed to get high bit rates, lesser distortion and address falloff boundary problem (pixel exceeding upper (255) or lower (0) range value after substitution). LSB combined with eight directional PVD [23] enhanced capacity, PSNR and evaded steganalysis. Pixel indicator based adaptive LSB and PVD [27] on scrambled RGB images increased capacity by 50%. An adaptive method using LSB and PVD [5] for grayscale images improved both capacity and PSNR. Multi-pixel differencing and LSB [10] embedded data in noisy and smooth blocks of the grayscale image respectively. Construction based data hiding [13] transformed the message into a fingerprint image directly. It resulted in accurate extraction and hard detection of the message. Image steganography robust to JPEG compression [26] was presented by hiding data in channel compressed version of the original image. Data were extracted with exact accuracy. Overlapping block-based PVD [16] obtained from combining color components improved both capacity and PSNR. QVD (quotient value differencing) and MFPVD (modulus function PVD) [24] were combined obtaining higher payload and PSNR respectively. QVD combined with LSB [25] achieved the highest capacity in the available literature but at the cost of reduced PSNR. A literature review of spatial domain steganography techniques follows in [7].

The proposed technique, in this paper, hides data digits instead of binary data. To the best of our knowledge, the proposed technique has achieved the second-best capacity with the highest PSNR relative to the technique [25]. The rest of the paper is organized as follows. Section 2 describes the related work. The methodology of the proposed technique is discussed in Section 3. Section 4 presents a comparative analysis of the proposed technique with existing spatial hiding techniques. Section 5 performs security analysis of the proposed technique. The conclusion is presented in Section 6.

2 Related work

Digit-by-digit data hiding using modulus function [28] was proposed for grayscale images. The method resulted in a higher capacity and produced better results than LSB substitution. Pixel Value Modification (PVM) [14] was proposed for RGB images using a modulus three function. The method resulted in increased hiding capacity. PVM method was enhanced [12] and presented using varying modulus functions. However, the hiding capacity depended on the modulus function. Hiding of ASCII values corresponding to characters, in the least significant digits of the pixels [19] was proposed using medical cover images. However, only pixel values 0–249 were used for embedding. Parity-bit PVD was combined with improved Right Most Digit Replacement (iRMDR) [8], increasing both capacity and PSNR for grayscale images. RMDR with adaptive LSB [9] was proposed, resulting in the high capacity and good visual symmetry for grayscale images. Generalization of the concept of a single-digit sum [3] was presented for supporting the number system with any base to enable hiding of the data bits in pixels.

The proposed method in comparison to the existing digit hiding methods, embeds 393,216 bytes (3,145,728 bits), 512 × 512 grayscale or a 256 × 512 color image in a 512 × 512 color image, relatively achieving higher capacity. Indeed, the trade-off between capacity and imperceptibility (39 dB approximate PSNR at 3,145,728 bits or 4bpp) obtained using the proposed method has not been found in the available literature.

3 Proposed method

The method proposed in this paper compresses ASCII values of text characters or pixel values of data image into two-digit decimal values. The compression provides encryption and enables large hiding. Sections 3.1 and 3.2 present the overview of the procedure to hide text and data image respectively. After conversion, the individual digits of the data decimal values are hidden sequentially in cover pixels using the method provided in Section 3.3. Section 3.4 illustrates mathematically that the changes in cover pixel values after hiding data are minimal. Data extraction is described in Section 3.5.

3.1 Text hiding procedure

  1. Step 1:

    the text is converted into ASCII format.

  2. Step 2:

    ASCII values 32–126 (x) are compressed to range 00–95 using a function,

    $$ F(x)=\left\lfloor\ \left(\frac{x- lb}{ub- lb}\right)\ast \mathit{\max}\right\rfloor; lb=32, ub=126,\mathit{\max}=95 $$

    The control characters Carriage return, new line, horizontal and vertical tab are converted to 96, 97, 98 and 99 respectively. Thus, two-digit value is obtained corresponding to every character.

  3. Step 3:

    a sequence or a string S of digits is created using the obtained two-digit values. The first digit in S is embedded in the first blue, the second digit in first green and third in first red pixel. The next three digits in S are inserted in second-pixel components. Following this way, digits in S are embedded sequentially into the pixel components.

A character (two-digit value) is embedded in two pixels. Therefore, XxY RGB image (XxYx3 pixels) can hide (XxYx3)/2 bytes using the proposed method. For example, 512 × 512 RGB image can hide total 393,216 bytes or 3,145,728 bits.

3.2 Data image hiding

Data image pixels are compressed to range 00–64 using respective six MSBs, which are sufficient to recover the image in acceptable detail. Therefore, the data comprises of pixel values 00–64. This is followed by Step 3 in Section 3.1 to hide the compressed grayscale image. For color data image, a pixel (00–64) of each plane is inserted into two consecutive pixels of the corresponding cover plane. For example, individual digits of the 1st red pixel are embedded into 1st and 2nd red pixels of the cover image, 2nd red pixel into 3rd and 4th red cover pixels. Following the process for the other two planes, X x (Y/2) or (X/2) xY color image can be embedded in an XxY color image.

3.3 Hiding digit in a pixel value

Pixel values lie in range 000–255. Fig. 1 shows the maximum three digits of a pixel p (Most Significant Digit or leftmost digit (LMD) α, a middle digit (MD) β, Least Significant Digit or rightmost digit (RMD) γ). The pixel value p is split into individual digits, using place values as p = 100α + 10β + γ. Based upon the difference between the rightmost digit and digit d to be embedded, new pixel value p’ is computed as p’ = 100α’ + 10β’ + γ’. The proposed method replaces the rightmost digit by d, followed by changing the middle and/or leftmost digit if required, to bring minimal change in the pixel value. When the difference between RMD and d is up to value 5, RMD is replaced by d without any change in either MD or LMD. In a rare case of a modified value obtained beyond 255, the middle digit is decreased by 1 to solve the falloff boundary problem. However, when the difference between RMD and d is more than 5, i.e. change in the pixel p by value more than 5, then MD and/or LMD are also changed to ensure minimal changes in cover pixels. In such a case, the method uses the positive and negative difference as an indicator to increase and decrease the pixel value respectively. To increase the pixel value, MD (β) is increased by 1 after replacing RMD by d. Middle digit β ∈ {0, …, 8} upon increasing changes to β’ ∈ {1, …, 9}. But, increasing MD equal to 9 changes it to 0 and increases LMD by 1 as in simple addition. To decrease the pixel value, MD is decremented by 1. MD equal to 0 upon decreasing changes to 9 and LMD decreases by 1 as in mathematical subtraction. The threshold for comparison has been selected to be 5 since it resulted in minimal value changes. Fig. 2 illustrates the manipulation of individual digits of a pixel value to hide a digit in the rightmost digit of the pixel. The method thus uses the rightmost digit or least significant digit replacement.

Fig. 1
figure 1

Individual digits of a pixel value

Fig. 2
figure 2

Procedure to hide a digit in a pixel value

3.4 Changes in pixel values after hiding data

Amount of distortion to the cover pixels using the proposed method has been computed mathematically corresponding to each case of Fig. 2. Let original pixel be p = 100α + 10β + γ and new pixel value be p’ = 100α’ + 10β’ + γ’.

figure d
figure e

This shows that pixel values change by the values in range 0–4 or 0–5 in almost all cases. Change in a pixel value by maximum value 9 is rare, either in solving falloff boundary problem or a single-digit cover pixel value.

3.5 Data extraction

A digit is extracted from stego pixel using modulus 10 on the pixel value. To recover text or grayscale image, digits are extracted sequentially and grouped into two-digit values. Values (y) 00–95 are converted back to ASCII values using ⌈ (y*(ub - lb)/max + lb) ⌉ and 96–99 are converted to respective control characters to recover the text. Values 00–64 are used to recover the grayscale image. For color data image, planes are recovered, extracting pixels (00–64) from every two consecutive pixels of the corresponding stego plane.

4 Results and analysis

The proposed technique has used standard 512 × 512 color test images of JPG format as cover images obtained from the SIPI image database (http://sipi.usc.edu/database/database.php?volume=misc). Cover images are shown in Fig. 3 (a - h). Fig. 4 presents stego images with PSNR values resulting from embedding 393,216 bytes (3,145,728 bits) in the respective cover images using the proposed technique. PSNR measures imperceptibility in dB using Eq. (10) between cover C and stego S for XxY images.

$$ PSNR=10\ast {\log}_{10}\ \frac{X\ast Y\ast 255\ast 255}{\sum \limits_{i=1}^X{\sum}_{j=1}^Y{\left({C}_{ij}-{S}_{ij}\right)}^2} $$
(10)
Fig. 3
figure 3

512 × 512 color cover images: a lena b baboon c pepper d jet e boat f house g barbara h pot

Fig. 4
figure 4

Stego images on hiding 3,145,728 bits in respective cover images using the proposed technique

QI (Quality Index), a measure of similarity is computed using Eq. (11), where Cm and Sm are the average pixel values.

$$ QI=\frac{4\ast {C}_m\ast {S}_m\ast \left\{{\sum}_{i=1}^X{\sum}_{j=1}^Y\left({C}_{ij}-{C}_m\right)\ \left({S}_{ij}-{S}_m\right)\right\}}{\left\{\ {\sum}_{i=1}^X{\sum}_{j=1}^Y{\left({C}_{ij}-{C}_m\right)}^2+{\sum}_{i=1}^X{\sum}_{j=1}^Y{\left({S}_{ij}-{S}_m\right)}^2\ \right\}\ast \left\{{C_m}^2+{S_m}^2\right\}} $$
(11)

The results of the existing LSB + PVD techniques and the proposed technique are presented in Table 1 to Table 3. In Table 2, for MFPVD [24], PSNR is provided for 840,000 bits. However, for the proposed technique, PSNR has been computed for 3,145,728 bits. The proposed technique has achieved relatively the higher capacity of 4bpp with considerably good PSNR of 38.74 dB (39 dB approximately). Comparison of the proposed technique with QVD techniques for a payload of 700,000 bits has been presented in Table 4. QVD [24] and the proposed technique has attained the same capacity of 4bpp. To compare the two, the median capacity for both the techniques has been computed as the median is more robust to outliers. The median capacity obtained for proposed and QVD technique [24] is 3,145,728 bits and 3,140,444 bits respectively. Therefore, both the mean capacity (as shown in Table 4) and the median capacity for the proposed technique is higher relative to [24].

Table 1 Results of existing LSB + PVD techniques
Table 2 Results of other LSB + PVD techniques

The results of Tables 1, 2, 3 and 4 have been summarized in Figs. 5 and 6 respectively. Fig. 5 shows that the proposed technique has outperformed the existing LSB + PVD techniques in terms of capacity, attaining 4bpp. Additionally, PSNR of about 39 dB at 4bpp using the proposed technique is fairly high and comparable to that of MFPVD [24] which shows lower capacity. The proposed technique also attaining higher median and mean capacity has outperformed QVD [24] as shown in Fig. 6. However, QVD + LSB [25] has achieved the highest capacity of 4.55bpp in the available literature. Thus, to the best of our knowledge, the proposed technique has attained the second-best capacity (3,145,728 bits with 4bpp) in the available literature and the highest PSNR of 45.18 dB (for 700,000 bits). The trade-off (about 39 dB PSNR at 4bpp) has not been found in the available literature. The proposed digit hiding technique has outperformed the existing digit based hiding techniques also as shown in Table 5.

Table 3 Results of the other and the proposed technique
Table 4 Results of existing QVD techniques and the proposed technique for 700,000 bits
Fig. 5
figure 5

comparison of capacity and PSNR for different techniques

Fig. 6
figure 6

Comparison of PSNR (for 700,000 bits) and maximum capacity achieved using different techniques

Table 5 Comparison of the proposed technique with existing digit hiding techniques

The efficiency of the proposed technique for different embedding rates of 20%, 40%, 60%, 80% and 100% over a set of cover images (Fig. 3) has been presented in Fig. 7. It shows approximate 46 dB PSNR at embedding rate of 0.2 (629,144 bits), approximate 43 dB at 0.4 rate (1,258,288 bits) and about 41 dB at rate of 0.6 (1,887,440 bits). However, PSNR decreases as the embedding rate increases, reaching about 38.74 dB over 100% rate (3,145,728 bits). At respective embedding rates, the proposed technique has resulted in considerably high PSNR.

Fig. 7
figure 7

Efficiency of the proposed technique over different embedding rates

The proposed technique has been extended to also hide an image in a color image. Fig. 8 (a1 - h1) presents data images: 512 × 512 grayscale and 256 × 512 color images. Table 6 presents the result of hiding these images in Lena and Baboon. It shows fairly good PSNR of about 40.6 dB and 38.8 dB on hiding grayscale and color images respectively. Figs. 9 and 10 show stego images with PSNR values on hiding grayscale (Fig. 8 (a1 - d1)) and color data images (Fig. 8 (e1 - h1)) respectively in Lena and Baboon.

Fig. 8
figure 8

512 × 512 grayscale and 256 × 512 color data images

Table 6 PSNR on hiding grayscale and color data images in lena and baboon
Fig. 9
figure 9

stego images resulting from hiding grayscale images (Fig. 8 (a1 - d1)) in Lena and Baboon

Fig. 10
figure 10

stego images on hiding color images (Fig. 8 (e1 - h1) in Lena and baboon

For the proposed technique, time complexity depends upon the size n of secret data and embedding a data unit alters the pixel digits that takes O (1). Thus, the time complexity of O(n) is obtained. Table 7 compares the time complexity of the proposed technique with that of other techniques. Also, the embedding and extraction time for 700,000 bits is provided in the table.

Table 7 Time complexity of different techniques

5 Security analysis

To evaluate the security of the proposed method, modified Weighted Stego-Image (WS) steganalysis [2, 11] have been used. Fridrich introduced standard WS method [6] to estimate the number of random embedding changes, made using LSB steganography in the spatial domain. It outperformed both RS and Sample Pairs Analysis (SPA) for nearly 100% embedding rate. Ker and Bohme [11] improved the standard method and specialized standard WS to detect sequential embedding, outperforming structural detectors. Bohme further enhanced WS on a never-compressed and JPEG pre-compressed covers [4] and proposed WS variant [17] to detect content-adaptive embedding. Specialized WS to detect sequential embedding [11] was upgraded in [2]. Both the upgraded and specialized WS methods have been used for the security analysis of the proposed technique that also embeds data sequentially.

Suppose that a cover image C consists of n samples with n = XxY and C = (c1, …, cn). A payload of length q ≤ n is embedded by LSB substitution, resulting in a stego image S = (s1, …, sn). Let \( \overline{S} \) be the stego image with every sample’s LSB flipped using \( \overline{s} \)i = si + 1–2 (si mod 2) and for z ε [0,1], let Sz be the weighted stego image formed using Eq. (12).

$$ {s}_i^z=z\overline{s_i}+\left(1-z\right){s}_i $$
(12)

As per Theorem 1 [6], weighted stego image is closest to the cover when the difference between Sz and C is measured using Euclidean L2 norm. Thus, embedding rate \( r=\frac{q}{n} \) can be estimated as r’ from the stego image using Eq. (13) that finds z which minimizes the distance between cover and weighted stego image.

$$ r^{\prime }={\arg \mathit{\min}}_zE(z)=\sum \limits_{i=1}^n{\left({s}_i^z-{c}_i\right)}^2 $$
(13)

However, the cover is unavailable for steganalysis, WS method estimates ci using F(si), the function of the neighbors of si excluding si. Then the embedding rate is estimated using Eq. (14) which on differentiating, estimates r using Eq. (15). In [6], cover pixel predictor F(si) estimated ci as a mean of the four closest neighbors of si. However, this predictor resulted in more accuracy in flat areas than noisy areas of the image. Thus, to improve performance, weights wi for each stego pixel (relatively less weight for pixels in noisy areas) were introduced in expression (15) such that Ʃi wi = 1. It yielded estimation Eq. (16) (unweighted expression (15) corresponds to each wi = 1/n).

$$ r^{\prime }={\arg \mathit{\min}}_zE(z)=\sum \limits_{i=1}^n{\left({s}_i^z-F\left({s}_i\right)\right)}^2 $$
(14)
$$ r^{\prime }={\arg \mathit{\min}}_zE(z)=\frac{2}{n}\sum \limits_{i=1}^n\left({s}_i-F\left({s}_i\right)\right)\left({s}_i-\overline{s_i}\right) $$
(15)
$$ r^{\prime }={\arg \mathit{\min}}_z\sum \limits_{i=1}^n{w}_i{\left({s}_i^z-F\left({s}_i\right)\right)}^2=2\sum \limits_{i=1}^n{w}_i\left({s}_i-F\left({s}_i\right)\right)\left({s}_i-\overline{s_i}\right) $$
(16)

To keep the variance of the estimate low, it was suggested that wi be proportional to 1/(1 + σi2), where σi2 is the local variance of four neighboring pixels. Ker [11] further improved WS components to improve detection. An adaptive convolution filter F of the symmetrical form (17) given by (18) was proposed to enhance pixel prediction. Weights wi ∝ 1/(5 + σi2) was suggested over the one in [6] to reduce higher weights that over-emphasized the flat areas in image.

$$ {\displaystyle \begin{array}{ccc}b& a& b\\ {}a& 0& a\\ {}b& a& b\end{array}} $$
(17)
$$ {\displaystyle \begin{array}{ccc}-1/4& 1/2& -1/4\\ {}\ 1/2& 0&\ 1/2\\ {}-1/4& 1/2& -1/4\end{array}} $$
(18)

Also, specialized WS method to detect sequential embedding [11] was introduced that has been described ahead. For payload hidden in the first t samples, z = 1/2 is fixed and z = 0 for the rest. Using (12) and (14),

$$ E(t)=\sum \limits_{i=1}^t{\left(\frac{1}{2}\left({s}_i+\overline{s_i}\right)-F\left({s}_i\right)\right)}^2+\sum \limits_{i=t+1}^n{\left({s}_i-F\left({s}_i\right)\right)}^2 $$
(19)

is minimized in expectation at t = q. However, its derivative has no closed-form and the function can have multiple minima. Thus, all values of t can be tried to locate the minimum. However, computing the sum (19) for each t = 0, …, n would need O(n2) operations. Therefore, noting that the linear recurrence

$$ {e}_o=0,{e}_t={e}_{t-1}+{\left(\frac{1}{2}\left({s}_t+\overline{s_t}\right)-F\left({s}_t\right)\right)}^2-\Big({s}_t-F{\left({s}_t\Big)\right)}^2 $$

generates \( {e}_t=E(t)-{\sum}_{i=1}^n\Big({s}_i-F{\left({s}_i\Big)\right)}^2 \); the minimum term et thus gives a minimum of E(t) and it takes linear time to generate and examine the sequence et. This linear time algorithm has been used in this paper to estimate the embedding rate r = q/n using two cover pixel predictors. As stated in [4, 11], WS steganalysis crucially depends on the accuracy of the pixel predictor. For the proposed method, dynamic pixel predictor that uses the Canny Edge Detection algorithm [2] and adaptive filter in expression (18) used in [11] have been used. Dynamic pixel predictor uses the Canny Edge Detection algorithm to separate the pixels as edge and non-edge pixels. For a target pixel that is an edge pixel, only those neighbors which are the edge pixels are used to predict ci. For a non-edge target pixel, ci is predicted using non-edge neighbors. The result showing the estimated rate using both the predictors over stego images with 100% embedding rate (r = 1) has been presented in Table 8. It shows that the adaptive filter has relatively failed to detect any message. For adaptive filter, estimated rate r’ is negligible, and extensively lower for the dynamic pixel predictor relative to the true embedding rate r = 1. Fig. 11 compares Mean Absolute Error (MAE) in rate estimation for different embedding rates over a set of N = 8 stego images, where MAE = 1/N Ʃ |r’-r|. It shows that the dynamic pixel predictor gives better results than the adaptive filter in estimating r. However, for both the predictors, MAE is higher which increases with r. Thus, for the proposed technique, Weighted Stego-Image steganalysis has shown poor detection performance.

Table 8 estimated embedding rate over the actual rate of 1.0 using two predictors
Fig. 11
figure 11

Mean Absolute Error (MAE) over different embedding rates using two detection schemes of WS steganalysis

6 Conclusion

This paper has proposed a novel digit-based image steganography technique, unlike the common techniques which hide binary data. The technique has outperformed some existing spatial hiding techniques in terms of capacity and imperceptibility, achieving second-best capacity in the available literature. Indeed, the trade-off attained between payload and imperceptibility on 100% embedding rate (about 39 dB PSNR at 3,145,728 bits with 4bpp) has not been found in the available literature. Also, the method can hide any data (text, grayscale or a color image) without incurring falloff boundary problem. Moreover, the proposed linear time technique has secured data via encryption and against state-of-the-art steganalysis.