Introduction and related work

Recently Web and Cloud based medical information systems have dynamically entered the market of electronic healthcare systems. Thus, the exchange of medical data over the Internet and the storage of data in Cloud infrastructures is prospectively a common practice in modern hospital information systems [1, 2]. In addition, online collaboration systems such as the one presented in [3] require the exchange of medical images over the Internet. A key performance indicator for the adoption of such systems is data security [4]. It must be ensured that medical data is protected both during transmission and at the storage site. In the case of medical images, sensitive data is embedded in image files as header information defined in the Digital Imaging and Communications in Medicine (DICOM) standard [5]. Medical images exchange over the Internet is also specified by the Web Access to DICOM Objects (WADO) standard, which involves a Web-based service for accessing and presenting DICOM persistent objects [5]. More specifically, WADO provides a simple mechanism for accessing a DICOM image from HTML pages or XML documents, through HTTP/HTTPS. Data can be retrieved either in a presentation-ready form as specified by the requestor (e.g., JPEG or GIF) or in a native DICOM format.

One of the main security concerns in the aforementioned case is guaranteeing the integrity and identity of the medical images [6]. Thus, authenticating both the origin of the image as well as proving that it was not modified before and during transmission is of crucial importance. Data hiding (specifically watermarking) techniques exist that may satisfy one or more of the above security requirements [69]. These techniques insert additional data inside the corresponding files and alter the images. Thus, an important requirement for exploiting data hiding methodologies is that the image must be of a sufficient size to conceal data. In addition, reversibility of at least the regions of interest must be applicable, to prevent the loss of any diagnostic information.

As far as image watermarking techniques are concerned, they can be evaluated by the following metrics:

  • Capacity: Defined as the amount of information that can be embedded. For example when using simple Least significant bit (LSB) modification the maximum amount of data that can be hidden (in bits), in an 8-bit image is equal to the number of bytes that compose the image.

  • Robustness: Defined as the resistance of the embedded information to image changes. For example, when using simple LSB modification, compression of the image would probably change the LSB’s of the pixels and thus destroy the data inserted.

  • Imperceptibility: Defined as how well the data is hidden with regard to human senses.

  • Privacy/Security: Defined as the amount of secret knowledge needed for accessing the embedded information. For example, a secret key may be needed to extract the hidden data [10, 11].

Α categorization of watermarking techniques applied in medical images was introduced by Coatrieux et al. in [10]. Three categories of watermarking algorithms are the most prominent. The first class of algorithms embeds the information into the RONI, which is specified as the black background and possibly some grey regions of minor interest. The second class corresponds to reversible watermarking, which allows retrieval of the original image once the watermark is read. Finally, the third class corresponds to classical watermarking techniques which modify small visual aspects like the LSB’s of the pixel values, minimizing image distortion.

An additional categorization based on the purpose of medical image watermarking is presented by Al-Qershi et al. in [6]. More specifically, medical image watermarking can be divided into three schemes. The first, called an authentication scheme aims at authenticating the origin of the image while also providing tamper detection and recovery of the pixels that have been altered. The second called data-hiding scheme aims to conceal patient data with high perceptibility. Finally, the third category is a combination of the first two.

As far as steganography is concerned, the main goal is to conceal the presence of hidden communication. Thus, perceptual and algorithmic undetectability is crucial. Furthermore, in image steganography the original image is referred to as the cover image while the image containing the hidden data as the stego image. In addition, with the exception of one major difference, steganography requirements are the same as the watermarking requirements described above. More specifically, in steganography the imperceptibility requirement refers to how well the data is hidden regarding not only visual recognition but also steganalysis algorithms. Nevertheless, it must be noted that while steganography has high demands in capacity and imperceptibility and low demands in robustness, watermarking has high demands in robustness and low demands regarding capacity [11].

One of the most common steganography methods is LSB replacement, which is also known as LSB substitution. This method simply overwrites the LSB’s of the pixel values with the bit of insertion while the pixels to be manipulated are randomly chosen using a Pseudorandom number generator (PRNG). Thus, the manipulated LSB of the pixel value will either be modified or remain unaltered. However, by embedding a uniformly distributed message, statistical patterns may appear on the image’s histogram leading to easy detection [12]. Another technique, named LSB matching, exists which overcomes this issue by randomly applying a +1 or −1 operation when a pixel value must be modified. Furthermore, a method known as optimal pixel adjustment process (OPAP) was introduced by Chan et al. in [13] which provided much improved stego image quality compared to LSB replacement. While the above methods use only one pixel as an embedding unit, methods known as pixel pair matching (PPM) methods which use pairs of pixels to hide the message also exist. The message in these methods is represented using a specific B-ary notational system. A clear example is LSB matching revisited (LSBMR) introduced by Mielikainen in [12], which considers pairs of pixels and their relationship for insertion. More specifically, the LSB of the first pixel carries one bit while the odd-even relationship between the pixels is used to extract the other. LSBMR provides a mean square error (MSE) of 0.375 for an embedding rate of 1 bpp, which is a significant advantage compared to the MSE of 0.5 which LSB replacement provides [12]. In addition, improvements to LSBMR were introduced such as the exploiting modification direction (EMD) method formed by Zhang et al. in [14]. In EMD only one pixel in a pixel pair is changed, one gray-scale unit maximum, and a message digit in a 5-ary notational system can be embedded. Thus, a maximum payload of (1/2)*log25 ≈ 1.161 bpp is provided [14]. Furthermore, the diamond encoding (DE) method, which enhanced the payload of EMD, was introduced by Chao et al. in [15]. In addition, the adaptive pixel pair matching (APPM) formed by Hong et al. in [16] provided an even lower distortion rate than DE. The above algorithms do not take into account the way human sight reacts to changes in pixels with diverse content, such as smooth blocks of pixels and edges. Nevertheless, edge adaptive algorithms exist. One of the most common schemes used in such algorithms is pixel value differencing (PVD) in which the number of bits to be embedded is calculated based on the difference between a pixel and its neighbors [1721]. Greater differences in relation to one’s neighbor indicate that more bits can be embedded [17]. A recently published method for steganography is the “Dual-Level Security based Cyclic18 Steganographic” method presented by Muhammad et al. in [22]. This method applies a variety of methods such as image scrambling, the use of custom encryption schemes and the insertion of the encrypted message using LSB and intermediate LSB substitution.

The algorithm proposed in this work is an edge adaptive scheme, which also belongs to the category of Secret Sharing algorithms. It is utilized using the Secret Sharing methods given by Yuan in [23]. A (n,k) Secret Sharing scheme/algorithm “inserts” the message into k shares. Subsequently, extraction of the message can be accomplished by using x ≥ k shares, while extraction using less than k shares is computationally impossible. Furthermore, none of the k shares themselves reveal any information about the message. The proposed algorithm is applied on DICOM images and inserts sensitive patient data, recovery data needed to recover the ROI part of the image and authentication data needed to validate the integrity of the ROI of image and the patient’s data. The proposed implementation takes into account the fact that DICOM images are of considerable size and contain large RONI with zero diagnostic value.

The rest of the paper is structured as follows: In Section 2 we present an overview and the distinct modules of the insertion and extraction schemes, while in Section 3 use case application of the proposed insertion scheme in practice is provided. In Section 4 experimental results are provided and in Section 5 we discuss the results and conclude the paper.

The proposed steganography methodology

In this Section we provide the technical details of the proposed steganography method for the DICOM medical images. In the rest of this paper the following conventions are made:

  • Sensitive patient data (extracted from the DICOM Tags included in image header) that is chosen for insertion is referred to as ‘data’.

  • Data used to recover the ROI part of the image is referred to as ‘recovery data’. More specifically the ‘recovery data’ consists of:

    • The ‘map’ which is a mapping of the pixels modified in the ROI during insertion and is crucial for the recovery of the ROI.

    • The ‘size of map’ which is a variable that represents the size of the ‘map’. This variable consists of 16 bits.

    • The ‘start row’ which contains the value of the starting row of the ROI. This variable consists of 10 bits. For example if this variable has the value ‘3’ it is represented as ‘0000000101’. It must be noted, that 16 and 10 bits are more than adequate to store the values of the ‘size of map’ and ‘start row’ respectively.

  • Data used to validate the origin and the integrity of the image is referred to as ‘authentication data’. The ‘authentication data’ actually consists of two hashes, one for the ROI part of the image (‘hash ROI’) and one for the ‘data’ hidden (‘hash data’).

  • The encrypted form of the ‘data’ and ‘authentication data’ is referred to as ‘encrypted data’. More specifically, the above are encrypted to ensure confidentiality even if steganography is detected.

The insertion scheme

An overview of the modules that comprise the proposed insertion scheme is illustrated in Fig. 1.

Fig. 1
figure 1

Modules that compromise the insertion scheme

The following steps comprise the proposed insertion algorithm:

Insertion of the original image

At this step the user inputs the medical image in DICOM format. The data to be hidden consists of the sensitive elements of the DICOM header (i.e. patient name and demographic data, referring physician data etc.). The exact definition of data to be hidden is achieved in the next step either manually by the user or using pre-defined rules.

DICOM header fields selection

In this step the fields of the DICOM header which need to be hidden and anonymized are determined. The majority of the DICOM header sensitive information is placed under the patient information group (‘0010’). Nevertheless, additional fields exist which could be used in order to identify the patient directly or indirectly through a combination of other fields. The supplement 142 of the DICOM standard provides guidelines that must be followed in order to properly achieve de-identification of an image. Because, the list of fields that must be anonymized is immense (as provided in the supplement 142 of the DICOM standard) lists such as the one provided in [24] can be used. This list (see Table 1) was utilized in order to evaluate the efficiency of 10 free DICOM anonymizers and thus adequately describes a minimum subset of fields that should be anonymized in order to protect the identity of the patient. As is obvious, there is no need to store both the group-element and Tag description values, as only one of them is required to define the field.

Table 1 Minimum subset of fields that should be anonymized in order to protect the identity of the patients

After the selection of fields, the algorithm is able to determine the maximum available ROI. This is because it has been defined in this paper as the part of the image that does not belong to the RONI which can subsequently be calculated when the ‘data’ is selected. Nevertheless, how the RONI size is defined will be clarified below. In this scheme, the maximum ROI is considered; however an even smaller ROI could be specified manually. At Fig. 2 an example of the ROI and RONI is given. More specifically the RONI contains two parts: one that consists of the first rows of the image which will be referred to as ‘RONI part 1’ in the rest of this paper and one of the last rows which will be referred to as ‘RONI part 2’. The above describes the way in which the ROI and RONI are defined throughout this paper. Nevertheless, if the first or last rows of the image contain valuable information, different regions (and subsequently shares) can be selected with the sole restriction that information needed in order to locate every share must be provided to the user (i.e windows at the corners). As aforementioned, in our case this information is provided through the “start row” variable. This variable could be substituted to include points (vertexes), which will point to the shares of the ROI and RONI.

Fig. 2
figure 2

Example of ROI and RONI definition: (a) RONI part 1, (b) ROI, (c) RONI part 2

Creation of the ‘authentication data’

In the proposed scheme, calculation of the ‘authentication data’ must be accomplished using a secure cryptographic hash function due to the significance of the data hidden. Nevertheless, considering the fact that the main goal of this algorithm is to ensure integrity and since hashes are encrypted and thus only retrievable by the legitimate user, there is no need to choose a secure cryptographic hash function, which although more secure is less efficient. In the proposed scheme two hash values ‘hash data’ and ‘hash ROI’ are calculated in order to confirm the integrity of the patient’s data and of the ROI part of the image respectively.

Encryption of ‘data’ and ‘authentication data’

The encryption of the ‘data’ and ‘authentication data’ can be implemented using any private or public key cryptography algorithm. Obviously, the user must have inserted the equivalent (encryption) key prior to this step. The choice of public key cryptography could be used in a scenario in which the equivalent keys have been assigned to doctors of a specific hospital. Thus, by applying encryption with the public key of the recipient it would be assured that only that specific user (using his private key) could decrypt the data. The adverse aspect of this is the high demand of resources regarding public key cryptography in comparison to private key cryptography. Furthermore, the choice of private key cryptography could be used in a scenario in which the key is also used as a session key during transmission of the image among the communicating parts.

A description of how the maximum available ROI is calculated is presented below. It should be noted that the ‘recovery data’ is inserted at the RONI, while the ‘encrypted data’ at the ROI.

Supposing that the ‘data’ is x bits of size and every hash consists of y bits, there will be a total of x + 2*y unencrypted data bits. Furthermore, if a stream cipher is used the encrypted data will also occupy x + 2*y = z bits. In contrast, if a block cipher is used the encrypted data will occupy \( \left\lceil \left(x+2*y\right)\;/ block\_ size\right\rceil * block\_ size=z\;\mathrm{bits} \). In addition, taking into account that 2 bits per map entry is needed and that 10 bits are used to represent the ‘start row’ and 16 bits to represent the ‘size of map’ there will be a total of 2*z + 26 bits of insertion at the RONI. Furthermore, insertion of the ‘start row’ takes place considering two shares (share 1 and share 2) by dividing the ‘first part of the RONI’ vertically and the insertion of the ‘size of map’ and ‘map’ takes place considering four shares (shares 1 to 4) by dividing both parts of the RONI vertically (see Fig. 3). Thus, since using the 2LSB scheme two bits per pixel are inserted, the RONI consists of \( \left\lceil \left(z+13\right)\;/\left\lfloor \left(\frac{width\_ of\_ image}{2}\right)\right\rfloor\;\right\rceil *2 \) rows and the ROI consists of the remaining image rows. The reason why the ‘start row’ is inserted using only the first two shares of the RONI is that there is no possible way of knowing where the remaining shares (3 and 4) start as this information is given by the ‘start row’ variable itself. To be more precise, ROI starts at ‘start row’ and ends at ‘start row’ rows before the last row of the image. Thus, if the image consists of rows 0 to N-1, (where N is equal to the rows in the image), then shares 1 and 2 of the RONI end at (‘start row’ −1) row while shares 3 and 4 of the RONI start after the last row of the ROI.

Fig. 3
figure 3

Available shares in the RONI: (a) share 1, (b) share 2, (c) share 3 and (d) share 4

Insertion of ‘encrypted data’ using the LSB scheme and creation of ‘recovery Data’

The insertion of the ‘encrypted data’ is done adopting the LSB method presented by Yuan in [23]. Two methods of insertion are proposed, the former with the ability to manipulate the LSB’s of the cover (original) images while the latter has the ability to manipulate both the LSB’s and second LSB’s. The Author proved that these methods had excellent results compared to steganalysis when natural images are used. In this work, slightly modified versions of these methods are applied to medical images. As stated in [23], the complexity of both the LSB and 2LSB (n,n) secret sharing methods are proportional to the amount of data hidden and the number of shares used. Thus, for both methods, if the amount of data inserted is equal to “x” and “n” number of shares are used, the computational complexity of each method is O(x*n). The same applies to the slightly modified versions used.

The LSB method first creates the shares that will be used, for example the pixels that are comprised in every cover image. Then the Sobel operators for each share are calculated [25]. Afterwards, insertion of the message occurs bit by bit. More specifically, for every bit of insertion the XOR values of the corresponding LSB’s of the pixels in every share is calculated. If the result differs from the current bit of insertion then LSB matching is applied to the pixel which has the largest gradient magnitude among the pixels of the shares that are currently processed. This guarantees that the pixel being more robust against steganalysis will be modified. Reconstruction of the hidden message is simply achieved by applying XOR operations in the LSB’s of the pixels of the stego shares [23].

In order to reconstruct the original ROI, the proposed scheme requires a way of knowing if a +1 or −1 operation was applied, since LSB matching does this in a random manner. To address this issue a PRNG binary is created using a secure PRNG and the cryptographic key provided by the user as a seed. A PRNG produces a sequence of pseudo numbers. Furthermore, by using the same seed the same binary sequence will be produced during insertion and extraction. When a pixel needs to be modified the equivalent value of the PRNG binary is checked. If this value is ‘1’, then a ‘+1’ operation is applied otherwise a ‘−1’ operation. At this point it must be noted that, if public key cryptography is used, using only the public key of the recipient as a seed in the PRNG, will produce the same PRNG binary values every time. To address this issue, a value could be added to the seed, or could be used alone instead of the public key. This value should be inserted with the rest of the data so it is available during extraction.

The encrypted data is inserted using the LSB method and the PRNG binary starting at ‘start row’ which defines the starting row of the ROI. Two shares are considered by dividing the ROI vertically. This step will produce the extraction ‘map’ needed for the reconstruction of the image and will contain 2 bits for every bit inserted.

More specifically the combination:

  • ‘00’, states that no modification occurred.

  • ‘01’, states that a modification occurred in the first half (share 1) of the ROI.

  • ‘11’, states that a modification occurred in the second half (share 2) of the ROI.

  • ‘10’, is used to state that a modification of a border value took place using the opposite value to the one stated from the PRNG binary. For example if the PRNG binary states that a −1 operation must take place and the value of the chosen pixel is 0 then a +1 will actually be held, to avoid an underflow. Thus, during reconstruction, information is needed to apply a −1 operation and not a +1. During reconstruction, the reconstruction process will check the pixel values of both shares and will apply a −1 operation to the share that has the value ‘1’. At this point it must be noted that a special condition occurs if the pixel value of the second share has the value ‘1’, since, during reconstruction it would be impossible to ascertain which pixel should be modified. Thus, in this rather rare case a ‘−1’ operation is held at the pixel of the share that was not chosen for manipulation and the map value is subsequently set to ‘01’ or ‘11’. The equivalent applies in the case of possible overflow.

Table 2 displays the operations that take place for 8-bit grayscale images during the LSB insertion and the equivalent map values. In the table it is assumed that the first share has the biggest gradient magnitude among the chosen pixels. Nevertheless, the equivalent operations take place when the pixels of the second share have the biggest gradient magnitude among the pixels chosen for manipulation.

Table 2 Description of pixel manipulation and map values during LSB insertion based on the values of the PRNG binary

Insertion of ‘recovery data’ using the 2LSB scheme

As stated earlier, Yuan in [23] also proposed a method which can modify both the LSB and the second LSB of the pixel values. In this step the ‘recovery data’ is inserted using this 2LSB insertion method. More specifically, the ‘recovery data’ consists of (see Fig. 4):

  • The ROI starting row (‘start row’) so it is known at which row extraction of the data using the LSB extraction method should start. This component is inserted using only the two shares that belong to the first part of the RONI.

  • The maps size (‘size of map’) so during extraction using the 2LSB extraction method the ending point of the extraction is known. This component is inserted using all shares of the RONI.

  • The ‘map’ so that the ROI part of the image can be reconstructed. This component is inserted using all shares of the RONI.

Fig. 4
figure 4

Example of data for insertion using the 2LSB scheme: (a) ‘start row’, (b) ‘size of map’, (c) ‘map’

Anonymization of the header and stego image creation

At this step, the fields that the user chose for insertion are anonymized. The detailed workflow of the insertion scheme is presented in Fig. 5.

Fig. 5
figure 5

Workflow of the insertion scheme: (1) Insertion of the original image, (2) DICOM header tags selection, (3) Creation of the ‘authentication data’, (4) Encryption of ‘data’ and ‘authentication data’, (5) Insertion of ‘encrypted data’ using the LSB scheme and creation of ‘recovery data’, (6) Insertion of ‘recovery data’ using the 2LSB scheme, (7) Anonymization of header and stego image creation

The extraction scheme

In this section an overview of the modules that comprise the proposed extraction scheme is presented. More specifically the extraction scheme can be divided into the following modules illustrated in Fig. 6:

Fig. 6
figure 6

Modules that compromise the extraction scheme

Insertion of stego image

At this step the stego medical image is inserted by the user.

Extraction of the ‘recovery data’ using the 2LSB extraction scheme from the RONI

At this step starting at the first row, extraction of the ‘recovery data’ using the 2LSB extraction method takes place. More specifically, the following components are extracted:

  • The first row of the ROI (‘start row’). So that the starting row from where extraction of the encrypted data will take place, as well as the rows that constitute the ‘RONI part 2’, are known. This component is extracted using only the two shares that belong to the first part of the RONI.

  • The size of the map (‘size of map’). So that the algorithm knows when to stop extracting the map and subsequently the encrypted data. This component is extracted using all of the shares of the RONI.

  • The ‘map’, so that the ROI can be recovered. This component is extracted using all of the shares of the RONI.

Extraction of the ‘encrypted data’ using the LSB extraction scheme from the ROI

At this step, starting at the ‘start row’ row and for ‘size of map’ bits, extraction of the encrypted data using the LSB extraction method takes place.

Decryption of ‘encrypted data’

At this step decryption of the ‘encrypted data’ takes place. Again, as is obvious the cryptographic key must have been inserted by the user prior to this step. After the decryption of the ‘encrypted data’, the ‘data’, ‘hash data’ referred to as ‘old hash data’ and ‘hash ROI’ referred to as ‘old hash ROI’ are parted.

Confirmation of the integrity of the ‘data’

At this step the hash of the extracted ‘data’ (referred to as ‘new hash data’) is calculated and is compared with the retrieved hash of the ‘data’ (‘old hash data’). If those values are divergent it means that the integrity of the ‘data’ cannot be validated as it has been altered and consequently the user must be informed. Else, if those values are identical then, consequently the integrity of the ‘data’ has been successfully validated.

Recreation of the original ROI of image

At this step recalculation of the original ROI of the image occurs. First, the PRNG binary is created using the cryptographic key as a seed. Then using the PRNG binary and the ‘map’ the following is applied for every entry of the map. If the map’s value is not ‘00’ then a modification during insertion has occurred. Thus, reversal of the operation that took place during insertion is required. Table 3 includes the operations that take place for 8-bit grayscale images during recovery of the ROI.

Table 3 Description of pixel manipulation during recovery of the ROI, based on the PRNG binary and extraction map values

Confirmation of the integrity of the ROI

At this step the hash of the ROI (referred to as ‘new hash ROI’) is calculated and is compared with the retrieved hash of the ROI (‘old hash ROI’). If those values are divergent it means that the integrity of the ROI cannot be validated as it has been altered and consequently, the user must be informed. Else, if those values are identical the integrity of the ROI has been successfully validated. The overall workflow of the extraction scheme is depicted in Fig. 7

Fig. 7
figure 7

Workflow of the extraction scheme: (1) Insertion of the stego image, (2) Extraction of the ‘recovery data’ using the 2LSB extraction scheme from the RONI, (3) Extraction of the ‘encrypted data’ using the LSB extraction scheme from the ROI, (4) Decryption of ‘encrypted data’, (5) Confirmation of the integrity of the ‘data’, (6) Recreation of the original ROI of image, (7) Confirmation of the integrity of the ROI

The proposed algorithm in practice

In this section, the application of the proposed insertion scheme to DICOM medical images is presented. The DICOM medical image, which will be used as an example, is displayed in Fig. 8.

Fig. 8
figure 8

Medical Image in DICOM format tested at the use case

The DICOM tags/fields selected in this example for anonymization are provided in Table 4. As aforementioned, there is no need to store both the group-element and tag description values. Thus, only the group-element of each field along with its VR and value are stored.

Table 4 Sensitive patient data of header, chosen for anonymization at the use case

The hash function chosen for the specific implementation is SHA-256, which produces an output of 256 bits. The hash values (represented in hexadecimal) of the sensitive ‘data’ and the ROI are given in Fig. 9. The chosen ‘data’ size is 226 bytes. Thus, by adding 32 bytes of the ‘hash data’ and 32 bytes of the ‘hash ROI’ a total of 290 bytes or otherwise 2320 bits of unencrypted data occurs. Furthermore, as the application was developed in JAVA the JAVA Secure random generator (SecureRandom class) was used to produce the PRNG binary.

Fig. 9
figure 9

Authentication data (‘hash data’ and ‘hash ROI’) produced at the use case

In the encryption step of the ‘data’ and ‘authentication data’ AES-128 is used, which is a block cipher that uses a block size of 128 bits [26]. The key used (represented in hexadecimal) is given in Fig. 10. In addition, the ‘encrypted data’ is given in Fig. 11. The rationale behind the choice of AES as the encryption algorithm relies on the fact that it is assumed that the key is used as a session key during transmission of the image. It must be noted that algorithms like blowfish are faster than AES [27]. On the other hand, given that the data that is encrypted is of a small size and that AES is a standard, it is considered to be the optimum choice. The size of the ‘encrypted data’, which as mentioned is used to calculate the ‘size of map’ and subsequently the maximum ROI is calculated as follows:

Fig. 10
figure 10

Encryption key used at the use case

Fig. 11
figure 11

‘encrypted data’ produced by the insertion algorithm at the use case

Since the unencrypted data occupies 2320 bits, \( \left\lceil 2320/128\right\rceil \) =19 blocks or otherwise 19*128 = 2432 bits of ‘encrypted data’ is produced. Furthermore, since the image used occupies 512 rows and 512 columns by dividing the image in half vertically a maximum insertion of 512/2 = 256 bits at every row using the LSB method and 256*2 = 512 bits using the 2LSB method occurs. Thus, ⌈(2432 + 13)/256⌉ ∗ 2 = 20 rows are needed for the insertion of the ‘recovery data’. Furthermore, it has been chosen not to take into account the first and last row of pixels due to the fact that they have a gradient magnitude of zero. Thus, the first part of the RONI will consist of the rows 1 to 10 and the second part of the RONI will consist of the rows 501–510. Thus, the insertion of the ‘encrypted data’ and consequently the maximum ROI must start at the 12th row (row 11) and end at the 501th row (row 500). Thus, the ‘start row’ value is 11.

In the specific example, data insertion using the LSB method begins at the 12th row. In Fig. 12, the values produced by the Sobel operators corresponding to the first 60 rows of the image and the pixels that were chosen for manipulation using the LSB method (represented with white) are illustrated. It is obvious that the left and right part of the image were not manipulated because they consist of smooth areas. In contrast, manipulation took place in the central part of the image where the pixels had increased gradient magnitude values. As aforementioned this step produces the ‘map’ given at Fig. 13.

Fig. 12
figure 12

(a) Sobel values of first 60 rows and (b) Difference of images due to LSB insertion at the use case

Fig. 13
figure 13

Part of Extraction ‘map’ produced by LSB insertion at the use case

The next step involves the insertion of the ‘map’ along with the ‘size of map’ and the ‘start row’ (Fig. 14) at the RONI using the 2LSB method. As aforementioned, the ‘start row’ is inserted using the first two shares of the RONI while the ‘size of map’ and ‘map’ using all four shares. In Fig. 15, the values produced by the Sobel operators corresponding to the first 60 rows of the image and the pixels that were chosen for manipulation using the 2LSB method (represented with white) are depicted. Furthermore, the equivalent values corresponding to the last 60 rows of the image are given in Fig. 16. Once again, it is obvious that the left and right part of the images were not manipulated because they consist of smooth areas. In contrast, manipulation took place in the central part of the image where the pixels had augmented gradient magnitude values.

Fig. 14
figure 14

‘recovery data’ inserted using the 2LSB insertion algorithm at the use case: (a) ‘start row’, (b) ‘size of map’, (c) ‘map’

Fig. 15
figure 15

(a) Sobel values of first 60 rows and (b) Difference of images due to 2LSB insertion at the use case

Fig. 16
figure 16

(a) Sobel values of last 60 rows and (b) Difference of images due to 2LSB insertion at the use case

Finally, the stego image is created.

Quantitative and qualitative results

In this section, appraisal metrics regarding the proposed steganography scheme are discussed. More specifically, an assessment of how the algorithm responds based on the evaluation factors regarding steganography presented in the introductory Section, as well as time execution metrics are given.

Capacity

In order to measure the algorithmic capacity, the following conventions are assumed:

  • The encrypted form of the ‘data’ plus ‘authentication data’ is given by the z variable.

  • The width of the image by the w variable.

  • The height of the image by the h variable.

  • The image consists of grayscale pixels.

  • RONI’ refers to the amount of pixels that consist the RONI

  • ROI’ refers to the amount of pixels that consist the ROI

Then, as aforementioned in the proposed scheme the RONI consists of \( \left\lceil \frac{\left(z+13\right)}{\left\lfloor \frac{w}{2}\right\rfloor}\right\rceil *2 \) rows or else

$$ \hbox{'} RONI\hbox{'}=\left\lceil \frac{\left(z+13\right)}{\left\lfloor \frac{w}{2}\right\rfloor}\right\rceil \ast 2\ast w\ \mathrm{Pixels}. $$

For reasons of simplicity, the upper and lower bounds are removed from the above equation. Thus,

$$ \hbox{'} RONI\hbox{'}=\frac{\left(z+13\right)}{\frac{w}{2}}\ast 2\ast w=4\ast z+52\ \mathrm{Pixels}. $$
(1)

Furthermore, if every pixel of the ROI is used during insertion it will consist of 2*z pixels, since the ROI consist of two shares. In addition, since all of the pixels of the image are w*h, ROI and RONI must be less or equal to w*h.

(2)

In addition if the first and last row of the image is not used during insertion, due to the fact that they have zero gradient magnitude values, Eq. 2 can be rewritten as:

(3)

If either no encryption or a stream cipher is used, z will be equal (in size) to the ‘data’ and ‘authentication data’. Thus, the amount of ‘data’ that can be hidden is:

$$ \hbox{'} data\hbox{'}\le \frac{\left(h-2\right)\ast w-52}{6}-\hbox{'} authentication\ data\hbox{'} $$
(4)

For example, based on Eq. 3 for an image of size 512 × 512, the maximum amount of ‘encrypted data’ that can be hidden if a stream cipher is used is 43511 bits ≈ 5.3 Kbyte. Furthermore, when a stream cipher is used the total insertion in bits equals to:

$$ 3*\left(\hbox{'} data\hbox{'}+\hbox{'} authentication\ data\hbox{'}\right)+26\ \mathrm{bits} $$
(5)

This is because the ‘map’ occupies twice the ‘encrypted data’ which has a size of \( \left(^{\prime }dat{a}^{\prime }{+}^{\prime } authentication\;dat{a}^{\prime}\right) \) bits. In addition, ‘start row’ and ‘size of map’ occupy 26 bits. Thus, for an image of size 512 × 512, if SHA-256 is used, then the max amount of ‘data’ that can be hidden is equal to (43511 – 256 ∗ 2) = 42999 bits. Furthermore, based on Eq. 5 a total of 3 ∗ (43511) + 26 = 130559 bits can be inserted, which corresponds to \( \frac{130559}{512\ast 512}\approx 0.50\kern0.28em \mathrm{b}\mathrm{p}\mathrm{p} \).

Finally, regarding the amount of ‘data’ which can be inserted when a block cipher is used, in the worst case an additional block must be deducted. Thus, the following equation occurs

$$ \hbox{'} data\hbox{'}\le \frac{\left(h-2\right)*w-52}{6}-\hbox{'} authentication\ data\hbox{'}-\hbox{'} block\ size\hbox{'} $$
(6)

Robustness

Since the proposed schemes manipulate the LSB or both the LSB and the second LSB of the pixels any modification of the pixels will destroy the hidden data. Nevertheless, since the integrity of the images is of utmost importance even slightly modified images should not be used.

Imperceptibility

In this subsection, the imperceptibility of the stego images is evaluated. More specifically the peak signal-to-noise ratio (PSNR) values are utilized to test visual similarity between the stego and cover images. Furthermore, the stego images are tested against two steganalysis algorithms. In order to have objective results 10 16-bit grayscale medical images (CTs and MRIs) of size 512 × 512 and 2 of size 256 × 256 are used. These images are presented in Table 5. In every image an insertion of 0.25 bits per pixel takes place using randomly generated data through the use of a uniform probability distribution. It must be noted that the calculation of the total amount of insertion (in bits per pixel) which takes place is calculated based on Eq. 7. In Eq. 7 the calculation takes into account all the data inserted except the ‘start row’ and ‘size of map’. Furthermore, ‘total pixels’ refers to the number of pixels that comprise the image.

$$ \frac{3*\left(\hbox{'} data\hbox{'}+\hbox{'} authentication\ data\hbox{'}\right)}{\hbox{'} total\ pixels\hbox{'}} $$
(7)
Table 5 DICOM images used for testing

PSNR calculation

Given a reference (cover) image f and a test (stego) image g, both of size MxN, PSNR is defined using the mean square error (MSE) as follows:

$$ \mathrm{PSNR}\left(\mathrm{f},\ \mathrm{g}\right)=10 \log 10\left(\frac{{\left(\hbox{'} \max\ value\hbox{'}\right)}^2}{\mathrm{MSE}\left(\mathrm{f},\ \mathrm{g}\right)}\right) $$
(8)

Where

$$ \mathrm{M}\mathrm{S}\mathrm{E}\left(\mathrm{f},\ \mathrm{g}\right)=\frac{1}{M\ast N}{\displaystyle {\sum}_{i=1}^M{\displaystyle {\sum}_{j=1}^N{\left(fij-gij\right)}^2}} $$
(9)

and ‘max value’ the maximum possible pixel value. For example the ‘max value’ for 8 bit grayscale images is 28-1 = 255.

In Table 6, based on Eqs. 8 and 9 the PSNR and MSE values for every set of images are given. To be more precise every set consists of the images before and after embedding.

Table 6 PSNR values of the 12 DICOM images tested for an insertion rate of 0.25 bpp

Similar measurements in [6] had much lower PSNR values for even smaller insertion of data. In this context, the PSNR values produced are considered more than acceptable.

Resistance versus steganalysis

In order to evaluate the proposed schemes resistance to steganalysis the stego images are tested against two structural LSB detectors. The first is the weighted stego steganalyser described in [28] (referenced in this paper as “WS”), while the second detector is the structural LSB detector described in [29] (referenced in this paper as “TRIPLES”). For both detectors the implementations given by Jessica Fridrich in [30] are used. Both detectors attempt to estimate the length of the hidden message and thus the payload (bits per pixel) that has been inserted into the stego image. The equivalent results for an embedding rate of 0.25 bpp are given in Table 7.

Table 7 Estimation of bits per pixel inserted given by the structural detectors in [28] (WS) and [29] (Triples)

In all cases the estimated embedding rate is much smaller than the real one. More specifically in all images except 7, 8, 9, and 11 the estimated embedding rate is below 0.014 while at most images it is below 0.008. The reason why the estimated embedding rate is significantly higher in these images is due to the fact that they contain large black (smooth) areas, which if modified are susceptible to steganalysis. Thus, it can be stated that at least for DICOM images that do not contain a large amount of smooth regions the proposed algorithm has a high resistance against steganalysis.

Privacy/security

The specific evaluation criterion refers to the amount of secret knowledge needed during the algorithm execution. This corresponds to the size of the cryptography key, which depends on the encryption scheme used. In the experimental results, AES with a key size of 128 bits was selected.

Complexity and execution times

In this subsection, metrics regarding the execution time of the proposed schemes are given. To be more precise, the execution times refer to the insertion and extraction process and they are listed in Table 8 for the evaluated images in an Intel i5-3230 M 2.6GHz CPU. As it is illustrated, for an insertion rate of 0.25 bpp in images of 512 × 512 pixels size, the mean execution time of the insertion scheme is 697.4 milliseconds with a standard deviation of 85.3 milliseconds, while the mean execution time of the extraction scheme is 403.4 milliseconds with a standard deviation of 9.4 milliseconds. Furthermore, for an insertion rate of 0.25 bpp in images of 256 × 256 pixels size the mean execution time of the insertion scheme is 452.5 milliseconds with a standard deviation of 65.8 milliseconds, while the mean execution time of the extraction scheme is 137.5 millisecond with a standard deviation of 4.9 milliseconds.

Table 8 Execution times of the insertion and extraction scheme during analysis

Evidently, even for an insertion rate of 0.25 bpp, the insertion scheme demands less than 1 s in order to execute while the corresponding time for the extraction scheme is less than 0.5 s. Furthermore, by using more than one physical and logical core the above execution times can be drastically reduced.

The reason the insertion scheme demands more execution time than the extraction scheme relies on the fact that extra calculations, such as the calculation of the image gradient are required during insertion.

Conclusions and future work

In this paper we proposed and evaluated a steganography algorithm for medical images which provides reversibility of the ROI, as well as data integrity. The presented experimental results proved that the proposed scheme could be used as an efficient steganography scheme, when images with limited smooth areas are used. Furthermore, it was also proven that as far as visual equality (in terms of PSNR) is concerned, excellent results are produced. Finally, since the algorithm is used in spatial domain DICOM images, high capacity insertion is applicable.

The proposed method can be utilized for the exchange of medical images over the Internet and the storage of image data in Cloud infrastructures. Nowadays, cloud computing is recognized as a dominant computing model in IT infrastructures, enabling flexible, ubiquitous, on-demand and cost-effective access to a wide pool of shared resources in all business sectors including healthcare. One of the main concerns in such information systems remains data privacy. Furthermore, sharing health information raises the level of complexity and increases the stakes for issues of data confidentiality and the need for robust security measures. The presented methodology exhibited very good results in all aspects of evaluation concerning Capacity, Robustness, Imperceptibility and Privacy. Furthermore execution times and data overheads are low, allowing easy and transparent integration in existing distributed medical information systems.

As described in section 2.1.5, regarding the pixels selected to be modified, there is a likelihood of modification of the pixel with the smallest gradient magnitude. To address this issue, 3 bits per map entry could be used. In this case, the first bit would state if a modification occurred, the second bit in which share the modification transpired and the third bit if a border value was modified. Nevertheless, this was not considered a beneficial choice since it would significantly increase the amount of data inserted, while experimental results proved that this is a rare case. Actually, during our experimentation this case did not appear at all.

When DICOM images with large smooth areas are used, the scheme could be used as a watermarking scheme. More specifically, it can be derived that two watermarks are inserted: Watermark W1 which is composed of the ‘recovery data’ and is inserted in the RONI and W2 which is composed of both the ‘data’ and ‘authentication data’ (in an encrypted form) and is inserted in the ROI. Thus, future work involves inserting error control mechanisms in our scheme to make it more robust and subsequently even more suitable for watermarking [6, 8, 31].