1 Introduction

The implementation of Picture Archiving and Communications System (PACS) in the medical area has allowed the storage and distribution of medical images more efficiently, while its administration is managed by the standard of Digital Imaging and Communication in Medicine (DICOM), which provides the communication protocols for the transmission of medical information and technical specifications of the files corresponding to the stored data [1]. With the expansion and accelerated growth of the digital imaging paradigm over its analog counterpart, it is necessary to guarantee the integrity, confidentiality, and security of medical images, since they usually contain reserved information about patient health [2]; which, if not protected with adequate security levels, can be used in medical insurance frauds, identity theft, among other illegal activities.

Even though medical images contain sensitive information in their metadata, they are usually transferred from imaging stations through unencrypted or unauthenticated transmission channels to PACS servers, where they are stored and distributed to their users; like display stations, DICOM printers, CD and DVD recording equipment, clinical diagnostic equipment or even consultations via mobile services. Although these systems have various means to protect the files stored, millions of reports of security breaches related to medical information are presented every year [3]. In these incidents, attackers or intruders can accessed the medical information while it was stored on the PACS servers or during its transmission, mainly for the purpose of acquiring patient information, modifying, deleting, or hijacking the data, as illustrated in Fig. 1.

Fig. 1
figure 1

Transmission of DICOM files through PACS systems

To confront some of these issues and considering that the DICOM standard allows the implementation of additional measures that increase data security, in the scientific literature several data hiding techniques in digital image watermarking modality [1, 2] have been used for this purpose. In general terms, the watermarking techniques embed a small amount of data bits called "watermarks" into the medical images to allow the authorized medical staff to extract embedded data for specific purposes. However, a drawback of watermarking methods is the visual distortion into the medical image after data embedding operation, moreover, the restoration of the image to its original form after watermark extraction/detection usually is not possible. To solve this inconvenient, reversible data hiding (RDH) techniques (also known as lossless or invertible data hiding) could be implemented since it makes possible to extract the embedded data without errors, as well as to restore the cover medium to its original state [4]. Over time, the proposals have been applied for digital media such as color and grayscale natural image, as well as digital audio and video, where the main efforts were focusing in a) improve capacity of the methods, b) obtain robustness against JPEG compression and c) conceal information meanwhile the contrast enhancement of the images is performed.

In general, the security measures were developed to verify the integrity and guarantee the privacy of the information contained in the image metadata, however, none contemplates the possibility of preventing the image pixels from being modified to the point where a healthy patient is diagnosed as sick or vice versa. In this way, e.g., in 2019 authors from [5] presented a scheme where deep learning was implemented to add and remove evidence of several pathologies in 3D medical images, which can lead to a complete misdiagnosis.

In this way, to guarantee the security and confidentiality of the hidden data and the image content respectively, reversible data hiding schemes over encrypted domain (RDH-ED) [4] are presented as a promissory solution, this refers to embed additional data into encrypted images, without losing the ability to recover both the data and the original content of the image, respectively. A characteristic that defined the first RDHE-ED schemes [6,7,8], and that differentiates it from other data concealment techniques, is the presence of three entities: a) the content owner who owns the original media and oversees encrypting it, b) the data hider, who embeds additional information into the encrypted media, and c) the receiver, the one who recovers the data or the original image from the encrypted marked media. Thus, the information can be embedded into the image without exposing its visual content, ensuring that the data hider does not know what the image represents, while allowing the receiver to access the content of the image or data, depending on the security elements that he owns. Likewise, RDH-ED methods can be classified into two categories: Vacating Room Before Encryption (VRBE) and Vacating Room After Encryption (VRAE), this depends on whether the image is processed to define the space where the information will be embedded before or after it is encrypted, however, the data embedding is not carried out in the plaintext domain due to the characteristics of the entities already mentioned.

The most recent RDH-ED schemes [9,10,11,12,13] were designed for natural images with 8 bits of depth per pixel, authors of [9] encrypted the image with a tailored stream cipher and embedded the information through prediction-error expansion. On the other hand, in [10] Tromino scrambling is used with AES stream cipher, to later hide data by shifting a two-dimensional prediction error histogram (2D-PEH). The scheme presented in [11] used a conventional stream cipher in conjunction with the prediction errors obtained from the median-edge predictor detector to embed information using the least significant bit (LSB) substitution technique. In [12] the image was encrypted using NTRU (Number Theory Research Unit), while data embedding was done by shifting the histogram of the absolute differences of adjacent pixels. As in [11] the scheme of [13] used a conventional stream cipher and embedded the information using the LSB substitution technique, however, it employed a linear regression model to generate an error map, required to restore the image to its original state. Although the works of [9,10,11,12,13] used different encryption techniques, they took advantage of the spatial redundancy present in natural images to embed the additional data.

In the context of medical imaging, in the literature has been reported several methods based on RDH [14,15,16,17,18,19,20,21], and RDH-ED [22, 23] respectively. In this sense, the works presented in [14,15,16,17,18,19,20,21] were carried out in the plaintext domain, i.e., the image information is not encrypted, thus the content is legible to the naked eye. For the sake of brevity and to our best knowledge, we make a synthesis of the most representative works [22, 23] based on RDH-ED methods that are directly related to the field of medical imaging.

Given the importance of the region of interest (ROI) in medical diagnosis, an RDH-ED method for encrypted medical images was proposed in [22]. In general terms, the scheme in [22] divides the medical image into non-overlapping blocks and encrypts them using a conventional stream cipher [24, 25] with a public encryption key. Later, encrypted tiles of the image are classified in tiles of the ROI and the region of non-interest (RONI) respectively. Finally, control information and patient data are embedded in a customized manner using a data hiding key, modifying only the blocks belonging to the RONI, by flipping the least significant 3 bits of each encrypted pixel. In a similar way to embedding procedure, extraction and image recovery stage is performed. Thus, both patient data and control information necessary to recover the medical image could be reversibly hidden.

On the other hand, Liu et al. [23] presented an RDH-ED scheme for encrypted images based on ROI. As in their previous work reported in [21], the image is divided into the ROI, RONI and the border area, which, in general terms, is segmented, rearranged as a stack and encrypted using a conventional stream cipher [24, 25]. Using LSB substitution technique, the encrypted LSBs from ROI are concatenated with the electronic patient record (EPR) information and embedded in the encrypted domain; while the vertices that define the ROI along with the MD5 [24, 25] hash value of the ROI, are embedded in the appointed position into the border area. After the embedding process, access to the content is controlled according to the secret keys that the receiver owning. Thus, with the encryption key, a receiver could obtain an approximation of the original image; with the data hiding key, the receiver could perform the extraction of the embedded data without error; and with both keys, the ROI could be recovered without loss, as well as all data contained into the image.

However, as a drawback of the methods [22] and [23], in both the embedded data cannot be extracted when the image with hidden data is directly decrypted, i.e., in the plaintext domain. This fact implies that the recovery of data and the reversibility of the content image can only be performed in the encrypted domain, in this way, a receiver entity that possesses an approximation of the image (image with hidden data directly decrypted) as well as a data hiding key, will not be able to obtain the original content. A promissory solution to solve this problem is the use of the framework for complete separable reversible data hiding in encrypted images [26], which, in general terms, implies that a receiver entity can extract the embedded data and recover the original image not only from the image encrypted with hidden data but also from the approximate version obtained by directly decrypting it.

Since medical images contain valuable information, the proposed scheme must not alter the visual quality of the image, thus the process of data hiding should not produce perceptible changes to the human eye after the embedding of the data. Furthermore, the data embedding technique used should allow hiding enough information to store mainly patient data, clinical diagnosis and the necessary information to perform the reversibility of the image. Given the above needs, this paper proposes an RDH-ED scheme completely separable applied to DICOM [27] medical images, which allows the extraction of additional data, as well as the full restoration of the ROI, either from encrypted or plaintext domain of the DICOM medical image. Our proposal is inspired in the previous work reported in [23], which in our best knowledge, is the most recent and relevant RDH-ED scheme applied to medical images with native DICOM format, however, our proposal presents notable differences with [23]. Main contributions of this proposed paper are:

  1. 1.

    Tailored implementation of AES cipher in CTR mode operation for DICOM medical images with 12 or 16-bit resolution, to perform an effective extraction of hidden data either from the DICOM encrypted image with hidden data (encrypted domain) or from the directly decrypted version (plaintext domain).

  2. 2.

    Data embedding through LSB substitution with a pseudorandom walk in both ROI and RONI, to increase the security of the RDH-ED proposed method.

  3. 3.

    Implementation of the cryptographic hash function SHA-512 [24, 25] to guarantee the integrity of the embedded data as well as the fully restored ROI.

  4. 4.

    Evaluation of visual quality of the obtained DICOM medical images with more metrics based on the human visual system model (HVS), specifically the Structural Similarity Index (SSIM) and the Visual Information Fidelity (VIF) respectively.

The rest of this paper is organized as follows: Sect. 2 describes the materials and methods of the proposed scheme designed for embedding data in the encrypted domain of the medical image, as well as the data extraction and the full restoration processes of the ROI. The experimental results, including a discussion and a comparison with previous RDH-ED algorithms, are shown in Sect. 3. Finally, the conclusions and future work are presented in Sect. 4.

2 Material and methods

The proposed scheme consists of the ROI-based preprocessing, encryption of medical image, data embedding in the encrypted domain, extraction of data and ROI recovery stages respectively. The framework of the proposed method is shown in Fig. 2.

Fig. 2
figure 2

Framework of the proposed scheme

2.1 ROI-based preprocessing

To obtain an encrypted medical image, the process described in this paper is applied to DICOM medical images in grayscale with 12 bit/pixel resolution \([0,{2}^{12}-1=4095]\) and dimensions M x N.

However, to increase the level of security of the scheme and vacate the room where additional data will be embedded, a preprocessing is performed on the image to rearrange it according to an interactive selection of ROI into the medical image. The preprocessing of the image is described below:

  • Step 1- Read the original DICOM medical image I and supported by an interaction between the content owner and computer device, according to given clinic criteria, select a rectangular area that containing the ROI of the image I. The coordinates of vertices that define the selected region are registered and denoted as VROI. Note that the capacity of the proposed RDH-ED method is directly proportional to the sizes of ROI and RONI areas, as well as dependable on the spatial resolution of the DICOM modality of the image I, i.e., the capacity of the algorithm decreases according to the increase in ROI size. On the other hand, the maximum size of ROI is explained later in Sect. 3.

  • Step 2- Once ROI area is selected, to verify its integrity after the reversibility process, the hash value of ROI is calculated with the algorithm SHA-512 [24, 25], according to (1):

$${H}_{ROI}={H}_{SHA-512}\left(ROI\right)$$
(1)

where \(H_{SHA512}\left(\cdot\right)\) denotes message-digest SHA-512 hash function and HROI denotes the binary representation of 512 bits generated after conversion from hexadecimal to binary values of \(H_{SHA512}\left(\cdot\right)\). It should be mentioned that other cryptographic hash functions can be adapted to perform this step.

  • Step 3- Finally, the image is rearranged stacking all the RONI pixels followed by the pixels that belong to the ROI, preserving the original dimensions M x N. We obtain the IndROI index of the pixel that points out the beginning of ROI in the reordered image.

2.2 Medical image encryption

After ROI-based preprocessing, the reordered medical image is encrypted using the AES-128 block cipher algorithm in counter mode (CTR) [24, 25]. This algorithm is an alternative to dedicated ciphers tailored to data hiding in the encrypted domain [28], due to the high efficiency of its implementation and the security it provides to the information systems, as well as its ability to emulate the operation of a conventional stream cipher. Unlike [23], where a conventional stream cipher is used to obtain the encrypted domain, in this paper we adapt and customize the AES-128 block cipher algorithm to achieve the completely separable reversibility explained earlier, which is a property not provided by [23]. In this way, considering 12 and 16 bit/pixel grayscale resolution of DICOM images in conjunction with the AES-128 block cipher algorithm in counter mode (CTR), the process of image encryption is described as follows:

  • Step 1- The rearranged image I is divided into non-overlapping blocks Bi, each of them of 4 × 4 pixels, according to (2).

$$I={\left\{{B}_{i}\right\}}_{i=1}^{j}, where j=\frac{n}{4\times 4}$$
(2)

where the total number of blocks is given by j, which is obtained by dividing the total number of pixels n of the image among the 16 pixels contained in each block. The size of Bi is determined by the operation specifications of the AES-CTR algorithm.

  • Step 2- The decimal representation of the pixels of a given block Bi is converted to 3-digit hexadecimal numbers when bit-depth of the image is 12 bit/pixel and 4-digit hexadecimal numbers when bit-depth of the image is 16 bit/pixel, in order to facilitate the manipulation of the bits of each pixel.

  • Step 3- A partition is performed to each hexadecimal number in such a way that the first two digits from left to right, that representing the eight most significant bits denoted as \({MSB}_{{B}_{i}}\), are separated from the 3rd digit when 12 bit/pixel or 4th digit when 16 bit/pixel, i.e. the least significant bits denoted as \({LSB}_{{B}_{i}}\).

  • Step 4- To obtain the encrypted block \({MSB}_{{B}_{i}^{e}}\) of the \({MSB}_{{B}_{i}}\), an encryption key Ke is required, which determines the result of the application of the AES-CTR stream cipher to the hexadecimal digits \({MSB}_{{B}_{i}}\). If not specified, the default length of Ke is 128 bits.

  • Step 5- Once the encrypted block \({MSB}_{{B}_{i}^{e}}\) is obtained, concatenate the unaltered \({LSB}_{{B}_{i}}\) with the encrypted \({MSB}_{{B}_{i}^{e}}\).

  • Step 6- Values obtained of the concatenation in Step 5 are converted from hexadecimal to decimal representation, to generate the final encrypted block of pixels \({B}_{i}^{e}\). This block encryption procedure could be expressed as (3):

$${B}_{i}^{e}=e\left({B}_{i},{K}_{e}\right)={B}_{i}\oplus K$$
(3)

where the encryption function e(Bi, Ke), denotes all steps described above. The ciphertext is obtained by an XOR operation between plaintext domain of the bits and the pseudorandom bitstream K, generated by AES-CTR stream cipher and the encryption key Ke. To illustrative purposes, Fig. 3 shows an example of the above encryption procedure considering a 12 bit/pixel bit depth.

Fig. 3
figure 3

Example of tailored encryption for medical images with 12 bit/pixel depth

  • Step 7- In this way, by repeating the procedure described from Steps 2 to 6 for each one of the blocks Bi in plaintext domain, the encrypted version of the DICOM medical image Ie is obtained by (4):

$${I}_{e}={\left\{{B}_{i}^{e}\right\}}_{i=1}^{j}$$
(4)

where i = 1, …, j = n/4 × 4, e denotes encryption domain.

Finally, to complete the creation of the encrypted medical image Ie and considering an appointed position R(x,y) in the first pixels of the RONI, conceal into RONI the data of VROI and IndROI using LSB substitution technique.

The VROI parameter allows a receiver with the key Ke reconstruct the ROI and RONI when the image is decrypted. On the other hand, IndROI is required by the data hider to embed additional data in the corresponding region; this guarantees that the plaintext of the image will not be exposed, either when embedding or extracting data.

2.3 Data embedding in the encrypted domain

Once preprocessed and encrypted DICOM medical image Ie is obtained by the content owner, as shown in Fig. 4, the data hider is capable of embed data into the encrypted medical image, without knowing the image content.

Fig. 4
figure 4

Sample of the preprocessing in conjunction with medical image encryption according to the selection of ROI

To embed additional data, LSB substitution technique is used, which has been widely used in data hiding schemes [29,30,31], due to its high capacity to embed data into images without affecting its visual quality. This technique consists of the substitution of the pixel LSB by a bit b = {0,1}, the changes caused in the value of the modified pixel are:

Considering the value of a pixel pair as 2i, with i = 0, …, (212/2–1) = 2047, the value changes according to (5):

$$2i\xrightarrow[{LSB-substitution}]{}\left\{\begin{array}{lc}2i,&\mathrm{if}\;b=0\\2i+1,&\mathrm{if}\ b=1\end{array}\right.$$
(5)

Considering the value of an odd pixel as 2i + 1, with i = 0, …, 2047, its value is given by (6):

$$2i+1\xrightarrow[{LSB-substitution}]{}\left\{\begin{array}{lc}2i,&\mathrm{if}\;b=0\\2i+1,&\mathrm{if}\ b=1\end{array}\right.$$
(6)

In order to guarantee patient privacy, as well as the integrity of their associated information, before embedding data into Ie, the following additional information is required:

- From metadata of DICOM medical image Ie, the fields associated with the name, date of birth, gender, and patient identifier are obtained and stored in binary form into Mp.

- From the DICOMDIR file associated with DICOM medical image Ie, the diagnosis summary of the patient is obtained and stored in binary form into Dp.

- To verify the integrity of Mp and Dp information after the extraction-data and ROI-recovery procedures, SHA-512 bits hash of Mp and Dp denoted as HM and HD are obtained applying (7) and (8) respectively.

$${H}_{M}={H}_{SHA-512}\left({M}_{p}\right)$$
(7)
$${H}_{D}={H}_{SHA-512}\left({M}_{p}\right)$$
(8)

The embedding process is performed by the data hider entity, which is described in detail as follows:

  • Step 1- From DICOM medical image Ie, read the LSBs embedded into Ie according to the appointed position R(x,y) and recover the index IndROI, which determines the pixels belonging to ROI and RONI, respectively.

  • Step 2- Supported by the information of the hash values HROI, HM and HD, a set of control data LROI composed by indexes location of each one, is obtained.

  • Step 3- Concatenate the data bits of HROI, HM, HD, and LROI, and obtain the information that will be concealed into the ROI, which is denoted as WROI and generated by (9):

$${W}_{ROI}={L}_{ROI}+{H}_{ROI}+{H}_{M}+{H}_{D}$$
(9)

the operator " + " in (9) indicates the concatenation of data bits.

  • Step 4- Embed WROI information into encrypted ROI pixels, using the index IndROI and LSB substitution with pseudorandom walk defined by the secret key Kd. For this purpose, Kd seeds a pseudorandom number generator to produce a pseudorandom permutation of the pixel positions, from the beginning of ROI to the last pixel of the image, so that increases the security level of hidden data.

  • Step 5.- During the process of LSB replacing, the original LSBs of the ROI that were modified are recorded in a string denoted as LSBROI. A set of control data LRONI composed by indexes location corresponding to LSBROI, the metadata, and the patient diagnosis, is obtained.

  • Step 6- Concatenate the data bits of LRONI, LSBROI, Mp, and Dp, to obtain the information that will be concealed into the RONI, which is denoted as WRONI and generated by (10).

$${W}_{RONI}={L}_{RONI}+{LSB}_{ROI}+{M}_{p}+{D}_{p}$$
(10)
  • Step 7- Embed WRONI information into encrypted RONI pixels, using LSB substitution with pseudorandom walk defined by the secret key Kd, excluding the first encrypted pixels of the RONI where VROI and IndROI were concealed by the content owner according to the appointed position R(x,y).

In summary, RONI conceals sensitive patient and diagnostic information, as well as the LSBs required for the full restoration of ROI; meanwhile, ROI conceals data for integrity verification related to patient, diagnostics and ROI data respectively. In this way, the capacity W is defined as the total sum of bits embedded in both the ROI and the RONI, as shown in (11).

$$W={W}_{ROI}+{W}_{RONI}$$
(11)

Note that this embedding process can be done in any LSB plane. In this way, when embedding W in the vacated room reserved during preprocessing and encryption procedures, the encrypted image with data hidden I'e could be as represented in (12):

$${I}_{e}^{^{\prime}}=RDHED\left({I}_{e},W,{K}_{d}\right)$$
(12)

where RDHED(∙) refers to the embedding procedure by using LSB substitution in the encrypted domain which was described earlier in Steps 1–7. Finally, the image I'e is written in DICOM format.

2.4 Data extraction and ROI restoration

In this stage, a receiver entity with the image I'e will perform several operations, depending on whether he has the encryption key Ke or the data hiding key Kd or both. Different from non-separable RDH-ED methods such as [23], in the schemes completely separable such as this proposed method, it is possible to extract the concealed data and recover the ROI either from the plaintext or encrypted domain. Cases available to a legitimate receiver entity are illustrated in Fig. 5. Cases 1, 2 and 3 in Fig. 5 are performed in the encrypted domain, while cases 4 and 5 are carried out in the plaintext domain.

Fig. 5
figure 5

Reception cases in the receiver stage of the complete separable RDH-ED scheme proposed

2.4.1 Case 1: Data extraction from the encrypted domain

A receiver with the pseudorandom walk data hiding key Kd will have access to the additional data embedded inside I'e, without being able to decrypt the image. To achieve this, the following procedure is required.

  • Step 1- Read the DICOM image file and obtain the image I'e.

  • Step 2- From I'e, obtain the parameter IndROI by reading the LSBs embedded in the appointed position R(x,y).

  • Step 3- Using the pseudorandom walk data hiding key Kd and the parameter IndROI that contains the beginning of ROI, read the LSBs of ROI and RONI, and obtain the sequences WROI and WRONI respectively.

  • Step 4- From sequences WROI and WRONI, identify the control information LROI and LRONI, and extract all data bits from ROI and RONI, respectively.

  • Step 5- Binary sequences corresponding to the metadata Mp and patient diagnosis Dp are converted to ASCII code, on the other hand, bits of the hash values HM and HD are formatted to hexadecimal representation.

  • Step 6- Finally, using the recovered information of metadata and diagnosis Mp, Dp, HM and HD we can verify its data integrity.

In Case 1, the original LSBs of ROI, LSBROI, and the hash value HROI are not employed.

2.4.2 Case 2: Obtaining of the medical image approximation from the encrypted domain

A receiver that has only the encryption key Ke is able to directly decrypt the image to generate an approximate version of the original image, however, he cannot extract another data embedded. This process is described as follows:

  • Step 1- Read the DICOM image file and obtain the image I'e.

  • Step 2- To re-arrange the approximate image after the decryption process, it is necessary to recover the vertices of ROI, in this way, VROI is obtained from reading the LSBs in the appointed position R(x,y). We would like to emphasize that obtaining this parameter can also be done after the decryption procedure.

  • Step 3- To decrypt the image I'e and obtain its approximate version in plaintext, the decryption process is performed to each block of pixels, according to (13).

$${B}_{i}=d\left({B}_{i}^{e},{K}_{e}\right)={B}_{i}^{e}\oplus K$$
(13)

where the decryption function \(d\left({B}_{i}^{e},{K}_{e}\right)\) implies the same operations as in (3) in a decryption mode operation. Once the image I'e has been decrypted, with VROI information, ROI and RONI pixels are arranged to their original positions and the approximate version of the medical image I' is obtained, whose visual quality is close to the original in spite of the hidden data that remains in its content.

2.4.3 Case 3: Recovery of ROI from the encrypted medical image

A receiver with the encryption key Ke and the pseudorandom walk data hiding key Kd can extract error-free embedded data as well as recover the ROI with lossless, both procedures from the encrypted medical image I'e, as follows:

  • Step 1- Read the DICOM medical image I'e and extract the VROI and IndROI parameters from the LSBs of the pixels in the appointed position R(x,y).

  • Step 2- Using the pseudorandom walk data hiding key Kd and the IndROI parameter, all data bits W concealed in the image I'e are extracted.

  • Step 3- Recover the LSBROI from the extracted data bits W and, in the order given by the secret key Kd, restore the LSBs of the encrypted ROI to its original state, i.e., the state prior to the LSB substitution.

  • Step 4- Once ROI is restored, decrypt the DICOM medical image I'e with encryption key Ke using (13).

  • Step 5- The decrypted image in Step 4 is rearranged using the information of VROI. The resultant medical image is denoted by Ir and contains the ROI information completely restored to its original form.

  • Step 6- Finally, ROI integrity can be verified using the hash value HROI and ROI restored information.

Note that the extraction of all data bits W, as well as the restoration of the ROI LSBs, can be performed after the decryption procedure or before the reconstruction of the image.

2.4.4 Case 4: Data extraction from the plaintext domain

The receiver entity with the approximate version of the medical image I' and the pseudorandom walk data hiding key Kd, can extract the additional data W from the plaintext domain, using the control parameters VROI and IndROI from the appointed position R(x,y) of the image I'. This is possible because the encryption method does not modify the first LSB planes of the pixels, allowing them to keep the embedded bits even after decryption.

In this way, the image I' is rearranged using the parameter VROI and subsequently, the receiver can extract the additional data W employing Kd and IndROI parameters.

2.4.5 Case 5: ROI recovery from the decrypted image

This case extends the capabilities of Case 4. In this way, once the image I' is rearranged and the additional data W was extracted, the modified LSBs into ROI are replaced by the bits of LSBROI using Kd and, as a consequence, ROI is restored to its original condition; thus, employing the VROI information, an image Ir can be obtained, which contains the ROI completely restored with lossless data.

3 Results and discussion

To evaluate the performance of the algorithm applied to DICOM imaging, this section presents the experimental results and discussion of the proposed scheme. A set of 200 computed tomography (CT) scans in DICOM format [27] were used, composed of skull, thorax and abdomen images; of 512 × 512 pixels in size and 12 bit/pixel resolution in grayscale. Considering the spatial and grayscale resolutions of CT images, experiments were performed using a ROI with a size of 25% of the spatial resolution of the original image.

Although the schemes of the works [9,10,11,12,13] are more recent and have obtained competitive results, they cannot be directly implemented in medical imaging, since they are based on the spatial correlation present in the image pixels. As can be seen in Fig. 6a, the regions that appear to be flat in a DICOM image are represented by very different intensities and the pixel values are in a range that goes from 0 to 1024, 4096, or 65,536, because DICOM images can have resolutions of 10, 12 or 16 bit/pixel depth, depending on the modality. On the other hand, the intensities of the pixels in a natural image are usually approximate to each other and are in a range from 0 to 255, as shown in Fig. 6b, which allows the use of methods based on predicting pixel values or calculating the difference between pixels. Therefore, the comparison of results will be performed between the schemes that have been designed for medical images.

Fig. 6
figure 6

Samples of pixel values from 3 × 3 flat regions (a) from a DICOM CT scan with 12 bit/pixel grayscale resolution, and (b) from the conventional image Lena with 8 bit/pixel grayscale resolution

Lengths of parameters VROI and IndROI are: VROI control parameter consists of 4 coordinates represented by 10 bits each one, obtaining a length of 40 bits; while IndROI is a sequence of 18 bits that indicates the pixel index at which ROI begins, obtaining a total length of VROI + IndROI = 58 bits. These control data are embedded into the LSBs of the first 58 pixels of the medical image. To perform data extraction, the control information LROI and LRONI is formed by 3 indexes, each one allows to identify the beginning and the end of the binary sequences embedded in ROI and RONI respectively; thus each index is represented by 18 bits, hence, 3 × 18 = 54 bits is the length of LROI and LRONI respectively. Since message-digest algorithm SHA-512 cause an output of 512 bits in length, the total amount of data to recover ROI region is LSBROI = LROI + HROI + HM + HD = 54 + 512 + 512 + 512 = 1590 bits. The patient sensitive information Mp is obtained from the metadata of DICOM medical images and is composed of the fields associated with the name, date of birth, gender, and patient identifier. Meanwhile, the size of the patient diagnosis Dp is variable and, for illustrative purposes, in the experimental results its length was adjusted with pseudorandom data.

All tests were carried out on a personal computer with Microsoft Windows 10 © operating system, Intel © Core ™ i7 (2.66 GHz) processor and 8 Gb of RAM, where the algorithms of the proposed scheme were implemented in MATLAB © R2017b. For the encryption stage, the average processing time was 10.06 s. The restoration of ROI with embedding rates of 0.01 to 0.7 bits per pixel (bpp) in the encrypted domain was carried out in an interval from 9.95 to 11.24 s; meanwhile, in the plaintext domain, the restoration time was in an interval from 0.53 to 1.48 s. Considering an embedding rate of 0.5 bpp in the 1st LSB plane and a ROI size of 25% of the original image, Fig. 7 shows five test DICOM medical images used in the proposed scheme with its encrypted version with hidden data, as well as their approximated and recovered versions with ROI restored, respectively. DICOM images in Fig. 7 were obtained using the specialized software Radiant DICOM Viewer ©.

Fig. 7
figure 7

(a) Original test medical images, (b) Encrypted medical images with data hidden, (c) Approximate images without ROI restored and (d) Images with restored ROI. Embedding rate of 0.5 bpp in the 1st LSB plane and a ROI size of 25% of the original image

The performance of the proposed algorithm is evaluated in terms of imperceptibility using several embedding rates for the first 3 LSB planes. A widely used conventional metric to evaluate the visual quality of images with data concealed in its content is the Peak Signal to Noise Ratio (PSNR) [32], given by (14):

$$PSNR\left(dB\right)=10{\mathrm{log}}_{10}\left(\frac{{N\cdot M\cdot Max Pixel\ Value}^{2}}{\sum_{x=1}^{N}\sum_{y=1}^{M}{\left(I\left(x,y\right)-{I}^{^{\prime}}\left(x,y\right)\right)}^{2}}\right)$$
(14)

where N and M are the image dimensions, while I and I’ are the original and the image with data hidden in its content, respectively.

In this context, Fig. 8 shows a performance comparison in terms of PSNR using test images 1, 2, 3 and 4 shown in Fig. 7, with several embedding rates from 0 to 0.5 bpp. The RDH-ED methods included in the comparative are [6,7,8, 22] and [23]. PSNR in Fig. 8 is measured directly from decrypted images versions that still have data concealed in its content. From Fig. 8 we show that the imperceptibility decreases quickly in the works [6,7,8] and [22] for embedding rates below 0.1, obtaining PSNR values of less than 80 dB. This fact indicates that the methods [6,7,8] and [22] inevitably introduce visual distortion when the embedding rate is increased, as a consequence, its application in medical images seems not suitable to scenarios that require high data embedding capacity.

Fig. 8
figure 8

PSNR comparison obtained from directly decrypted images with the methods Zhang [6], Hong et al. [7], Zhang [8], Lavanya et al. [22] and Liu et al. [23]

On the other hand, Fig. 8 shows that the RDH-ED method reported in [23] and the proposed one in this paper, both obtained PSNR values greater than 102 dB for embedding rates from 0 to 0.5 bpp, avoiding any visual distortion into the medical images and allowing a high data embedding capacity. Based on these results and considering that the algorithm presented by Liu et al. [23], in our best knowledge, is nowadays the most recent and relevant method in the context of RDH-ED for DICOM medical images, from now on the comparison is performed between the method of [23] and our proposed scheme.

In this way, considering embedding rates of 0.005, 0.01, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, LSB planes 1, 2, 3, and decrypted images with and without restored ROI, the comparison between the PSNR average values obtained from Liu et al. [23] scheme and the proposed one is shown in Tables 1 and 2, respectively. From Tables 1 and 2 we show that the visual quality provided by our proposed method outperforms the offered by Lui et al. [23] in terms of PSNR. In this way, when the embedding rate is small (less than 0.1), PSNR differences between our proposed RDH-ED scheme and Lui et al. [23] are around 1.5–7.5 dB, becoming smaller when the embedding rate is increased, obtaining PSNR differences of 1.5–0.15 dB approximately.

Table 1 Average PSNR comparison obtained in approximate images for the first three LSB planes with respect to several embedding rates
Table 2 Average PSNR comparison obtained in recovered images for the first three LSB planes with respect to several embedding rates

In order to evaluate more strictly the performance of the proposed method in this paper, the literature reports metrics that allow evaluating the visual quality of the image more accurately than the PSNR, based on the perceptible distortions of an image with respect to another reference. One of these metrics is the SSIM (Structural Similarity Index) [33], defined by (15):

$$SSIM(I,I^{\prime}) = \frac{{\left( {2\mu_{I} \mu_{I^{\prime}} + C_{1} } \right)\left( {2\sigma_{I^{\prime}} + C_{2} } \right)}}{{\left( {\mu_{I}^{2} + \mu_{I^{\prime}}^{2} + C_{1} } \right)\left( {\sigma_{I}^{2} + \sigma_{I^{\prime}}^{2} + C_{2} } \right)}}$$
(15)

In (15), I is the original medical image, I' is the decrypted medical image with or without restored ROI, while C1 and C2 are small constant values defined in [33].

Another well-known criterion to determine the level of fidelity of a processed image with respect to the original, based on the human visual system model (HVS), is the VIF (Visual Information Fidelity) [32], given by (16):

$$VIF = \frac{{\sum\limits_{k \in channels} {I(\overrightarrow {C}^{Z,k} ;\overrightarrow {G}^{Z,k} |s^{Z,k} )} }}{{\sum\limits_{k \in channels} {I(\overrightarrow {C}^{Z,k} ;\overrightarrow {E}^{Z,k} |s^{Z,k} )} }}$$
(16)

where, E and G refer to the visual signals of the original medical image and the processed one, respectively, obtained from the HSV model and from which the brain extracts cognitive information. \(I\left({\overrightarrow{C}}^{Z,k};{\overrightarrow{G}}^{Z,k}|{s}^{Z,k}\right)\) denotes the information that, theoretically, the brain extracts from a specific channel of the original image, while \(I\left({\overrightarrow{C}}^{Z,k};{\overrightarrow{E}}^{Z,k}|{s}^{Z,k}\right)\) corresponds to information extracted from the processed image.

Both the SSIM and the VIF metrics provide more accuracy than the conventional PSNR to determine perceptible distortions in a given image and have a range of values of [0, 1]. SSIM and VIF values close to 1 indicate a good visual quality regarding the original image. The average values of SSIM and VIF obtained with embedding rates ranging from 0.01 to 0.6 bpp for each of the first three LSB planes, are shown in Tables 3 and 4, for directly decrypted images and those with ROI completely restored, respectively.

Table 3 Comparison of SSIM and VIF average values obtained in the approximate images for the first three planes LSB with respect to several embedding rates
Table 4 Comparison of SSIM and VIF average values obtained in the recovered images for the first three planes LSB with respect to several embedding rates

Regarding the above analysis between the imperceptibility and capacity of the proposed method, several points should be noted. Since the length of bits embedded into ROI (WROI) is fixed, the number of bits required for ROI restoration (LSBROI) does not depend on the size of the selected ROI area. However, a large ROI selected in the preprocessing stage limits the embedding capacity for the patient data Mp and the patient diagnosis Dp, because the number of pixels belonging to the RONI (where Mp and Dp are concealed) decreases when the selected ROI is bigger. In this way, considering images with 512 × 512 pixels in size and 12 bit/pixel resolution in grayscale, the proposed scheme requires a selected ROI with a minimum size of 40 × 40 pixels, so that it can store the 1,590 bits of WROI, allowing a maximum capacity in RONI to host 260,544 bits, which corresponds to a maximum embedding capacity of 0.99 bpp in a single LSB plane. On the other hand, the maximum size of the ROI depends on the amount of information that the data hider wants to embed; considering a case where no additional data is embedded, and only the control information (1648 bits) is stored in the RONI, the ROI can be made up of 260,496 bits, and like the RONI, the ROI can have a maximum size of 99% from the image.

Finally, from Tables 1, 2, 3 and 4, it can be seen that the proposed scheme allows the creation of approximate versions, as well as recovered versions with restored ROI, both with high visual quality, obtaining average values ​​of PSNR, SSIM and VIF greater than 101 dB, 0.98 and 0.97 respectively when the embedding rate is 0.5 bpp in the first LSB plane; fact that shows that a receiver entity that possesses only the decryption key Ke is able to obtain decrypted DICOM medical images with hidden data which are visually identical to the original ones.

Another aspect analyzed is the capability of the RDH-ED proposed scheme to be unnoticed against the steganalysis technique. For this purpose, we employed the steganalysis method of Pairs of Values (PoV) [34], because it is one of the most efficient algorithms to detect hidden data when the embedding technique is LSB substitution. In general terms, PoV is based on statistical analysis using the probability density function (PDF) of χ2 distribution [34]. In this way, considering embedding rates ranging from 0 to 0.6 bpp and the first three LSB planes of medical images, Fig. 9a shows the behavior of χ2 distribution for approximated versions, meanwhile, Fig. 9b shows the results obtained from recovered versions with restored ROI. As shown in Fig. 9, the pseudorandom walk implemented in the data embedding stage of the proposed RDH-ED method allows obtaining higher PDF values regarding the sequential walk used in the method of Lui et al. [23]. Thus, considering that with a low value of PDF, the probability of detecting the presence of hidden data in an image is high, from Fig. 9a and b we show that our proposed RDH-ED method compared with [23], provides a better capability in terms of allowing concealed data in medical images to be unnoticed against the steganalysis technique, demonstrating the benefits of using a pseudorandom-walk in RDH-ED schemes, as in the case of our proposed algorithm.

Fig. 9
figure 9

Comparison of the values of PoV steganalysis obtained from (a) approximate images and (b) images with restored ROI, for several embedding rates

To complete the test, a performance comparison is presented in Table 5, regarding the state of the art reported in [6,7,8, 22] and [23] in terms of capability to be separable (data extraction and image recovery could be done from the encrypted or plaintext domain, not in both), the ability to be completely separable (data extraction and image recovery can be performed either in encrypted domain or plaintext domain), as well as the presence of errors in data extraction and image recovery, respectively. Therefore, in Table 5 we show that the methods of Zhang [6], Hong et al. [7] and Lavanya et al. [22] are not separable in their stages of data extraction and image recovery, meanwhile, the works of Zhang [8], Liu et al. [23] and our proposed method are separable. On the other hand, our proposed RDH-ED method is completely separable, while the other schemes are not. Regarding errors in data extraction, Zhang [8], Liu et al. [23] and our proposed method are error-free, although the rest of the methods present errors in the extracted bits. Finally, concerning errors in image recovery, we show that Lavanya et al. [22], Liu et al. [23] and our proposed method present errors in recovered images only in the RONI content, this fact is because these three RDH-ED methods are ROI-based, i.e., they recover in an intact manner only the ROI information. The rest of the methods [6, 7] and [8] contains errors in the recovered image, in both the ROI and RONI areas. Although the results obtained are very similar to those of [23], the proposed method has managed to increase the recovery options of the receiver, since it allows restoring the image and extracting the data from the plaintext domain of the approximate image, without losing embedding capacity and visual quality of the approximation, whereas this is not possible with the work of [23]. Additionally, the proposed scheme provides a higher level of protection against steganalysis, thanks to the implementation of the LSB substitution in pseudo-random order.

Table 5 Performance comparison

4 Conclusions

To protect the privacy of a medical image, as well as the patient personal information associated with it, in this paper we propose a reversible data hiding scheme for encrypted medical images; whose reversibility, unlike related works in the state of the art, is completely separable by allowing the extraction of the additional data and the restoration of the region of interest, both from the plaintext domain and the encrypted domain of DICOM medical images. According to the proper key, a legitimate receiver can perform the following tasks: a) Obtain a high visual quality approximate image with respect to the original version by directly decrypting the cryptogram with the encryption key, b) With the data hiding key, the embedded data can be extracted free of any error, either from the encrypted image or its approximate version respectively, and c) In case of having both keys, the embedded data can be extracted and the recovered image with the region of interest fully restored can be obtained without loss of information. In this context, the proposed RDH-ED method is suitable for applications where the information security and the management of medical images need to be ensured in terms of reliability, integrity and, confidentiality.

The high visual quality of DICOM images with restored ROI has been demonstrated by obtaining average values of PSNR, SSIM and VIF higher than 101 dB, 0.98 and 0.97, respectively. These values ensure that the medical images do not present perceptible distortions to the human eye that may alter the visual content of medical images and, as a result, lead to an erroneous diagnosis. Moreover, a high capacity for concealing data is warranted with embedding rates up to 0.6 bpp in the first three LSB planes. Also, information security is improved by using AES cipher in CTR operation mode to obtain the encrypted domain, SHA-512 algorithm to verify data integrity, as well as the implementation of pseudorandom-walk in data embedding to be unnoticed against steganalysis. A performance comparison with the most recent work reported in the state of the art was provided, demonstrating the superiority of our proposed method in terms of visual quality of the obtained medical images, as well as an improvement in the versatility of separability, by allowing data extraction and ROI restoration, either from the encrypted domain or from the plaintext domain; preserving in all cases the performance in terms of data embedding capacity.

As future work, we consider the implementation of a reversible data hiding technique that allows the restoration of the whole image or the use of lossless data compression methods, in order to achieve total reversibility and not only of the region of interest ROI. Also, extending the application of our method to other modalities of medical imaging, such as magnetic resonance imaging (MRI), radio fluoroscopy (RF), computerized radiography (CR), among others.