1 Introduction

In this era, various medical systems are continuously migrating into the cloud and mobile environments. In the version of the telemedicine, the doctor examines the medical image along with the patient data, which is transmitted from remote places, that helps in receiving medical care by expediting diagnosis and immediate treatment. Security parameters such as authentication, integrity, confidentiality and availability have to be considered for secure transmission, Department of Health and Human Services (DHHS) imposed regulations for data security and privacy under the health insurance portability and accountability act (HIPAA) of 1996 [13].

Cryptographic techniques encrypt the secret records with a password and assume that only authorized parties have access to the password [4]. While this does work most of the time, the encrypted data is prone to prying security thieves, who could decipher sensitive information like the patients’ insurance service provider, medication history, etc.

Steganography provides an alternative to this problem, hiding the very existence of sensitive data by concealing the data in nondescript areas of the carrier image, such that the changes made to the image are imperceptible, and the secret information is retrieved only by authorized person [510]. Steganography methods performance can be observed by the three valuable specifications: secrecy, volume/capability, and visual imperceptibility [11]. Secrecy is used to protect data from unauthenticated attackers or intruders. The hiding capacity should be enough to obscure the data in a cover image. Visual quality of stego image should be like that, no one can claim about imperceptibility [12].

Since the debut of this era, one of the most intelligible terms of information technology and communication is the security of medical records and patient’s personal information like unique id, name of patient, and disease information etc.

In the least significant bit (LSB) procedure, secret data bits are substituted at the least significant bits position of every byte of the cover file [1315]. A LSB array based mechanism has been used. It combines whole LSB bits of diverse pixels as an LSB array [16].

Parvez et al. described a steganography methodology using RGB intensity values of the pixel. They have suggested concept of an indicator channel and remaining two channels to hide secret data bits. The last two bits of the indicator channel will give information about the hiding data in the other two channels [17]. Alseelawi et al. proposed a new steganography technique based on LSB methodology using ALPHA channel on JPG cover images [18]. Thiyagarajan et al. proposed a new steganography methodology for hiding patient information inside a medical cover image using a dynamic key produced by graph 3 coloring problem [19].

Anderson et al. discussed some limitations in steganography methods [20]. They approached an information theoretic method using Shannon’s theory for perfect security of data. Therefore, steganography and cryptography both are used together to accomplish the security challenges to the transmitted medical records and patient’s personal information over internet and networking [11]. The encrypted message block has been mapped to LSB array. A matching process has applied to find out maximum matching portion for embedding secret data [21, 22].

Wang et al. proposed a fuzzy model which is fundamentally based on learning from big data with uncertainty in data mining [2327].

The proposed method provides two level of security paradigm. Formerly, the medical records and patient’s personal information is encrypted using RSA cryptography algorithm and later, the encrypted data is concealed into the least significant bits of cover image using the decision tree matching concept, thus the strength of steganography increased with cryptography. Hiding the data into least significant bits of cover image does not much affect its visual appearance quality.

This novel approach can be understood by referring the following divisions. In Sect. 2, the proposed techniques for image steganography using decision tree and RSA cryptosystem is discussed. In Sect. 3, the problem formulation and work methodology are suggested. In Sect. 4, results are observed and in Sect. 5, the discussion and analysis is done. Finally the work is concluded in Sect. 6.

2 Proposed techniques for image steganography using decision tree and RSA cryptosystem

In this section, various techniques have been discussed as follows:

2.1 RSA cryptosystem

The RSA algorithm involves three steps as key generation, encryption and decryption [4].

RSA algorithm can be illustrated as follows:

Step 1: Select two large strong prime numbers, \(p\) and \(q\). Calculate

$$n = p*q$$
(1)

Step 2: Compute Euler’s totient value for \({\text{n}}\)

$${\text{f }}({\text{n}}) = ({\text{p }} - 1) * ({\text{q }} - 1)$$
(2)

Step 3: Find a dynamic number \({\text{e}}\) satisfying condition 1 <  \(\text{e} \,< \,\text{f}(\text{n})\) and relatively prime to function f(n)

$${\text{gcd }}({\text{e}}, {\text{f }}({\text{n}})) = 1$$
(3)

Step 4: Calculate a number \({\text{d}}\) such that

$${\text{d }} = {\text{e}} - 1 {\text{mod f }}({\text{n}})$$
(4)

Step 5: Now public key and private key of receiver is \(({\text{e}}, {\text{n}})\) and \(({\text{d}}, {\text{n}})\) respectively.

Step 6: Encryption: Given a plain text message m satisfying condition \(\text{m} \,< \,\text{n}\), then the cipher text

$${\text{c }} = {\text{m}}^{\text{e}} {\text{mod n}}$$
(5)

Step 7: The cipher text is decrypted by

$${\text{m }} = {\text{c}}^{\text{d}} {\text{mod n}}$$
(6)

2.2 Security of the RSA cryptosystem

The security of the RSA cryptosystem is fundamentally based on, the factoring of large prime numbers. Providing security against partial decryption is required to add a secure scheme. The RSA problem is described as the task of taking eth roots modulo of a composite number \(n.\)

Recovering a value of message \({\text{m}}\) such that

$$c \equiv {\text{m}}^{\text{e}} ({\text{mod }}n)$$
(7)

where (\(n\)e) is an RSA public key and c is the RSA cipher text.

2.3 Decision tree

A decision tree is used for classification, prediction and facilitating decision making rules in linear decision issues. It is a graphical representation, which shows decisions and feasible outcomes. Decision tree consist of three types of nodes:

  1. 1.

    Decision node: It is represented by squares. A square shows all variety of options available at a particular node by outgoing lines.

  2. 2.

    Chance node: It is represented by circles for giving chance outcomes. Chance outcomes are events that can occur.

  3. 3.

    Terminal node: It is represented by triangles. Terminal nodes describe the final feasible outcomes of the decision making process.

In decision tree, entropy calculates the data elements level of difference in a set. This term derives from information theory. Higher entropy gives the more information.

$${\text{Entropy }} = \mathop \sum \nolimits - {\text{p}}_{\text{i}} {\text{log}}_{2} {\text{p}}_{\text{i}}$$
(8)

\(p_{i}\) is the probability of class i, compute as the proportion of class i in the set.

Entropy is 0 if all members of set belong to the same class (the data is perfectly classified). The range of entropy is 0 “perfectly classified” to 1 “totally random” [23, 24]. Figure 1 has shown the decision tree using graphical representation for describing linear decision problems.

Fig. 1
figure 1

Decision tree using graphical representation for describing linear decision problems

Information gain is a statistical analysis that measures how given attribute separates from different classes. The highest information gain attribute is selected to decide the ordering of attributes in the nodes of a decision tree [28].

\({\text{Gain}}({\text{S}}, {\text{A}})\) is information gain of set \({\text{S}}\) on attribute \({\text{A}}\) is defined as

$${\text{Gain}}({\text{S}}, {\text{A}}) = {\text{Entropy}}({\text{S}}) - \mathop \sum \nolimits ((|{\text{Sv}}| / |{\text{S}}|) * {\text{Entropy }}({\text{Sv}}))$$
(9)

where: Σ is each value \({\text{v}}\) of all possible values of attribute \({\text{A}}\)

\({\text{S}}_{\text{v}}\) = subset of \({\text{S}}\) for which attribute \({\text{A}}\) has value \({\text{v}}\)

\(\left| {{\text{S}}_{\text{v}} } \right|\) = number of elements in \({\text{S}}_{\text{v}}\)

\(|{\text{S}}|\) = number of elements in \({\text{S}}\)

2.4 Breadth first search (BFS) technique

BFS is a data structure algorithm for traversing or searching tree or graph. It starts by the root of the tree or graph. It explores the neighbor nodes first, after that move to the next level neighbors. BFS algorithm can be used to find a shortest path between two vertices \(x\) and \(y\). It uses a queue \(q\) to keep track of the shortest path between the start vertex and the destination vertex. The time complexity for BFS is \(O(n)\), here \(n\) is the number of nodes for traversal [29]. The procedure for BFS is as follows and shown in Fig. 2.

Fig. 2
figure 2

Breadth first search technique

  1. 1.

    Search for all vertices that are directly reachable from the root (called level 1 vertices).

  2. 2.

    Visit all vertices that are directly reachable from any level 1 vertices (called level 2 vertices), and so on.

  3. 3.

    In general, level \(p\) vertices are directly reachable from a level \(p - 1\) vertices.

3 Problem formulation and work methodology

3.1 Work flow of the algorithm

The problem statement consist hiding of secret RSA encrypted medical data in grey- alpha channel cover image with the help of decision tree to create a secure crypto-stegano algorithm. Figure 3 shows the work flow of this algorithm.

Fig. 3
figure 3

Work flow diagram of the methodology

3.2 Sender side methodology

3.2.1 Cover image and secret message

In the proposed system, select RGBA channel image (32 bits) and grey channel medical image (8 bits). Now separate alpha channel from RGBA image and merge it with grey image. Grey-alpha channel medical image is now used as a cover image. Now, cover image will be divided into number of image blocks (except some reserved location i.e. to byte numbers 3045–4045) and each image block has 4 grey- alpha pixels (64 bits). Now, take a secret medical record of patient and his personal information like unique id, name, disease information, which will be embedded into the cover image. Reserved bytes will have the information of the number of blocks of secret medical data and cipher text, dynamic key values. These key values along with breadth first search in decision tree is used for selection of one channel in RGB channel as an indicator and position in particular channel as well as position of embedding two bit per channel in grey-alpha channel (5th–8th bit).

3.2.2 RSA cryptosystem

RSA encryption technique is used to encrypt the secret medical data, cipher text will be divided into number of 8 bits blocks using dynamic key, before embedding.

3.2.3 Decision tree

Decision tree will be used as a reactive mechanism, for giving decisions to select the particular channel in grey-alpha channel. Decision tree will be constructed using binary 0 and 1 using their entropy and information gain values. The entropy of binary 0 and 1 is having the same value, so both values will be used at the decision node. Here, suppose binary sequence ranges from 0000–1111 for 0–15 integer value for making decision tree. Decision tree will be divided in four levels. Fourth level has total eight children and they will act as terminating nodes. All the eight terminal nodes in decision tree are assigned a number from 1 to 8 from left to right direction as shown in Fig. 4.

Fig. 4
figure 4

Binary decision tree

3.2.4 RGB channel image using decision tree matching process

In this process, RGB channel image using decision tree is used for selection of grey-alpha channel for embedding secret cipher data. RGB channel image will be divided into blocks, and each block has two pixels. Now randomly select one block at a time, and one channel from red, green and blue channel, the selected channel will give us the channel byte value. Select the four bits using random number in ascending order of the channel byte value, which will be called as channel value. All the four bits are matched in decision tree at fourth level using breadth first search. One indicator value out of eight in decision tree is matched at fourth level by following exactly same bits sequence as channel value. If indicator value is from 1 to 4, then grey channel is used for embedding, otherwise alpha channel is used for embedding. This process continues till all the secret cipher bits are not finished. Table 1 has shown the example for RGB channel decision tree matching process.

Table 1 Example for RGB channel decision tree matching process

3.2.5 LSB embedding

Secret cipher insertion at various LSB positions in cover image is done as follows:

  1. a.

    Embed the number of blocks of plain and cipher text, dynamic key values and number of matched channel in reserved location.

  2. b.

    Convert the values of cipher text and grey-alpha channel medical cover image in binary form.

  3. c.

    Grey/alpha channel is selected using decision tree as discussed in previous section.

  4. d.

    Select two LSB from 5th to 8th bit LSB positions (5th 6th or 5th 7th or 5th 8th or 6th 7th or 6th 8th or 7th 8th) using dynamic key.

  5. e.

    Now embed the data into selected LSB positions.

Continue this process until all the cipher data block is not empty and all secret cipher text is not embedded in grey–alpha channel medical cover image and send resultant stego image to the receiver.

3.3 Receiver side methodology

Now retrieving the plain text from secret cipher text using grey–alpha channel medical cover image is done as follows:

  1. a.

    Retrieve the number of blocks of plain and cipher text, dynamic key values and number of matched channel from the reserved location.

  2. b.

    Retrieving will be done, from each grey-alpha channel medical cover image block from dynamically selected channel either grey or alpha channel of cover image, from the two LSB positions.

  3. c.

    Go to step (b) till all the secret cipher text bits are not retrieved.

  4. d.

    Now take all the cipher bits and arrange them in blocks according to their dynamic key value.

  5. e.

    Convert binary value into integers.

  6. f.

    Now apply RSA decryption to secret cipher blocks to get plain text blocks.

  7. g.

    After decryption, arrange all the medical data plain text blocks according to their dynamic key value.

This system gives one value of plain text corresponding to one cipher text. Only authorized receiver can identify the correct medical information with their private key.

4 Results

The simulation and experimentation have been done using MATLAB. Resultant simulated outcome for different medical cover images and their stego images are being displayed in Fig. 5. Histograms also show the negligible amount of difference between histogram of original cover image and stego image. Histograms for various cover images and their stego images are also shown in Fig. 5.

Fig. 5
figure 5figure 5

(1), (8), (15), (22) are RGBA images and (3), (10), (17), (24) are their RGB images respectively, (2), (9), (16), (23) are grey channel medical cover images, (4), (11), (18), (25) are grey-alpha medical cover images and (5), (12), (19), (26) are their medical stego images respectively, (6), (13), (20), (27) are histograms of grey-alpha medical cover images and (7), (14), (21), (28) are their medical stego images histograms respectively

The patient information used in this work is listed in Table 2.

Table 2 Patient medical record

The clause PSNR (Peak Signal to Noise Ratio) is a technical terminology that defines the ratio between the maximum power of a signal and the power of damaged noise. An important index to readjust the quality of reformation of steganography images is peak signal to noise ratio. The original cover image acts like a signal, and the noise is the defect included by some steganography mechanism. The PSNR, MSE (Mean Square Error) and maximum embedding volume values at divergent payloads for different cover images of various sizes is given in Table 3. PSNR is calculated in decibels (dB). A high quality stego image should aspire for 40 dB and above [30].

Table 3 Observed capacity, max squared error, ratio of squared norms, MSE and PSNR (different medical cover images of same/different size with various secret cipher data of same/different size)

PSNR outcome is defined by the mean square error (MSE) for two \(P*Q\) monochrome images, where \(x\) as well as \(y\) are image coordinates, \(SG_{xy}\) (stego image) and \(CV_{xy}\) (cover image), one of the images is approved a noisy surmise of the other is defined as:

$$MSE = \frac{1}{PQ} \mathop \sum \limits_{x = 1}^{P} \mathop \sum \limits_{y = 1}^{Q} (SG_{xy} - CV_{xy} )$$
(10)
$$PSNR = 10 \log_{10} \left( {\frac{{CV_{max}^{2} }}{MSE}} \right)$$
(11)

where \(CV_{max}\) = the maximum 255 pixel value, for 8-bit cover images [8].

5 Discussion and analysis

Using Table 3, results are analyzed. If grey-alpha medical cover images such as Brain, Kidney, Lungs, and Uterus of size 262 kilo bytes and secret medical data size is 256 bytes, then PSNR and MSE values will be in range from 62.06 to 66.24 dB and 0.0405 to 0.0403 respectively and if medical data size increases to 1024 bytes then PSNR and MSE values will be in range from 56.38 to 59.38 dB and 0.3623 to 0.5741 respectively. If medical grey-alpha cover images size increases to 1048 kilo bytes and secret medical data size is 256 bytes then PSNR and MSE values will be in range from 71.55 to 74.12 dB and 0.0513 to 0.0443 respectively and if secret medical data size increases to 1024 bytes then PSNR and MSE values will be in range from 61.26 to 65.72 and 0.6143 to 0.0664 respectively.

  1. (a)

    In Brain grey-alpha cover image, maximum embedding capacity is 65.53 and 357.37 kb, which is 26 and 34 % respectively of the image size.

  2. (b)

    In Kidney cover image, it is 69.32 and 389.65 kb, which is 27 and 37 % respectively of the image size.

  3. (c)

    In Lungs cover image, it is 71.45 and 396.38 kb, which is 28 and 38 % respectively of the image size.

  4. (d)

    In Uterus image, it is 73.01 and 409.47 kb, which is 28 and 39 % respectively of the image size.

So by result analysis, it can be noticed that by increasing the medical grey-alpha cover image size and decreasing the secret medical data size PSNR value will be increased up to 74.12 dB and MSE value will be decreased up to 0.0443 as well as maximum embedding capacity is increased up to 39 %. So that performance will be high with respect to PSNR, MSE and maximum embedding capacity values. Figure 6 shows the result analysis of proposed algorithm using various performance parameters.

Fig. 6
figure 6

Result analysis of proposed algorithm using various performance parameters

Using Fig. 5, one can observe that there is no visual artifacts with the medical stego images and histograms, it is looking exactly same as corresponding original medical cover images.

This image steganography approach is exceedingly secure for sending medical record and personal information of patient’s, since it uses allocation of message blocks to medical cover image through matching concept of decision tree. Embedding of secret cipher information will be performed in dynamically selected medical cover image pixels using decision tree. Here, embedding will be done only at 5th–8th bit LSB position in a pixel for better visual quality of medical stego image. The existing intruders and attacks cannot identify the existence of steganography [27]. Moreover, the cipher text is hidden, not the medical plain text, which increases one level of security. The enciphering algorithm is the RSA public key cryptography algorithm.

Using Table 4, the comparison of the proposed scheme is shown on the basis of minimum calculated PSNR, embedding capacity and visual imperceptibility with the different algorithms proposed by other researcher’s scheme. Compared to other algorithms proposed by other researchers in this field, it shows strongest technique and can be used for securing any variety of secret medical data.

Table 4 Comparison with other researchers

6 Conclusion

In this paper, a novel secret transmission scheme is proposed using LSB substitution with the help of decision tree matching concept. This algorithm also uses RSA to provide confidentiality of information at data center end-to-end communication. Since, it is computationally secure to various types of attack. Various medical organizations can use this multi level security phenomenon to protect their medical records of patient. By using the multilevel encoding approach presented, the medical image itself may be hidden inside. By result and histogram analysis, it is concluded that PSNR, MSE values and percentage of maximum embedding capacity are better as compared to some of the existing algorithms.