Abstract
This paper presents the novelty in sensitive data transmission of patient medical records. The secret medical data is hidden inside scanned grey medical image or magnetic resonance image using the red, green, blue, and alpha (RGBA) image and with the help of decision tree. In this technique, alpha channel will be separated from the RGBA image and merged to the medical grey image to improve the hiding capacity. RSA cryptosystem is used to encrypt the medical data, and divided into various blocks using dynamic key. In steganography process, organize the grey-alpha channel medical cover image into various blocks using dynamic key. Secret cipher blocks are assigned to grey-alpha channel medical cover image blocks using Breadth First Search and decision tree, for data embedding. Performance analysis is observed using various performance measure parameters between various medical stego and cover images.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
In this era, various medical systems are continuously migrating into the cloud and mobile environments. In the version of the telemedicine, the doctor examines the medical image along with the patient data, which is transmitted from remote places, that helps in receiving medical care by expediting diagnosis and immediate treatment. Security parameters such as authentication, integrity, confidentiality and availability have to be considered for secure transmission, Department of Health and Human Services (DHHS) imposed regulations for data security and privacy under the health insurance portability and accountability act (HIPAA) of 1996 [1–3].
Cryptographic techniques encrypt the secret records with a password and assume that only authorized parties have access to the password [4]. While this does work most of the time, the encrypted data is prone to prying security thieves, who could decipher sensitive information like the patients’ insurance service provider, medication history, etc.
Steganography provides an alternative to this problem, hiding the very existence of sensitive data by concealing the data in nondescript areas of the carrier image, such that the changes made to the image are imperceptible, and the secret information is retrieved only by authorized person [5–10]. Steganography methods performance can be observed by the three valuable specifications: secrecy, volume/capability, and visual imperceptibility [11]. Secrecy is used to protect data from unauthenticated attackers or intruders. The hiding capacity should be enough to obscure the data in a cover image. Visual quality of stego image should be like that, no one can claim about imperceptibility [12].
Since the debut of this era, one of the most intelligible terms of information technology and communication is the security of medical records and patient’s personal information like unique id, name of patient, and disease information etc.
In the least significant bit (LSB) procedure, secret data bits are substituted at the least significant bits position of every byte of the cover file [13–15]. A LSB array based mechanism has been used. It combines whole LSB bits of diverse pixels as an LSB array [16].
Parvez et al. described a steganography methodology using RGB intensity values of the pixel. They have suggested concept of an indicator channel and remaining two channels to hide secret data bits. The last two bits of the indicator channel will give information about the hiding data in the other two channels [17]. Alseelawi et al. proposed a new steganography technique based on LSB methodology using ALPHA channel on JPG cover images [18]. Thiyagarajan et al. proposed a new steganography methodology for hiding patient information inside a medical cover image using a dynamic key produced by graph 3 coloring problem [19].
Anderson et al. discussed some limitations in steganography methods [20]. They approached an information theoretic method using Shannon’s theory for perfect security of data. Therefore, steganography and cryptography both are used together to accomplish the security challenges to the transmitted medical records and patient’s personal information over internet and networking [11]. The encrypted message block has been mapped to LSB array. A matching process has applied to find out maximum matching portion for embedding secret data [21, 22].
Wang et al. proposed a fuzzy model which is fundamentally based on learning from big data with uncertainty in data mining [23–27].
The proposed method provides two level of security paradigm. Formerly, the medical records and patient’s personal information is encrypted using RSA cryptography algorithm and later, the encrypted data is concealed into the least significant bits of cover image using the decision tree matching concept, thus the strength of steganography increased with cryptography. Hiding the data into least significant bits of cover image does not much affect its visual appearance quality.
This novel approach can be understood by referring the following divisions. In Sect. 2, the proposed techniques for image steganography using decision tree and RSA cryptosystem is discussed. In Sect. 3, the problem formulation and work methodology are suggested. In Sect. 4, results are observed and in Sect. 5, the discussion and analysis is done. Finally the work is concluded in Sect. 6.
2 Proposed techniques for image steganography using decision tree and RSA cryptosystem
In this section, various techniques have been discussed as follows:
2.1 RSA cryptosystem
The RSA algorithm involves three steps as key generation, encryption and decryption [4].
RSA algorithm can be illustrated as follows:
Step 1: Select two large strong prime numbers, \(p\) and \(q\). Calculate
Step 2: Compute Euler’s totient value for \({\text{n}}\)
Step 3: Find a dynamic number \({\text{e}}\) satisfying condition 1 < \(\text{e} \,< \,\text{f}(\text{n})\) and relatively prime to function f(n)
Step 4: Calculate a number \({\text{d}}\) such that
Step 5: Now public key and private key of receiver is \(({\text{e}}, {\text{n}})\) and \(({\text{d}}, {\text{n}})\) respectively.
Step 6: Encryption: Given a plain text message m satisfying condition \(\text{m} \,< \,\text{n}\), then the cipher text
Step 7: The cipher text is decrypted by
2.2 Security of the RSA cryptosystem
The security of the RSA cryptosystem is fundamentally based on, the factoring of large prime numbers. Providing security against partial decryption is required to add a secure scheme. The RSA problem is described as the task of taking eth roots modulo of a composite number \(n.\)
Recovering a value of message \({\text{m}}\) such that
where (\(n\), e) is an RSA public key and c is the RSA cipher text.
2.3 Decision tree
A decision tree is used for classification, prediction and facilitating decision making rules in linear decision issues. It is a graphical representation, which shows decisions and feasible outcomes. Decision tree consist of three types of nodes:
-
1.
Decision node: It is represented by squares. A square shows all variety of options available at a particular node by outgoing lines.
-
2.
Chance node: It is represented by circles for giving chance outcomes. Chance outcomes are events that can occur.
-
3.
Terminal node: It is represented by triangles. Terminal nodes describe the final feasible outcomes of the decision making process.
In decision tree, entropy calculates the data elements level of difference in a set. This term derives from information theory. Higher entropy gives the more information.
\(p_{i}\) is the probability of class i, compute as the proportion of class i in the set.
Entropy is 0 if all members of set belong to the same class (the data is perfectly classified). The range of entropy is 0 “perfectly classified” to 1 “totally random” [23, 24]. Figure 1 has shown the decision tree using graphical representation for describing linear decision problems.
Information gain is a statistical analysis that measures how given attribute separates from different classes. The highest information gain attribute is selected to decide the ordering of attributes in the nodes of a decision tree [28].
\({\text{Gain}}({\text{S}}, {\text{A}})\) is information gain of set \({\text{S}}\) on attribute \({\text{A}}\) is defined as
where: Σ is each value \({\text{v}}\) of all possible values of attribute \({\text{A}}\)
\({\text{S}}_{\text{v}}\) = subset of \({\text{S}}\) for which attribute \({\text{A}}\) has value \({\text{v}}\)
\(\left| {{\text{S}}_{\text{v}} } \right|\) = number of elements in \({\text{S}}_{\text{v}}\)
\(|{\text{S}}|\) = number of elements in \({\text{S}}\)
2.4 Breadth first search (BFS) technique
BFS is a data structure algorithm for traversing or searching tree or graph. It starts by the root of the tree or graph. It explores the neighbor nodes first, after that move to the next level neighbors. BFS algorithm can be used to find a shortest path between two vertices \(x\) and \(y\). It uses a queue \(q\) to keep track of the shortest path between the start vertex and the destination vertex. The time complexity for BFS is \(O(n)\), here \(n\) is the number of nodes for traversal [29]. The procedure for BFS is as follows and shown in Fig. 2.
-
1.
Search for all vertices that are directly reachable from the root (called level 1 vertices).
-
2.
Visit all vertices that are directly reachable from any level 1 vertices (called level 2 vertices), and so on.
-
3.
In general, level \(p\) vertices are directly reachable from a level \(p - 1\) vertices.
3 Problem formulation and work methodology
3.1 Work flow of the algorithm
The problem statement consist hiding of secret RSA encrypted medical data in grey- alpha channel cover image with the help of decision tree to create a secure crypto-stegano algorithm. Figure 3 shows the work flow of this algorithm.
3.2 Sender side methodology
3.2.1 Cover image and secret message
In the proposed system, select RGBA channel image (32 bits) and grey channel medical image (8 bits). Now separate alpha channel from RGBA image and merge it with grey image. Grey-alpha channel medical image is now used as a cover image. Now, cover image will be divided into number of image blocks (except some reserved location i.e. to byte numbers 3045–4045) and each image block has 4 grey- alpha pixels (64 bits). Now, take a secret medical record of patient and his personal information like unique id, name, disease information, which will be embedded into the cover image. Reserved bytes will have the information of the number of blocks of secret medical data and cipher text, dynamic key values. These key values along with breadth first search in decision tree is used for selection of one channel in RGB channel as an indicator and position in particular channel as well as position of embedding two bit per channel in grey-alpha channel (5th–8th bit).
3.2.2 RSA cryptosystem
RSA encryption technique is used to encrypt the secret medical data, cipher text will be divided into number of 8 bits blocks using dynamic key, before embedding.
3.2.3 Decision tree
Decision tree will be used as a reactive mechanism, for giving decisions to select the particular channel in grey-alpha channel. Decision tree will be constructed using binary 0 and 1 using their entropy and information gain values. The entropy of binary 0 and 1 is having the same value, so both values will be used at the decision node. Here, suppose binary sequence ranges from 0000–1111 for 0–15 integer value for making decision tree. Decision tree will be divided in four levels. Fourth level has total eight children and they will act as terminating nodes. All the eight terminal nodes in decision tree are assigned a number from 1 to 8 from left to right direction as shown in Fig. 4.
3.2.4 RGB channel image using decision tree matching process
In this process, RGB channel image using decision tree is used for selection of grey-alpha channel for embedding secret cipher data. RGB channel image will be divided into blocks, and each block has two pixels. Now randomly select one block at a time, and one channel from red, green and blue channel, the selected channel will give us the channel byte value. Select the four bits using random number in ascending order of the channel byte value, which will be called as channel value. All the four bits are matched in decision tree at fourth level using breadth first search. One indicator value out of eight in decision tree is matched at fourth level by following exactly same bits sequence as channel value. If indicator value is from 1 to 4, then grey channel is used for embedding, otherwise alpha channel is used for embedding. This process continues till all the secret cipher bits are not finished. Table 1 has shown the example for RGB channel decision tree matching process.
3.2.5 LSB embedding
Secret cipher insertion at various LSB positions in cover image is done as follows:
-
a.
Embed the number of blocks of plain and cipher text, dynamic key values and number of matched channel in reserved location.
-
b.
Convert the values of cipher text and grey-alpha channel medical cover image in binary form.
-
c.
Grey/alpha channel is selected using decision tree as discussed in previous section.
-
d.
Select two LSB from 5th to 8th bit LSB positions (5th 6th or 5th 7th or 5th 8th or 6th 7th or 6th 8th or 7th 8th) using dynamic key.
-
e.
Now embed the data into selected LSB positions.
Continue this process until all the cipher data block is not empty and all secret cipher text is not embedded in grey–alpha channel medical cover image and send resultant stego image to the receiver.
3.3 Receiver side methodology
Now retrieving the plain text from secret cipher text using grey–alpha channel medical cover image is done as follows:
-
a.
Retrieve the number of blocks of plain and cipher text, dynamic key values and number of matched channel from the reserved location.
-
b.
Retrieving will be done, from each grey-alpha channel medical cover image block from dynamically selected channel either grey or alpha channel of cover image, from the two LSB positions.
-
c.
Go to step (b) till all the secret cipher text bits are not retrieved.
-
d.
Now take all the cipher bits and arrange them in blocks according to their dynamic key value.
-
e.
Convert binary value into integers.
-
f.
Now apply RSA decryption to secret cipher blocks to get plain text blocks.
-
g.
After decryption, arrange all the medical data plain text blocks according to their dynamic key value.
This system gives one value of plain text corresponding to one cipher text. Only authorized receiver can identify the correct medical information with their private key.
4 Results
The simulation and experimentation have been done using MATLAB. Resultant simulated outcome for different medical cover images and their stego images are being displayed in Fig. 5. Histograms also show the negligible amount of difference between histogram of original cover image and stego image. Histograms for various cover images and their stego images are also shown in Fig. 5.
The patient information used in this work is listed in Table 2.
The clause PSNR (Peak Signal to Noise Ratio) is a technical terminology that defines the ratio between the maximum power of a signal and the power of damaged noise. An important index to readjust the quality of reformation of steganography images is peak signal to noise ratio. The original cover image acts like a signal, and the noise is the defect included by some steganography mechanism. The PSNR, MSE (Mean Square Error) and maximum embedding volume values at divergent payloads for different cover images of various sizes is given in Table 3. PSNR is calculated in decibels (dB). A high quality stego image should aspire for 40 dB and above [30].
PSNR outcome is defined by the mean square error (MSE) for two \(P*Q\) monochrome images, where \(x\) as well as \(y\) are image coordinates, \(SG_{xy}\) (stego image) and \(CV_{xy}\) (cover image), one of the images is approved a noisy surmise of the other is defined as:
where \(CV_{max}\) = the maximum 255 pixel value, for 8-bit cover images [8].
5 Discussion and analysis
Using Table 3, results are analyzed. If grey-alpha medical cover images such as Brain, Kidney, Lungs, and Uterus of size 262 kilo bytes and secret medical data size is 256 bytes, then PSNR and MSE values will be in range from 62.06 to 66.24 dB and 0.0405 to 0.0403 respectively and if medical data size increases to 1024 bytes then PSNR and MSE values will be in range from 56.38 to 59.38 dB and 0.3623 to 0.5741 respectively. If medical grey-alpha cover images size increases to 1048 kilo bytes and secret medical data size is 256 bytes then PSNR and MSE values will be in range from 71.55 to 74.12 dB and 0.0513 to 0.0443 respectively and if secret medical data size increases to 1024 bytes then PSNR and MSE values will be in range from 61.26 to 65.72 and 0.6143 to 0.0664 respectively.
-
(a)
In Brain grey-alpha cover image, maximum embedding capacity is 65.53 and 357.37 kb, which is 26 and 34 % respectively of the image size.
-
(b)
In Kidney cover image, it is 69.32 and 389.65 kb, which is 27 and 37 % respectively of the image size.
-
(c)
In Lungs cover image, it is 71.45 and 396.38 kb, which is 28 and 38 % respectively of the image size.
-
(d)
In Uterus image, it is 73.01 and 409.47 kb, which is 28 and 39 % respectively of the image size.
So by result analysis, it can be noticed that by increasing the medical grey-alpha cover image size and decreasing the secret medical data size PSNR value will be increased up to 74.12 dB and MSE value will be decreased up to 0.0443 as well as maximum embedding capacity is increased up to 39 %. So that performance will be high with respect to PSNR, MSE and maximum embedding capacity values. Figure 6 shows the result analysis of proposed algorithm using various performance parameters.
Using Fig. 5, one can observe that there is no visual artifacts with the medical stego images and histograms, it is looking exactly same as corresponding original medical cover images.
This image steganography approach is exceedingly secure for sending medical record and personal information of patient’s, since it uses allocation of message blocks to medical cover image through matching concept of decision tree. Embedding of secret cipher information will be performed in dynamically selected medical cover image pixels using decision tree. Here, embedding will be done only at 5th–8th bit LSB position in a pixel for better visual quality of medical stego image. The existing intruders and attacks cannot identify the existence of steganography [27]. Moreover, the cipher text is hidden, not the medical plain text, which increases one level of security. The enciphering algorithm is the RSA public key cryptography algorithm.
Using Table 4, the comparison of the proposed scheme is shown on the basis of minimum calculated PSNR, embedding capacity and visual imperceptibility with the different algorithms proposed by other researcher’s scheme. Compared to other algorithms proposed by other researchers in this field, it shows strongest technique and can be used for securing any variety of secret medical data.
6 Conclusion
In this paper, a novel secret transmission scheme is proposed using LSB substitution with the help of decision tree matching concept. This algorithm also uses RSA to provide confidentiality of information at data center end-to-end communication. Since, it is computationally secure to various types of attack. Various medical organizations can use this multi level security phenomenon to protect their medical records of patient. By using the multilevel encoding approach presented, the medical image itself may be hidden inside. By result and histogram analysis, it is concluded that PSNR, MSE values and percentage of maximum embedding capacity are better as compared to some of the existing algorithms.
References
Randolph C, Barrows JR, Paul MD, Clayton D (1996) Review: privacy, confidentiality and electronic medical records. J Am Med Inf Assoc 3(2):139–148. Available at http://www.ncbi.nlm.nih.gov/pmc/articles/PMC116296/pdf/0030139.pdf
Raman RS, Reddy R, Jagannathan V, Reddy S, Cleetus KJ, Srinivas K (1997) A strategy for the development of secure telemedicine applications. In: Proceedings of the AMIA annual fall symposium, pp 344–348. Available at http://www.ncbi.nlm.nih.gov/pubmed/9357645
United S (2000) Summary of the HIPAA privacy rule. United States Department of Health and Human Services, pp 1–19. Available at http://www.hhs.gov/ocr/privacy/hipaa/understanding/summary/privacysummary.pdf
Chandra MK, Cherif A (2002) Implementation of the RSA algorithm and its cryptanalysis. In: ASEE Gulf-Southwest annual conference, American society for engineering education, USA. Available at http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.130.6258&rep=rep1&type=pdf
Wu DC, Tsai WH (2003) A steganograhic method for images by pixel value differencing. Pattern Recogn Lett 24(9-10):1613–1626
Zhang X, Wang S (2004) Vulnerability of pixel-value differencing steganography to histogram analysis and modification for enhanced security. Pattern Recogn Lett 25(12):331–339
Chang CC, Tseng HW (2004) A steganographic method for digital images using side match. Pattern Recogn Lett 25(12):1431–1437
Martin A, Sapiro G, Seroussi G (2005) Is image steganography natural. IEEE Trans Image Process 14(12):2040–2050
Wang R, Chen Y (2006) High payload image steganography using two-way block matching. IEEE Signal Process Lett 13(3):161–164
Kumar PM, Roopa D (2007) An image steganography framework with improved tamper proofing. Asian J Inf Technol 6(10):1023–1029
Provos N, Honeyman P (2003) Hide and seek: an introduction to steganography. Secur Privacy Mag IEEE 1(3):32–44
Cheddad A et al (2010) Digital image steganography survey and analysis of current methods. Signal Process 90:727–752
Mohammad ABY, Jantan A (2008) A new steganography approach for image encryption exchange by using the LSB insertion. IJCSNS Int J Comput Sci Netw Secur 8(6):247–254. Available at http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.371.9525&rep=rep1&type=pdf
Nag A, Singh JP, Khan S, Ghosh S (2011) A weighted location based LSB image steganography technique. Springer ACC 2011, CCIS (ISBN: 978-3-642-22714-1), 2(191):620–627. Available at http://www.springerlink.bibliotecabuap.elogim.com/content/pdf/10.1007/978-3-642-22714-1_64.pdf
Maiti C, Baksi D, Zamider I, Gorai P, Kisku DR (2011) Data hiding in images using some efficient steganography techniques. Springer SIP 2011, CCIS (ISBN: 978-3-642-27183-0), 2(260):195–203. Available at http://www.springerlink.bibliotecabuap.elogim.com/chapter/10.1007%2F978-3-642-27183-0_21
Juneja M, Sandhu PS (2009) Designing of robust steganography technique based on LSB insertion and encryption. In: Proceedings of international conference on advances in recent technologies in communication and computing (ISBN: 978-0-7695-38457), pp 302–305. Available at http://www.dl.acm.org/citation.cfm?id=1673335
Parvez MT, Gutub AA (2008) RGB based variable-bits image steganography. In: Proceedings of IEEE Asia pacific services computing conference (ISBN: 978-0-7695-3473-2), pp. 1322–1327. Available at http://www.ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4780862
Alseelawi NS, Ismaiel TZ, Sabir FA (2015) High capacity steganography method based upon RGBA image. Int J Adv Res Comput Commun Eng (ISSN: 2278-1021), 4(6). Available at http://www.ijarcce.com/upload/2015/june-15/IJARCCE%2027.pdf
Thiyagarajan P, Aghila G (2013) Reversible dynamic secure steganography for medical image using graph coloring. Health Policy Technol 2(3):151–161. Available at http://www.sciencedirect.com/science/article/pii/S2211883713000403
Ross J, Anderson F, Petitcolas AP (1998) On the limits of steganography. In: IEEE Journal of selected Areas in communication, Special Issue on Copyright & Privacy protection (ISSN: 0733-8716), 6(4):474–481
Swain G, Lenka SK (2012) LSB array based image steganography technique by exploring the four least significant bits. Springer, In: Proceedings of 4th international conference, Obcom 2011, CCIS (ISBN: 978-3-642-29216-3), 2(270):479–488
Swain G, Lenka SK (2015) A novel steganography technique by mapping words with LSB array. Int J Signal Imaging Syst Eng Indersci (ISSN: 1748-0701), 8(1–2). Available at: http://www.inderscience.com/link.php?id=67052
Wang XZ, Ashfaq RAR, Fu AM (2015) Fuzziness based sample categorization for classifier performance improvement. J Intell Fuzzy Syst 29(3):1185–1196
Wang XZ (2015) Uncertainty in learning from big data-editorial. J Intell Fuzzy Syst 28(5):2329–2330
Lu SX, Wang XZ, Zhang GQ, Zhou X (2015) Effective algorithms of the Moore–Penrose inverse matrices for extreme learning machine. Intell Data Anal 19(4):743–760
He YL, Wang XZ, Huang JZX (2016) Fuzzy nonlinear regression analysis using a random weight network. Inf Sci. doi:10.1016/j.ins.2016.01.037
Ashfaq RAR, Wang XZJ, Huang ZX, Abbas H, He YL (2016) Fuzziness based semi-supervised learning approach for intrusion detection system (IDS). Inf Sci. doi:10.1016/j.ins.2016.04.019
Anthony JM, Robert NF, Yang L, Nathaniel AW, Steven DB (2004) An introduction to decision tree modeling. J Chemomet 18(6):275–285. Available at http://www.onlinelibrary.wiley.com/doi/10.1002/cem.873/pdf
Thomas HC, Charles EL, Ronald LR, Clifford S (2001) Introduction to algorithms, 2nd edn. MIT Press and McGraw-Hill, Section 10.1: Stacks and queues, pp. 200–204. Available at http://www.dcc.ufrj.br/~francisco_vianna/livros/Introduction.To.Algorithms.-.Cormen.-.2nd.Ed.pdf
Li B et al (2011) A survey on image steganography and steganalysis. Journal of Information Hiding and Multimedia Signal Processing (ISSN: 2073-4212), 2(2):142–172. Available at http://www.bit.kuas.edu.tw/~jihmsp/2011/vol2/JIH-MSP-2011-03-005.pdf
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Jain, M., Kumar, A. RGB channel based decision tree grey-alpha medical image steganography with RSA cryptosystem. Int. J. Mach. Learn. & Cyber. 8, 1695–1705 (2017). https://doi.org/10.1007/s13042-016-0542-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-016-0542-y