Keywords

1 Introduction

A polynomial-based secret sharing scheme is first time proposed by Shamir [1] in 1979. Later, Thein and Lin [2] put forward image secret sharing based on Shamir’s [1] polynomial-based secret sharing on image pixel values. The approach reduces size of the image shares generated in construction phase, and also the generated secret image is of good quality (same as the original image). The wide ranges of applications are possible using extended capabilities [3] in secret sharing schemes.

A polynomial-based image secret sharing algorithms are highly secure but computationally heavy due to the requirement of polynomial operation on each pixel value. In large size of images, construction and reconstruction step computation is done on pixel-by-pixel for complete image. This is computationally meticulous as the database size to be processed is comparatively large. This results into large computational complexities of construction and reconstruction algorithms. In real life, these approaches need acceleration of processing of algorithms for use. The parallel advancements of image-based algorithms are suggested by few researchers [4, 5].

The Hadoop File System (HDFS) [6, 7] is implemented using distributed file system design. Hadoop File System is fault tolerant and can be implemented with minimal cost. It is able to accumulate the huge amount of data and also can provide access to that data at ease. Hadoop enables interface to HDFS through a command interface. It also provides streaming access to file system data. The clusters are formed by the integral servers, Name Node and Data Node [8, 9].

2 Related Work

2.1 Thein and Lin’s Secret Sharing [2]

Thein and Lin [2] used polynomial-based Shamir secret sharing [1] for image threshold secret sharing. The secret image pixel values are used as coefficients to construct as polynomial used for constructing share images. The share image pixel values are used for reconstruction of the secret image using Lagrange’s interpolation. This is very effective secret sharing method to distribute confidential images secretly. For very large size of images, it becomes inefficient due computationally heavy load.

2.2 Efficient Image Secret Sharing Using Parallel Processing for Row-Wise Encoding and Decoding [4]

A multithreading approach is implemented in [4] where it is observed that the concurrent approach is very effective and efficient to apply practically. In every algorithm, the multithreading logic needs to be implemented to achieve parallelism. The author demonstrated the use of parallel processing for efficient secret construction and reconstruction algorithms. The parallel approach is implemented using Unix platform. The more efficient parallel platforms can enhance the performance of image secret sharing methods.

2.3 Hadoop and HDFS [6,7,8]

Hadoop is basically an open-source framework which focuses on computations of distributed storage. Hadoop processes massive data on commodity hardware. A dedicated file system called Hadoop Distributed File System (HDFS) stores big data and supports distributed tasks estimations in Hadoop clusters. The conception of HDFS is implemented using Java. The Hadoop services may be characterized in terms of components such as storage component and processing component. Here, the HDFS works as storage component, whereas MapReduce works as processing component.

The HDFS grants dependable data storage. It also trails Master–Slave architecture. The Name Node is the master node. It only contains the metadata. The Name Node protects and maintains incoming data. The maintained information, i.e., metadata, is required for data retrieval as the data is distributed over numerous nodes. Data Node behaves as a slave node. It stores the actual data in the HDFS and interconnects with the files which are stored in that particular node along with the Name Node. The slave node is efficient enough to create new blocks of data along with manipulating and removing the blocks. It can also replicate the blocks if Name Node requires it to be done [9].

2.4 Hadoop-HDFS-Map Reduce [10]

The parallel processing of large amount of data is modeled through MapReduce. The main advantages are in effective storing of large images and effective accessing of large size images. Also, filtering of images and processing of images become effective.

2.5 Hadoop Image Processing Interface

A traditional Hadoop MapReduce program strives in presenting input image and output image data in a convenient format. The image library, “Hadoop Image Processing Interface” (HIPI) is built to be exercised with Apache Hadoop. HIPI is a solution to store a huge compilation of images on the dedicated file system of Hadoop. It also makes them available for effective distributed processing with few parallel programming components like MapReduce. A HIPI Image Bundle (HIB) is a collection of images characterized as a single file on HDFS.

3 Proposed Method

The proposed method is implemented for two steps—Formation of image shares from secret images and reformation of original secret from shares. The below section elaborates construction and reconstruction method of images for polynomial image secret sharing using Hadoop.

3.1 Image Secret Sharing

Construction and Reconstruction of Master–Slave Approach using Hadoop

The proposed parallel approach is based on master–slave model, as shown in Fig. 1 on Hadoop to achieve the parallel computation. It is implemented on one master and multiple slaves. The used approach is shown below.

Fig. 1
figure 1

HDFS framework

As shown in Fig. 1, the intermediate results are produced using input records functional with Mapper. Reducer aggregates these generated results. The local summation is executed by combiners. The shuffling of intermediate data with reducer is performed by partitions.

3.2 Construction of Shares

Figure 2 indicates construction of shares on Hadoop framework which consist of single Name Node (Master Node) and two Data Nodes (Slave Node). The Name Node distributes partition datasets to Data Nodes. The Data Nodes will compute polynomial computations of their respective pixel values of secret image.

Fig. 2
figure 2

Construction of shares

The detailed illustration formation of shares of secret image using polynomial method is shown in Fig. 2.

Master node

  1. i.

    Master will split the image row-wise into k-number of blocks (partitions) of fixed size.

  2. ii.

    Master will transfer each partition to respective slaves.

  3. iii.

    Each partition will contain parameters with row and column numbers.

Slave node

  1. i.

    Each slave will apply Thein and Lin’s method on every row of received partition to construct a (k − 1) degree polynomial.

  2. ii.

    Polynomials must be computed for each participant, starting from 1…n.

  3. iii.

    No slaves will quit until the task is finished.

  4. iv.

    All the computed values will be sent to master node.

Master node

  1. i.

    Master node will collect all the values from all slaves and will create shares.

  2. ii.

    These shares will be distributed to all the participants.

3.3 Reconstruction of Original Secret from Shares

Figure 3 indicates reconstruction of original secret on Hadoop framework with single Name Node (Master) and two Data Nodes (Slave).

Fig. 3
figure 3

Reconstruction of original secret

Master node

  1. i.

    k-shares will be given as input to master node collected from interested participants.

  2. ii.

    Master node will decide k-partitions for slaves.

  3. iii.

    Each partition will be passed containing parameters with row and column numbers of all shares.

  4. iv.

    The partitions will be distributed to all the slaves.

Slave node

  1. i.

    Each slave will choose first pixel from every share which is chosen.

  2. ii.

    Every single slave will utilize Lagrange’s interpolation formula to create an equation from k-selected pixel values of k-shares.

  3. iii.

    Entire coefficient of derived equations will be used as pixel values for resultant image.

  4. iv.

    Each slave will imitate the steps (ii) and (iii) for each and every specified row.

Master node

  1. i.

    Master node will collect all the pixel values from all slaves and will present the reconstructed secret image.

The proposed method is implemented in HDFS with varying the number of slave nodes which results in better performance as compared to the traditional approach.

4 Result and Analysis

The proposed approach is implemented using Apache Hadoop on one master node and multiple slave nodes. The secret image of Leena, Baboon, Barbara, and Pepper of size 512 × 512 pixel are used from standard image dataset.

Tables 1 and 2 show the experimental results obtained with both approaches. The readings are observed for sequential approach (standalone machine) and using Hadoop cluster having one slave and Hadoop cluster having two slaves.

Table 1 Time comparison table for construction using sequential verses Hadoop approach
Table 2 Time comparison table for reconstruction using sequential verses Hadoop approach

As it is observed from Table 1, the Hadoop cluster speeds up the process as we increase the number of slaves in the system for processing of large images as compared to sequential method.

The construction time is more as compared to reconstruction time as the number of shares, from which original image is to be generated are less than the shares needs to be created in the construction phase.

5 Conclusion

It is observed that the time required by the system on a Hadoop-based distributed platform is much less than that is required on a standalone machine for sequential approach. Compared to standalone machine, time requirement of system with one slave is 18% and the same with two slaves is 14%. This demonstrates that on increasing number of slaves, the system will be more efficient and will construct/reconstruct in even less time, thus reducing load on single machine. Also for image of smaller sizes, sequential approach is proved to be better but as the size of image goes on increasing, time efficiency is observed. Hadoop accelerated the construction and reconstruction time for bigger images, for which computations are extra compared to smaller images by distributing the task to various slaves.