Keywords

1 Introduction

Cryptographic hash functions (CHFs) [1] are a relatively recent addition to the cryptographic toolbox. They are used in various application areas where the integrity of data, the authenticity of the source, and the non-repudiation of delivery are important, including digital forensics, digital signatures, communication protocols like SSL, crypto-currency, etc. They are critical for all cloud computing services which require ensuring the integrity and correctness of the transmitted data, e.g. validation of downloaded files. Security is considered to be the most critical aspect of all cryptographic primitives and CHFs are no different. For a majority of the popular CHFs like the SHA-x family [2], the idea of security is based on computational complexity. This has adverse implications for performance, as the sequential nature of signature calculations in most algorithms leads to unwanted delays. Due to their sequential nature, the software performance of many hash functions on modern architectures, while decent in terms of speed and the no. of cycles used, is not optimal, resulting in wasted CPU cycles. The volume of data, easily running into terabytes, further deteriorates the performance as the compute-intensive nature of these algorithms makes the calculation of hash code for large files cumbersome and time-consuming. In 2007, National Institute Standards and Technology (NIST) made an announcement calling for suitable hash functions as candidates for the next standard SHA-3 [3], even though SHA-2 was still secure—because of the growing sophistication of attacks since Wang et al. [4], resulting in reduced lifespan for hash functions. The SHA-3 competition introduced parallelizability for increased performance as a desirable feature for the new hash function, suggesting that hashing algorithms need to be fast and efficient, as much as they need to be secure. However, it is remarkable that the focus was on parallelizable rather than parallel CHFs, implying that the design approach continued to remain sequential. It is suggested that a parallel and distributed approach toward CHFs would not only improve performance, but also enhance the security of CHFs. This paper presents a new parallel framework for CHFs to deal with the problem of long computations by lowering the computational complexity through the use of parallel implementation on multi-core processors.

The paper is organized as follows: Sect. 2 provides a brief introduction to the CHFs, followed by a detailed description of the proposed framework in Sect. 3. Section 4 presents the security analysis of the framework giving arguments in support of pre-image resistance, weak collision resistance, collision resistance, non-correlation resistance, parallel pre-image resistance, and pseudorandom oracle preservation properties. Section 5 discusses briefly the various performance benchmarks for evaluation of the framework. The conclusions along with the scope for future research work are given in Sect. 6.

2 Previous Work

A comprehensive study of the literature suggests that prior to the SHA-3 competition, parallelization of CHFs was not on the agenda of many researchers [5]. The evolution of parallel CHFs should thus be viewed in the light of SHA-3 competition, dividing the timeline into three phases, namely pre-SHA3, the years of the competition itself, and post-SHA3. In the initial stage, the efforts were concentrated toward software performance analysis of existing CHFs on the newly arrived Pentium processors [6,7,8]. The focus was largely on fine-grained parallelism. Hardware parallelism with FPGAs and ASICs was also prevalent and discussed in a few of the papers [9,10,11]. The arrival of GPUs in 2006 was a real game-changer for parallel CHFs as demonstrated by [12, 13]. The SHA-3 competition established parallelism/parallelizability of algorithm designs as a criterion for evaluation leading to the emergence of several promising designs like BLAKE [14], GR\({\Phi }\)STL [15], MD6 message-digest algorithm [16], skein [17, 18], Keccak [19], etc. These were subject to rigorous tests and analyses and eventually in the year 2013, Keccak was declared the winner of the SHA-3 competition because of its novel sponge construction [20]. The post-SHA-3 era saw the emergence of improved versions for SHA-3 candidates like BLAKE-2 [21], SHAKE [22], ParallelHash [23], etc. Some of the work was done in the domain of lightweight hash functions for resource-constrained environments [24,25,26,27]. Presently, efforts are being made toward standardization of tree modes for hash functions in order to achieve medium to coarse-grained parallelism [28,29,30]. The efforts in the three phases should be seen in the continuation of one other. The three-stage classification does not imply obsolescence of the techniques in the previous stage(s), and it merely signifies a change in approach toward parallel CHFs.

In 2014, Kishore and Kapoor [31] proposed ITRH transformation for a faster, more secure CHF, based on recursive tree hashing and presented its framework as well as detailed analysis. An algorithm RSHA-1 was proposed too. Speedup upto 3.5\(\times \) was observed when large files were hashed using the proposed framework. Additionally, linear speedup proportional to the no. of processing elements was observed. However, at the time of security analysis, it was observed that the proposed transformation was vulnerable to certain attacks, a side-effect of removal of chain dependencies of the underlying function.

In order to overcome these deficiencies, this paper proposes an improved parallel transformation, Enveloped Inverted Tree Recursive Hashing (EITRH), which comes with an enhanced security level. The design is modeled on the lines of ITRH, but uses enveloping [32] as an additional measure for security.

3 The EITRH Transformation

Definition 1

Let a set of bit strings be denoted as \(\{0,1\}^*\), a message \(M \in \{0,1\}^* \), and k be the no. of blocks of size l, then keyless EITRH (\(H{:}\,\{0,1\}^* \rightarrow \{0,1\}^n\)) accepts a message \(M \in \mathcal {M}\) and maps it to a unique n-bit digest as follows:

$$ h(i) = \left\{ \begin{array}{ll} h(M_i,H_0), &{} \quad 0 < i \le k-1 \\ h(M_i,H_{00}), &{} \quad i = k \end{array} \right. $$

and

$$\begin{aligned} H_i = h_1 \parallel h_2 \ldots \parallel h_k, M \rightarrow H_i\,\text {until}\, H_i \rightarrow \{0,1\}^n \end{aligned}$$

Definition 2

Let a set of bit strings be denoted as \(\{0,1\}^*\), a message \(M \in \{0,1\}^* \), and k be the no. of blocks of size l, then a keyed EITRH (\(H_k{:}\, \{0,1\}^{\mathscr {K}} x \{0,1\}^* \rightarrow \{0,1\}^n\)) accepts a message \(M \in \mathcal {M}\) along with a fixed-length key \(\mathscr {K} \in \mathcal {M}\) and maps it to a unique n-bit digest as follows:

$$ h(i) = \left\{ \begin{array}{ll} h(M_i,H_0), &{} \quad 0 < i \le k-1 \\ h(M_i,H_{\mathscr {K}}), &{} \quad i = k \end{array} \right. $$

and

$$ H_i = h_1 \parallel h_2 \ldots \parallel h_{k-1} \parallel h_k, M \rightarrow H_i\,\text {until}\, H_i \rightarrow \{0,1\}^n $$

Definitions 1 and 2 formally describe the keyless EITRH and keyed EITRH, respectively [33]. The sub-sections of this section further explain the structure and properties of the framework.

3.1 Describing the Framework

As depicted in Fig. 1, EITRH has an inverted tree structure. The transformation comprises of three recursive steps: “message expansion,” “parallel reduction,” and “hash value generation.” The process terminates when the size \(h_n\) (of the final hash value) is reached, where \(h_n\) depends upon the EITRH variant being used.

Fig. 1
figure 1

Enveloped Inverted Tree Recursive Hashing (EITRH) framework

Step 1: “Message Expansion” It is performed at every level of the inverted tree of height \(h_t\) (determined by the file size and EITRH variant being used). This step is necessary for improving the sensitivity of the hashing process. The message \(M_i\) at level \(h_i\) (\(0< i < h_t\)) is divided into k blocks of size l (determined by the variant used), where \(k = (| M | \text {mod}\,l) + 1\). Padding is applied to the kth (last) block by appending a string of 10* followed by the length of \(M_i\) at the end. The no. of 0s is adjusted to make the padded block-length a multiple of l.

$$ M_i = M_i \left| \right| 10 \ldots 0 \left| \right| \left| M _i \right| \quad \text {for}\,\,\text {each level}\,h_i . $$

Parallelism is achieved by executing each block on a multi-core processor. For enveloping, two distinct initial vectors of size \(h_n\) are used: \(\text {IV}_1 (H_0)\) for \(k-1\) blocks and \(\text {IV}_2 (H_{00})\) for the kth block.

Step 2: “Parallel Reduction” For each block reduction is done in a parallel mode, using nonlinear compression function (f()) and set of IVs (\((H_0)\) for \(k-1\) blocks and \(\text {IV}_2 (H_{00})\) for the kth block). Once the hash values are computed for each of the blocks, they become internal state hashes for the next level. These values are concatenated to generate \(H_i\), which acts as the message M for the next level of the tree. Thus, a \(h_n\)-bit collective hash \(H_i\) is obtained by recursive application of steps 1 and 2.

$$ h(i) = \left\{ \begin{array}{ll} h(M_i,H_0), &{} \quad 0 < i \le k-1 \\ h(M_i,H_{00}), &{} \quad i = k \end{array} \right. $$

This step is uniform for each block of \(M_i\) and is performed independently, since there is no chaining input from the previous blocks.

Step 3: “Hash Generation” Once all the blocks of level \(h_t - 1\) have been processed using the steps 1 and 2, the final digest is generated by \(h = H(M_{h_t-1},H_{00})\).

Algorithm 1 presents the pseudocode for the EITRH domain extender in a data parallel model.

Algorithm 1

figure a

3.2 Properties

As described in Sect. 2, EITRH is an improved construction designed to overcome the security flaws in ITRH construction. Like ITRH, it is a recursion-based construction. Passing \(\text {IV}_2 (H_0)\) to each block separately removed data dependencies, making parallel reduction possible for both the constructions. However, unlike its predecessor, EITRH is an enveloped construction. It uses an additional input \(\text {IV}_2(H_{00}\)) for hashing the last block at each level of the tree. Enveloping the last block in this manner improved the security level of the transformation, improving non-correlation, making it resistant against multi-collision attacks and partial pre-image attacks, in addition to the generic attacks. Also, due to \(\text {IV}_2\), it is possible to use EITRH as both a keyless as well as a keyed construction; if \(H_{00}\) in Definition 1 is replaced with a fixed-length key \(\mathscr {K}\) , as defined in Definition 2. In this manner, a keyless hash function is transformed into a keyed variant simply by replacing \(\text {IV}_2\) in the compression function with \(\mathscr {K}\), without the necessity of separate provision for handling the key.

3.3 EITRH Variants

Table 1 indicates the CHF variants using EITRH construction and their internal state sizes. The proposed variants are derived from the SHA family where application of EITRH is done for enhancing security and parallelization. RSHA-1 [31], like SHA-1, gives a digest of 160 bits. Similarly, the digest size in RSHA-224, RSHA-256, RSHA-384, and RSHA-512 correspond to those in the SHA-2 standard.

This paper presents the analysis of EITRH based on the experiments done using RSHA-1. The outcomes can be extended to other variants as well.

Table 1 CHFs based on EITRH

4 Security Analysis of EITRH

The EITRH transformation is designed to be a multi-property preserving domain extension transformation which can preserve multiple properties in addition to the ideal properties, viz. pre-image resistance (PIR), weak collision resistance (WCR), and collision resistance (CR). The sub-sections of this section provide the security analysis of the proposed framework.

4.1 Pre-image Resistance (PIR)

The pre-image resistant property refers to the one-wayness of the hash functions, which means that inverting the hash value in order to retrieve the original message should be computationally infeasible. The property of PIR is strongly preserved in EITRH as compared to MD construction because of its recursive nature. The fact that the final digest itself is a hash of hashes makes the use of backtracking to deduce the input message computationally difficult.

4.2 Weak Collision Resistance (WCR)

Weak collision resistance, also called second pre-image resistance, is the property which ensures that given a message M, it is difficult for the adversary (A) to find \(M''\) which generates an identical hash. Due to this property, attacks involving falsification of the message may be thwarted by using encrypted hash. For a n-bit digest, the effort required to find a second pre-image is proportional to \(2^n\). In Sect. 4.4, it will be proved that EITRH transformation supports the “avalanche effect.” Therefore, for any message, it would be computationally infeasible for an adversary to find another distinct message mapping to the same hash value, as the level of effort required for this type of attack is \(2^n\). As n increases, the level of effort increases exponentially.

4.3 Collision Resistance (CR)

Strong collision resistance property implies that it should be computationally difficult for A to find two distinct messages M and \(M''\) such that \(H(M)=H(M'')\). For a n-bit digest, attacking CR requires an effort of \(2^{\frac{n}{2}}\), much less than pre-image and second pre-image attacks, on account of the birthday paradox. Consider two inputs M and \(M''\) (where \(M\ne M''\)) of the same size, differing by \(\Delta \). Let the difference be in only half of the input blocks. Partial collision may happen, effecting the intermediate hashes and/or input to the next stage. Due to recursion, a collision in one half of the intermediate hashes ensures that the other half differs by \(\Delta \). Further, the colliding values become input for the nonlinear compression function f() in the next stage, which uses sub-key values that are distinct from those used in the previous stages. As a result, in the succeeding stages, the internal collision is unlikely to persist.

4.4 Non-correlation Resistance (Confusion and Diffusion Analysis)

The avalanche effect for CHFs, as in encryption, requires a complex correlation between the input and hash value bits, such that a change in the former should cause a drastic change in at least half of the output bits.

Fig. 2
figure 2

Testing EITRH for confusion and diffusion properties by measuring the distribution of changed bits \(L_i\) in a simulation. The algorithm used is RSHA-1 with digest size of 160 bit. values of \(L_i\) suggest a change in more than 50% of the output bits with every single-bit change in the input file

EITRH demonstrated the avalanche effect when it was subjected to tests for confusion and diffusion on a simulator. A random text file from t5-corpus was used for this purpose; its hash was calculated using EITRH variant (RSHA-1). A random bit in the original input file was toggled and hash was re-calculated. The two hash values were XORed and the no. of set bits were counted, giving the no. of bits that had changed. The test was repeated N times on the algorithm (where \(N=128\), 256, 512 or 1024) and the results shown in Fig. 2 were plotted after the following calculations.

$$\begin{aligned} \text {Minimum no. of bits changed}{:}\,\, L_{\text {min}} = \min (\{L_i\}^N_i) \end{aligned}$$
(1)
$$\begin{aligned} \text {Maximum no. of bits changed}{:}\, \,L_{\text {max}} = \max (\{L_i\}^N_i) \end{aligned}$$
(2)
$$\begin{aligned} \text {Mean no. of bits changed}{:}\,\, \overline{L} = \frac{1}{N} \sum ^N_{i=1} L_i \end{aligned}$$
(3)
$$\begin{aligned} \text {Mean Changed Probability}{:}\,\, \text {Pr} = \frac{\overline{L}}{\text {Hash Size}} \times 100 \% \end{aligned}$$
(4)
$$\begin{aligned} \text {Standard Deviation of bits changed}{:}\,\, \sigma _L = \sqrt{\frac{1}{N-1}\sum ^N_{i=1} (L_i - \overline{L})^2} \end{aligned}$$
(5)
$$\begin{aligned} \text {Standard Deviation}{:} \,\, \sigma _{\text {Pr}} = \sqrt{\frac{1}{N-1}\sum ^N_{i=1} \bigg (\frac{L_i}{\text {Hash Size}} - \text {Pr}\bigg )^2 \times 100 \%} \end{aligned}$$
(6)

Here, N is the total no. of statistics, \(L_i\) is the no. of bits changed in the ith test, and \(\sigma _{\text {Pr}}\), \(\sigma _{L}\) indicate the stability of confusion and diffusion.

Table 2 records the statistics \(L_{\text {min}}\), \(L_{\text {max}}\), \(L_i\), Pr, \(\sigma _{L}\), \(\sigma _{\text {Pr}}\) calculated using Eqs. (1)–(6), respectively. The table also shows the mean values for these statistics. It can be noted that mean values of \(L_i\) and Pr for EITRH-based RSHA-1 are 83.94 and 52.46%, respectively. These are close to the benchmark values for the standard 160-bit CHF SHA-1, i.e., 64 bits and 50%. \(\sigma _{L}\) and \(\sigma _{\text {Pr}}\) indicate the stable capability of confusion and diffusion of the algorithm. The results show that the proposed algorithm is resistant to statistical attacks.

Table 2 Statistics of no. of changed bit \(L_i\)

4.5 Partial Pre-image Resistance (PPR)

Partial pre-image resistance or local one-wayness is the computational difficulty in retrieving the message partially, given that A knows a part of the message. The EITRH transformation preserves PPR property. Since the final digest is a hash of hashes, it is difficult for A to recover the message even partially. The no. of intermediate hashes at each level is n and thus, can never lead to discovery of the original message even partially.

4.6 Pseudorandom Oracle Preservation (PRO-Pr)

EITRH transformation is an inverted tree hash that is inspired by EMD transformation. EITRH, like EMD, uses two distinct initial vectors (IV). The second IV is used as an input while hashing the last block of each level. This leads to a high probability of the random oracle behaving independently at the last application . Since EMD is proven to be PRO-Pr [32], EITRH using the same principle is PRO-Pr.

5 Performance Evaluation

The sub-sections of this section present an evaluation of EITRH variants based on the performance metrics for a parallel program, namely complexity, speedup, efficiency, cost optimality and iso-efficiency, using some analytical tools [31].

5.1 Algorithm Complexity

For calculating the algorithmic complexity of EITRH, the following assumptions were made.

Suppose the complexity of calculating the hash value H(M(i)) for each message block M(i) is \(\text {Cmplx}(H(M(i)))\), and the no. of original blocks is b, then the computational complexity for standard CHF model can be given by Eq. (7).

$$\begin{aligned} b\cdot \text {Cmplx}(H(M(i))) \end{aligned}$$
(7)

The first phase of EITRH on p processors, where \(p \ge 1\) has the complexity,

$$\begin{aligned} \frac{b}{p}\cdot \text {Cmplx}(H(M(i))) \end{aligned}$$
(8)

For the parallel reduction phase of EITRH, the no. of message blocks determines the no. of times H(M(i)) is called. After the first reduction, the remaining length of the message is \(\frac{x}{y}\) of the length of the last message where x is the size of hash to be generated and y is the block size. x and y depend upon the variant of EITRH. Therefore, to achieve the x-bit hash value, \(\lceil {\text {log}}_{\frac{x}{y}}b\rceil \) recursive reductions are required and the no. of times it will be called, \(n_{\text {call}}\), is calculated as follows:

$$\begin{aligned} \begin{aligned} n_{\text {call}}&= \bigg \lceil \bigg (\frac{x}{y}\bigg )^1\cdot \frac{b}{p}\bigg \rceil +\bigg \lceil \bigg (\frac{x}{y}\bigg )^2\cdot \frac{b}{p}\bigg \rceil \cdots +\bigg \lceil \bigg (\frac{x}{y}\bigg )^z\cdot \frac{b}{p}\bigg \rceil \\&= \sum ^z_{i=1}\bigg \lceil \bigg (\frac{x}{y}\bigg )^i\cdot \frac{b}{p}\bigg \rceil \\ \end{aligned} \end{aligned}$$
(9)

where

$$ z=\lceil {\log }_{\frac{x}{y}}b\rceil , \quad \frac{x}{y} = \frac{\text {Hash Size}}{\text {Block Size}} $$

Now, the complexity for the complete reduction process is given by:

$$\begin{aligned} O(\text {rdc}) = n_{\text {call}}\cdot \text {Cmplx}(H(M(i))) \end{aligned}$$
(10)

From Eqs. (8) and (10), the overall computational complexity \(f(\text {rdc})\) of EITRH transformation is given as:

$$\begin{aligned} \begin{aligned} f(\text {rdc})&= \frac{b}{p}\cdot \text {Cmplx}(H(M(i)))+n_{\text {call}}\cdot \text {Cmplx}(H(M(i))) \\&= \bigg (\frac{b}{p}+n_{\text {call}}\bigg )\cdot \text {Cmplx}(H(M(i))) \end{aligned} \end{aligned}$$
(11)

Therefore, from Eqs. (7) and (11), it may be concluded that even though the no. of calculations required in EITRH transformation is more computations than that in the Merkle-Damgård (MD) construction, in terms of computational complexity EITRH is at par with MD construction.

5.2 Speedup

Suppose that the reference CHF model (SHA-1 in this case) is able to hash a message block M(i) in the time T(M(i)), measured as the no. of elementary steps in the computation (provided each step is completed in constant time).

Thus, for b message blocks, time \(=b\cdot T(M(i))\)

Therefore, the sequential time can be given as:

$$\begin{aligned} T_s=b\cdot T(M(i))=\Theta (b) \end{aligned}$$
(12)

From Fig. 1, we know that the process of calculating hash using EITRH involves producing the intermediate hash of sub-blocks, concatenating these hashes and using recursion to determine the final hash value. The computation of the time complexity of EITRH on p processors, where \(p\ge 1\) is done as follows:

Let T(M(i)) be the time required to hash ith block.

Let \(t_c\) be the constant time required for concatenating the intermediate hashes.

From Eq. (9), the no. of recursive calls (\(n_{\text {call}}\)) is calculated as:

$$\begin{aligned} n_{\text {call}}&= \bigg \lceil \bigg (\frac{x}{y}\bigg )^1\cdot \frac{b}{p}\bigg \rceil +\bigg \lceil \bigg (\frac{x}{y}\bigg )^2\cdot \frac{b}{p}\bigg \rceil \cdots +\bigg \lceil \bigg (\frac{x}{y}\bigg )^z\cdot \frac{b}{p}\bigg \rceil \\&= \sum ^z_{i=1}\bigg \lceil \bigg (\frac{x}{y}\bigg )^i\cdot \frac{b}{p}\bigg \rceil \\ \end{aligned}$$

where

$$\begin{aligned} z=\lceil {\text {log}}_{\frac{x}{y}}b\rceil \quad \frac{x}{y} = \frac{\text {Hash Size}}{\text {Block Size}} \end{aligned}$$

Time taken for reduction and recursive calls \(=n_{\text {call}}\cdot T(M(i)).\)

So, parallel time is given by:

$$\begin{aligned} T_p&= \frac{b}{p}\cdot T(M(i))+n_{\text {call}}\cdot T(M(i))+t_c \nonumber \\&=T(M(i)).\bigg (\frac{b}{p}+n_{\text {call}} \bigg ) =\Theta \bigg (\frac{b}{p}+n_{\text {call}} \bigg ) \end{aligned}$$
(13)

Based on the above results, total parallel overhead is given by:

$$\begin{aligned} T_o&= p\cdot T_p-T_s \end{aligned}$$
(14)
$$\begin{aligned}&= p\cdot T(M(i))\cdot \bigg (\frac{b}{p}+n_{\text {call}} \bigg )-b\cdot T(M(i)) \nonumber \\&=p\cdot n_{\text {call}}\cdot T(M(i)) =\Theta (p\cdot n_{\text {call}}) \end{aligned}$$
(15)

Now, from Eqs. (12) and (13), speedup can be given as:

$$\begin{aligned} S(b)&= \frac{T_s}{T_p} =\frac{b}{\bigg (\frac{b}{p}+ n_{\text {call}}\bigg )} =\frac{b}{b\cdot \bigg (\frac{1}{p}+ \frac{n_{\text {call}}}{b}\bigg )}\nonumber \\&=\frac{1}{\bigg (\frac{1}{p}+ \frac{n_{\text {call}}}{b}\bigg )} =\Theta \bigg (\frac{p}{1+ \frac{p}{b}\cdot n_{\text {call}}}\bigg ) \end{aligned}$$
(16)

The speedup S(b) (from Eq. (16)) is a function of both problem size b and the no. of processing cores p. The growth of S(b) is sub-linear, with respect to b. Therefore, if b is increased, keeping p fixed, there is no significant increase in speedup, as observed in Fig. 3. However, there is a proportional increase in speedup, if both b and p are increased, as shown in Fig. 4.

Fig. 3
figure 3

Speedup versus file size (b): simulation results showing that speedup is almost constant (around 26) if b is increased, but the no. of cores (p) is kept fixed

Fig. 4
figure 4

Speedup versus the no. of processing elements (p) and problem size (b): simulation results showing that speedup increases if both b and the no. of p are increased proportionally

5.3 Efficiency

Efficiency is an indicator of resource utilization. It is measured as the fraction of time for which the processor is usefully employed. Its value (E(b)) lies in the range \(\frac{1}{b}\le E(b) \le 1.\) The efficiency of the EITRH variant on p processors can be calculated as:

$$\begin{aligned} E(b)&= \frac{\text {Speedup}}{\text {No. of processors}}=\frac{T_s}{p\cdot T_p} \end{aligned}$$

E(b) can be expressed in terms of parallel overhead, as done in Eqs. (14) and (15):

$$\begin{aligned}&E(b)= \frac{1}{1+\frac{T_o}{T_s}} \quad \therefore \quad E(b)= \Theta \bigg (\frac{1}{1+ \frac{p\cdot n_{\text {call}}}{b}}\bigg ) \end{aligned}$$
(17)

Generally, \(b \gg p\), so \(\frac{p\cdot n_{\text {call}}}{b}\) and efficiency cannot be 1, due to the parallel overhead. From Eq. (17), it is evident that, unlike speedup, efficiency increases if the problem size is increased while keeping the no. of processing elements fixed. However, due to Amdahl’s law, the speedup tends to saturate and correspondingly the efficiency drops with the increase in p in the results shown in Figs. 4 and 5.

Fig. 5
figure 5

Efficiency versus no. of processors (p) and problem size (b): simulation results showing that efficiency improves if b is increased by keeping p constant. The efficiency drops with increase in p

5.4 Cost Optimality

In parallel systems, cost optimality means that the cost of solving a problem, as a function of input size, has the same asymptotic growth as the fastest known sequential algorithm on a single core [34]. The cost of using EITRH framework for computing hash value is:

$$\begin{aligned} T_{\text {cost}}&= p\cdot T_p \nonumber \\&= p\cdot \bigg (\frac{b}{p}+n_{\text {call}}\bigg ) \nonumber \\&= \Theta (b+p\cdot n_{\text {call}}) \end{aligned}$$
(18)

From Eq. (18), it may be concluded that as long as the input size \(b=\Omega (p\cdot n_{\text {call}})\), EITRH is cost optimal as the cost is \(\Theta (b)\), which is the same as the cost for sequential algorithm (see Eq. (12)).

5.5 Minimum Execution Time

For a problem of a fixed size, increasing the no. of processing cores does not always lead to improvement in execution time of the program. Initially, it decreases asymptotically, till it reaches a minimum value. After achieving the minima, any further increase in p leads to an increase in run-time. Therefore, the minimum parallel execution time \(T_p^{\text {min}}\), given by Eq. (19), can be calculated as follows:

$$\begin{aligned} \frac{d}{dp}T_p&= 0\\ \frac{d}{dp}T_p&=\frac{d}{dp}\bigg (\frac{b}{p}+n_{\text {call}}\bigg )=\frac{b}{p^2}(-c) \end{aligned}$$

where c is constant.

$$ \therefore \,\,b=p^2 \implies p=\sqrt{b} $$

Substituting \(p=\sqrt{b}\) in Eq. (13), we get

$$\begin{aligned} T_p^{\text {min}}=2\sqrt{b} \end{aligned}$$
(19)

5.6 Scalability

Scalability is defined as the ability of a parallel system to maintain its efficiency at a constant level, even as the size of the system is expanded by increasing, both, the problem size as well as the no. of processing elements. It is measured using an iso-efficiency function [34], defined as follows:

$$\begin{aligned} \text {Problem size}\ (W)=\frac{E}{1-E}\cdot T_o(T_s,p) \end{aligned}$$

From Eq. (14), the parallel overhead function of EITRH is approximately \(p\cdot n_{\text {call}}.\) Therefore,

$$\begin{aligned} W=\frac{E}{1-E}\cdot p\cdot n_{\text {call}} \end{aligned}$$
(20)

Using the asymptotic notation, the iso-efficiency for EITRH framework is \(\Theta (p\cdot n_{\text {call}}).\) This suggests that, if the no. of processors/cores is increased by a factor of \(\frac{p'}{p}\) (\(p'\) being the increased no. of processors), then in order to achieve a gain in the speedup by the same factor, the problem size (b) should be correspondingly increased by a factor of \(\frac{p'\cdot n'_{\text {call}}}{p\cdot n_{\text {call}}}.\)

The discussion presented in this section evaluated the EITRH framework with respect to various metrics for parallel programs. The results suggest that EITRH framework is at par with its sequential counterparts in terms of scalablity, efficiency as well as optimalty with respect to costs.

6 Conclusion and Future Work

In this paper, a fast parallel recursive tree-based transformation, called EITRH, was proposed. It can be used to design parallel algorithms from their sequential counterparts. The proposed transformation combines the use of tree hashing and EMD construction. It has three recursive steps, viz. “message expansion,” “parallel reduction,” and “hash value generation.” The envelope of the transform, provided by the use of the second vector (\(\text {IV}_2\)), makes it suitable for both keyed and keyless CHFs. Further, this paper proved that EITRH satisfies all the essential security properties for hash functions, namely PIR, WCR, CR, avalanche effect, PPR, and PRO-Pr, as discussed in the previous sections. This suggests that EITRH is a secure and flexible hash transformation that is multi-property preserving. Performance evaluation of the framework indicates its scalability for hashing long messages, implying potential application in digital forensics and parallelization of digital signatures. In the future, the theoretical results presented in this paper shall be confirmed by practical implementation of CHFs based on EITRH transformation.