1 Introduction

Cryptography plays an important role in the area of secure communication systems. In order to protect the data, encryption algorithms like AES, Camellia, are used in cryptography for the purpose of security. AES is a symmetric key algorithm of Federal Information Processing Standards (FIPS) Publication 197, issued as a replacement of Data Encryption Standards (DES) [21] by the National Institute of Standards and Technology (NIST) in 2001 [1]. AES was standardized and adopted in the latest IEEE Standard 802.15.6 for wireless body area network (WBAN) due to high security, efficiency, ease of implementation, higher data rates [10]. Moreover, this algorithm is widely used in applications like secure communication, RFID tags.

The AES algorithm was realized on hardware using pipelining, sub-pipelining and loop unrolling architecture to achieve maximum throughout. The architecture of AES algorithm was implemented on 0.18-\(\upmu \)m CMOS ASIC technology using fully pipelining for an encryption process, achieving a throughput of 30–70 Gbits/s [13]. Although these architectures were efficient for many applications which require high throughput. Moreover, these high-throughput architecture hardware realizations utilize more area and high power consumption. Among these architectures, the hardware implementation of classical S-Box was traditionally designed using LUT’s [19, 24]. In order to enhance the speed and avoid unbreakable delay, the S-Boxes are also designed and implemented using composite field arithmetic, which involves in decomposition of Galois Field GF\((2^8)\) to GF\(((2^4)^2)\) or GF\((((2^2)^2)^2)\), respectively, using isomorphic mapping [15, 28]. The S-Box was realized using binary decision diagram (BDD), and TBoxes provided a throughput of 10 Gbps in the literature [2, 9]. FPGA-based implementation of AES processor is reported in [18]. The works so far reported in the literature mainly emphasized on enhancement in throughput and reduction in hardware complexities [11]. However, there is a necessity to develop an alternative architecture which is secure enough with a lesser area and low energy consumption. Since the WBAN applications demand an ultralow power to increase the lifetime of the battery, we proposed the PCA-based S-Box realization, which enormously reduces power consumption compared to conventional LUT-based S-Box realization. In order to check the level of security for the proposed PCA-based S-Box realization, we also obtained cryptographic properties such as nonlinearity, entropy, correlation immunity bias, balancedness property and strict avalanche criterion. It is also found that the proposed PCA-based S-Box gives comparable performance in terms of security with respect to LUT-based S-Box (Table 4).

In this paper, the concept of AES algorithm is revisited in Sect.  2. The formulation of S-Box using cellular automata is discussed in Sect. 3. The proposed novel PCA-based dynamic S-Box and architecture are presented in Sect. 4. The comparative analysis of LUT-based S-Box and PCA-based S-Box is evaluated using cryptographic properties with architecture implementation in Sect. 5, and conclusion is drawn in Sect. 6.

2 Concept of AES Algorithm

The AES algorithm is a symmetric key cryptographic algorithm as shown in Fig. 1, which uses four transformations in each round, namely the substitution bytes (S-Box), shift rows (SR), mix columns (MC) and add round key (ARK), to generate cipher text over plain text in order to provide the desired level of security.

The rounds of transformation \((N_r)\) used in the AES algorithm can be determined using the relation \(N_r=\frac{S_k}{32}+6\), where \(S_k\) = key size. For wireless body area network (WBAN) application, the latest IEEE Standard 802.15.6 has recommended a secret key size of 128 bits for encryption and decryption, which results in 10 rounds of transformations [10].

Fig. 1
figure 1

A block diagram of AES encryption

The initial add round key is the 128-bit direct secret key which is used in EXOR operation with the input data, and subsequent add round keys are generated in the key expansion phase. Out of these 10 rounds, the first 9 rounds undergo all the four transformations, whereas the last round performs only the three transformations S-Box, SR and ARK as illustrated in Fig. 1. In each round of encryption process, the algorithm performs S-Box, SR, MC and ARK operation on a \(4 \times 4\) array of bytes called as a state as described in the following subsections.

2.1 Substitution Bytes

In S-Box transformation, each byte of the input state is substituted by another byte using a precomputed lookup table (LUT). The S-Box is computed by multiplicative inverse over the Galois finite field GF (\(2^{8}\)), using the irreducible polynomial \(p(x)=x^{8}+x^{4}+x^{3}+x+1\), followed by an affine transformation. Mathematically, the affine transformation of S-Box in matrix is as follows:

$$\begin{aligned} \begin{bmatrix} b^{'}_{0}\\ b^{'}_{1}\\ b^{'}_{2}\\ b^{'}_{3}\\ b^{'}_{4}\\ b^{'}_{5}\\ b^{'}_{6}\\ b^{'}_{7}\\ \end{bmatrix}= \begin{bmatrix} 1&\quad 0&\quad 0&\quad 0&\quad 1&\quad 1&\quad 1&\quad 1 \\ 1&\quad 1&\quad 0&\quad 0&\quad 0&\quad 1&\quad 1&\quad 1 \\ 1&\quad 1&\quad 1&\quad 0&\quad 0&\quad 0&\quad 1&\quad 1 \\ 1&\quad 1&\quad 1&\quad 1&\quad 0&\quad 0&\quad 0&\quad 1 \\ 1&\quad 1&\quad 1&\quad 1&\quad 1&\quad 0&\quad 0&\quad 0 \\ 0&\quad 1&\quad 1&\quad 1&\quad 1&\quad 1&\quad 0&\quad 0 \\ 0&\quad 0&\quad 1&\quad 1&\quad 1&\quad 1&\quad 1&\quad 0 \\ 0&\quad 0&\quad 0&\quad 1&\quad 1&\quad 1&\quad 1&\quad 1 \\ \end{bmatrix} \begin{bmatrix} b_{0}\\ b_{1}\\ b_{2}\\ b_{3}\\ b_{4}\\ b_{5}\\ b_{6}\\ b_{7}\\ \end{bmatrix}+ \begin{bmatrix} 1\\ 1\\ 0\\ 0\\ 0\\ 1\\ 1\\ 0\\ \end{bmatrix} \end{aligned}$$
(1)

Traditionally, the classical S-Box will be implemented using memory cells which can store the 256 possible values in \(8 \times 8\) array of bits. For input data of 128 bit, a total number of sixteen LUT-based S-Boxes are required for AES algorithm. The LUT-based S-Box in hexadecimal form is represented in Table 1. For example, if the input is a5, then the substituted value of S-Box is determined from Table 1 by the intersection of a row and 5 column which results in 06.

Table 1 LUT-based S-Box

2.2 Shift Rows

In SR transformation, the first row remains unchanged and the subsequent three rows are shifted cyclically to the left by 1, 2 and 3 bytes, respectively. This transformation is attained in order to create diffusion in cipher text.

2.3 Mix Columns

The MC transformation operates column wise, where in each column four term polynomials over GF(\(2^8\)) and multiplied by a modulo \(x^4\)+1 with a fixed polynomial \(A(x)=(03H)x^3+(01H)x^2+(01H)x+(02H)\). Mathematically, this operations can be written in matrix form as follows:

$$\begin{aligned}&S^1(x)=A(X)\otimes S(x). \end{aligned}$$
(2)
$$\begin{aligned}&\begin{bmatrix} S^{'}_{0,C}\\\ S^{'}_{1,C}\\\ S^{'}_{2,C}\\\ S^{'}_{3,C}\\ \end{bmatrix}= \begin{bmatrix} 02H&03H&01H&01H \\ 01H&02H&03H&01H \\ 01H&01H&02H&03H \\ 03H&01H&01H&02H \\ \end{bmatrix} \begin{bmatrix} S_{0,C}\\ S_{1,C}\\ S_{2,C}\\ S_{3,C}\\ \end{bmatrix} \end{aligned}$$
(3)

where \( 0\le C < 4\).

2.4 Add Round Key

In ARK transformation, each round key consists of 4-byte words denoted by \(w_i\) which are generated from the key expansion. Key expansion block generates a total of 4(\(N_r\)+1) number of ARKs. In this AES algorithm initial phase, the first round key is the initial 128 bits of secret key and the subsequent round keys are calculated iteratively using SubWord, RotWord and Rcon. Each ARK is 4-word output from the key expansion block denoted by ARK \(i = (w_{4i},~w_{4i+1},~w_{4i+2},~w_{4i+3})\), where \(i=0~to~N_r\). SubWord means nonlinear transformation of each byte of key using S-Box. The rotation word (RotWord) is a cyclic left shift of each byte in a word by one byte. Rcon is an array of constant words, and the left most byte in a word is nonzero.

3 Reformulation of S-Box Using Cellular Automata

The basic function of S-Box is to transform one byte of input data into another one byte secret data using predefined lookup table (LUT). The truth table of S-Box is basically a function \( f : B^n\) \(\rightarrow \) \(B^m\).

The LUT-based S-Box architecture requires more area and consumes high energy. Hence, LUT-based S-Box architecture is not suitable for IEEE Standard 802.15.6 for WBAN applications. Moreover, IEEE Standard 802.15.6 for WBAN also demands a highly secure, lesser area and low energy consumption cryptographic algorithm. In order to meet the requirements of WBAN, in this paper, we proposed a cellular automata (CA)-based architecture for realization of S-Box.

The basic structure of CA is shown in Fig. 2, which consist a groups of cells with a finite size of length from \(R_0\) to \(R_7\) which evolve at discrete time steps using deterministic rule with each cell storing one of the two states 0 and 1. If the right most and left most (extreme) cells of this finite size CA are considered to be adjacent each other, then the CA is called as periodic boundary CA. The one-dimensional periodic boundary CA evolves with different neighborhood configurations of elementary CA. Each elementary CA consists of central cell i which is surrounded by neighborhood cells of a defined radius r; therefore, the total number of cells in elementary CA is given as \(n_i=2r+1\), including central cell i. We considered \(r=1\), and the total number of possible different neighborhood configurations of elementary CA are \(L=2^{n_i}\) with \(R_{i-1}^{t},~R_{i}^{t},~R_{i+1}^{t}\) number of cells. The next state central cell \(R_i^{t+1}\) at a time step \((t+1)\) depends on the current state of central cell \(R_i^{t}\) and also neighborhood \(R_{i-1}^{t},R_{i+1}^{t}\) cells, respectively, at time t with a deterministic rule of function \(f_p\). Mathematically, \(R_i^{t+1}\) can be expressed as

$$\begin{aligned} R_i^{t+1}=f_{p}(R_{i-1}^{t},R_{i}^{t},R_{i+1}^{t}) \end{aligned}$$
(4)

The representation of deterministic rules \(f_p\) in decimal form is shown in Table 2, and the total number of CA rules considered is \(2^L=256\). If the rule in CA is expressed using EXOR logic and/or EXNOR logic, then it is called as additive CA. The additive CA is used in VLSI testing, bit-error correcting code and data encryption. If all the cells in CA evolve using the same deterministic rule, then the CA is called uniform CA. The dynamic nature of one-dimensional periodic uniform CA depends on deterministic rule \(f_p\) and the number of iterations. In this paper, we considered a programmable cellular automata (PCA) which is a modified version of one-dimensional periodic uniform CA structure in which all the cells in the lattice obey the same rule [26].

Fig. 2
figure 2

A cellular automata array of size \((R_0-R_7)\) with a circular boundary condition

Table 2 Truth table for rule 90 and 75
figure a

The functioning of PCA-based S-Box with 256 number of different rules is shown in Algorithm 1. There exist a relationship between the number of iterations at discrete time step t and group of CA cells \(R_0\) to \(R_7\) in a lattice, as diversification from input to output is high if the time step \(t\ge \) size of the lattice. In CA algorithm, D is loaded with 8-bit initial random value, so there exists \(2^8\) possible random initial states which are taken into consideration. However, the 8 bit random initial states of PCA evolves using different 256 deterministic rule and NOI means the number of iterations which varies from time step 1 to 50.

4 Proposed PCA-Based S-Box

The S-Box of AES algorithm in cryptography provides confusion in the cipher text and hence plays a important role in AES algorithm. The conventional LUT-based S-Box realization uses a large number of memory cells which eventually consumes more power. Moreover, the secret information from the existing AES algorithm architecture can be revealed using power analysis attacks [23].

In order to overcome these limitations, we proposed a PCA-based architecture for S-Box with low energy consumption and dynamic in nature. Unlike the conventional LUT-based S-Box, the proposed S-Box is dynamic in nature because of the fact that the output of the S-Box is a function of input rule which can be programmed. Out of 10 rounds of transformations, each encryption round in AES algorithm as discussed in Sect. 2 consists of substitution bytes (S-Box), shift rows (SR), mix columns (MC) and add round key (ARK) transformations. The substitution bytes block transformation in each round of encryption is replaced with the proposed PCA-based S-Box as shown in Fig. 3. The detailed description of proposed PCA-based S-Box is presented in the following paragraphs.

There are a total number of 256 rules that can be used to program into the registers at discrete time steps. The output \(R_{i}^{t+1}\) depends on the input control signals \(R_{i-1}^{t},R_{i}^{t},R_{i+1}^{t}\) as shown in Fig.  4. For example, in Table 1, if 90 is the input rule and 110 is input data , the output \(R_{i}^{t+1}\) should be 1.

Fig. 3
figure 3

Encryption process of each round

Fig. 4
figure 4

PCA basic cell structure

The output of the proposed basic PCA structure with given 8-bit input rule is one bit as shown in Fig. 4. In order to implement the S-Box which operates on 8 bits, eight such basic cells shown in Fig. 4 need to be interconnected.

The proposed architectural design of \(8 \times 8\) array PCA-based S-Box is implemented using logic gates, multiplexers and registers as shown in Fig. 5. The initial 8 bits of CA array will be loaded into register R\(_1\) using preset and clear signals. The bits in the register R\(_1\) will be applied as control signals to 8:1 MUX (M\(_1\)–M\(_8\)) in circular fashion whose input is an 8-bit rule. First 3 bits \(R_7\), \(R_0\) and \(R_1\) will act as a control signals to M\(_1\), and the rotated bits \(R_0\), \(R_1\) and \(R_2\) to M\(_2\) and the last MUX M\(_8\) the control signal are \(R_6\), \(R_7\) and \(R_0\). The MUXs produce the output which is shown in Table 1. The multiplexer outputs so produced will be used as a CA array bits in subsequent iterations.

The control logic has a 6-bit upcounter and a comparator. If the count value of counter is equal to the number of iteration in time step, then the output of the control logic circuit goes high to enable the register (\(R_2\)). The latency incurred in computing the S-Box depends upon the number of iterations of PCA. However, on the other side, the ASIC implementation of PCA-based S-Box architecture shown in Fig. 5 utilizes few logic elements compared to that of LUT-based S-Box [14]. As a result, CA-based S-Box architecture consumes less power and require small chip area, and hence, this hardware realization is much suitable for WBAN applications.

Fig. 5
figure 5

Proposed PCA-based S-Box

5 Performance Comparison Between Conventional LUT S-Box and Dynamic PCA S-Box

In order to analyze the security aspects, the output bits obtained by the proposed \(RCA^2\)-based S-Box architecture as described in Sect. 4 are taken as inputs bits to the MATLAB system which computes cryptographic properties. However, in order to examine the S-Box using cryptographic properties the \(2^8\) output bits are transformed into a single output bit using Boolean function \(f_{i}\) : \(B^n \rightarrow B \), where \(i\in (1,m)\). In a S-Box, \( f :B^{n}\rightarrow B^{m}\), and hence, there exists m number of functions \( \mu =\{f_{1},f_{2},\ldots , f_{m}\}\). The truth table representation of S-Box in polarity form is written as \(f_{k}(x)\) = \((-1)^{f(x)}\).

$$\begin{aligned} f_\beta (x)=(\alpha _1f_1(x)\oplus \alpha _2f_2(x)\oplus \alpha _3f_3(x)\ldots \oplus \alpha _mf_m(x)) \end{aligned}$$
(5)

The Boolean function \(f_\beta \) is a linear combination of m functions \(f_i(x)\), \(i \le m\), where \(\alpha _i~\in ~B^m\) are coefficients of the linear function.

Table 3 Symbolic representation

5.1 Analysis Using Cryptographic Properties

The level of security provided by PCA-based S-Box is analyzed using cryptographic properties for 256 different rules at discrete time step t which varies from 8 to 50. It is also observed that the diversification in a output pattern of PCA-based S-Box is high if the number of iterations used in order to evolve CA is not less than the number of cells in a lattice \((t \ge K)\) [3]. The symbolic representation of the cryptographic properties is presented in Table 3.

5.1.1 Balancedness Property

A Boolean function is balanced, if number of 1s is equal to number of 0s. The balancedness of the output is measured by Hamming weight. If the Hamming weight is \(2^{n-1}\), then the output is balanced. This property is observed in proposed PCA-based S-Box and conventional LUT-based S-Box.

5.1.2 Strict Avalanche Criterion

If one input bit changes in a Boolean function, then half of the output bits should be changed. For a Boolean function, if \({f_i}\) is to satisfy SAC the following condition should be satisfied, \(f(x)\oplus f(x\oplus \alpha )\) should be balanced, where the Hamming weight of \(\alpha \) is 1 and SAC is denoted by \(\varUpsilon _{S}\).

$$\begin{aligned} \mathrm{SAC}_{f_i}=\mathrm{max}_{1\le i\le n}|2^{n-1}-\sum \limits _{x\in B^{n}} f(x)\oplus f(x\oplus c_{i}^{n})| \end{aligned}$$
(6)

In a S-Box, \( f :B^{n}\rightarrow B^{m}\), and hence, there exists m number of functions \( \mu =\{f_{1},f_{2},\ldots f_{m}\}\). In the n variable function, \(B^{n}\) consists of all the possible input which is basically \(2^{n}\) different inputs, and \(c_{i}^{n}\) consist of all the element in \(B^{n}\), whose Hamming weight is 1 .

$$\begin{aligned} \varUpsilon _{S}=\mathrm{max}(\mathrm{SAC}_{\mu }) \end{aligned}$$
(7)

If the value of SAC is less for the observed ciphers, then the cipher is more difficult to cryptanalysis. The achieved value of SAC ranges between [0,128], and the best value is observed as 14 for more than 26% of rules, as shown in Fig. 6. The PCA S-Box in terms of SAC have better performance than that of classical S-Box.

5.1.3 Nonlinearity

The nonlinearity of a Boolean function f is the minimum distance of the function to the set of affine functions and represented by \(\mathfrak {R}_{S}\).

$$\begin{aligned} N_{f}=\mathrm{min}[d(f, g)],~\mathrm{where}~ g\in A_{n} \end{aligned}$$
(8)

where \(A_{n}\) is the set of all the affine function.

$$\begin{aligned} d(f,g)=2^{n-1}-2^{-1}(\langle \eta ,\beta \rangle ) \end{aligned}$$
(9)

where \(\eta \) and \(\beta \) represent the binary sequence of f and g, respectively, and \(\langle \eta ,\beta \rangle \) define the scalar product of sequence.

$$\begin{aligned} N_{f} =2^{n-1}-2^{-1},[\mathrm{max}(\langle \eta ,\beta _{j}\rangle )] \end{aligned}$$
(10)

where \(\beta _{j}\) belongs to the sequence of all linear function. In a S-Box, \( f :B^{n}\rightarrow B^{m}\), and hence, there exists m number of functions \( \mu =\{f_{1},f_{2},\ldots f_{m}\}\).

$$\begin{aligned} \mathfrak {R}_{S}=\mathrm{min}(N_{\mu }) \end{aligned}$$
(11)
Fig. 6
figure 6

Value of SAC with different rules

If the achieved value of nonlinearity is significantly high for the observed ciphers, then the cipher is more hard to cryptanalysis. It is observed that the value of nonlinearity varies from [0,109]. Moreover, we found that the achieved value of nonlinearity is more than 100 for 15% of 256 CA rules as shown in Fig. 7, which indicates that the performance of PCA S-Box is comparable to that of classical LUT-based S-Box.

Fig. 7
figure 7

Nonlinearity with 256 rules

5.1.4 Input/Output Entropy

The entropy of a Boolean function f is the amount of information in input bits, if the output bits are known. There exist \(2^n\) possible inputs and \(2^m\) outputs for a Boolean function of n input and m output. The \((i,j)\mathrm{th}\) input/output bit-to-bit entropy of S-Box is computed with \(H(\frac{x_i}{f_j(x)})\) and represented by \(H_S\).

$$\begin{aligned} H=\mathrm{min}[H (\frac{x_i}{f_j(x)})] ~ [i\in \{1,n\},j\in \{1,m\}] \end{aligned}$$
(12)

The entropy \(H_S\) of the output Boolean function f is given by

$$\begin{aligned} H_{f_{i}}(P_i)= P_{i}\mathrm{log}_{2}\left( \frac{1}{P_{i}}\right) +(1- P_{i})\mathrm{log}_{2}\left( \frac{1}{1-P_{i}}\right) \end{aligned}$$
(13)

where \(P_i\) is the fraction of 1s in the output. In a S-Box, \( f :B^{n}\rightarrow B^{m}\), and hence, there exists m number of functions \( \mu =\{f_{1},f_{2},\ldots f_{m}\}\).

$$\begin{aligned} H_{S} = \mathrm{min}(H_{\mu }) \end{aligned}$$
(14)

If the entropy value of cipher is high, then the cipher is difficult for cryptanalysis. The entropy value observed for PCA S-Box ranges from [0,1]; the best value attained is 0.99, and the entropy value for most of the CA rules is between [0.95, 0.99], as shown in Fig.  8. The achieved values for conventional LUT S-Box are also depicted in Table 5. The performance of PCA-based S-Box with respect to entropy is better than that of classical S-Box.

Fig. 8
figure 8

Entropy with different rules

5.1.5 Correlation Immunity Bias

A Boolean function f is said to satisfy a correlation immunity bias of order l if it is a statistically independent combination of any l input bits. Mathematically, if l input bits are fixed, then we can get \(^{n}C_{l}~2^{l}\) g functions and the correlation immunity bias is denoted by \(\varPhi _{S}(l)\).

$$\begin{aligned} \mathrm{CIB}_{f}(l)=\mathrm{max}|2^{l}*W(g_{j})-W(f)| \end{aligned}$$
(15)

where \(W(g_{j})\) belongs to the Hamming weight of all the possible function keeping l bits in the function f fixed. W(f) corresponds to the Hamming weight of function f. In a S-Box, \( f :B^{n}\rightarrow B^{m}\), and hence, there exists m number of functions \( \mu =\{f_{1},f_{2},\ldots f_{m}\}\).

$$\begin{aligned} \varPhi _{S}(l)=\mathrm{max}(\mathrm{CIB}_{\mu }) \end{aligned}$$
(16)
Fig. 9
figure 9

CIB with different rules

If the value of CIB for cipher is less, then the cipher is more difficult for cryptanalysis. The observed value of CIB for PCA S-Box ranges from [0,128]; the best value achieved is 0, and the values less than 14 are for 23% of 256 CA rules, as shown in Fig.  9. The noticed value of classical S-Box is 14 as indicated in Table 5. From the above observation, the PCA-based S-Box provides remarkable performance than that of classical S-Box.

Table 4 CIB, SAC, NL, entropy values for PCA-based S-Box and standard AES S-Box using cryptographic properties
Table 5 Security of LUT-based S-Box using cryptographic properties

The comparative performance of proposed PCA-based S-Box, CA-based S-Box and conventional LUT-based S-Box in terms of security using cryptographic properties is shown in Tables 4 and 5. In Table 4, the nonlinearity value of CA-based S-Box in terms varies from [0,128]; the best value obtained is 128. The achieved value of SAC [0,128] is for CA-based S-Box [4, 20]; the observed best value is 0. Table 4 shows that the proposed PCA-based S-Box attained 10% better nonlinearity compared with that of Clark et al. [7] and Millian et al. [12]. The value of CIB is 15% better than that of Clark et al. [7]. The attained value of nonlinearity and SAC for proposed PCA-based S-Box is comparatively better than Hussain et al. [8].

The PCA-based S-Box is flexible and dynamic in nature, and it is also found that PCA S-Box provides enough level of security compared to LUT-based S-Box.

5.2 Architectural Design

In order to validate the proposed architecture, AES algorithm with PCA-based S-Box is implemented using verilog, verified on FPGA board and synthesized with Cadence RTL compiler. The proposed architecture operates at different clock frequency with TSMC 0.18-\(\upmu \)m technology (core voltage of 1.62 V) and UMC 0.13-\(\upmu \)m technology (core voltage of 1.08 V) under worst-case conditions. The total time consumed to encrypt 128 bits of plain text is calculated by Latency  = Clock cycles \(\times \) Time period. The performance comparison of AES with PCA-based S-Box and AES with LUT-based S-Box is presented in Table 6 in terms of area, power dissipation, energy consumption and operating frequency. It can be noted that in our proposed PCA-based S-Box realization, the number of iterations was considered as 20 clock cycles to compute the PCA S-Box and the total time taken to encrypt 128 bits of plain text using AES algorithm with PCA-based S-Box is 244 clock cycles (Table 6).

Table 6 Hardware results of the proposed AES algorithm with PCA-based S-Box

The number of gates utilized for LUT-based S-Box and composite field arithmetic-based S-Box realization was 696 and 294, respectively, with 0.11-\(\upmu \)m [5], where in case of the proposed dynamic PCA-based S-Box realization, the number of gates utilized is 113 and 116 using 0.18 and 0.13-\(\upmu \)m technology libraries. Sumio et al. [2] presented optimized low-power S-Box architecture for AES which consumes power of 29 \(\upmu \)W at 10 MHz using 130 \(\upmu \)m CMOS technology, whereas our proposed PCA-based S-Box at 10 MHz using 130 \(\upmu \)m CMOS technology consumes power of 10 \(\upmu \)W. It can be easily seen that our proposed PCA-based S-Box consumes 65% less power than the existing work [2]. The work reported in [25] needs power consumption of 7.55 mW for encryption with 0.18-\(\upmu \)m technology operated at 13.56 MHz frequency, while our proposed work of AES with PCA-based S-Box operates at 13.69 MHz clock frequency, power consumption of 3.259 mW for encryption and area of 0.184 mm\(^2\) which is 58% less compared to Eslami et al. [25]. The ASIC implementation of AES algorithm with composite field arithmetic-based S-Box using 0.18-\(\upmu \)m technology takes 500 clock cycles to complete encryption of 128-bit plain text, when operated at 1 MHz frequency, and the power consumption is 51.20 \(\upmu \)W [16]. The power dissipation and energy consumption of AES algorithm with proposed PCA-based S-Box operated at 1 MHz frequency using 0.18-\(\upmu \)m technology is 94.07 \(\upmu \)W and 22.95 nJ, whereas the energy consumption in Manoj et al. [16] is 25.60 nJ; there is slight decrease in energy consumption by 10% than in the existing work [16]. Our result shows 28% reduction in energy consumption compared to the results of Kaps et al. [17]. It is clear from Table 4 that the proposed PCA-based S-Box outperforms in terms of power dissipation and energy consumption compared with the existing works.

6 Conclusion

In this paper, we proposed a ultralow-power, less area architecture of AES with dynamic PCA-based S-Box for WBAN application. We also evaluated the architecture through simulation and synthesis for ASIC implementation. Unlike the design in [2, 6, 17, 25, 27], the proposed design requires few logic elements; hence, there is a reduction in power, energy and chip area compared to conventional AES with LUT-based S-Box. We have achieved comparable performance in terms of security for dynamic PCA-based S-Box with that of classical LUT S-Box using cryptographic properties. The design was synthesized using Cadence RTL compiler to evaluate area, power and frequency of operation. The maximum operating frequency achieved is 536 MHz for TSMC 0.18 \(\mu \)m technology, achieving an area of 0.189 mm\(^2\) and power consumption of 98.33 mW. UMC 0.13 \(\mu \)m technology achieved an area of 0.072 mm\(^2\) and power consumption of 32.059 mW with operating frequency of 769 MHz. Therefore, it has been observed that AES with dynamic PCA-based S-Box is an ultralow-power and low energy consumption encryption algorithm and hence suitable for WBAN applications.