Keywords

1 Introduction

Electronic cryptographic devices are widely used today for securing the secret information. However, there are emerging issues about side channel attack. In the scope of the statistical power analysis attack on cryptographic systems, two efficient techniques were proposed. The first one is well known Difference Power Analysis (DPA) introduced by Paul Kocher [1, 2] and formalized by Thomas Messerges et al. [3]. It uses statistical tools to find out the information correlates to confidential key. The second one is Correlation Power Analysis (CPA) that proposed by Brier et al. [4]. It exploits the correlation between the power model and real power consumption, in order to leak the secret key. These techniques and measurements are carefully taken into account by embedded system designers to know an attacker can measure the power consumption or electromagnetic emanations, which are two of the main physical quantities used for non-invasive attack.

In this paper, we focus on CPA in order to evaluate the security of the AES cryptography because it is an efficient technique to recover the secret key with a simple power consumption model. In fact, we realize that when the cryptographic devices run in low frequency without or with very small noise, CPA only needs a hundred of power traces to successfully extract all secret key. However, for the device running in high frequency and high noise, CPA needs a huge amount of power traces to recover the whole key. Besides, to evaluate the protection method for a secure device, the designers need to consider to the real scenarios. Side channel analysis with real devices usually leads to the high level of noise, then the efficient attacks may require the large numbers of power traces. Therefore, finding an efficient way to analyze those power traces is critical.

The rest of this paper is organized as follows. The original CPA and some related works are described in Sect. 2. In Sect. 3, we will introduce a new CPA, which is an efficient CPA technique that allows to recover secret key byte at least 2× faster than the original CPA. Then, in Sect. 4, we show and compare the analysis results on cryptographic devices, which is an implementation of AES running on XMEGA MCU (on ChipWhisperer board). Finally, we conclude the paper in Sect. 5.

2 Original CPA and Related Works

2.1 Original CPA

CPA was first proposed by Brier et al. in [4]. It exploits the correlation between the real power consumption and the power consumption model. It is based on a power consumption model of the running device at some certain points of time which must depend on the fixed secret key and the plain text changed for each trace. For the Advanced Encryption Standard (AES) attack, attackers mostly used the performing after the first Sbox function to analyze because it has strongly correlation between the real power consumption and power consumption model. The most widely used consumption model is the Hamming distance between two relevant values in the same register [4], or simply the Hamming weight of the particular value [5, 6]. The correlation between the power model and actual power consumption is calculated through Pearson’s correlation coefficient by evaluating the linear relationship. The formula of this correlation coefficient between ai and bi is given by:

$$ \rho_{{}} = \frac{{\text{cov} (A,B)}}{{\sigma_{A} \sigma_{B} }} $$
(1)

Next, we will discuss the analysis technique for AES algorithm. AES is a cryptographic algorithm executing on one byte separately. Therefore, this paper will focus on analyzing the key bytes independently. Let’s denote that p is Pearson’s correlation coefficient, N power traces of length L, tn,i is the consumption value of point i in trace n (with 1 ≤ iL, 1 ≤ nN). We have K possible values of a subkey (in this paper K = 256), we note hn,k the power consumption model of key k, in trace n and calculated by following formula:

$$ h_{n,k} = \, HW\left( {Sbox\left( {Plaintext_{n} \oplus k} \right)} \right) $$
(2)

where HW denotes the Hamming weight of the Sbox output. With this data, we can see how well the power model and actual power consumption correlation is for each guess key byte k at each time i, by the following formula:

$$ \rho_{k,i} = \frac{{\sum\limits_{n = 1}^{N} {(h_{n,k} - \bar{h}_{k} )(t_{n,i} - \bar{t}_{i} } )}}{{\sqrt {\sum\limits_{n = 1}^{N} {(h_{n,k} - \bar{h}_{k} )^{2} \sum\limits_{n = 1}^{N} {(t_{n,i} - \bar{t}_{i} } )^{2} } } }} $$
(3)

where \( \bar{h}_{k} \) and \( \bar{t}_{i} \) are the average of the power consumption model and real power consumption respectively at instant i, respectively.

It can be seen that, by taking the maximum of \( \rho_{k,i} \) among all values for i and k, we can decide which power consumption model of the key is most correlated with the actual power consumption. In (3), i is the value of number of samples that we acquired in a trace. Fortunately, we don’t need to use all of samples in a power trace, we only focus on a byte that has a leakage position as illustrated in Fig. 1. We use N random plaintexts corresponding to N traces in which each trace has L samples. Note that ti,j is the value of jth sample in the trace number ith (1 < j < L, 1 < i < N), di,B is the byte value of byte B (\( B \in (1;16) \)) in the plaintext number ith. In order to compute the correlation coefficient based on (3), Algorithm 1 as shown below is used. From this algorithm, it is clear that we can easily calculate the mean value of power consumption model and the real power consumption. However, the problem is the iteration of algorithm, it can be computed by 16 × 256 × L × N. It leads to a huge number of iterations that directly impact on the execution time. Therefore, the values of L and N need to be reduced for improving computing effectiveness. Hereafter, some related works will be discussed.

Fig. 1.
figure 1

Data leakage position of AES.

figure a

2.2 Related Works

Recently, CPA technique focuses on the computing effectiveness. In [7], the authors introduced a selection method that can be used to reduce the number of power traces (it means reducing N) to reveal the keys. This is a pre-processing technique which consists selecting a biased subset of the power traces have more information than average traces. Their method improves the performance of the original analysis but the authors does not show the computation analysis in terms of execution time. The work in [8] proposed a technique for CPA enhancement. The authors presented the Partitioning Power Analysis (PPA) technique to combine the techniques of DPA, multi-bit DPA and CPA in a single form. The notion of class was introduced by grouping the power consumption. Each class contains the messages which have the same Hamming distance as the reference state. After that, the correlation is calculated on the classes which use difference coefficients for the difference classes. According to [8], CPA is a particular case of PPA. Despite enhancement of CPA, the computation time is not presented in this method. Another technique is presented in [9], the authors show a new method to recover secret keys with less power consumption traces than the standard CPA. This improvement is done by selecting appropriate plaintexts, namely non-adaptive and adaptive. The authors choose a set of messages with the most pairwise decorrelated subkeys consumption model. This technique reduces the number of power traces for CPA, but the author does not give any information of computation time.

Most recently, in [10], the authors proposed an algorithm that is similar to the original CPA. This technique is based on the idea of creating, at each point in time, a vector of consumption values indexed by the plaintext byte value. This method also refers to L and N and has two phases. In phase 1, the vector consumption is created in a single pass, requiring (N × L) iterations. Phase 2 uses these correlated vectors and corresponding power model with Hamming weight. This phase takes L × K2 (K = 256) iterations. Consequently, this technique uses (N × L + L × K2) iterations for computing one byte key. Despite reducing execution time significantly, this method only works well if the number of power traces is large enough.

This work also employs the idea to reduce the iterations. As mentioned previously, the original CPA requires many power traces and the number of iterations is 16 × 256 × L × N. Therefore, to enhance the effectiveness of CPA, especially the computation time, in this paper, we aim to propose a technique that reduces the value of L for calculating the final result in CPA technique.

3 Proposed CPA Technique

In this Section, we describe the process of our proposed method and explain how to reduce the number of samples of a power trace in computing. CPA exploits the correlation of power consumption model. In this work, we use the Hamming weight with the data leakage positions as describing in Sect. 2.

In Algorithm 1, we need to calculate the mean of modeled power consumption and the real power consumption. The mean value of power trace needs to use all of the samples of all power traces. Therefore, the number of iterations in computing is large number. If we use a small amount of power traces, it means that we can reduce the number of iterations. Moreover, not every points is important in calculating the correlation, the subkeys only influence the power consumption at a few critical times, and corresponding samples are called Points of interest (POI). If we can pick POI, we can ignore most of the samples. Therefore, using the POI will reduce the number of iterations dramatically. For these reasons, we propose a method that can take the POI base on conventional CPA, and then they are used to calculate Pearson correlation coefficient for all of power traces with whole possible key.

The POI mentioned above is similar to the template attack, a simple technique was used is sum of differences method. In our proposed, the plaintext is random, and the secret key is fixed. We cannot use sum of differences, so the Pearson correlation coefficient is chosen instead. Since a power trace has L samples, we will have L values of correlation, after each iteration, if ith sample has the linear relationship, it will steadily increase or reduce, by contrast it will flexible like noise. Figure 2 illustrates the correlation of 5,000 samples after computing 50 power traces. It is clear that some samples of power trace are much higher than the rest. One of these samples, which is correlated to correct byte key, then constantly goes up. A question is posed, what happen if we only compute a part of power traces instead of all traces. In this case, we suppose that all correlation values will reduce but these POI still higher than the rest. This is an important point to extract the POI from the large sample. To implement this hypothesis, we propose Algorithm 2.

Fig. 2.
figure 2

Points of interest for one byte of secret key.

Inspired by the conventional CPA, our proposed algorithm is also based on the correlation coefficients to find out the highest value. However, the novelty of our proposed approach is the two-phase CPA technique with optimal chosen parameters. In the first phase, a smaller number of power traces is used and denoted as N1 (\( N_{1} \le \frac{1}{2}N \)). After that, the Eq. (3) is applied to calculate α highest samples from all guess keys. The value of α is much smaller than L. In the last phase, this equation is used again to compute the correlation for all hypothesis keys again. However, only α samples are calculated instead of using L samples as the first phase. Next, we present the proposed algorithm in detail (Algorithm 2). The notation N, L, di,B of Algorithm 1 are reused and N1 denotes for the power traces using in phase 1.

figure b

4 Experimental Results

In this section, we evaluate the proposed method on unprotected AES implementation compared to the conventional correlation coefficient used for CPA. We will show the impact of N1 in computing time, then present two criteria including the accuracy and the execution time for recovering whole key. To prove the practical effectiveness of the proposed technique, we use the synchronized and low noise power traces of low frequency AES implementation acquired on ChipWhisperer board.

To illustrate the accuracy of algorithm, an AES implementation on XMEGA MCU is used. The number of correct key bytes will be used as the criteria to assess. In this work, 10,000 power traces are acquired from ChipWhisperer board and then added Gaussian noise ranging from 0.1 to 0.2 with steps of 0.02. As mentioned above, we will show the impact of N1 in the proposed algorithm. In this case, the fixed value of α = 50 is chosen. All experiments are implemented in MATLAB software.

In Fig. 3, it is clear that the more noise is added, the higher of N1 is needed for extracting secret key successfully. However, the value of N1 directly affects the computation time, therefore we cannot choose the too large value of N1. In the experiments, with the value of N1 = [1/3 N, 1/4 N, 1/5 N], we find that the value N1 = 1/3 N will leads to the better results than other ones. Hence, this value is used for evaluating the execution time with the detail results in the next section.

Fig. 3.
figure 3

The average of the correct number of key bytes corresponds to N1 and the standard deviation increases.

To demonstrate the effectiveness of timing execution, the proposed algorithm is implemented with the same configurations of the conventional CPA algorithm. In this experiment, we used a number of power traces in the range from 1,000 to 10,000 corresponding to the standard deviation of noise ranging from 0.06 to 0.15 and with N1 = 1/3 N. These parameters are listed in Table 1.

Table 1. Parameters of CPA attack on the whole AES key for computation time measurements.

Figure 4 shows the results of experiment. It can be seen that the proposed technique is much faster than the conventional CPA. As expected, the value of N1 directly affects the calculation time and it is effective with the broad range of number of power of trace, not only for the large number as in [10]. When the number of power traces is larger, with the appropriately chosen value of N1, the higher efficiency of the execution time can be achieved. The smaller N1 is chosen, the lower computation time is achieved. In this case, we only use the value of N1 = 1/3 N, however, depending on the standard deviation value of noise, N1 could be chosen smaller. According to Fig. 3, if the standard deviation is 0.14, the value of N1 = 1/4 N can be applied. Hence, the efficiency of execution time will be higher.

Fig. 4.
figure 4

Results of computation time for conventional and proposed CPA techniques, for power traces containing 5000 samples.

5 Conclusion and Future Work

In this paper, we have proposed a new CPA method in order to recover the secret key. This technique contains two phases, in which the first phase calculate the number of points of interest of all possible key bytes (very small compare to number of power trace samples). Then, the points of interest are used to compute the highest correlation value of all guess key bytes. The effectiveness of execution time is demonstrated in experiments with various number of power traces which have clarified the efficiency of the proposed method. In the future work, a mathematical representation and analysis will be investigated in detail.