Keywords

1 Introduction

Today’s ciphering algorithms such as AES are considered resistant to cryptanalysis. This means that the best possible way to extract a 128-bit key is about as complex as an exhaustive search over the \(2^{128}\) possibilities. With our current computational power, this is not achievable within a reasonable amount of time.

However, it is possible to use plaintexts, ciphertexts, along with additional side information in order to recover the secret key of a device. Indeed, the secret key may leak via side-channels, such as the time to compute the algorithm, the power consumption of the device during the computation of the algorithm, or the electro-magnetic radiations of the chip.

In order to secure chips from side-channel attacks, designers have to understand how these work and what could be the future security breaches in the cryptographic algorithm as well as in the hardware implementation. A preliminary step is to identify how the secret keys leak and deduce leakage models. Then, mathematical functions—called distinguishers—take the leakage as argument and return an estimation of the secret key. Such distinguishers come in many flavoursFootnote 1 and have different figures of merit in different contexts. A given context not only involves the cryptographic algorithm and the device through the leakage model, but also the side-channel acquisition setup through the measurement characterized by its signal-to-noise ratio (SNR). This is illustrated in Fig. 1 borrowed from Heuser et al. [12] (with our annotations ).

Fig. 1.
figure 1

Illustration of the two parts of the side-channel analysis context (). (Color figure online)

In practice one may encounter monobit leakages. This means that the output of the leakage model can only take two values. In this case, as we shall see, the mathematical computations turn to be simpler and information theoretic tools can be used to precisely describe the link between the leakage model and the real-world leaking traces. From another perspective, considering monobit leakages can also be seen as an “abstraction” trick meant to intentionally ignore the complex effect of the way the device leaks, thereby keeping only the contribution from the cryptographic algorithm in the leakage model.

A related question is how the choice of the substitution box in the cryptographic algorithm may “help” the attacker. The standard AES substitution box was designed to be very secure against linear and differential cryptanalysis [6]. On the contrary, under side-channel analysis, the substitution box may be helpful for the attacker, especially for monobit leakages as shown below.

Related Work. Distinguishers were often studied empirically, yet such an approach does not allow for generalizations to other contexts and measurement campaigns. A theoretical approach consists in analyzing the formal expressions of the distinguishers as mathematical functions. Fei et al. [8] have shown that distinguishers such as DoM and CPA can be expressed in terms of a confusion coefficient. They gave the impetus to extend this formal analysis to other types of distinguishers. In 2014, Heuser et al. [11] relate KSA to the confusion coefficient, and also noticed that the confusion coefficient can be related to the resistance of a substitution box against differential cryptanalysis.

Whitnall and Oswald [21] have proposed the relative distinguishing margin metric to compare distinguishers. However, it has been shown [18] that this metric may not be relevant in all contexts. Another way to compare distinguishers is to contrast how their success rate (SR) in key recovery depends on the number q of side-channel traces. Works such as [8, 14] provide mathematically models for the SR. But the comparison between different distinguishers has never been actually carried out based on such frameworks. Instead, we shall leverage on the so-called success exponent (SE) [10] which allows to compare the SR of various distinguishers based on only one exponent parameter.

Our Contributions. In this paper, we consolidate the knowledge about side-channel attacks exploiting monobit leakages. We provide a rigorous proof that any distinguisher acting on monobit leakages depends only on two parameters: the confusion coefficient and the noise variance. Some distinguishers, namely DoM, CPA and KSA, have already been expressed as a function of those two parameters [8, 11]. In this article, we derive this expression for MIA and we obtain a simple analytic function when the non zero values of the confusion coefficient are near , which is the case of leakages occurring at cryptographically strong substitution boxes [4].

We derive the success exponent of these distinguishers in terms of the confusion coefficient and the standard deviation of the noise. Success exponents allow to characterize the efficiency (in terms of number of traces) of distinguishers to recover the key. Our closed-form expressions of the success exponent enable the comparison of distinguishers based only on these two parameters. The flow chart of Fig. 2 situates our contributions in relation to the current state of the art.

Organization. The remainder of this paper is organized as follows. In Sect. 2, we recall the main definitions. In Sect. 3, we consider all distinguishers in one mathematical framework and we show that they are only functions of two parameters. In Sect. 4, we compare the distinguishers in terms of the success exponent. Section 5 concludes. Appendices provide proofs for technical lemmas.

Fig. 2.
figure 2

The state of the art in relation to our contributions (in yellow boxes—see also Tables 1 and 2 below). (Color figure online)

Notations. Throughout this paper, we use calligraphic letters to denote sets and lower-case letters for elements in this set (e.g. \(x\in \mathcal {X}\)). Capital letters denote random variables. For example, X is a random variable taking values in \(\mathcal {X}\) and \(x\in \mathcal {X}\) is a realization of X. The probability that X is x is noted \(\mathbb {P}(X=x)\) or simply \(\mathbb {P}(x)\) when there is no ambiguity. The expectation of a random variable is noted \(\mathbb {E}[X]\) and its variance \(\mathrm {Var}(X)\). The differential entropy h(X) of a random variable X following distribution p(x) is defined as

$$\begin{aligned} h(X) = -\int _{\mathbb {R}} p(x)\log _2 p(x) \,\mathrm {d}x. \end{aligned}$$
(1)

The mutual information between two random variables X and Y is defined as

$$\begin{aligned} I(X;Y) = h(X)-h(X|Y) = \mathbb {E}\biggl [\log _2 \frac{\mathbb {P}(X,Y)}{\mathbb {P}(X)\mathbb {P}(Y)} \biggr ]. \end{aligned}$$
(2)

2 Modelization and Definitions

2.1 The Leakage Model

In order to compare the different distinguishers for monobit leakages, we need a leakage model upon which our computations will be based. A plaintext t meets the secret key \(k^*\) through a leakage function \(f(t,k^*)\). The resulting variable \(y(k^*)\) is called the sensitive variable. The dependence in the plaintext t will be omitted to make equations easier to read when there is no ambiguity.

The attacker measures a noisy version of \(y(k^*)\) called trace and denoted by x. When the key is unknown, the attacker computes a sensitive variable with a key hypothesis k, that is, \(y(k)= f(t,k)\). Thus our model takes the form

(3)

where n is an independent measurement noise.

As we consider monobit leakages, we suppose that y(k) can take only two values. In practice, t (resp. k) are subsets of the full plaintext (resp. key). Typically, in the case of AES where attacks can be conducted using a divide-and-conquer approach on a per substitution box basis, t and k are 8-bit works (i.e., bytes).

The above leakage model can also be written using random variables. Let T the random variable for the plaintext, Y(k) for the sensitive variable, X for the measurement, and N for the Gaussian noise. We have:

(4)

In a view to simplify further mathematical computations, we suppose that the leakage random variable is reduced, that is, centered (\(\mathbb {E}[Y(k)]=0\) for all k) and of unit variance (\(\mathbb {E}[Y(k)^2]=1\) for all k). The noise is also assumed Gaussian of zero mean and its standard deviation is noted \(\sigma >0\). Moreover, we assume that for any key hypothesis the sensitive variable is balanced, that is, \(\mathbb {P}(y(k))=\frac{1}{2}\). Since Y(k) is a binary random variable, we necessarily have that \(Y(k)\in \{\pm 1\}\) in our model, and consequently the signal-to-noise ratio equals \(\textsf {SNR}=1/\sigma ^2\).

Last, we suppose that the attacker has at his disposal a number of q traces \(x_1,\ldots ,x_q\) obtained from leaking sensitive variables \(y_1(k^*), \ldots , y_q(k^*)\) under additive noise \(n_1, \ldots ,n_q\).

2.2 The Confusion Coefficient

In the side-channel context, the confusion coefficient was defined by Fei et al. as the probability that two sensitive variables arising from two different key hypotheses are different [8, Section 3.1]. Mathematically, the confusion coefficient is written as

$$\begin{aligned} \kappa (k,k^*) = \mathbb {P}(Y(k) \ne Y(k^*)). \end{aligned}$$
(5)

As the secret key \(k^*\) is constant and understood from the context, we can write \(\kappa (k,k^*) = \kappa (k)\). Notice that in practical situations, the EIS (Equal Images under different Subkeys [20, Def. 2]) assumption holds, therefore \(\kappa \) is actually a function of the key bitwise XOR difference \(k\oplus k^*\).

Figure 3 illustrates the confusion coefficient for a monobit leakage \(Y(k) = \mathsf {SubBytes}(T\oplus k) \bmod 2\), where SubBytes is the AES substitution box (application \(\mathbb {F}_2^8\rightarrow \mathbb {F}_2^8\)) and \(\oplus \) is the bitwise exclusive or. We notice that except for \(k = k^*\) (here taken \(= 178= \mathtt {0xb2}\)), the confusion coefficient for the AES SubBytes is close to . This results from the fact the AES SubBytes has been designed to be resistant against differential cryptanalysis. Specifically, Heuser et al. [11, Proposition 6] noticed that a “good” substitution box leads to confusion coefficients near .

Fig. 3.
figure 3

Confusion coefficient for the AES SubBytes Least Significant Bit (LSB)

The original definition of the confusion coefficient [8] considers only monobit leakages. An extension for any type of leakage was proposed in [10] where \(\kappa (k)\) is defined by

$$\begin{aligned} \kappa (k) = \mathbb {E}\biggl [ \Bigl (\frac{Y(k^*) - Y(k)}{2} \Bigr )^2 \biggr ]. \end{aligned}$$
(6)

Equation (5) can be easily recovered from this more general expression by noting that when Y(k) and \(Y(k^*)\in \{\pm 1\}\), \(\bigl (\frac{Y(k^*) - Y(k)}{2} \bigr )^2\) is 0 or 1 according to whether \(Y(k) = Y(k^*)\) or \(Y(k) \ne Y(k^*)\).

2.3 Distinguishers

Distinguishers aim at recovering the secret key \(k^*\) from the traces and the model. For every key k, the attacker computes the associated distinguisher. The key hypothesis that gives the highest value of the distinguisher is the estimated key. The attack is successful if the estimated key is equal to the secret key.

For every key hypothesis k, a distinguisher is noted \(\widehat{\mathcal {D}}(k)\) and the estimated key is \(\widehat{k} = \arg \max _k \widehat{\mathcal {D}}(k)\). Five classical distinguishers are:

  • Difference of Means (DoM) [8], also known as the Differential Power Analysis (DPA) [13] where the attacker computes

    $$\begin{aligned} \widehat{\mathcal {D}}(k) = \frac{\sum _{i|y_i(k)=+1}x_i}{\sum _{i|y_i(k)=+1}} - \frac{\sum _{i|y_i(k)=-1}x_i}{\sum _{i|y_i(k)=-1}}. \end{aligned}$$
    (7)
  • Correlation Power Analysis (CPA) [3] where the attacker computes the absolute value of the Pearson coefficient

    $$\begin{aligned} \widehat{\mathcal {D}}(k) =\biggl | \frac{\frac{1}{q}\sum _{i=1}^q x_i y_i(k) - \frac{1}{q}\sum _{i=1}^q x_i \cdot \frac{1}{q}\sum _{i=1}^q y_i(k)}{\sqrt{\mathrm {Var}(X)\mathrm {Var}(Y_i(k))}} \biggr |. \end{aligned}$$
    (8)

    Notice that \(\mathrm {Var}(Y_i(k))\) do not depend on the index i, since repeated measurements are i.i.d.

  • Euclidean distance, which corresponds to the Maximum Likelihood (ML) attack under the Gaussian noise hypothesis, where the attacker actually computes the negative Euclidean distance between the model and the trace

    $$\begin{aligned} \widehat{\mathcal {D}}(k) = -\frac{1}{q} \sum _{i=1}^q (x_i - y_i(k))^2. \end{aligned}$$
    (9)

    Maximizing the value of the distinguisher amounts to minimizing the Euclidean distance. According to [12], as the noise is Gaussian and additive, the Euclidean distance is the optimal distinguishing rule (ML rule) that maximizes the success probability.

  • Kolmogorov-Smirnov Analysis (KSA) [22] where the traces are used to build an estimation of the cumulative density function \(\widehat{F}(x)\), and the distinguisher is

    $$\begin{aligned} \widehat{\mathcal {D}}(k) = -\mathbb {E}_{Y(k)} \bigl [ \Vert \widehat{F}(x|Y(k)) - \widehat{F}(x) \Vert _{\infty } \bigr ] \end{aligned}$$
    (10)

    where the infinite norm is defined as \(\Vert \widehat{F}(x) \Vert _{\infty } = \sup _x |\widehat{F}(x) |\). Maximizing the value of the distinguisher amounts to minimizing the expected infinite norm.

  • Mutual Information Analysis (MIA) [9] where the attacker computes the mutual information between the traces and each model. The traces are used to build an estimation of the joint distribution of X and Y(k), denoted by \(\widehat{p}(X,Y(k))\), and with this estimation, we calculate the mutual information

    $$\begin{aligned} \widehat{\mathcal {D}}(k) = \sum _{x,y(k)}\widehat{p}(x,y(k))\log _2\frac{\widehat{p}(x,y(k))}{\widehat{p}(x)\cdot \widehat{p}(y(k))}. \end{aligned}$$
    (11)

Given the available data, the attacker computes the distinguisher as a function of \(x_1,\ldots ,x_q\) and \(y_1(k),\ldots ,y_q(k)\). To emphasize the dependence on the data, we may write \(\widehat{\mathcal {D}}(k)=\widehat{\mathcal {D}}(X_1,\ldots ,X_q,Y_1(k),\ldots ,Y_q(k))\). As these traces are realizations of random variables, we may also consider \(\widehat{\mathcal {D}}(k)\) as a random variable which is a function of \(X_1,\ldots ,X_q\) and \(Y_1(k),\ldots ,Y_q(k)\), with expectation \(\mathbb {E}[\widehat{\mathcal {D}}(k)]\) and a variance \(\mathrm {Var}(\widehat{\mathcal {D}}(k))\).

When the number of queries q tends to infinity, we assume that the distinguisher converges in the mean-squared sense:

Definition 1

(Theoretical Distinguisher [10]). The theoretical value of the distinguisher is defined as the limit in the mean square sense when \(q\rightarrow \infty \) of the distinguisher. The notation for the theoretical distinguisher is \(\mathcal {D}(k)\), which is therefore implicitly defined as:

$$\begin{aligned} \mathbb {E}[ (\widehat{\mathcal {D}}(k) - \mathcal {D}(k))^2] \longrightarrow 0\text { as } q\rightarrow \infty . \end{aligned}$$
(12)

Put differently, \(\widehat{\mathcal {D}}(k)\) can be seen as an estimator of \(\mathcal {D}(k)\). It is easily seen that as \(q\rightarrow +\infty \) the distinguishers presented previously have the following theoretical distinguishers:

  • For DoM, the theoretical distinguisher is

    $$\begin{aligned} \mathcal {D}(k) = \mathbb {E}[XY(k)]. \end{aligned}$$
    (13)
  • For CPA, the theoretical distinguisher is

    $$\begin{aligned} \mathcal {D}(k) = \frac{\bigl | \mathbb {E}[XY(k)] - \mathbb {E}[X]\mathbb {E}[Y(k)] \bigr |}{1+\sigma ^2}. \end{aligned}$$
    (14)
  • For Euclidean distance (ML) distinguisher, we have:

    $$\begin{aligned} \mathcal {D}(k) = -\mathbb {E}\bigl [(X-Y(k))^2 \bigr ]. \end{aligned}$$
    (15)
  • For KSA, we have:

    $$\begin{aligned} \mathcal {D}(k) = \mathbb {E}_{Y(k)} \bigl [ \Vert F(x|Y(k)) - F(x) \Vert _{\infty } \bigr ]. \end{aligned}$$
    (16)
  • For MIA, it is the mutual information

    $$\begin{aligned} \mathcal {D}(k)=I(X;Y(k)). \end{aligned}$$
    (17)

3 Theoretical Expressions for Distinguishers

In this section, we show that all distinguishers for monobit leakages are functions of only two parameters: the confusion coefficient \(\kappa (k)\) and the \(\textsf {SNR}=1/\sigma ^2\). This is confirmed by the closed-form expressions for classical distinguishers. In particular we derive the one corresponding to MIA.

3.1 A Communication Channel Between Y(k) and \(Y(k^*)\)

To understand the link between any sensitive variable Y(k) and the leaking sensitive variable \(Y(k^*)\), consider the following information-theoretic communication channel between these two variables described in Fig. 4. This communication channel is simply a theoretical construction that helps explain the link between Y(k) and \(Y(k^*)\), which are both binary and equiprobable random variables taking their values in \(\{\pm 1\}\). The parameters p and \(p'\) are the transition probabilities defined as \(p = \mathbb {P}(Y(k^*)=+1 | Y(k) = -1)\) and \(p'=\mathbb {P}(Y(k^*) = -1 | Y(k) = +1)\).

Lemma 1

The communication channel defined in Fig. 4 is a binary symmetric channel (BSC) with transition probability equal to the confusion coefficient \(\kappa (k)\).

Fig. 4.
figure 4

Abstract communication channel between Y(k) and \(Y(k^*)\)

Proof

To prove that the channel is symmetric, we show that both transition probabilities coincide: \(p=p'\). In fact, from Fig. 4, \(\tfrac{1}{2} = \mathbb {P}(Y(k^*)=1) = p\mathbb {P}(Y(k) = -1) + (1-p')\mathbb {P}(Y(k) = 1) = \tfrac{1}{2}(p+1-p')\) hence \(p=p'\). Now the confusion coefficient \(\kappa (k) = \mathbb {P}(Y(k) \ne Y(k^*))\) can be expanded as

$$\begin{aligned} \kappa (k)&= \tfrac{1}{2}\bigl ( \mathbb {P}(Y(k) \ne Y(k^*) | Y(k) = 1) + \mathbb {P}(Y(k) \ne Y(k^*) | Y(k) = -1) \bigr ) \end{aligned}$$
(18)
$$\begin{aligned}&= \tfrac{1}{2}\bigl ( \mathbb {P}(Y(k^*) =-1 | Y(k) = 1) + \mathbb {P}(Y(k^*) = 1 | Y(k) = -1) \bigr ) \end{aligned}$$
(19)
$$\begin{aligned}&= \tfrac{1}{2}\bigl ( p + p' \bigr ) = p = p' . \end{aligned}$$
(20)

This proves that the BSC has transition probability equal to \(\kappa (k)\).    \(\square \)

According to a well-known information theoretic result [5, p. 187], the Shannon’s capacity in bits per bit of this channel is

$$\begin{aligned} \mathcal {C} = 1 - H_2(\kappa (k)), \end{aligned}$$
(21)

where \(H_2(x)\) is the binary entropy function defined by

$$\begin{aligned} H_2(x) = x\log _2 \Bigl (\frac{1}{x} \Bigr ) + (1-x)\log _2 \Bigl (\frac{1}{1-x} \Bigr ). \end{aligned}$$
(22)

This is represented in Fig. 5 as a function of \(\kappa (k)\). Interestingly, the value corresponds to null capacity while the capacity is evidently 1 bit per bit for \(\kappa (k^*)=0\), since in this case the above communication channel reduces to the identity.

3.2 A General Result

We can now explain why all distinguishers for monobit leakages depend only on the two parameters \(\kappa (k)\) and \(\mathsf {SNR}=\sigma ^{-2}\).

Fig. 5.
figure 5

Representation of the channel capacity according to \(\kappa (k)\)

Theorem 1

Any theoretical distinguisher \(\mathcal {D}(k)\) for a binary leakage y can be expressed as a function of \(\kappa (k)\) and \(\sigma \).

Proof

Any theoretical distinguisher is defined in terms of the joint probability distribution of X and Y(k), noted p(xy(k)). Now for any \(x\in \mathbb {R}\) and \(y(k) =\pm 1\),

$$\begin{aligned} p(x,y(k))&= \mathbb {P}(y(k))\;p(x \mid y(k)) \end{aligned}$$
(23)
$$\begin{aligned}&= \frac{1}{2}p(y(k^*) + n \mid y(k)) \end{aligned}$$
(24)
$$\begin{aligned}&= \frac{1}{2}\sum _{y(k^*)}p(y(k^*) + n \mid y(k),y(k^*))\;\mathbb {P}(y(k^*)\mid y(k)) \end{aligned}$$
(25)

where \(\mathbb {P}(y(k^*)\mid y(k))\) is the transition probability of the channel defined in Fig. 4. There are two possibilities. Either \(y(k) = y(k^*)\), and in this case \(\mathbb {P}(y(k^*)|y(k)) = 1 - \kappa (k)\), or \(y(k) \ne y(k^*)\) and in this case \(\mathbb {P}(y(k^*) | y(k)) = \kappa (k)\). The sum over \(y(k^*)\) has two terms and both cases are represented. Moreover, the Gaussian noise is independent from every other random variable. Therefore, we have two possibilities for the joint probability:

$$\begin{aligned} p(x,y(k)) = {\left\{ \begin{array}{ll} \frac{1}{2} \Bigl ( \phi ( \frac{1+n}{\sigma })\kappa (k) + \phi (\frac{-1+n}{\sigma })(1-\kappa (k)) \Bigr ) \\ \frac{1}{2} \Bigl ( \phi (\frac{-1+n}{\sigma })\kappa (k) + \phi ( \frac{1+n}{\sigma })(1-\kappa (k)) \Bigr ) \end{array}\right. } \end{aligned}$$
(26)

where \(\phi (x)\) is the probability density function of a standard normal random variable. As the noise is centered and Gaussian, the only parameter that characterizes \(\phi \) is its standard deviation \(\sigma \). Therefore, a joint distribution of a monobit leakage is fully characterized by \(\sigma \) and \(\kappa (k)\).    \(\square \)

This proves that the knowledge of the confusion coefficient and the noise power are essential to predict the performances of the side-channel attacks for monobit leakages.

3.3 Classical Distinguishers as Functions of \(\kappa (k)\) and \(\sigma ^2\)

To highlight the result of Sect. 3.2, we compute the classical distinguishers according to the confusion coefficient and the noise power. As we mentioned in the introduction, some of them have already been expressed according to these variables: we recall these results in Table 1 with references to the articles where the expression of the distinguisher in terms of \(\kappa (k)\) is proven.

Table 1. Summary of classical distinguishers. Among all the classical theoretical distinguishers, we notice that the expression of the theoretical value of DoM with \(\kappa (k)\) does not depend on \(\sigma \).

The new results are given by the following lemmas.

Lemma 2

For monobit leakages, the Euclidean distance distinguisher can be expressed as:

(27)

Proof

We have \(\mathcal {D}(k) = -\mathbb {E}\bigl [ (X - Y(k))^2 \bigr ]=-\mathbb {E}\bigl [ (Y(k^*) - Y(k)+N)^2 \bigr ]=-\mathbb {E}\bigl [ (Y(k^*) - Y(k))^2 \bigr ]-\sigma ^2\) since the noise is independent from \(Y(k^*) - Y(k)\). Then by (6), where we have stressed the dependence in \(\nicefrac 12 - \kappa (k)\) as in Table 1.    \(\square \)

Lemma 3

For monobit leakages, when for \(k\ne k^*\), the MIA distinguisher can be expressed at first order as:

(28)

where

$$\begin{aligned} g(\sigma ) = \frac{1}{2} \mathbb {E}\Bigl [ \tanh ^2 \Bigl (\frac{Z}{\sigma }+\frac{1}{\sigma ^2} \Bigr ) + \tanh ^2 \Bigl ( \frac{Z}{\sigma } - \frac{1}{\sigma ^2} \Bigr ) \Bigr ] \end{aligned}$$
(29)

and \(Z\sim \mathcal {N}(0,1)\). The function g satisfies

$$\begin{aligned} \lim _{\sigma \rightarrow 0} g(\sigma ) = 1 \qquad \text {and}\qquad \lim _{\sigma \rightarrow \infty } \sigma ^2\times g(\sigma ) = 1. \end{aligned}$$
(30)

Proof

See Appendix A.    \(\square \)

Figure 6 plots the shape of \(g(\sigma )\) which tends to 1 when \(\sigma \rightarrow 0\) and is equivalent to \(\frac{1}{\sigma ^2}\) when \(\sigma \rightarrow \infty \).

When \(k=k^*\) the MIA distinguisher also has a simple expression since it reduces to the known expression of the channel capacity for channels with binary input and additive Gaussian noise [2, p. 274]:

$$\begin{aligned} \mathcal {D}(k^*) = \frac{1}{\sigma ^2} - \int _{\mathbb {R}}\frac{e^{-\frac{1}{2}y^2}}{2\pi }\log _2\cosh (\frac{1}{\sigma ^2}-\frac{y}{\sigma ^2})\mathrm {d}y. \end{aligned}$$
(31)
Fig. 6.
figure 6

Representation of \(g(\sigma )\)

Remark 1

With respect to their theoretical distinguishers, DoM is in bijection with the Euclidean distance, and CPA is in bijection with KSA. Indeed, the Euclidean distance is and \(\sigma \) is independent from the choice of the key. Therefore, there is a bijection between and which is the theoretical value of DoM. Regarding CPA and KSA, both distinguishers are functions of .

We also notice that MIA is in bijection with CPA (and therefore KSA). Indeed, according to the value of MIA with \(\kappa (k)\), the distinguisher is a function of which is in bijection with . This means that for monobit leakages, any attack that works with one of these distinguishers will also work with another, and vice versa.

4 Comparing Distinguishers with the Success Exponent

In the previous section, we have computed the theoretical values of the classical distinguishers in terms of \(\kappa (k)\) and \(\sigma \). Now, we wish to compare their success rate. As we mentioned Sect. 2.3, the attacker computes the estimated distinguisher \(\widehat{\mathcal {D}}(k)\) to recover the secret key. This is the main reason why all distinguishers do not perform equally in key recovery; indeed, they do not converge at the same speed towards their theoretical value.

In order to compare them, we have computed their success exponent, a metric proposed by Guilley et al. in [10] that evaluates how fast the success rate of a distinguisher converges to 100%. With a Gaussian assumption, they prove that the success rate can be modeled as

$$\begin{aligned} \mathsf {SR} = 1 - \exp (-q\times \mathsf {SE}) , \end{aligned}$$
(32)

where q is the number of traces and \(\textsf {SE}\in \mathbb {R}^+\) is the so-called success exponent. Therefore, the greater the success exponent is, the faster the convergence of the success rate.

Table 2. Success exponents for the classical distinguishers. The numerical values of SE are obtained for AES SubBytes least significant bit leakage model and noise of standard deviation \(\sigma =4\). Notice that in the monobit case, Euclidean distance and DoM have strictly the same success rate because \(-(X-Y(k))^2 = -X^2 + 2XY(k) -1\), and \(X^2\) is independent of the choice of the key.

We present the theoretical values of the success exponent for the different distinguishers in Table 2. As a direct consequence of Theorem 1, all of these success exponents are function of \(\kappa (k)\) and \(\sigma \). Therefore, if the attacker only knows the type of substitution box that is used and the SNR of the leakage, he can predict how fast he recovers the secret key.

Lemma 4

(Success exponent of CPA). The success exponent of CPAFootnote 2 is:

(33)

Proof

See Appendix B.    \(\square \)

Lemma 5

(Success exponent of KSA). Assuming that the distributions are estimated with the kernel method using Heaviside step function, the success exponent of KSA is

(34)

Proof

See Appendix C.    \(\square \)

Lemma 6

(Success exponent of MIA). When \(\sigma \gg 1\), the success exponent for an MIA computed with histograms is

$$\begin{aligned} \mathsf {SE} = \frac{4\log _2(e)^2}{\sigma ^4}\min _{k\ne k^*} \kappa (k)^2(1-\kappa (k))^2. \end{aligned}$$
(35)
Fig. 7.
figure 7

Success rate for classical distinguishers (\(\sigma = 4\))

Proof

See Appendix D.    \(\square \)

In order to validate our theoretical results, we have simulated attacks within the monobit model presented in Sect. 2. The success rates of these attacks are presented in Fig. 7. In this figure, we notice that, as expected, the Euclidean distance (ML) is the best distinguisher, closely followed by CPA. Both have similar same success rate. The small difference is due to the use the the absolute values in the distinguishing function of CPA (see discussion in Remark 9 of [12]). The KSA is requiring a bit less than the double of traces, compared to Euclidean distance, DoM and CPA. The MIA performs really bad compared to the other distinguishers. Error bars represent the inaccuracy while estimating the SR (here, we ran 100 simulations).

These simulations are therefore in complete coherence with the theoretical results of Table 2. Indeed, the order of the distinguishers is the same w.r.t. the success rate and w.r.t. the success exponent. In addition, according to the definition of the success exponent \(\mathsf {SE}\) in (32), the number of traces q to reach a given success rate (e.g., \(\mathsf {SR}=80\%\)) is proportional to the inverse of \(\mathsf {SE}\). This quantitative law is satisfied in the simulation of Fig. 7.

5 Conclusion

In this paper, we have mathematically proven that only two parameters, the confusion coefficient and the SNR, determine the side-channel distinguishing efficiency for monobit leakages. Both of them are easy to compute because the confusion coefficient can be calculated with the knowledge of the operating substitution box and the SNR can be measured offline.

Our work is useful to predict how fast a distinguisher will succeed to recover the secret key. Long and painful simulations can be advantageously replaced by the computation of the success exponent using closed-form expressions.

This paper also consolidates the state of the art about the classical distinguishers, especially for MIA and KSA. We have derived the success exponent for these two distinguishers as a function of the confusion coefficient and the standard deviation of the noise.