Codes for Side-Channel Attacks and Protections

Guilley, Sylvain; Heuser, Annelie; Rioul, Olivier

doi:10.1007/978-3-319-55589-8_3

Sylvain Guilley^16,17,
Annelie Heuser¹⁸ &
Olivier Rioul¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 10194))

Included in the following conference series:

International Conference on Codes, Cryptology, and Information Security

Abstract

This article revisits side-channel analysis from the standpoint of coding theory. On the one hand, the attacker is shown to apply an optimal decoding algorithm in order to recover the secret key from the analysis of the side-channel. On the other hand, the side-channel protections are presented as a coding problem where the information is mixed with randomness to weaken as much as possible the sensitive information leaked into the side-channel. Therefore, the field of side-channel analysis is viewed as a struggle between a coder and a decoder. In this paper, we focus on the main results obtained through this analysis. In terms of attacks, we discuss optimal strategy in various practical contexts, such as type of noise, dimensionality of the leakage and of the model, etc. Regarding countermeasures, we give a formal analysis of some masking schemes, including enhancements based on codes contributed via fruitful collaborations with Claude Carlet.

Access provided by CONRICYT-eBooks. Download conference paper PDF

Analysis of a Code-Based Countermeasure Against Side-Channel and Fault Attacks

Toward finding best linear codes for side-channel protections (extended version)

Article 15 November 2022

Statistical properties of side-channel and fault injection attacks using coding theory

Article 14 December 2017

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Digital information is handled by electronic devices, such as smartphones or servers. Some information, such as keys, is sensitive, in the sense that it shall remain confidential. In general, information is present in three states within devices: at rest, in transit, and in computation. The protection of information at rest can be ensured by on-chip encryption in the memories. The same technique applies to the data in transit: the buses can be encrypted (e.g., in a lightweight way, in which case one uses the term scrambling). Therefore, the protection of information during computation is the big issue to be dealt with. It is a real challenge, as a computing devices inadvertently leak some information about the data they manipulate. In this context, three questions are of interest:

1.
How does an attacker best exploit the leaked information? The situation is similar to that of a decoding problem, and one aims at finding the optimal decoder.
2.
Second, the designer (and the end user) aim at being protected against such attacks. Their goal is thus to try and weaken the side-channel. Randomization is one option, referred to as masking in the literature. We will illustrate that it can be seen as the use of code to optimally mix some random bits into the computations, with the possibility to eventually get rid off this entropy, e.g., at the end of the computation. Another interesting usage of codes is to detect faults in circuits. This dual use of codes is of interest in general security settings, where attacks can choose to be either passive or active. It is also very relevant in the case the circuit is trapped with a Hardware Trojan Horse.
3.
Third, it is interesting to know in which respect the circuit leakage favors or not attacks. In particular, we will investigate the effect of glitches as a threat to masking schemes.

Outline. We start with the adversarial strategies in Sect. 2. Protection strategies, especially masking, are presented in Sect. 3. We will show how the circuit itself can contribute to the attack, through the analysis of glitches, in Sect. 4. Conclusions are in Sect. 5. Eventually, Appendix A gives some computation evidences why masking protection can be seen as reducing the signal-to-noise ratio, by increasing the noise.

2 Side-Channel Analysis as a Decoding Problem

In this section, we first describe the setup and the objective of the attacker. Second, we solve the objective of the attacker in various different setups.

2.1 Setup

We assume the device manipulates some data known by the attacker, such as a plaintext or a ciphertext, called T. This data is mixed with some secret, say a key $k^*$. The attacker manages to capture some noisy function of T and $k^*$, and attempts to extract $k^*$. For this purpose, he will enumerate (manageable) parts of the key (e.g., bytes), denoted k, and choose the key candidate $\hat{k}$ which is the most likely. Therefore, the attack resembles a communication channel, where the input is $k^*$ and the output is $\hat{k}$. The attack is termed successful if $\hat{k}=k^*$.

Two kinds of leakage models are realistic in practice:

1.
direct probing model, where the attacker uses some kind of probes, each being able to measure one bit,
2.
indirect measurement of an aggregated function of the bits, using for instance an electromagnetic probe.

These two ways of capturing the signal are, by nature, very different. They are illustrated in Fig. 1.

The first one is noiseless. However, the bits in integrated circuits are nanometric, whereas probes are mesometric. Therefore, only few such probes can be used simultaneously. The security parameter is thus linked to the ability for the attacker to recover some useful information out of d probes (where d is typically 1, 2, 3 or 4). Besides, the probing requires a physical access to the wires, which is challenging, since it is possible that the contact breaks the bit to be probed. Such attack is termed semi-invasive, since it leaves an evidence that the circuit has been tampered with (an opening is necessity to insert the probe).

The second one is noisy and also leaks some function of the bits. Therefore, the attacker needs to capture more than one trace to extract some information. This is why we model, in the sequel, traces by random variables. By convention, the variables are printed with capital letters, such as X, when designating a random variable, and with small letters, such as x, when designating the realization of random variables. We also denote by Q the number of queries (= of measurements), and by $\mathbf {x}=(x_1,\ldots ,x_Q)$ the vector of measurements. This attack will require a statistical analysis, which in general consists in the study of the leakage probability distribution. This starts in general by the analysis of the leakage moments.

We will link the two models in the case of RSM countermeasure (Sect. 3.5). The next Sect. 2.2 discusses the channel $k^\star \rightarrow \hat{k}$, for the second case.

2.2 Example of AWGN Channel

The key recovery setup is illustrated in Fig. 2 (see Fig. 1 in [24]). When the noise is Gaussian and independent from one measurement to others, it is referred to as AWGN (Additive white Gaussian noise). We write:

$$\begin{aligned} X = y(T,k^*) + N, \qquad \text {where } N\sim \mathcal {N}(0,\sigma ^2). \end{aligned}$$

(1)

The random variable $y(T,k^*)$ is the aggregated leakage model, and N is the noise (independent from Y). Let n the bitwidth of the key k and of the texts T. The function $y:\mathbb {F}_2^n\times \mathbb {F}_2^n\rightarrow \mathbb {R}$ is, in practice, the composition two functions $y=\varphi \circ f$, where:

f is an algorithmic function called sensitive variable, such as $f(T,k^*) = S(T\oplus k^*)$, where S is a substitution box, and
$\varphi : \mathbb {F}_2^n\rightarrow \mathbb {R}$ accounts for the way the sensitive variable leaks, such as the Hamming weight $\varphi : z\mapsto w_H(z)=\sum _{i=1}^n z_i$.

2.3 Absence of Countermeasures

The optimal distinguisher is the key guess $\hat{k}$ which maximizes the success probability, that is the probability that $\hat{k}$ is actually $k^*$.

When there is no protection, all the uncertainty resides in the measurement noise. Thus, as the attacker knows T, he also knows $Y=Y(T,k)$ (for all key guess k).

Theorem 1

([24, Theorem 4]). In the AWGN setup, the optimal distinguisher is demonstrated to be equal to:

$$\begin{aligned} \mathcal {D}_{opt}(\mathbf {x},\mathbf {t}) = {{\mathrm{argmin}}}_k \left\| \mathbf {x}-\mathbf {y}(\mathbf {t},k) \right\| _2^2 = {{\mathrm{argmax}}}_k \langle \mathbf {x} | \mathbf {y}(\mathbf {t},k) \rangle -\frac{1}{2} \left\| \mathbf {y}(\mathbf {t},k) \right\| _2^2 , \end{aligned}$$

(2)

where $\left\| \cdot \right\| _2$ is the Euclidean norm and $\langle \cdot | \cdot \rangle $ is the canonical scalar product.

2.4 Multivariate and Multimodel Setting

In the multivariate and multimodel case, the attacker is able to collect:

not only one sample, but D (dimensionality) samples, and
each function of the bits (e.g., $z\mapsto 1$, $z\mapsto z_i$ for $1\le i\le n$, but also any selection of $z\mapsto \bigwedge _{i\in I} z_i$ where $I\subseteq \mathbb {F}_2^n$) has a different contribution.

We call S the number of models, and $\alpha $ the $D\times S$ matrix of the leakages, such that Eq. (1) is generalized as:

$$\begin{aligned} \mathbf {X} = \alpha \mathbf {y}(\mathbf {T},k^*) + \mathbf {N}, \qquad \text {where } \mathbf {N}\sim \mathcal {N}(\mathbf {0},\varSigma ), \end{aligned}$$

(3)

where $\mathbf {N}$ is multivariate normal of $D\times D$ covariance matrix $\varSigma $, and $\mathbf {Y}=\mathbf {y}(\mathbf {T},k^*)$ is set of S models (e.g., $S=1$ if the leakage model is the Hamming weight, or $S=n+1$ if there is a non-zero offset (such offset is modeled by $z\mapsto 1$) and each bit $1\le i\le n$ of the leakage model leaks differently). In this case also, boldface variables are vectorial (either multivariate or multimodel).

We have a generalization of Theorem 1:

Theorem 2

([7, Theorem 1]). Let us define $\mathbf {x}'=\varSigma ^{-1/2}\mathbf {x}$ and $\alpha '=\varSigma ^{-1/2}\alpha $. Then, in the multivariate and multimodel AWGN setup, the optimal distinguisher is demonstrated to be equal to:

$$\begin{aligned} \mathcal {D}^{D,S}_{opt}(\mathbf {x},\mathbf {t})&= {{\mathrm{argmin}}}_k \ \sum _{d=1}^D \Vert \mathbf {x}'_d-\alpha '_d\mathbf {y}((\mathbf {t},k) \Vert _2^2 \\&= {{\mathrm{argmax}}}_k \ {{\mathrm{tr}}}\left( \mathbf {x}' {\left( \alpha ' \mathbf {y}(\mathbf {t},k) \right) }^\mathsf {T}\right) -\frac{1}{2} \left\| \alpha ' \mathbf {y}(\mathbf {t},k) \right\| _F^2 , \end{aligned}$$

where ${{\mathrm{tr}}}\left( \cdot \right) $ is the trace operator of a square matrix and $\Vert \cdot \Vert _F$ is the Frobenius normal of a (rectangular) matrix.

2.5 Collision

In some situations, the attacker does not know the leakage function $y=\varphi \circ f$, but knows that it is reused several times for different bytes, say $L>1$. We denote by $x^{(\cdot )}=(x^{(1)},\ldots ,x^{(\ell )},\ldots ,x^{(L)})$ the L leakages. Therefore, the optimal attack consists in a collision attack where all the coefficients of the leakage function are regressed.

Theorem 3

([5, Theorem 2.5]). The optimal collision attack is:

$$\begin{aligned} \mathcal {D}^L_{opt}(\mathbf {x}^{(\cdot )}, \mathbf {t}^{(\cdot )})&= {{\mathrm{argmax}}}_{k^{(\cdot )}\in (\mathbb {F}_2^n)^L} \quad \sum _{u\in \mathbb {F}_2^n} \frac{\left( \sum _\ell \sum _{q / t_q^{(\ell )} \oplus k^{(\ell )} = u} \ x^{(\ell )}_q \right) ^2}{\sum _\ell \sum _{q / t_q^{(\ell )} \oplus k^{(\ell )} = u} \ 1} . \end{aligned}$$

Notice that in general, this attack allows to recover $(L-1)$ n-bit keys when the collision is involving L samples with identical leakage model.

2.6 General Setting, with Countermeasures

In general, the device defends itself, by the implementation of protections. Masking is one of them. In the expression of y, in addition to T and k, another random variable M is introduced, called the mask, unknown to the attacker. It is usually assumed that it is uniformly distributed.

Theorem 4

([8, Proposition 8]). The optimal attack in case of masking countermeasure is:

$$\begin{aligned} \mathcal {D}^{M;L}_{opt}(\mathbf {x}^{(\cdot )}, \mathbf {t}^{(\cdot )}) = {{\mathrm{argmax}}}_k \ \sum _{q=1}^Q \log \left\{ \sum _{m} \exp \Bigl \{ \sum _{d=1}^D \frac{1}{\sigma ^{{(d)}^2}}\bigl (x_q^{(d)}y_q^{(d)}-\frac{1}{2}{y_q^{(d)}}^2\bigr )\Bigr \} \right\} , \end{aligned}$$

assuming that the noise at each sample d is normal of variance $\sigma ^{{(d)}^2}$.

2.7 Link Between Success Probability, SNR and Leakage Function

The optimal distinguishers $\mathcal {D}_\text {opt}$ given in various scenarios ( $\mathcal {D}_\text {opt}$ for nominal case in Sect. 2.3, $\mathcal {D}^{D,S}_\text {opt}$ for multivariate and multimodel case in Sect. 2.3, $\mathcal {D}^L_\text {opt}$ for the collision case in Sect. 2.5, and $\mathcal {D}^{M;L}_\text {opt}$ for the masked case in Sect. 2.6) allow to recover the secret key with the largest success rate (denoted as SR), but do not help in predicting the number of traces to reach a given success rate (or vice-versa).

Such relationship can be easily derived from the analysis of so-called first-order exponents [23]. Let us denote $\mathcal {A}_\text {opt}(\mathbf {x}, \mathbf {t}, k)$ the argument of maximization in either of $\mathcal {D}_\text {opt}$, $\mathcal {D}^{D,S}_\text {opt}$, $\mathcal {D}^L_\text {opt}$ or $\mathcal {D}^{M;L}_\text {opt}$. We have:

Theorem 5

([23, Corollary 1]).

$$\begin{aligned} 1-\mathrm {SR}(\mathcal {D}) \approx e^{ - Q \cdot \mathrm {SE}(\mathcal {D}) } \end{aligned}$$

(4)

where the first-order success exponent $\mathrm {SE}(\mathcal {D})$ is equal to:

$$\begin{aligned} \mathrm {SE}(\mathcal {D}) = \frac{1}{2}\ \min _{k\ne k^*}\ \frac{\bigl ( \mathcal {A}_{opt}(\mathbf {x}, \mathbf {t}, k^*)-\mathcal {A}_{opt}(\mathbf {x}, \mathbf {t}, k)\bigr )^2}{\mathrm {Var}\bigl ( \mathcal {A}_{opt}(\mathbf {x}, \mathbf {t}, k^*)-\mathcal {A}_{opt}(\mathbf {x}, \mathbf {t}, k)\bigr )}. \end{aligned}$$

(5)

For the sake of the introduction of a signal-to-noise, we rewrite Eq. (1) as:

$$\begin{aligned} X = \alpha y(T,k^*) + N, \text { where } {{\mathrm{\mathbb {E}}}}(y(T,k^*))=0, \mathrm {Var}(y(T,k^*))=1 \text { and } N\sim \mathcal {N}(0,\sigma ^2). \end{aligned}$$

Let us introduce generalized confusion coefficients [20]:

Definition 6

(General 2-way confusion coefficients [23, Definitions 8 and 10]). For $k\ne k^{*}$ we define

$$\begin{aligned} \kappa (k^*,k)&= {{\mathrm{\mathbb {E}}}}\Bigl \{\Bigl (\frac{Y(k^{*})-Y(k)}{2}\Bigr )^2\Bigr \} , \end{aligned}$$

(6)

$$\begin{aligned} \kappa '(k^*,k)&= {{\mathrm{\mathbb {E}}}}\Bigl \{\Bigl (\frac{Y(k^{*})-Y(k)}{2}\Bigr )^4\Bigr \} . \end{aligned}$$

(7)

For example, for the optimal distinguisher in the nominal case, the success exponent expression is:

Lemma 7

(SE for the optimal distinguisher, [23, Proposition 5]). The success exponent for the optimal distinguisher takes the closed-form expression

$$\begin{aligned} \mathrm {SE}(\mathcal {D})&=\frac{1}{2}\ \min _{k\ne k^*} \; \frac{\alpha ^2\kappa ^2(k^*,k)}{\sigma ^2 \kappa (k^*,k) + \alpha ^2(\kappa '(k^*,k) - \kappa (k^*,k)^2}. \end{aligned}$$

(8)

This closed-form expression simplifies for high noise $\sigma \gg \alpha $ in a simple equation:

Corollary 8

([23, Corollary 2]).

$$\begin{aligned} \mathrm {SE}(\mathcal {D}) \ \approx \ \frac{1}{2} \min _{k\ne k^*} {\frac{\alpha ^2 \kappa ^2(k^*,k)}{\sigma ^2 \kappa (k^*,k)}} \ =\ \frac{1}{2} \cdot \text {SNR} \cdot \min _{k\ne k^*} \kappa (k^*, k), \end{aligned}$$

(9)

where $\mathrm {SNR}=\alpha ^2/\sigma ^2$ is the signal-to-noise ratio (see [6] for the definition of SNR in the multivariate case).

3 Side-Channel Protection

Side-channel attacks threaten the security of cryptographic implementations. Protections against such attacks can be devised using the coding theory. We illustrate in this section several techniques which randomize leakages in a view to decorrelate them from the internally manipulated data, and that (in some cases) also allow to detect malicious fault injections.

3.1 Strategies to Thwart Side-Channel Attacks

As discussed in Sect. 2.7 (especially in (9)), the success of an attack is all the larger as the leakage function has a higher confusion (6) and the SNR is high. However, the input of confusion is limited, since $0\le \min _{k\ne k^*} \kappa (k^*, k)\le 1/2$ is bounded. Moreover, the defender cannot always change the algorithm nor its leakage model, that is $\min _{k\ne k^*} \kappa (k^*, k)$ is fixed. Thus, the defender is better off focusing on the reduction of the SNR.

This can be achieved in two flavors:

1.
reduce the signal, as done in strategies aiming at flattening the leakage. This is easily achieved for some side-channels, such as timing: the execution time is made constant, e.g., by inserting dummy instructions or by balancing the code in each branch when the control flow forks. However, balancing an analogue quantity (such as power or electromagnetic field) is more challenging, let alone because of process variations, two identical gates or structures behave differently after fabrication. For instance, this is the working factor of physically unclonable functions (PUFs). Therefore, the quality of the protection depends on the ability of the fabrication plant to produce reproducible patterns. This fact naturally limits the quality of the designer’s work, hence does not encourage to reach very high levels of security. In case this case, the second option is preferred;
2.
increase the noise, by resorting to some extra random variables independent of that involved in the leakage function. Obviously, some artificial noise can be easily produced: one practical example consists in running an algorithm known to produce a lot of leakage (such as an asymmetrical engine, e.g., RSA) in parallel to the algorithm to protect. However, there remains the risk that the attacker manages, by a subtle placement of the probes, to limit or completely avoid the externally added noise; imagine an attacker with a very selective electromagnetic probe which would place its probe over the targetted algorithm, which is micrometers apart from the noise source (RSA). Therefore, it sounds wiser to entangle the computation and the random variables. This is what is achieved by so-called masking schemes. Appendix A explains why masking reduces the SNR.

Notice that the two strategies are orthogonal, that is, it is beneficial to employ them at the same time. Still, in the sequel, we will focus on masking, since it allows (at least in theory) to increase the noise at the maximal extent.

3.2 Masking Schemes

Masking schemes have been introduced to obfuscate the internals of a computation, in a view to make it more difficult to be attacked. The strategy in masking is based on randomization:

for data (e.g., in algorithms with constant-execution flow, such as AES), and
for operations (e.g., in algorithms where the sequence of operations leak some secrets, such as RSA).

In practice, a masking scheme consists in four algorithms, as depicted in Fig. 3.

Initially, the input data must be masked, thanks to a first algorithm. Second, the masked data is manipulated, so as to implement the intended cryptographic operation. Many techniques exist. One way to envision masking is to see all the operations making up the cryptographic function as look-up tables. In this case, the masked look-up tables can be implemented as [37, Table 1]:

new larger look-up tables, where the masking material is now part of the addressing strategy,
table recomputation specifically for the current mask, or
computation style which is able to operate on masked data.

After the operation has been computed, it can be necessary to refresh the masks. Indeed, if the value is intended to be used more than once, then some masks would be duplicated during the computation. It is thus wise to re-randomize the current masks. Eventually, at the end of the computation, the masked data shall be freed from its mask. Hence a demasking step. The first three algorithms require entropy, whereas the last one destroys entropy.

3.3 Security of Masking Schemes

It is easy to measure the amount of entropy consumed by a masking scheme (see top of Fig. 3). However, this does not obviously reflect its actual security level. Indeed, the entropy can be wasted, e.g., by being badly used: XORing together entropy reduces it, while bringing no additional difficulty for the attacker.

The first attempt to measure security arise from [1, Definition 1]. The order is defined as the minimum number of intermediate values an attacker must collect to recover part of the secret. In this framework, the overall security is that of the weakest link.

Still, the exact definition of an intermediate variable is unclear. The difficulty arises from the fact the designer would like to link the security to properties of its design. However, the intermediate variables encompass different notions depending on the refinement stage: after compilations, variables are mapped to internal resources. Thus, the granularity [1, Sect. 3] can change between the cryptographic algorithm specification, the source code, the machine code, and what is actually executed on the device.

Some early works considered intermediate values are bits, such as in private circuits [25, 26]. This makes sense for hardware circuits, for which (in general CMOS processes) an equipotential has only two licit values, that is carries one bit. However, private circuits have been extended to software implementations (see e.g. [40]), where intermediate variables become bitvectors of the machine word length. But after considering some new threats, such as glitches, a new trend has consisted in looking back to bit-oriented masking. This is typically the case of threshold implementations [35], where the granularity is again the bit.

In this article, we are interested with the lowest possible level of security analysis, hence we consider that intermediate variables are bits.

3.4 Orthogonal Direct Sum Masking (ODSM), a Masking Scheme Based on Codes

We illustrate in this section several masking schemes, and show in which respect they relate to coding theory.

We will show that the two security notions related to masking (probing and bounded-moment models) are equivalent when conducting analyses at bit-level. We model a circuit as a parallel composition of bits, seen as elements of $\mathbb {F}_2$. The exemple, when there are n wires in the circuit, we model the circuit state as an element of $\mathbb {F}_2^n$, that is the Cartesian product $\mathbb {F}_2 \times \ldots \times \mathbb {F}_2$.

At this stage, we use the following new notations. Let X a k-bit information word to be concealed. Let Y an $(n-k)$-bit mask used to protect X. The protected variable is $Z=X G + Y H$, where:

G is an $k\times n$ generating matrix of a code,
H is an $(n-k)\times n$ generating matrix of a code of dual distance $d+1$,
$+$ is the bitwise addition in $\mathbb {F}_2^n$, sometimes also denoted by $\oplus $.

The random variable YH is the mask. In practice, the bits making up Z can be manipulated in whatever order, i.e., they can even be scheduled to be manipulated one after the other, like in a bitslice implementation. We call Z an encoding with codes, or ODSM [3].

Then, we have the following twain theorems.

Theorem 9

Encoding with codes is secure against probing of order d.

Proof

By definition of a code of dual distance $d+1$, any tuple of less than d coordinates is uniformly distributed [9]. Thus, if the attacker probes up to d (inclusive) wires, this word seen as an element of $\mathbb {F}_2^d$ is perfectly masked. Therefore, no information on X can be recovered. $\square $

Theorem 10

(Masking with codes is d -th order secure in the bounded-moments model). For all pseudo-Boolean function $\psi :\mathbb {F}_2^n\rightarrow \mathbb {R}$ (leakage function, denoted $y=\varphi \circ f$ in Sect. 2.2) of degree $d^\circ (\psi ) \le d$, we have

$$\begin{aligned} \mathsf {Var}(\mathbb {E}(\psi (X G + Y H | X))) = 0 . \end{aligned}$$

(10)

Proof

Let $\psi '$ the indicator of the code generated by H. Since H has dual-distance $d+1$, we have that for all $z\in \mathbb {F}_2^n$, $0<w_H(z)\le d$, $\hat{\psi '}(z)=0$, where $\hat{\psi '}(z)=\sum _{z'\in \mathbb {F}_2^n} \psi '(z) (-1)^{z' \cdot z}$. Now, owing to Lemma 1 in [4], we also know that for all $z\in \mathbb {F}_2^n$, $w_H(z)>d^\circ (\psi )$, $\hat{\psi }(z)=0$.

Now, we must prove that $\mathsf {Var}(\mathbb {E}(\psi (X G + Y H | X))) = 0$, that for all $x\in \mathbb {F}_2^k$, $\sum _{y\in \mathbb {F}_2^{n-k}} \psi (x G + y H) = \sum _{z\in \mathbb {F}_2^n} \psi (x G + z) \psi '(z) = (\psi \otimes \psi ')(x G)$ is the same, where $\otimes $ is the convolution product.

Actually, we can prove more than that, namely that $\psi \otimes \psi '$ is constant on the full $\mathbb {F}_2^n$. This is equivalent to proving that $\widehat{\psi \otimes \psi '} = \hat{\psi } \hat{\psi '}$ is equal to zero on $\mathbb {F}_2^n {\setminus }\{0\}$. Indeed, let $z\in \mathbb {F}_2^n$, $z\ne 0$. If $w_H(z)>d^\circ (\psi )$, then $\hat{\psi }(z)=0$. And if $w_H(z)\le d^\circ (\psi )\le d$, then $\hat{\psi '}(z)=0$. So, in both cases, we have $\hat{\psi }(z) \hat{\psi '}(z) = 0$. $\square $

Notice that the function $\psi : \mathbb {F}_2^n\rightarrow \mathbb {R}$ such that $\psi (x)=\sum _{i=0}^{n-1} x_i 2^i$, has degree one. It is sometimes (abusively) referred to as the identity function. Obviously, if the attacker gets to know $\psi (Z)$, then he can recover Z, hence deduce X by projection on subspace vector C. But this is not our security hypothesis. Our result from Theorem 10 (and in particular its Eq. (10)) is that the inter-class variance of $\psi (Z)$ knowing X is equal to zero, for all $d^\circ (\psi )\le d$.

In Eq. (10), the degree of $\psi $ can be accounted by two reasons:

1.
High-order leakage in $y=\varphi \circ f$, owing to glitches (see Sect. 4), capacitive coupling, IR drop, etc. (refer to [18, Sect. 4.2]);
2.
Combination function from the attacker, which can be: multivariate (which involved a product of shares), monovariate (hence necessarily high-order zero-offset).

As another remark, we notice that, although it is not strictly mandatory, the randomized variable Z can be manipulated by subwords, a bit like for classical masking, where the subwords coincide with shares.

Let us give the example of the look-up table, in the case $k=8$ and $n=16$. We know that we can reach 4-th order security [4]. But we can decide not to manipulate only Z as such, but to cut it into two parts, $Z=(Z_H, Z_L)$, where $Z_H, Z_L\in \mathbb {F}_2^8$. This cut is motivated by the adequation between the masking scheme and the machine architecture, where maybe the basic register size is 8 bits. Then, we also cut the T-table(s) into two tables, namely $T_H$ and $T_L$, both of 256 bytes. The Algorithm 1 allows to evaluate the T-table using bytes only, i.e., without placing $Z_H$ and $Z_L$ side-by-side for all data Z.

3.5 Illustration for Some Coding-Based Masking Schemes

In the previous section, we have shown with Theorems 9 and 10 that the two models (bit-level probing and bounded moments) are equivalent, which motivates to consider the probing model at bit level (as opposed to at word level, as done in many papers (to cite a few: [16, 19]). We give hereafter some examples of masking with codes at bit-level.

Perfect Masking. The masks $M_1$, $M_2$, etc. are chosen uniformly in $\mathbb {F}_2^k$. We assume here that k|n. It is possible to see perfect masking as a special case of ODSM [3], where:

(11)

Rotating Substitution-Box Masking (RSM [32]). Let us illustrate RSM on $n=8$ bits. The mask M is chosen uniformly in:

the set $\mathcal {C}_0 = \{\mathtt {0x00}\}$ for no resistance,
the set $\mathcal {C}_1 = \{\mathtt {0x00},\mathtt {0xff}\}$ for resistance to first-order attacks,
the set $\mathcal {C}_2$, a non-linear code of length 8, size 12 and dual distance $d^\perp _{\mathcal {C}_2}=3$,
the set $\mathcal {C}_3$, a linear code of length 8, dimension 4 and dual distance $d^\perp _{\mathcal {C}_3}=4$. This code is fully described in [15]. It is a self-dual code of parameters [8, 4, 4].

The case $\mathcal {C}_3$ is interesting since there are sixteen masks, hence (in hardware), the sixteen Substitution-boxes (S) of an algorithm such as AES can be implemented masked. When $\varphi =w_H$ and $Z=f(T, k^*)=S(T\oplus k^*)$, then the leakage distributions $X=\varphi (Z\oplus M)$ are represented in Fig. 4.

RSM involves a random index, that is the choice of the initial codeword in $\mathcal {C}_d$, for a protection order of d. This choice can be done in a leak-free manner by using a one-hot representation. In the case of $\mathcal {C}_3$, sixteen such indices can be selected. The one-hot representation is given in Fig. 5. The random index is selected at random initially; then, from round to round, it is simply shifted.

Leakage Squeezing (LS). In leakage squeezing, the shares are like for perfect masking, except that some bijective functions are applied to the them, thereby mixing bits better [10, 12, 13, 17].

Results. For the illustration of the bounded moment model, we use for our illustrations the Hamming weight leakage model. Notice that any other first-order leakage model would yield comparable results.

Also, we illustrate the leakage based on two extreme plaintexts, that is 0x00 and 0xff. However, in some situations, these two plaintexts lead to the same leakage (e.g., for symmetry reasons).

In all the presented schemes, security holds only provided there is no high-order leakage. Said differently, it is possible to consider that there is a high-order leakage. For instance, in recap Fig. 6, the indicated security order is the attack total order. The total attack order is the sum of multiplicative contribution from the hardware and the operations carried out by the attacker. That is, poor hardware which couples bits contributes to facilitates attacks by combining bits.

3.6 Masking and Faults Detection

Codes are also suitable tools when both side-channel leakage must be masked and faults must be detected. This need is general in cryptography, and has specific applications when thwarting Hardware Trojan Horses (HTH) [11, 33, 34]. Indeed, the activation part of a HTH is impeded by masking, whereas the payload part is caught red-handed by a detection code.

4 Leakage Model, and Glitches

The term glitch refers to a non-functional transition(s) occurring in combinational logic. They exist because combinational gates are non-synchronizing, i.e., they evaluate as soon as one input arrive. In terms of hardware description languages (VHDL, Verilog, etc.), they are modelled as processes where all inputs belong to the sensitivity list. Thus, for the vast majority of gates with many inputs, there is the possibility of a race between the inputs. Therefore, some gates can evaluate several times within one clock period. Actually, the deeper the combinational gates, the more likely it is that:

there is a large timing difference between the inputs, thereby generating new glitches, and
some input is already the output of a glitching gate, thereby amplifying the number of glitches.

It is known that glitches can defeat masking schemes [28,29,30]. Some masking schemes which somehow tolerate [21, 22, 35, 39] or avoid glitches [27, 31] have been put forward. However, the real negative effect of glitches on security is usually perceived in a qualitative manner.

Therefore, we would like to account quantitatively for the effect of glitches. Let us start by an illustrative example, provided in Fig. 7. The upper part of this figure represents a pipeline, where some combinational gates (AND gates represented by and XOR gate represented by ) form a partial netlist between two barriers of flip-flops (DFF gates represented by ). For the sake of this explanation, all the gates are assumed to have the same propagation time, namely 1 ns. The lower part of this figure gives the chronograms of the execution of this netlist, when initially all signals are set to zero. It appears that, owing to the difference of paths between the two inputs of the final XOR gate, this gate generates a glitch, highlighted with symbol , which lasts 3 ns, between time 1 and 4 ns within the depicted clock period. The condition for this glitch to appear is the following: $x_1 \wedge x_2 \wedge x_3 \wedge x_4$. This means that this glitch is a 4th-order leakage. So, if the masking scheme is only 3rd-order resistant, the setup of Fig. 7 would generate a glitch which compromises the security in a 1st-order side-channel attack. That is, the circuit itself contributes to the attack, in combining the bits on behalf of the attacker.

Assume now a setup slightly more simple than that of Fig. 7, where there is only one AND gate behind the second input of the XOR gate. However, we assume such pattern is present twice, once computing $y_0 = x_0 \oplus (x_1 \wedge x_2)$, and another time computing $y_5 = x_5 \oplus (x_4 \wedge x_3)$. Then, in this case depicted in Fig. 8, the leakage incurred by the glitches at the output of the XOR gates would only combine two bits amongst the $x_i$ (namely $x_1$ & $x_2$, and $x_3$ & $x_4$). Therefore, it suffices for the attacker to conduct a 2nd-order attack on the glitchy traces to succeed a $2\times 2=4$th order attack on the masking scheme. The circuit and the attacker collaborate in the objective of realizing a 4th-order attack: half of the combination is carried out by the circuit ($(x_1 \wedge x_2)$ and $(x_3 \wedge x_4)$), while the other half is left remaining to the attacker. Indeed, by raising the traces to the second power, the attacker obtains a term $(x_1 \wedge x_2) \times (x_3 \wedge x_4)$, which coincides with the leakage condition of Fig. 7, that is $\bigwedge _{i=1}^{i=4} x_i$.

To conclude on the leakage model complexification, we underline that it has a negative impact on two situations:

on low-entropy masking schemes, where the individual shares are not protected at the maximum order (see for instance RSM in Sect. 3.5), and
on any masking schemes, where shares interact between themselves by some combinational logic.

In those two cases, a great care must be taken; tools as that described in [18] can help check the design is secure (or not).

5 Conclusion

Throughout this paper, we have seen how coding and side-channel analysis can benefit one from another, for attack as well as for protection.

This is a nice example of cross fertilization between disciplines, in which Claude Carlet played a decisive role. Thanks to you, Claude!

References

Blömer, J., Guajardo, J., Krummel, V.: Provably secure masking of AES. In: Handschuh, H., Hasan, M.A. (eds.) SAC 2004. LNCS, vol. 3357, pp. 69–83. Springer, Heidelberg (2004). doi:10.1007/978-3-540-30564-4_5
Chapter Google Scholar
Brier, E., Clavier, C., Olivier, F.: Correlation power analysis with a leakage model. In: Joye, M., Quisquater, J.-J. (eds.) CHES 2004. LNCS, vol. 3156, pp. 16–29. Springer, Heidelberg (2004). doi:10.1007/978-3-540-28632-5_2
Chapter Google Scholar
Bringer, J., Carlet, C., Chabanne, H., Guilley, S., Maghrebi, H.: Orthogonal direct sum masking: a smartcard friendly computation paradigm in a code, with Builtin protection against side-channel and fault attacks. In: Naccache, D., Sauveron, D. (eds.) WISTP 2014. LNCS, vol. 8501, pp. 40–56. Springer, Heidelberg (2014). doi:10.1007/978-3-662-43826-8_4
Google Scholar
Bringer, J., Carlet, C., Chabanne, H., Guilley, S., Maghrebi, H.: Orthogonal direct sum masking: a smartcard friendly computation paradigm in a code, with Builtin protection against side-channel and fault attacks. Cryptology ePrint Archive, Report 2014/665 (2014). http://eprint.iacr.org/2014/665/
Bruneau, N., Carlet, C., Guilley, S., Heuser, A., Prouff, E., Rioul, O.: Stochastic Collision Attack. In: IEEE Transactions on Information Forensics and Security (2016)
Google Scholar
Bruneau, N., Guilley, S., Heuser, A., Marion, D., Rioul, O.: Less is more: dimensionality reduction from a theoretical perspective. In: Güneysu, T., Handschuh, H. (eds.) CHES 2015. LNCS, vol. 9293, pp. 22–41. Springer, Heidelberg (2015). doi:10.1007/978-3-662-48324-4_2
Chapter Google Scholar
Bruneau, N., Guilley, S., Heuser, A., Marion, D., Rioul, O.: Optimal side-channel attacks for multivariate leakages and multiple models. J. Crypt. Eng. (2016, to appear). http://www.proofs-workshop.org/program.html
Bruneau, N., Guilley, S., Heuser, A., Rioul, O.: Masks will fall off: higher-order optimal distinguishers. In: Sarkar, P., Iwata, T. (eds.) ASIACRYPT 2014. LNCS, vol. 8874, pp. 344–365. Springer, Heidelberg (2014). doi:10.1007/978-3-662-45608-8_19
Google Scholar
Carlet, C.: Boolean functions for cryptography and error correcting codes, chapter of the monography. In: Crama, Y., Hammer, P. (eds.) Boolean Models and Methods in Mathematics, Computer Science, and Engineering, pp. 257–397. Cambridge University Press, Cambridge (2010). Preliminary version, http://www.math.univ-paris13.fr/~carlet/chap-fcts-Bool-corr.pdf
Carlet, C.: Correlation-immune boolean functions for leakage squeezing and rotating S-Box masking against side channel attacks. In: Gierlichs, B., Guilley, S., Mukhopadhyay, D. (eds.) SPACE 2013. LNCS, vol. 8204, pp. 70–74. Springer, Heidelberg (2013). doi:10.1007/978-3-642-41224-0_6
Chapter Google Scholar
Carlet, C., Daif, A., Danger, J.-L., Guilley, S., Najm, Z., Ngo, X.T., Porteboeuf, T., Tavernier, C.: Optimized linear complementary codes implementation for hardware Trojan prevention. In: European Conference on Circuit Theory and Design, ECCTD, Trondheim, Norway, pp. 1–4. IEEE, 24–26 August 2015
Google Scholar
Carlet, C., Danger, J.-L., Guilley, S., Maghrebi, H.: Leakage squeezing of order two. In: Galbraith, S., Nandi, M. (eds.) INDOCRYPT 2012. LNCS, vol. 7668, pp. 120–139. Springer, Heidelberg (2012). doi:10.1007/978-3-642-34931-7_8
Chapter Google Scholar
Carlet, C., Danger, J.-L., Guilley, S., Maghrebi, H.: Leakage squeezing: optimal implementation and security evaluation. J. Math. Crypt. 8(3), 249–295 (2014)
MathSciNet MATH Google Scholar
Carlet, C., Guilley, S.: Side-channel indistinguishability. In: HASP, pp. 9:1–9:8. ACM, New York, 13–14 June 2013
Google Scholar
Carlet, C., Guilley, S.: Side-channel indistinguishability. On HAL, 19 July 2014. Extended version of [14] with more results in appendix, http://hal.archives-ouvertes.fr/hal-00826618
Coron, J.-S.: Higher order masking of look-up tables. In: Nguyen, P.Q., Oswald, E. (eds.) EUROCRYPT 2014. LNCS, vol. 8441, pp. 441–458. Springer, Heidelberg (2014). doi:10.1007/978-3-642-55220-5_25
Chapter Google Scholar
Danger, J.-L., Guilley, S.: Protection des modules de cryptographie contre les attaques en observation d’ordre élevé sur les implémentations à base de masquage. Brevet Français FR09/50341, assigné à l’Institut TELECOM, 20 January 2009
Google Scholar
Danger, J.-L., Guilley, S., Nguyen, P., Nguyen, R., Souissi, Y.: Analyzing security breaches of countermeasures throughout the refinement process in hardware design flow. In: DATE, Lausanne, Switzerland, 27–31 March 2017
Google Scholar
Duc, A., Faust, S., Standaert, F.-X.: Making masking security proofs concrete: or how to evaluate the security of any leaking device. In: Oswald, E., Fischlin, M. (eds.) EUROCRYPT 2015. LNCS, vol. 9056, pp. 401–429. Springer, Heidelberg (2015). doi:10.1007/978-3-662-46800-5_16
Google Scholar
Fei, Y., Luo, Q., Ding, A.A.: A statistical model for DPA with novel algorithmic confusion analysis. In: Prouff, E., Schaumont, P. (eds.) CHES 2012. LNCS, vol. 7428, pp. 233–250. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33027-8_14
Chapter Google Scholar
Fischer, W., Gammel, B.M.: Masking at gate level in the presence of glitches. In: Rao, J.R., Sunar, B. (eds.) CHES 2005. LNCS, vol. 3659, pp. 187–200. Springer, Heidelberg (2005). doi:10.1007/11545262_14
Chapter Google Scholar
Gomathisankaran, M., Tyagi, A.: Glitch resistant private circuits design using HORNS. In: IEEE Computer Society Annual Symposium on VLSI, ISVLSI, Tampa, FL, USA, pp. 522–527, 9–11 July 2014
Google Scholar
Guilley, S., Heuser, A., Rioul, O.: A key to success: success exponents for side-channel distinguishers. In: Biryukov, A., Goyal, V. (eds.) INDOCRYPT 2015. LNCS, vol. 9462, pp. 270–290. Springer, Cham (2015). doi:10.1007/978-3-319-26617-6_15
Chapter Google Scholar
Heuser, A., Rioul, O., Guilley, S.: Good is not good enough: deriving optimal distinguishers from communication theory. In: Batina, L., Robshaw, M. (eds.) CHES 2014. LNCS, vol. 8731, pp. 55–74. Springer, Heidelberg (2014). doi:10.1007/978-3-662-44709-3_4
Google Scholar
Ishai, Y., Prabhakaran, M., Sahai, A., Wagner, D.: Private circuits II: keeping secrets in tamperable circuits. In: Vaudenay, S. (ed.) EUROCRYPT 2006. LNCS, vol. 4004, pp. 308–327. Springer, Heidelberg (2006). doi:10.1007/11761679_19
Chapter Google Scholar
Ishai, Y., Sahai, A., Wagner, D.: Private circuits: securing hardware against probing attacks. In: Boneh, D. (ed.) CRYPTO 2003. LNCS, vol. 2729, pp. 463–481. Springer, Heidelberg (2003). doi:10.1007/978-3-540-45146-4_27
Chapter Google Scholar
Lin, K.J., Fan, S.C., Yang, S.H., Lo, C.C.: Overcoming glitches, dissipation timing skews in design of DPA-resistant cryptographic hardware. In: IEEE Computer Society Proceedings of the Conference on Design, Automation and Test in Europe, DATE 2007, Nice, France, pp. 1265–1270. EDA Consortium, San Jose, 16–20 April 2007. doi:10.1109/DATE.2007.364471
Mangard, S., Popp, T., Gammel, B.M.: Side-channel leakage of masked CMOS gates. In: Menezes, A. (ed.) CT-RSA 2005. LNCS, vol. 3376, pp. 351–365. Springer, Heidelberg (2005). doi:10.1007/978-3-540-30574-3_24
Chapter Google Scholar
Mangard, S., Pramstaller, N., Oswald, E.: Successfully attacking masked AES hardware implementations. In: Rao, J.R., Sunar, B. (eds.) CHES 2005. LNCS, vol. 3659, pp. 157–171. Springer, Heidelberg (2005). doi:10.1007/11545262_12
Chapter Google Scholar
Mangard, S., Schramm, K.: Pinpointing the side-channel leakage of masked AES hardware implementations. In: Goubin, L., Matsui, M. (eds.) CHES 2006. LNCS, vol. 4249, pp. 76–90. Springer, Heidelberg (2006). doi:10.1007/11894063_7
Chapter Google Scholar
Moradi, A., Mischke, O.: Glitch-free implementation of masking in modern FPGAs. In: HOST, pp. 89–95. IEEE Computer Society, Moscone Center, San Francisco, 2–3 June 2012. doi:10.1109/HST.2012.6224326
Nassar, M., Souissi, Y., Guilley, S., Danger, J.-L.: RSM: a small and fast countermeasure for aes, secure against first- and second-order zero-offset SCAs. In: DATE (TRACK A: “Application Design”, TOPIC A5: “Secure Systems”), Dresden, Germany, pp. 1173–1178. IEEE Computer Society, 12–16 March 2012
Google Scholar
Ngo, X.T., Bhasin, S., Danger, J.-L., Guilley, S., Najm, Z.: Linear complementary dual code improvement to strengthen encoded circuit against hardware Trojan horses. In: IEEE International Symposium on Hardware Oriented Security and Trust, HOST 2015, Washington, DC, USA, pp. 82–87. IEEE, 5–7 May 2015
Google Scholar
Ngo, X.T., Guilley, S., Bhasin, S., Danger, J.-L., Najm, Z.: Encoding the state of integrated circuits: a proactive and reactive protection against hardware trojans horses. In: Proceedings of the 9th Workshop on Embedded Systems Security, WESS 2014, pp. 7:1–7:10. ACM, New York (2014)
Google Scholar
Nikova, S., Rijmen, V., Schläffer, M.: Secure hardware implementation of nonlinear functions in the presence of glitches. J. Crypt. 24(2), 292–321 (2011)
Article MathSciNet MATH Google Scholar
NIST/ITL/CSD: Advanced Encryption Standard (AES). FIPS PUB 197, November 2001. http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf
Prouff, E., Rivain, M.: A generic method for secure SBox implementation. In: Kim, S., Yung, M., Lee, H.-W. (eds.) WISA 2007. LNCS, vol. 4867, pp. 227–244. Springer, Heidelberg (2007). doi:10.1007/978-3-540-77535-5_17
Chapter Google Scholar
Prouff, E., Rivain, M., Bevan, R.: Statistical analysis of second order differential power analysis. IEEE Trans. Comput. 58(6), 799–811 (2009)
Article MathSciNet Google Scholar
Prouff, E., Roche, T.: Higher-order glitches free implementation of the AES using secure multi-party computation protocols. In: Preneel, B., Takagi, T. (eds.) CHES 2011. LNCS, vol. 6917, pp. 63–78. Springer, Heidelberg (2011). doi:10.1007/978-3-642-23951-9_5
Chapter Google Scholar
Rivain, M., Prouff, E.: Provably secure higher-order masking of AES. In: Mangard, S., Standaert, F.-X. (eds.) CHES 2010. LNCS, vol. 6225, pp. 413–427. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15031-9_28
Chapter Google Scholar

Download references

Acknowledgements

Part of this work has been funded by the ANR CHIST-ERA project SECODE (Secure Codes to thwart Cyber-physical Attacks).

Author information

Authors and Affiliations

Secure-IC S.A.S., 15 Rue Claude Chappe, Bât. B, 35 510, Cesson-Sévigné, France
Sylvain Guilley
LTCI, Télécom ParisTech, Université Paris-Saclay, 75 013, Paris, France
Sylvain Guilley & Olivier Rioul
IRISA, 263 Avenue Général Leclerc, 35 000, Rennes, France
Annelie Heuser

Authors

Sylvain Guilley
View author publications
You can also search for this author in PubMed Google Scholar
Annelie Heuser
View author publications
You can also search for this author in PubMed Google Scholar
Olivier Rioul
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sylvain Guilley .

Editor information

Editors and Affiliations

University of Mohammed V, Rabat, Morocco
Said El Hajji
University of Caen, Caen, France
Abderrahmane Nitaj
University of Mohammed V, Rabat, Morocco
El Mamoun Souidi

A SNR in the Presence of First Order Masking

Let us consider a first-order masking scheme [1]. By design, a first-order side-channel attack fails. However, a second-order side-channel attack, combining two samples, can succeed. The setup is the following: the leakage is:

$$\begin{aligned} \left( \begin{array}{c} X_1 \\ X_2 \\ \end{array} \right) = \left( \begin{array}{c} \alpha _1 Y_1^\star \\ \alpha _2 Y_2^\star \\ \end{array} \right) + \left( \begin{array}{c} N_1 \\ N_2 \\ \end{array} \right) , \end{aligned}$$

where:

$N_1\sim \mathcal {N}(0,\sigma _1^2)$ and $N_2\sim \mathcal {N}(0,\sigma _2^2)$ are two independent noise sources,
$\alpha _1$ and $\alpha _2$ are the amount of leakage,
$Y_1^\star $ and $Y_2^\star $ are leakage functions (assumed normalized, that is $\mathbb {E}(Y_i^\star ) = 0$ and $\mathsf {Var}(Y_i^\star ) = 1$, for $i\in \{1,2\}$).

In the Boolean masking where the attacker target the pair (mask, masked substitution box S), the leakage model is:

$Y_1 = \frac{2}{\sqrt{n}} \left( w_H(S(T\oplus k)\oplus M) - \frac{n}{2} \right) =-\frac{1}{\sqrt{n}} \sum _{b=1}^n (-1)^{S_b(T\oplus k)\oplus M_b}$ and
$Y_2 = \frac{2}{\sqrt{n}} \left( w_H(M) - \frac{n}{2} \right) =-\frac{1}{\sqrt{n}} \sum _{b=1}^n (-1)^{M_b}$.

The notation $M_b$ means bit $b\in \{1,\ldots ,n\}$ in bitvector $M\in \mathbb {F}_2^n$.

As the masking is first-order perfect, we indeed have that $\mathbb {E}(Y_i|T=t)$ does not depend on the key, for each share $i\in \{1,2\}$. However, the attacker is inclined to combine the two leakages by a centered product, since the expectation of this combination $Y_c = Y_1 Y_2$ depends on the key, despite the masking with the uniform $M\sim \mathcal {U}(\mathbb {F}_2^n)$. Precisely, let $t\in \mathbb {F}_2^n$ one realization of T. We have that:

$$\begin{aligned} \mathbb {E}(Y_c|T=t)&= \frac{1}{2^n} \sum _{m\in \mathbb {F}_2^n} \frac{1}{n} \sum _{b,b'} (-1)^{S_b(T\oplus k)\oplus m_b\oplus m_{b'}} \nonumber \\&= \frac{1}{n 2^n} \sum _{m\in \mathbb {F}_2^n} \sum _{b} (-1)^{S_b(T\oplus k)} \qquad \text {(because } m \text { is uniform on } \mathbb {F}_2^n)\nonumber \\&= -\frac{1}{2\sqrt{n}} \left( w_H(S(T\oplus k)) - \frac{n}{2} \right) , \end{aligned}$$

(12)

which happens to be proportional to the leakage model of the substitution box when the masking is disabled ($M=0$). Indeed, one can derive from Eq. (12) that:

$$\begin{aligned} \mathbb {E}(Y_c|T=t) = -\frac{1}{2\sqrt{n}} \mathbb {E}(Y_1|T=t,M=0) . \end{aligned}$$

The second-order attack thus consists in applying the regular correlation power analysis (CPA [2]):

targeting $X_c = X_1 X_2$ instead of $X_1$ or $X_2$,
using as leakage model $\mathbb {E}(Y_c|T)$, where we recall that $Y_c = Y_1 Y_2$ [38].

Thus, the new leakage to analyse is:

$$\begin{aligned} X_c = X_1 X_2&= (\alpha _1 Y_1^\star + N_1) (\alpha _2 Y_2^\star + N_2) \\&= \underbrace{\alpha _1 \alpha _2 Y_1^\star Y_2^\star }_{\text {signal}} + \underbrace{\alpha _1 Y_1^\star N_2 + \alpha _2 Y_2^\star N_1 + N_1 N_2}_{\text {noise}} . \end{aligned}$$

Indeed, the term $Y_1^\star Y_2^\star $ conditionally to the known plaintext T depends on the key (recall Eq. (12)), whereas the other terms $\alpha _1 Y_1^\star N_2 + \alpha _2 Y_2^\star N_1 + N_1 N_2$ do not.

Therefore, the SNR in the case of the second-order attack is:

$$\begin{aligned} \text {SNR(2o)} =\frac{\mathsf {Var}(\alpha _1 \alpha _2 Y_1^\star Y_2^\star )}{\mathsf {Var}(\alpha _1 Y_1^\star N_2 + \alpha _2 Y_2^\star N_1 + N_1 N_2)} . \end{aligned}$$

(13)

Proposition 11

The SNR in the case of the second-order attack is:

$$\begin{aligned} SNR(2o) = \frac{SNR_1 \cdot SNR_2}{1 + SNR_1 + SNR_2} , \end{aligned}$$

where ${SNR}_i = \alpha _i^2 / \sigma _i^2$ for $i\in \{1,2\}$.

Proof

We have:

$$\begin{aligned} \mathbb {E}_{T,M}(Y_1^\star Y_2^\star )&= \frac{1}{2^{2n}} \sum _{t\in \mathbb {F}_2^n, m\in \mathbb {F}_2^n} Y_1^\star Y_2^\star \nonumber \\&= \frac{1}{2^{2n}} \left( \frac{2}{\sqrt{n}} \right) ^2 \sum _m \left( w_H(m)-\frac{n}{2} \right) \sum _t \left( w_H(S(t\oplus k^\star )\oplus m)-\frac{n}{2} \right) \nonumber \\&= \frac{1}{2^{2n}} \left( \frac{2}{\sqrt{n}} \right) ^2 \sum _m \left( w_H(m)-\frac{n}{2} \right) \sum _z \left( w_H(z)-\frac{n}{2} \right) \\&= 0 \times 0 = 0 . \nonumber \end{aligned}$$

(14)

At line (14), we used the fact that S is a bijection of $\mathbb {F}_2^n$ (as is SubBytes in AES [36]).

Besides, we also have:

$$\begin{aligned} \mathbb {E}_{T,M}\left( (Y_1^\star Y_2^\star )^2\right)&= \frac{1}{2^{2n}} \sum _{t\in \mathbb {F}_2^n, m\in \mathbb {F}_2^n} (Y_1^\star )^2 (Y_2^\star )^2 \nonumber \\&= \frac{1}{2^{2n}} \left( \frac{2}{\sqrt{n}} \right) ^4 \sum _m \left( w_H(m)-\frac{n}{2} \right) ^2 \sum _t \left( w_H(S(t\oplus k^\star )\oplus m)-\frac{n}{2} \right) ^2 \nonumber \\&= \frac{1}{2^{2n}} \left( \frac{2}{\sqrt{n}} \right) ^4 \sum _m \left( w_H(m)-\frac{n}{2} \right) ^2 \sum _z \left( w_H(z)-\frac{n}{2} \right) ^2 \\&= 1 \times 1 = 1 \qquad \text {(as per the normalization of } Y_1^\star \text { and } Y_2^\star ) . \nonumber \end{aligned}$$

(15)

Therefore, the variance of the signal is equal to $\alpha _1^2 \alpha _2^2$.

Regarding the noise part, we have:

$$\begin{aligned} \mathbb {E}(\alpha _1 Y_1^\star N_2 + \alpha _2 Y_2^\star N_1 + N_1 N_2) = 0 , \end{aligned}$$

by independence between $N_1$, $N_2$ and $Y_i^\star $ for $i\in \{1,2\}$. We also have:

$$\begin{aligned} \mathsf {Var}(\alpha _1 Y_1^\star N_2 + \alpha _2 Y_2^\star N_1 + N_1 N_2)&= \mathbb {E}\left( (\alpha _1 Y_1^\star N_2 + \alpha _2 Y_2^\star N_1 + N_1 N_2)^2\right) - 0 \\&= \alpha _1^2 \sigma _2^2 + \alpha _2^2 \sigma _1^2 + \sigma _1^2 \sigma _2^2 . \end{aligned}$$

As a result, we have:

$$\begin{aligned} \text {SNR(2o)} = \frac{\alpha _1^2 \alpha _2^2}{\alpha _1^2 \sigma _2^2 + \alpha _2^2 \sigma _1^2 + \sigma _1^2 \sigma _2^2} = \frac{\text {SNR}_1 \cdot \text {SNR}_2}{1 + \text {SNR}_1 + \text {SNR}_2} . \end{aligned}$$

$\square $

Corollary 12

(Limit of SNR(2o) in the presence of large noise). When the noise is large, that is ${SNR}_i \ll 1$ for $i\in \{1,2\}$, then

$$\begin{aligned} {SNR(2o)}&\approx {SNR}_1 \cdot {SNR}_2 \approx {SNR}^2 \qquad (if \,\ {SNR}_1 \approx {SNR}_2 = {SNR}) . \end{aligned}$$

(16)

Proof

Immediate first-order simplification of $\text {SNR(2o)}$ as given in Proposition 11. $\square $

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Guilley, S., Heuser, A., Rioul, O. (2017). Codes for Side-Channel Attacks and Protections. In: El Hajji, S., Nitaj, A., Souidi, E. (eds) Codes, Cryptology and Information Security. C2SI 2017. Lecture Notes in Computer Science(), vol 10194. Springer, Cham. https://doi.org/10.1007/978-3-319-55589-8_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-55589-8_3
Published: 09 March 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-55588-1
Online ISBN: 978-3-319-55589-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Codes for Side-Channel Attacks and Protections

Abstract

Similar content being viewed by others

Analysis of a Code-Based Countermeasure Against Side-Channel and Fault Attacks

Toward finding best linear codes for side-channel protections (extended version)

Statistical properties of side-channel and fault injection attacks using coding theory

Keywords

1 Introduction

2 Side-Channel Analysis as a Decoding Problem

2.1 Setup

2.2 Example of AWGN Channel

2.3 Absence of Countermeasures

Theorem 1

2.4 Multivariate and Multimodel Setting

Theorem 2

2.5 Collision

Theorem 3

2.6 General Setting, with Countermeasures

Theorem 4

2.7 Link Between Success Probability, SNR and Leakage Function

Theorem 5

Definition 6

Lemma 7

Corollary 8

3 Side-Channel Protection

3.1 Strategies to Thwart Side-Channel Attacks

3.2 Masking Schemes

3.3 Security of Masking Schemes

3.4 Orthogonal Direct Sum Masking (ODSM), a Masking Scheme Based on Codes

Theorem 9

Proof

Theorem 10

Proof

3.5 Illustration for Some Coding-Based Masking Schemes

3.6 Masking and Faults Detection

4 Leakage Model, and Glitches

5 Conclusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A SNR in the Presence of First Order Masking

A SNR in the Presence of First Order Masking

Proposition 11

Proof

Corollary 12

Proof

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation