1 Introduction

In side-channel analysis (SCA), the attacker exploits weaknesses in physical implementations of cryptographic algorithms [12]. This is possible by exploiting unintentional leakages in physical channels like power consumption [9] or electromagnetic radiation [19].

In profiled side-channel attacks, a powerful attacker has a device (the clone device) with knowledge about the secret key implemented and can obtain a set of profiling traces. From there, he builds a profiled model, which is then used to conduct an attack on another device (the device under attack). Consequently, profiled attacks have two phases (1) profiling phase where a model is constructed and (2) attack phase where the constructed model is used to attack the actual target device. Profiled SCA performs the worst-case security analysis as it considers the most powerful side-channel attacker with access to an open (since the keys are chosen/known by the attacker) clone device. The best-known profiled attack is the template attack, which is based on the Bayesian rule. Template attack is considered to be the most powerful attack from the information-theoretic point of view when the attacker has an unbounded number of measurements in the profiling phase [3]. To cope with certain statistical difficulties that can arise in template attack, there is a variant of it commonly known as the pooled template attack [4]. Finally, the third example of profiled attacks is the stochastic attack, which uses linear regression in the profiling phase [20].

These three techniques represent a standard set of techniques in profiled SCA. Besides these techniques, the SCA community also started using different machine learning techniques. Common examples are the Naive Bayes [15], Support Vector Machines [7], Random Forest [10], and multilayer perceptron [6, 13]. The multilayer perceptron algorithm (when having multiple hidden layers) also represents the first setting for deep learning-based attacks in profiled SCA. In 2016, Maghrebi et al. conducted a more detailed study of deep learning techniques in profiled SCA where they also used techniques like convolutional neural networks (CNN) or recurrent neural networks [11]. The reported results were in favor of CNNs, and from that time, a large part of the SCA community started to use CNNs, see, e.g., [2, 17]. Such a direction seems to pay off as current state-of-the-art results suggest CNNs indeed perform very well and can break implementations protected with countermeasures [2, 8, 22].

2 State-of-the-Art and Future Challenges

We emphasize that we do not provide a complete overview of the state-of-the-art nor all related works tackling certain aspects of the future research directions we discuss. Rather, we concentrate on challenges we consider to be important and then offer more precise research questions within those.

Currently, the most explored research direction in machine learning-based SCA uses deep learning techniques like multilayer perceptron and convolutional neural networks to mount as powerful as possible attacks. A common setting is to use publicly available datasets (the more difficult dataset the more attractive target) and report the guessing entropy results (i.e., how many traces we require to break the target). There, we mention research by Kim et al. that showed how to add noise to the input to improve the performance of CNNs [8]. More recently, Zaid et al. proposed a methodology for CNN-based attacks where they achieved state-of-the-art results [22]. Some of their results are so good that it remains questionable whether truly better attacks on those datasets and in such scenarios are even possible (as minimal improvements in guessing entropy are not so relevant in practice). Still, there is room for improvements if we consider not only the number of measurements necessary to mount the attack but also to:

  • Reduce the complexity of deep learning models. For example, Zaid et al. reported CNN models with much smaller number of parameters than commonly needed [22].

  • Limit the number of measurements available to the attacker not only in the attack phase (which is usually done) but also in the training phase. By doing so, we force the attacker to use as powerful as possible deep learning models and at the same time, we reduce the computational complexity as the training phase would last shorter [16].

  • Consider more difficult targets and more realistic settings. Indeed, a quite common procedure in profiled SCA research is to use only a single device for both profiling and attacking as well as to have the same key on both “devices”. While this makes the setting easier for research, it also makes the results less reliable. Recent results indicate that settings using different devices and keys, commonly known as portability settings, are significantly more difficult for machine learning attacks [1, 5].

Next, despite strong results in deep learning-based SCA, we still do not understand much that is happening inside the deep learning process and as such, we do not know how to make the attacks even stronger. Common examples of questions one could ask are:

  • How to know when to stop the training phase (as simply observing loss and accuracy is not necessarily revealing the SCA performance)?

  • How to understand what did deep learning model learn and how different results one can expect from some other target?

  • How to better connect the performance as measured by side-channel metrics and machine learning metrics?

  • How to select the best deep learning architectures (from both performance and complexity perspectives) for certain scenarios and how to conduct good hyperparameter tuning?

We note there are several works partially considering such questions but the answers are far from complete [14, 18, 21].

Finally, while improving the performance of attacks is important, we must not forget that the end goal is to provide more security. As such, we should consider how to use the knowledge from the most powerful machine learning-based attacks to construct stronger countermeasures and how to use machine learning constructively in SCA (i.e., not only to attack).