More Insight on Deep Learning-Aided Cryptanalysis

Bao, Zhenzhen; Lu, Jinyu; Yao, Yiran; Zhang, Liu

doi:10.1007/978-981-99-8727-6_15

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14440))

Included in the following conference series:

International Conference on the Theory and Application of Cryptology and Information Security

786 Accesses
3 Citations

Abstract

In CRYPTO 2019, Gohr showed that well-trained neural networks could perform cryptanalytic distinguishing tasks superior to differential distribution table (DDT)-based distinguishers. This suggests that the differential-neural distinguisher ($\mathcal {N}\mathcal {D}$) may use additional information besides pure ciphertext differences. However, the explicit knowledge beyond differential distribution is still unclear. In this work, we provide explicit rules that can be used alongside DDTs to enhance the effectiveness of distinguishers compared to pure DDT-based distinguishers. These rules are based on strong correlations between bit values in right pairs of XOR-differential propagation through addition modulo $2^n$. Interestingly, they can be closely linked to the earlier study of the multi-bit constraints and the recent study of the fixed-key differential probability. In contrast, combining these rules does not improve the $\mathcal {N}\mathcal {D}$s’ performance. This suggests that these rules or their equivalent form have already been exploited by $\mathcal {N}\mathcal {D}$s, highlighting the power of neural networks in cryptanalysis.

In addition, we find that to enhance the differential-neural distinguisher’s accuracy and the number of rounds, regulating the differential propagation is imperative. Introducing differences into the keys is typically believed to help eliminate differences in encryption states, resulting in stronger differential propagations. However, differential-neural attacks differ from traditional ones as they don’t specify output differences or follow a single differential trail. This questions the usefulness of introducing differences in a key in differential-neural attacks and the resistance of Speck against such attacks in the related-key setting. This work shows that the power of differential-neural cryptanalysis in the related-key setting can exceed that in the single-key setting by successfully conducting a 14-round key recovery attack on Speck32/64 .

Access provided by Autonomous University of Puebla. Download conference paper PDF

Enhancing Differential-Neural Cryptanalysis

A deep learning aided differential distinguisher improvement framework with more lightweight and universality

Article Open access 06 November 2023

A Deeper Look at Machine Learning-Based Cryptanalysis

Keywords

1 Introduction

In 2019, Gohr [15] proposed differential-neural cryptanalysis, employing neural networks as superior distinguishers and exploiting them to perform efficient key recovery attacks. Impressively, the differential-neural distinguisher ($\mathcal{N}\mathcal{D}$) outperformed the traditional pure differential distinguishers using full differential distribution tables (DDT). However, interpreting these neural network-based distinguishers remains challenging, hindering the comprehension of the additional knowledge learned by differential-neural distinguishers.

Despite the intricate nature of neural network interpretability, researchers have made primary progress in understanding the differential-neural distinguish-er’s inner workings. In EUROCRYPT 2021, Benamira et al. [7] proposed that Gohr’s neural distinguisher effectively approximates the cipher’s DDT during the learning phase. Moreover, the distinguisher relies on both the differential distribution of ciphertext pairs and that of the penultimate and antepenultimate rounds. Yet, the specific form of additional information remains undisclosed.

In AICrypt 2023, Gohr et al. [16] proved the differential-neural distinguisher for Simon32/64 can use only differential features and achieve accuracy same as pure differential ones. Applying the same neural network to both Speck and Simon yields different conclusions: neural networks learned or did not learn features beyond full DDT. These intriguing findings motivate us to delve deeper into the neural network’s mechanisms, aiming to comprehend the specific features underpinning its conclusions for each cipher and to improve and exploit further the neural distinguishers should additional features be captured.

Our Contributions. In this work, we conclude that $\mathcal {N}\mathcal {D}$s’ advantage over pure DDT-based distinguishers is in exploiting the differential distribution under the partially known value input to the last non-linear operation. Specifically, $\mathcal {N}\mathcal {D}$s exploit the correlation between the ciphertexts’ partial value, ciphertext pair’s differences, and intermediate states’ differences. Furthermore, our work shows that differential-neural cryptanalysis in the related-key ($\mathcal{R}\mathcal{K}$) setting can attack more rounds than in the single-key setting, which was not apparent before. The concrete contributions include the following.

Improving full DDT-based distinguisher. We observe that, apart from the information of differences, one knows the partial value of inputs, denoted by y, to the last modular addition of Speck, leveraging by which one can improve DDT-based distinguishers. We show that the differential probability conditioned on a fixed value of y can differ from the average differential probability over all possible y. This insight enables more accurate classification based on the ciphertext pair’s differences and the ciphertexts’ partial value. The high-level idea is to consider conditional probabilities and specific cases where the fulfillment of the differential constraints can be predicted based on the value of y. The results indicate that it is highly likely that $\mathcal {N}\mathcal {D}$s rely on these specific cases to outperform pure DDT-based distinguishers.
Optimizing the performance and training process of $\mathcal{N}\mathcal{D}$s. Addressing the challenge of training high-round, especially 8-round, $\mathcal{N}\mathcal{D}$ of Speck32/64, we introduce the Freezing Layer Method. By freezing all convolutional layers in a pre-trained 7-round $\mathcal{N}\mathcal{D}$, we efficiently train an 8-round $\mathcal{N}\mathcal{D}$ using simple basic training with unaltered hyperparameters. This method matches Gohr’s accuracy but cuts training time and data.
Exploring differential-neural attacks in the related-key setting. The conclusion that $\mathcal {N}\mathcal {D}$s can efficiently capture features beyond full DDT encourages further exploration of $\mathcal {N}\mathcal {D}$-based attacks. We observed that control over the differential propagation is vital for achieving effective high-round $\mathcal{N}\mathcal{D}$s. Hence, we introduce related-key ($\mathcal{R}\mathcal{K}$) differences to slow down the diffusion of differences, aiding in training $\mathcal{N}\mathcal{D}$ for higher rounds. As a result, we achieve a 14-round key recovery attack on Speck32/64 using related-key neural distinguishers ($\mathcal {R}\mathcal {K}\text {-}{\mathcal {N}\mathcal {D}}$s). Results are in Table 1. Furthermore, we constructed various distinguishers under various $\mathcal{R}\mathcal{K}$ differential trails and conducted comprehensive comparisons, reinforcing $\mathcal {N}\mathcal {D}$ explainability.

Table 1. Summary of key recovery attacks on Speck32/64

unknown but biased	x	unknown and balanced
known		known
known	z	unknown and balanced

Parameter	Definition
\(n_{cts}\)	The number of ciphertext structures
\(n_{b}\)	The number of ciphertext pairs in each ciphertext structure,
\(n_{b}\)	that is, \(2^{\|\text {NB}\|}\)
\(n_{it}\)	The total number of iterations in the ciphertext structures
\(c_1,c_2\)	The cutoffs with respect to the scores of the recommended last
\(c_1,c_2\)	subkey and second to last subkey, respectively
\(n_{byit1/2}\)	The number of iterations, the default value is 5
\(n_{cand1/2}\)	The number of key candidates within each iteration, default
\(n_{cand1/2}\)	value is 32

More Insight on Deep Learning-Aided Cryptanalysis

Abstract

Similar content being viewed by others

Enhancing Differential-Neural Cryptanalysis

A deep learning aided differential distinguisher improvement framework with more lightweight and universality

A Deeper Look at Machine Learning-Based Cryptanalysis

Keywords

1 Introduction

2 Preliminary

2.1 Notations

2.2 Brief Description of Speck32/64

2.3 Overview of Differential-Neural Cryptanalysis

3 Explicitly Explain Knowledge Beyond Full DDT

3.1 Locating Information Used by \(\mathcal {N}\mathcal {D}\)s of Speck Beyond DDT

Conclusion 1

Conclusion 2

Remark 1

3.2 Explicitly Rules to Exploit the Information Beyond Full DDT: From a Cryptanalytic Perspective

Observation 1

Remark 2

Conclusion 3

3.3 Distinguishers Using Systematic Computation of Conditional Differential Probability Under Known y

3.4 Discussion on \(\mathcal {N}\mathcal {D}\) ’s Advantages

4 Insights and Improvements on Training Differential-Neural Distinguisher

4.1 Relations Between Distinguisher Accuracy and Differential Distribution

4.2 Freezing Layer Method

5 Related-Key Differential-Neural Cryptanalysis

5.1 Related-Key Differential-Neural Distinguisher for Speck32/64

5.2 Key Recovery Attack on Round-Reduced Speck32/64

Remark 3

Remark 4

6 Conclusion

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation