1 Introduction

Balanced Feistel networks (BFNs) are one of the most widely used structures for a block cipher. In fact, BFNs are adopted in a large number of symmetric key primitives, e.g., the former U.S. encryption standard DES [13], the current Russian encryption standard GOST blockcipher [20], and KASUMI which is the core of A5/3 cryptosystem in mobile networks [30]. Besides, a considerable number of analytic papers for the structure of the BFNs and the specific instantiation of the BFNs have been published since it was developed in the 1970s. However, the optimal design strategy with respect to both the security and the efficiency for its F-function is still an open problem.

This article addresses this problem in a wide class of typical underlying functions for a BFN (substitution-permutation functions with any finite number of layers). To do that, for each of them, we first prove tight bounds on the security parameter (number of active S-boxes). Then the security parameter is related to the computational workload of a cipher implementation (modelled as the number of S-boxes computed in the cipher) to obtain an efficiency parameter. Finally, the optimal constructions are those with the maximum resulting efficiency parameter.

The class of ciphers We focus on balanced Feistel networks with SP-type bijective F-functions, that is, with underlying functions whose internal structure is a substitution-permutation network (SPN). An SPN consists of several sequential applications of an S-box layer (S)—several small nonlinear maps applied in parallel—and a diffusion layer (P)—multiplication by a matrix over a binary finite field. We treat F-functions with (SP)\(^{2t}\), (SP)\(^{2t+1}\), (SP)\(^{2t-1}\)S and (SP)\(^{2t}\)S-type F-functions for integer \(t\ge 1\). For instance, an (SP)\(^{2}\)S-type F-function consists of two consecutive SP-functions followed by an S-box layer, namely an SPSPS F-function. The instantiation of a Feistel network with an SP-type F-function is deployed in many cryptographic algorithms including E2 [16], TWOFISH [22], CAMELLIA [2], CLEFIA [28], SHAvite-3 [4], and PICCOLO [23].

Security parameter Counting the minimum number of active S-boxes is a widely accepted argument [12] to demonstrate the immunity of a cryptographic algorithm against differential [5] and linear [18] cryptanalysis which are two fundamental attacks on block ciphers. Lower bounds on the number of active S-boxes are closely related to the probability of differential trails and linear trails [12].

For each of the BFN instantiations above, we prove lower bounds on the number of differentially and linearly active S-boxes. In contrast to the previous works [6, 15], our results with respect to this security parameter:

  • generalize the type of the F-function, while [6, 15] only contain lower bounds for BFNs with SP- and SPS- functions,

  • hold for any number of rounds (those of [6, 15] hold only for a few rounds), and

  • contain proofs of tightness for the bounds when the matrices used in the diffusion layers of BFNs are maximum distance separable (MDS).

Efficiency metric To measure the efficiency of a construction, we are using the ratio between active S-boxes and all S-boxes in a cipher—a reasonable efficiency metric introduced in [25] and extensively used in [69]. It is based on the assumption that most workload one has to perform in the implementation of an SP-type construction is the computation of the S-boxes. Since we are mainly interested in MDS matrices that are equal in all rounds and intend to compare block ciphers of the same block length only, we will ignore the cost of the linear operations for the purposes of comparison. Note that this efficiency metric cannot capture all implementation possibilities and constraints in the field.

Optimality In the wide class of our target ciphers, we prove optimality of several instances with respect to the efficiency parameter. More specifically, among BFN block ciphers with bijective SP-type F-functions and MDS diffusion, we prove BFNs with SPS and SPSP functions to maximize the efficiency in terms of the proportion of active S-boxes in all S-boxes. Interestingly, one SP-layer in the function is not enough to attain optimality, whereas taking more than two S-box layers does not increase the efficiency either.

Organization of the article The remainder of this article is organized as follows. Section 2 describes the target structure and definitions. The duality of differential and linear trails is explained in Sect. 3. Section 4 gives proofs for lower bounds on the numbers of differentially and linearly active S-boxes for the BFNs and its results are summarized in Table 1. Section 5 shows the tightness of those bounds. Section 6 discusses the optimality of the BFNs. Finally, we conclude in Sect. 7.

2 Preliminaries

2.1 Target structures

In this paper, we focus on balanced Feistel networks (BFNs) with bijective F-functions. A \(2mn\)-bit plaintext \(\mathcal {P}\) is divided into two subblocks as \(\mathcal {P} = (X_{L}^{(1)}, X_{U}^{(1)})\), where \(X_{L}^{(i)}, X_{U}^{(i)} \in \{0,1\}^{mn}\). Then the \(i\)-th round output is calculated as follows:

$$\begin{aligned} \left( X_{L}^{(i+1)},X_{U}^{(i+1)}\right) \leftarrow \left( F(X_{L}^{(i)}) \oplus X_{U}^{(i)}, X_{L}^{(i)}\right) , \end{aligned}$$

where \(F: \{0,1\}^{mn} \rightarrow \{0,1\}^{mn}\) is an F-function in the \(i\)-th round. A \(2mn\)-bit ciphertext \(\mathcal {C}\) for the \(r\)-round encryption function is derived as \(\mathcal {C} = (X_{U}^{(r+1)}, X_{L}^{(r+1)})\), i.e., the last exchange is omitted. Each F-function consists of some S-box layers and linear diffusion layers (P-layers), and all S-box layers and P-layers are bijective. While \(mn\)-bit subkeys are XORed before each S-box layer, we omit these subkey additions in this paper for simplicity. An S-box layer consists of \(m\) \(n\)-bit bijective S-boxes, and a linear diffusion layer consists of \(mn\)-bit linear Boolean function. BFN-(SP)\(^{u}\) denotes BFN with F-functions consisting of \(u\) consecutive SP-functions. BFN-(SP)\(^{u}\)S denotes BFN with F-functions consisting of \(u\) consecutive SP-functions followed by one additional S-box layer. See Figs. 1 and 2.

Fig. 1
figure 1

\(r\)-round BFN with bijective F-functions

2.2 Differential and linear cryptanalyses

Differential cryptanalysis was published in 1990 by Biham and Shamir [5] with applications to DES. However, it has been known to the designers of DES at IBM in early 1970s [11]. As the name suggests, the main idea of differential cryptanalysis is to exploit correlations between differences in the inputs and outputs of a block cipher to recover the key. It is a chosen-plaintext attack, in which an attacker is allowed to choose arbitrary plaintexts and obtain the corresponding ciphertexts.

Linear cryptanalysis as applied to DES was proposed by Matsui in 1993 [18]. However, similar ideas were published by Shamir [21] in 1985 as well as Tardy-Corfdir and Gilbert [29] in 1991. Linear cryptanalysis uses linear approximations of block ciphers to perform key recovery. It is a known-plaintext attack, in which an attacker knows some plaintexts and the corresponding ciphertexts.

For an \(n\)-bit function \(f: \{0,1\}^n \rightarrow \{0,1\}^n\), a differential probability used in differential cryptanalysis and a linear probability used in linear cryptanalysis are defined as follows, respectively.

Definition 1

(Differential probability) Given an input difference \(\Delta x\) and an output difference \(\Delta y\), a differential probability of \(f\) is defined as follows:

$$\begin{aligned} DP_{f}(\Delta x, \Delta y) = \Pr _{x \in \{0,1\}^n}(f(x) \oplus f(x \oplus \Delta x) = \Delta y), \end{aligned}$$

where \(x, \Delta x, \Delta y \in \{0,1\}^{n}\).

Definition 2

(Linear probability) Given an input linear mask value \(\Gamma x\) and an output linear mask value \(\Gamma y\), a linear probability of \(f\) is defined as follows:

$$\begin{aligned} LP_{f}(\Gamma x, \Gamma y) = \Bigl ( 2 \cdot \Pr _{x \in \{0,1\}^{n}} \bigl ( x \bullet \Gamma x = f(x) \bullet \Gamma y \bigr ) - 1 \Bigr )^{2}, \end{aligned}$$

where \(\bullet \) denotes dot products and \(x\), \(\Gamma x\), \(\Gamma y\) \(\in \{0,1\}^{n}\).

Fig. 2
figure 2

The \(i\)-th round F-function of BFN-(SP)\(^{u}\) and BFN-(SP)\(^{u}\)S

2.3 Notations

We give the standard definitions of bundle weight and branch number followed by more specific notations [12].

Definition 3

(Bundle Weight) Let \(x \in \{0, 1\}^{mn}\) be represented as \(x =\) \((x_1, x_2,\ldots , x_{m})\), where \(x_i \in \{0, 1\}^n\), then the \(n\)-bit bundle weight \(w_n(x)\) is defined as

$$\begin{aligned} w_n(x) = \sharp \{i | 1 \le i \le m, x_i \ne 0\}. \end{aligned}$$
(1)

Definition 4

(Branch Number) Let \(M: \{0, 1\}^{mn} \rightarrow \{0, 1\}^{mn}\). The branch number of \(M\) is defined as

$$\begin{aligned} \mathcal {B} (M) = \min _{a \ne 0} \{w_n (a) + w_n (M(a))\}. \end{aligned}$$
(2)

Besides the bundle weight \(w_n\) and branch number \(\mathcal {B}\), throughout this paper, we use the following notations:

  • \(x_{j}^{(i)}, y_{j}^{(i)}\): input and output of the \(j\)-th S-box layer in the \(i\)-th round.

  • \(z_{j}^{(i)}\): output of the \(j\)-th linear diffusion layer in the \(i\)-th round.

  • \(\Delta x_{j}^{(i)}\): a difference of \(x_{j}^{(i)}\).

  • \(d_{j}^{(i)}\): a truncated difference weight of \(x_{j}^{(i)}\), i.e., \(d_{j}^{(i)} = w_{n} (\Delta x_{j}^{(i)})\).

  • \(d^{(i)}\): the number of differentially active S-boxes in the \(i\)-th round.

  • \(\mathcal {D}(r)\): the minimum number of active S-boxes in \(r\) consecutive rounds.

  • \(\Gamma y_{j}^{(i)}\): a linear mask value of \(y_{j}^{(i)}\).

2.4 Efficiency metric

The proportion of active S-boxes in all S-boxes is a reasonable efficiency metric with respect to differential and linear cryptanalysis for ciphers based on substitution-permutation. It was introduced in [25] by Shirai and Preneel for BFNs and used in [69] for estimating and comparing the efficiency of diverse Feistel constructions, including BFNs.

Both the number of active S-boxes and the number of all S-boxes over several rounds of a BFN depend on the number \(r\) of rounds considered and the number \(m\) of S-boxes in one F-function.

Definition 5

(\(E_m\, and\, E\)) The efficiency metric \(E_m\) is defined as \(E_m\) \(=\) \(\lim _{r \rightarrow \infty }\) \(A_{m,r}/S_{m,r}\), where \(A_{m,r}\) is the minimum number of active S-boxes over \(r\) rounds and \(S_{m,r}\) is the total number of S-box computations over \(r\) rounds. The efficiency metric \(E\) is defined as \(E = \lim _{m \rightarrow \infty }E_m\), where the number of active S-boxes \(A_{m,r}\) is measured when the underlying diffusion matrix is MDS, i.e., \(\mathcal {B}(M) = m + 1\).

Note that this efficiency metric \(E_m\) cannot capture all implementation possibilities and constraints in the field, though it is believed to provide an indication of the efficiency of a block cipher towards the two fundamental types of cryptanalysis, see [69, 25] for some extensions and discussions with respect to efficiency metrics.

3 Duality of trails

In this section, we demonstrate an equivalence between differential and linear trails for the BFNs. This equivalence follows from Biham’s considerations in [3] and is provided here for completeness. It allows us to work with the minimum numbers of differentially and linearly active S-boxes simultaneously. We first show an equivalent transform for BFN-(SP)\(^{u}\).

Property 1

Suppose that both S-box layer and linear diffusion layer are bijective. Any BFN consisting of \(u\) consecutive SP-functions, BFN-(SP)\(^{u}\), can be equivalently transformed into a BFN consisting of \(u\) consecutive PS-functions with an initial and a final linear function.

This property is seen as a generalization of [15]. Let \(v_{j}^{(i)} \!=\! P^{-1}(x_{j}^{(i)})\). From the definition, \(P(y_{u}^{(i)}) \!=\! x_{1}^{(i-1)} \oplus x_{1}^{(i+1)}\), then \(y_{u}^{(i)} \!=\! P^{-1} (x_{1}^{(i-1)} \oplus x_{1}^{(i+1)})\). Since \(P\) is linear, \(y_{u}^{(i)} \!=\! v_{1}^{(i-1)} \oplus v_{1}^{(i+1)}\). Meanwhile, \(y_{u}^{(i)} \!=\! S(P(S(\cdots P(S(x_{1}^{(i)}))\cdots )))\), then \(y_{u}^{(i)} \!=\! S(P(S(\cdots \) \( P(S(P(v_{1}^{(i)})))\cdots )))\). Combining the above equations, \(v_{1}^{(i+1)} \!=\! S(P(S(\cdots P(S(P(v_{1}^{(i)})))\) \(\cdots ))) \oplus v_{1}^{(i-1)}\). Now we have BFN-(PS)\(^{u}\) from BFN-(SP)\(^{u}\) by using equivalent transforms. Note that BFN-(PS)\(^{u}\) takes \(\mathcal {P}'\) \(=\) \((P^{-1} (X_{L}^{(1)})\), \(P^{-1} (X_{U}^{(1)}))\) as a plaintext and outputs a ciphertext \(\mathcal {C}' = (P(X_{U}^{(r+1)}), P(X_{L}^{(r+1)}))\). Since these initial and final linear functions do not affect the minimum numbers of active S-boxes, we can ignore these functions when studying the minimum numbers of active S-boxes. An illustration of these equivalent transforms is given in Fig. 3.

Fig. 3
figure 3

Equivalent transform (BFN-(SP)\(^{u}\) to BFN-(PS)\(^{u}\)), where thin boxes and thick boxes denote S-box layers and P-layers, respectively

From the concatenation rules [3, 19], \(\Gamma v_{1}^{(i)} = \Gamma y_{u}^{(i-1)} \oplus \Gamma y_{u}^{(i+1)} = ^{t}P (\Gamma x_{1}^{(i)})\), where \(^{t}P\) is the bit-based transpose matrix of \(P\). Thus, for BFN-(SP)\(^{u}\), the linear trails can be transformed to the corresponding differential trails by replacing (\(\Delta x_{1}^{(i)}\), \(\Delta x_{2}^{(i)}, \ldots , \Delta x_{u}^{(i)}\)), (\(\Delta z_{1}^{(i)}\), \(\Delta z_{2}^{(i)}, \ldots , \Delta z_{u}^{(i)}\)) and \(P\) with (\(\Gamma y_{u}^{(i)}\), \(\Gamma y_{u-1}^{(i)},\ldots , \Gamma y_{1}^{(i)}\)), (\(\Gamma v_{u}^{(i)}\), \(\Gamma v_{u-1}^{(i)},\ldots , \Gamma v_{1}^{(i)}\)) and \(^{t}P\), respectively. Similarly, for BFN-(SP)\(^{u}\)S, the linear trails can be treated as the differential trails by replacing (\(\Delta x_{1}^{(i)},\) \(\Delta x_{2}^{(i)},\ldots ,\Delta x_{u+1}^{(i)}\)), (\(\Delta y_{1}^{(i)},\) \(\Delta y_{2}^{(i)},\ldots ,\Delta y_{u+1}^{(i)}\)) and \(P\) by (\(\Gamma y_{u+1}^{(i)},\) \(\Gamma y_{u}^{(i)},\ldots ,\Gamma y_{1}^{(i)}\)), (\(\Gamma x_{u+1}^{(i)},\) \(\Gamma x_{u}^{(i)},\ldots ,\Gamma x_{1}^{(i)}\)) and \(^{t}P\), respectively. Therefore, since the constraints for differential and linear trails for the BFNs are the same, the minimum numbers of differentially and linearly active S-boxes can be derived simultaneously. The above discussions yield the following theorem.

Theorem 1

For BFN-(SP)\(^{u}\) and BFN-(SP)\(^{u}\)S, assuming that both S-box layer and linear diffusion layer are bijective, the lower bounds on the number of differentially active S-boxes derived from the property of the linear diffusion layer hold also for the number of linearly active S-boxes by changing the linear diffusion layer to the transposed one.

Table 1 Summary of our results, where \(\mathcal {B}\) is the branch number of the diffusion matrix or its transpose, \(E_m = \lim _{r \rightarrow \infty }A_{m,r}/S_{m,r}\), and \(E = \lim _{m \rightarrow \infty }E_m\)

In the sequel, we only discuss the minimum numbers of differentially active S-boxes for simplicity, keeping in mind, however, that the minimum numbers of linearly active S-boxes can be derived in the same way.

4 Bounds for active functions

In this section, we give proofs for lower bounds on the minimum number of differentially active S-boxes for BFN-(SP)\(^{2t+1}\), -(SP)\(^{2t}\), -(SP)\(^{2t-1}\)S and -(SP)\(^{2t}\)S. These results are summarized in Table 1.

To prove those bounds, we utilize the following property and lemmata for BFNs consisting of bijective F-functions.

Property 2

For each nonzero input difference, any two and three consecutive rounds of BFN consisting of bijective F-functions have at least one and two active functions, respectively.

Proof

If two consecutive F-functions of the \(i\)-th and \((i+1)\)-th rounds are both non-active, i.e., \(\Delta X_{L}^{(i)}\) and \(\Delta X_{L}^{(i+1)}\) are zero, the input difference \(\Delta X_{L}^{(i)}\) and \(\Delta X_{U}^{(i)} (= \Delta X_{L}^{(i+1)}\), since the output difference of the \(i\)-th round F-function is zero) are zero. Since this contradicts the assumption, at least one of two F-functions is active. From this, each of two consecutive rounds starting from the \((i-1)\)-th round and the \(i\)-th round has at least one active F-function, which is an F-function whose input difference is nonzero. Obviously, if the \(i\)-th round F-function is non-active, three consecutive rounds starting from the \((i-1)\)-th round have at least two active F-functions. If the \(i\)-th round F-function is active, \(\Delta X_{L}^{(i-1)} (= \Delta X_{U}^{(i)})\) and \(\Delta X_{L}^{(i+1)}\) cannot be zero simultaneously since the output difference of the \(i\)-th round F-function is nonzero. Therefore, there exist at least two active F-functions in three consecutive rounds. \(\square \)

Meanwhile, the numbers of differentially active S-boxes for each differentially active F-function, which is an F-function whose input difference is nonzero, are lower-bounded by the following lemmata. Recall that \(\mathcal {B}\) denotes the branch number of the linear layer.

Lemma 1

(active S-boxes for 1-round BFN-(SP)\(^{u}\)) For BFN-(SP)\(^{u}\), if \(d^{(i)}\) is not zero, \(d^{(i)} \ge \lfloor u/2 \rfloor \mathcal {B} + (u \mod 2)\).

Proof

If an input difference of two consecutive SP-functions is not zero, there exist at least \(\mathcal {B}\) active S-boxes, e.g., \(d^{(i)}_{1} + d^{(i)}_{2} \ge \mathcal {B}\). Since BFN-(SP)\(^{u}\) has \(\lfloor u/2 \rfloor \) independent two consecutive SP-functions and \((u \mod 2)\) SP-functions, it has at least \(\lfloor u/2 \rfloor \mathcal {B} + (u \mod 2)\) active S-boxes when the input difference is not zero. \(\square \)

Similarly to Lemma 1, one derives the following lemma.

Lemma 2

(active S-boxes for 1-round BFN-(SP)\(^{u}\)S) For BFN-(SP)\(^{u}\)S, if \(d^{(i)}\) is not zero, \(d^{(i)} \ge \lceil u/2 \rceil \mathcal {B} + ((u + 1) \mod 2)\).

These lemmata show that the number of active S-boxes can be derived from the number of S-box layers when we treat only one active F-function. However, when we consider some consecutive rounds, the number of active S-boxes does not depend only on the number of S-box layers.

Starting from here, we treat four cases of the F-function construction separately: (SP)\(^{2t+1}\), (SP)\(^{2t}\), (SP)\(^{2t-1}\)S, and (SP)\(^{2t}\)S, as those exhibit essential differences.

4.1 Differentially active S-boxes in BFN-(SP)\(^{2t+1}\)

For BFN-(SP)\(^{2t+1}\), which consists of odd number of SP-layers, the proofs for the lower bounds are the most complicated among other BFNs, since the number of differentially active S-boxes cannot be directly obtained from the number of differentially active F-functions. We find tight lower bounds on the minimum number of differentially active S-boxes by carefully observing two cases separately: \(t=0\) and other cases.

For BFN-(SP)\(^{2t+1}\), Lemma 1 directly translates to the following corollary.

Corollary 1

For BFN-(SP)\(^{2t+1}\), if \(d^{(i)}\) is not zero, \(d^{(i)} \ge t \mathcal {B} + 1\).

Property 2 and Corollary 1 directly show that any three consecutive rounds of BFN-(SP)\(^{2t+1}\) have at least \(2(t\mathcal {B}+1)\) active S-boxes. However, when the center of the F-function in the three consecutive rounds is active, there exist more active S-boxes as follows.

Lemma 3

For BFN-(SP)\(^{2t+1}\), if \(d^{(i)}\) is not zero, \(d^{(i-1)} + d^{(i)} + d^{(i+1)} \ge (2t+1)\mathcal {B}\).

Proof

From the definition, \(\Delta x_{1}^{(i-1)} \oplus \Delta x_{1}^{(i+1)} = M(\Delta y^{(i)}_{2t+1})\). If \(d^{(i)}\) is not zero, then \(\Delta y^{(i)}_{2t+1}\) is not zero due to the invertibility. Since \(\Delta y^{(i)}_{2t+1}\) is not zero, \(w_n(\Delta x_{1}^{(i-1)}) + w_n(\Delta x_{1}^{(i+1)}) + w_n(\Delta y^{(i)}_{2t+1}) \ge \mathcal {B}\), i.e., \(d^{(i-1)}_{1} + d^{(i)}_{2t+1} + d^{(i+1)}_{1} \ge \mathcal {B}\). Also, if \(\Delta y^{(i)}_{2t+1}\) is not zero, \(\Delta x_{1}^{(i-1)}\) and \(\Delta x_{1}^{(i+1)}\) cannot be zero simultaneously. Thus \(d^{(i-1)}_{1} + \cdots + d^{(i-1)}_{2t} \ge t \mathcal {B}\) or \(d^{(i+1)}_{1}+ \cdots + d^{(i+1)}_{2t} \ge t \mathcal {B}\). Therefore \(\Sigma _{j=1}^{2t+1} (d^{(i-1)}_{j} + d^{(i)}_{j} + d^{(i+1)}_{j}) \ge (2t + 1) \mathcal {B}\). \(\square \)

The lower bounds on the minimum number of active S-boxes in any consecutive rounds of BFN-(SP)\(^{2t+1}\) are directly derived by the lemmata above. First, we prove the bounds on \(\mathcal {D}(r)\), \(r \le 4\) by Lemma 4, then show the bounds on \(\mathcal {D}(r)\), \(r > 4\) by Lemma 5.

Lemma 4

For BFN-(SP)\(^{2t+1}\), \(\mathcal {D}(1) = 0\), \(\mathcal {D}(2) = t \mathcal {B} + 1\), \(\mathcal {D}(3) = 2 (t\mathcal {B} + 1)\), and \(\mathcal {D}(4) = (2t+1) \mathcal {B}\).

Proof

Since any two consecutive rounds have at least one active F-function, \(\mathcal {D}(2) = t \mathcal {B} + 1\) from Corollary 1. We consider \(d^{(i-1)}\), \(d^{(i)}\) and \(d^{(i+1)}\). If \(d^{(i)}\) is not zero, then \(d^{(i-1)} + d^{(i)} + d^{(i+1)} \ge (2t + 1) \mathcal {B}\). If \(d^{(i)}\) is zero, then both \(d^{(i-1)}\) and \(d^{(i+1)}\) are not zero from Property 2. In that case, \(d^{(i-1)} + d^{(i)} + d^{(i+1)} \ge 2(t \mathcal {B} + 1)\) from Corollary 1. Since \(\mathcal {B} \ge 2\) from the invertibility and \((2t + 1) \mathcal {B} \ge 2(t \mathcal {B} + 1)\), we obtain \(\mathcal {D}(3) = 2(t \mathcal {B} + 1)\). We consider \(d^{(i-1)}\), \(d^{(i)}\), \(d^{(i+1)}\) and \(d^{(i+2)}\). If \(d^{(i)}\) is not zero, \(d^{(i-1)} + d^{(i)} + d^{(i+1)} \ge (2t + 1) \mathcal {B}\) from Lemma 3. If \(d^{(i)}\) is zero, then \(d^{(i+1)}\) is not zero due to the invertibility. Then \(d^{(i)} + d^{(i+1)} + d^{(i+2)} \ge (2t + 1) \mathcal {B}\). Thus, \(\mathcal {D}(4) = (2t + 1) \mathcal {B}\). \(\square \)

The bounds on \(\mathcal {D}(r)\), \(r > 4\), are given as inductive forms.

Lemma 5

Let \(r > 4\), \(\mathcal {D}(r) = \min (\mathcal {D}(r - 3) + (2t+1)\mathcal {B}, \mathcal {D}(r-4) + (3t+1)\mathcal {B}+1)\) for BFN-(SP)\(^{2t+1}\).

Proof

We consider active S-boxes in \(r\) consecutive rounds starting from round \(i+1\), i.e., \(d^{(i+1)}, \ldots , d^{(i+r)}\). If \(d^{(i+r-1)}\) is not zero, then \(d^{(i+r-2)} + d^{(i+r-1)} + d^{(i+r)} \ge (2t + 1) \mathcal {B}\). Also, \(d^{(i+1)} + \cdots + d^{(i+r-3)} \ge \mathcal {D}(r-3)\) from the definition. Therefore, \(d^{(i+1)} + \cdots + d^{(i+r)} \ge \mathcal {D}(r-3) + (2t+1) \mathcal {B}\) when \(d^{(i+r-1)}\) is not zero. If \(d^{(i+r-1)}\) is zero, then both \(d^{(i+r-2)}\) and \(d^{(i+r)}\) are nonzero. \(d^{(i+r-3)} + d^{(i+r-2)} + d^{(i+r-1)} \ge (2t + 1) \mathcal {B}\) and \(d^{(i+r)} \ge t \mathcal {B} + 1\) from Corollary 1. Also, \(d^{(i+1)} + \cdots + d^{(i+r-4)} \ge \mathcal {D}(r-4)\). Therefore, \(d^{(i+1)} + \cdots + d^{(i+r)} \ge \mathcal {D}(r-4) + (3t + 1) \mathcal {B} + 1\) when \(d^{(i+r-1)}\) is zero. Combining both results, we obtain \(\mathcal {D}(r) = \min (\mathcal {D}(r-3) + (2t+1) \mathcal {B}, \mathcal {D}(r-4) + (3t+1) \mathcal {B} + 1)\) when \(r>4\). \(\square \)

Now we have the lower bounds in any consecutive rounds of BFN-(SP)\(^{2t+1}\). However, it is hard to compare its efficiency with other constructions, since the bounds are proven as inductive forms. In order to obtain more accurate bounds, we consider two cases. We start with the special case of \(t=0\).

Theorem 2

(active S-boxes for BFN-(SP)\(^{2t+1}, t=0\)) For any nonzero input difference (nonzero input mask), every \(4R\), \(4R+1\), \(4R+2\), \(4R+3\) rounds of BFN (\(R \ge 1\)) with an SP F-function provide at least \((\mathcal {B}+1)R-1\), \((\mathcal {B}+1)R\), \((\mathcal {B}+1)R+1\), \((\mathcal {B}+1)R+2\) differentially (linearly) active S-boxes, respectively, assuming \(\mathcal {B} > 2\), where \(\mathcal {B}\) is the branch number of the diffusion matrix (of the transposed diffusion matrix).

Proof

If \(\mathcal {D}(r-3) - \mathcal {D}(r-4) \ge 1\), \(\mathcal {D}(r) = \mathcal {D}(r-4)+\mathcal {B}+1\) from Lemma 5. Otherwise \(\mathcal {D}(r) = \mathcal {D}(r-3)+\mathcal {B}\). Clearly, \(\mathcal {D}(r-3) - \mathcal {D}(r-4) = 1\) when \(r = 5\) and 6, and \(\mathcal {D}(r-3)-\mathcal {D}(r-4)=\mathcal {B}-2\) when \(r=7\). Since \(\mathcal {B}>2\) from the assumption, \(\mathcal {D}(r-3)-\mathcal {D}(r-4) \ge 1\) when \(r=7\). Similarly, \(\mathcal {D}(r-3)-\mathcal {D}(r-4) = \mathcal {D}(4)-\mathcal {D}(5) = (\mathcal {B}) - (\mathcal {B}+1) = 1\) when \(r=8\). Since \(\mathcal {D}(r-3)-\mathcal {D}(r-4) \ge 1\) for \(r=5,6,7\) and \(8\), \(\mathcal {D}(r-3)-\mathcal {D}(r-4) \ge 1\) when \(r \ge 5\). Thus \(\mathcal {D}(r) = \mathcal {D}(r-4)+\mathcal {B}+1\) when \(r \ge 5\). Then \(\mathcal {D}(r) = \mathcal {D}(r-4)+\mathcal {B}+1 = \mathcal {D}(r-8)+2(\mathcal {B}+1)=\cdots =\mathcal {D}(r-4u)+(\mathcal {B}+1)u\). Therefore \(\mathcal {D}(4R+1) = \mathcal {D}(4R-3)+\mathcal {B}+1 = \mathcal {D}(4R-7)+2(\mathcal {B}+1) = \cdots = \mathcal {D}(1)+(\mathcal {B}+1)R = (\mathcal {B}+1)R\). Similarly, \(\mathcal {D}(4R+2) = \mathcal {D}(2) + (\mathcal {B}+1)R = (\mathcal {B}+1)R+1\), \(\mathcal {D}(4R+3) = \mathcal {D}(3) + (\mathcal {B}+1)R = (\mathcal {B}+1)R+2\), \(\mathcal {D}(4R) = \mathcal {D}(4) + (R-1)(\mathcal {B}+1) = (\mathcal {B}+1)R-1\). \(\square \)

Note that Theorem 2 was conjectured in [24]. For all other integers \(t>0\), the bounds are stated as follows.

Theorem 3

(active S-boxes for BFN-(SP)\(^{2t+1}, t>0\)) For any nonzero input difference (nonzero input mask), every \(3R\), \(3R+1\), \(3R+2\) rounds of BFN (\(R \ge 1\)) with \((2t+1)\) consecutive SP-layers in the F-function (\(t > 0\)) provide at least \((2t+1)\mathcal {B}R-\mathcal {B}+2\), \((2t+1)\mathcal {B}R\), \((2t+1)\mathcal {B}R+t\mathcal {B}+1\) differentially (linearly) active S-boxes, respectively, where \(\mathcal {B}\) is the branch number of the diffusion matrix (of the transposed diffusion matrix).

Proof

If \(\mathcal {D}(r-3)-\mathcal {D}(r-4) \le t \mathcal {B}+1\), \(\mathcal {D}(r)=\mathcal {D}(r-3)+(2t+1)\mathcal {B}\) from Lemma 5. Otherwise \(\mathcal {D}(r) = \mathcal {D}(r-4)+(3t+1)\mathcal {B}+1\). From Lemma 4, \(\mathcal {D}(r-3)-\mathcal {D}(r-4)=t\mathcal {B}+1\) when \(r = 5\) and \(6\), and \(\mathcal {D}(r-3)-\mathcal {D}(r-4)=\mathcal {B}-2\) when \(r=7\). Since \(t>0\), \(\mathcal {B} - 2 < t\mathcal {B}+1\). Thus \(\mathcal {D}(r) = \mathcal {D}(r-3)+(2t+1)\mathcal {B}\) when \(r \ge 5\). Then \(\mathcal {D}(r) = \mathcal {D}(r-3)+(2t+1)\mathcal {B} = \mathcal {D}(r-6)+2(2t+1)\mathcal {B} = \cdots = \mathcal {D}(r-3u)+(2t+1)\mathcal {B}u\). Therefore \(\mathcal {D}(3R+1)= \mathcal {D}(3R-2)+(2t+1)\mathcal {B} = \cdots = \mathcal {D}(1) + (2t+1)\mathcal {B}R = (2t+1)\mathcal {B}R\). Similarly, \(\mathcal {D}(3R+2) = \mathcal {D}(2) + (2t+1)\mathcal {B}R = (2t+1)\mathcal {B}R+t\mathcal {B}+1\), \(\mathcal {D}(3R) = \mathcal {D}(3) + (2t+1)(R-1)\mathcal {B} = (2t+1)\mathcal {B}R-\mathcal {B}+2\). \(\square \)

Now we have comparable bounds for every four rounds of BFN-(SP)\(^{2t+1}\). For the case of \(t>0\), \((\mathcal {D}(r-3)+(2t+1)\mathcal {B})\) is always less than or equal to \((\mathcal {D}(r-4)+(3t+1)\mathcal {B}+1)\). On the other hands, for the case of \(t=0\), \((\mathcal {D}(r-4)+\mathcal {B}+1)\) is less than or equal to \((\mathcal {D}(r-3)+\mathcal {B})\) when \(\mathcal {B} > 2\) and \(r = 4s+3 (s>0)\) (e.g., \(r = 7, 11, 15,\ldots \)). Thus, the bounds for the case \(t=0\) and \(t>0\) are slightly different and those are separately proven. The tightness of these bounds is proven in Sect. 5.

4.2 Differentially active S-boxes in BFN-(SP)\(^{2t}\)

For BFN-(SP)\(^{2t}\), which comprises even number of SP-layers, the minimum number of differentially active S-boxes is straightforwardly proven by observing the number of differentially active F-functions.

Lemma 1 yields the following corollary.

Corollary 2

For BFN-(SP)\(^{2t}\), if \(d^{(i)}\) is not zero, \(d^{(i)} \ge t \mathcal {B}\).

This corollary allows us to prove the following theorem.

Theorem 4

(active S-boxes for BFN-(SP)\(^{2t}\)) For any nonzero input difference (nonzero input mask), every \(3R\), \(3R+1\), \(3R+2\) rounds of BFN (\(R \ge 1\)) with \(2t\) consecutive SP layers in the F-function provide at least \(2t\mathcal {B}R\), \(2t\mathcal {B}R\), \(2t\mathcal {B}R + t\mathcal {B}\) differentially (linearly) active S-boxes, respectively, where \(\mathcal {B}\) is the branch number of the diffusion matrix (of the transposed diffusion matrix).

Proof

We consider \(d^{(i-1)}, d^{(i)}\) and \(d^{(i+1)}\). If \(d^{(i)}\) is zero, then both \(d^{(i-1)}\) and \(d^{(i+1)}\) are not zero due to the invertibility. Thus there exist at least \(2t\mathcal {B}\) active S-boxes from Corollary 2. If \(d^{(i)}\) is not zero, then \(d^{(i-1)}\) and \(d^{(i+1)}\) cannot be zero simultaneously. Therefore there exist at least \(2t\mathcal {B}\) active S-boxes from Corollary 2. Since two consecutive rounds have at least \(t\mathcal {B}\) active S-boxes, \(3R+2\) consecutive rounds have at least \(2t\mathcal {B}R+t\mathcal {B}\) active S-boxes. \(\square \)

Unlike the case of BFN-(SP)\(^{2t+1}\), the lower bounds for BFN-(SP)\(^{2t}\) are easily proven. In the other words, the minimum number of differentially active S-boxes for BFN-(SP)\(^{2t}\) corresponds to the minimum number of differential active F-functions times \(t\mathcal {B}\).

4.3 Differentially active S-boxes in BFN-(SP)\(^{2t-1}\)S

Since the number of S-box layers is the same in BFN-(SP)\(^{2t-1}\)S, similarly to the bounds for BFN-(SP)\(^{2t}\), one derives the following theorem.

Theorem 5

(active S-boxes for BFN-(SP)\(^{2t-1}\)S) For any nonzero input difference (nonzero input mask), every \(3R\), \(3R+1\), \(3R+2\) rounds of BFN (\(R \ge 1\)) with \((2t-1)\) consecutive SP-layers followed by an S-box layer in the F-function provide at least \(2t\mathcal {B}R\), \(2t\mathcal {B}R\), \(2t\mathcal {B}R+t\mathcal {B}\) differentially (linearly) active S-boxes, where \(\mathcal {B}\) is the branch number of the diffusion matrix (of the transposed diffusion matrix).

The obtained bounds for BFN-(SP)\(^{2t}\)S seem almost same as the bounds for BFN-(SP)\(^{2t}\). However, BFN-(SP)\(^{2t}\) has one more P-layer than BFN-(SP)\(^{2t-1}\)S has when the parameter \(t\) is the same. This implies that the last P-layer of BFN-(SP)\(^{2t}\) does not improve the security in terms of the number of differentially active S-boxes.

4.4 Differentially active S-boxes in BFN-(SP)\(^{2t}\)S

Similarly to BFN-(SP)\(^{2t+1}\), BFN-(SP)\(^{2t}\)S has odd number of S-layers. However, lack of the last P-layer allows us to prove the bounds for BFN-(SP)\(^{2t}\)S easily.

Property 2 and Lemma 2 yield the following theorem.

Theorem 6

(active S-boxes for BFN-(SP)\(^{2t}\)S) For any nonzero input difference (nonzero input mask), every \(3R\), \(3R+1\), \(3R+2\) rounds of BFN (\(R \ge 1\)) with \(2t\) consecutive SP-layers followed by an S-box layer in the F-function provide at least \(2(t\mathcal {B}+1)R\), \(2(t\mathcal {B}+1)R\), \((2(t\mathcal {B}+1)R + (t\mathcal {B}+1))\) differentially (linearly) active S-boxes, where \(\mathcal {B}\) is the branch number of the diffusion matrix (of the transposed diffusion matrix).

The proof for BFN-(SP)\(^{2t}\)S is similar to the proofs for BFN-(SP)\(^{2t}\) and BFN-(SP)\(^{2t-1}\)S. In other words, for BFN-(SP)\(^{2t}\)S, the minimum number of active S-boxes can be proven by studying the number of active F-functions. However, the proven bounds are not same as the bounds for BFN-(SP)\(^{2t}\) and BFN-(SP)\(^{2t-1}\)S, since the number of S-layers is different. In the following sections, we discuss tightness of the bounds proven in this section and their optimality.

5 Tightness of bounds

To demonstrate the tightness of the lower bounds, we provide trails that actually attain those proven bounds when the matrices used in the BFNs are MDS. These trails are given in Figs. 4, 5, 6 and 7 for all the BFN constructions in question. Note that a similar observation for BFN-SP with \(m=8\) was given in Appendix A of [26].

Fig. 4
figure 4

Truncated differential trails of BFN-(SP)\(^{2t}\) (left 3-round iterative trail) and BFN-(SP)\(^{2t-1}\)S (right 6-round iterative trail) attaining the lower bounds of Theorems 4 and 5

In the figures, \(\Delta \) and \(\nabla \) denote S-box truncated difference \(100\ldots 00\) (only the first S-box active out of \(m\)) and \(111\ldots 11\) (all \(m\) S-boxes active), respectively. Thin boxes and thick boxes denote S-box layers (S-layers) and linear layers (P-layers), respectively. XORs with difference cancellation are marked with dashed circles. Differentially active S-box layers are denoted by grey. The underlined numbers denote the minimum numbers of active S-boxes in the area indicated by a dashed line.

From the discussions in Sect. 3, the following observations are directly applicable to the case of the linear cryptanalysis.

Fig. 5
figure 5

Truncated differential trails of BFN-(SP)\(^{2t}\)S (3-round iterative trail) attaining the lower bounds of Theorem 6

Fig. 6
figure 6

Truncated differential trails of BFN-SP attaining the lower bounds of Theorem 2

Fig. 7
figure 7

Truncated differential trails of BFN-(SP)\(^{2t+1}\), \(t>0\) (6-round iterative trail) attaining the lower bounds of Theorem 3

5.1 BFN-(SP)\(^{2t}\)

The left side of Fig. 4 shows a 3-round iterative path that maps \((0, \Delta )\) to \((0, \Delta )\) for BFN-(SP)\(^{2t}\). In other words, the \(i\)-th round input difference \((\Delta X_L^{(i)}, \Delta X_U^{(i)}) = (0, \Delta )\) and the \((i+3)\)-th round input difference \((\Delta X_L^{(i+3)}, \Delta X_U^{(i+3)}) = (0, \Delta )\). Note that, since we use an untwisted form in Fig. 4, an output difference looks reverse in the case of odd number of rounds. The numbers of active S-boxes provided by this figure correspond to the bounds proven in Theorem 4. For instance, the numbers of active S-boxes for 3, 4, 5 and 6 rounds given by the figure are \(2t\mathcal {B}\), \(2t\mathcal {B}\), \(3t\mathcal {B}\) and \(4t\mathcal {B}\), respectively, which correspond to the proven bounds. Since the path is 3-round iterative, it shows that the proven bounds are tight.

5.2 BFN-(SP)\(^{2t-1}\)S and BFN-(SP)\(^{2t}\)S

The right side of Fig. 4 shows a 6-round iterative path for BFN-(SP)\(^{2t-1}\)S that maps \((0, \Delta )\) to \((0, \Delta )\). There does not exist a simple 3-round iterative path, since the output difference of the F-function will be \(\nabla (\Delta )\) when the input difference of F-function is \(\Delta (\nabla )\). However, those become iterative when considered over 6 rounds. The paths shown in the figure provide \(2t\mathcal {B}\) active S-boxes for 3 rounds and prove the tightness of the bounds proven in Theorem 5. Figure 5 shows a 3-round iterative path for BFN-(SP)\(^{2t}\)S that attains the lower bounds proven in Theorem 6.

5.3 BFN-SP

The paths of Fig. 6 for BFN-SP consist of iterative paths and an additional path. In the case of \((4R+3)\) rounds, the tightness is easily proven by the right side of Fig. 6. In the other cases (\(4R\), \(4R+1\) and \(4R+2\) rounds), paths consist of some consecutive 4-round iterative paths on the left and one 4-round additional path in the center of Fig. 6. Each path for \(4R\) rounds consists of \((R-1)\) consecutive 4-round iterative paths and one 4-round additional path. Also paths for \(4R+1\) and \(4R+2\) rounds consist of \(R\) consecutive 4-round iterative paths and one 4-round additional path. For example, a path for 12 rounds of BFN-SP consists of two consecutive 4-round iterative paths followed by one 4-round additional path. Similarly, a path for 13 rounds consists of three 4-round iterative paths (12 rounds) followed by the first one round of the 4-round additional path (1 round).

5.4 BFN-(SP)\(^{2t+1}\), \(t > 0\)

Figure 7 shows a 6-round iterative path that attains the bounds proven in Theorem 3. The path starting from the \(i\)-th round shows the tightness for \(3R+1\) and \(3R+2\) rounds. The path starting from the \((i+2)\)-th round shows the tightness for \(3R\) rounds.

6 Optimality

In this section, it is proven that BFN-SPS and BFN-SPSP are the most efficient with respect to the efficiency metric \(E_m\) of Definition 5. Recall that \(E_m\) shows the ratio between active S-boxes and all S-boxes when the number of rounds is sufficiently large. Table 2 contains the computation of \(E_m\) for the different BFNs in question. The optimality result is formulated as follows.

Theorem 7

When instantiated with MDS matrices for \(m\ge 2\), BFN-(SP)\(^{2t}\) and BFN-(SP)\(^{2t-1}\)S provide a higher or equal proportion of active S-boxes than BFN-SP, BFN-(SP)\(^{2t+1}\) and BFN-(SP)\(^{2t}\)S for any number \(t\) of layers. Thus, BFN-SPSP and BFN-SPS are optimal with respect to \(E_m\).

Table 2 \(E_{m}\) for BFNs with SP-type functions and MDS matrices

Proof

We compute the values of \(E_m\) for all BFN constructions with MDS matrices in Table 2 and compare \(E_m=\frac{m+1}{3m}\) for BFN-(SP)\(^{2t}\) and BFN-(SP)\(^{2t-1}\)S to \(E_m\) for

  • BFN-(SP)\(^{2t+1}\). From Table 2, one immediately observes that \(\frac{m+1}{3m}\) is no lower than \(E_m\) for BFN-(SP)\(^{2t+1}\).

  • BFN-SP. For \(m\ge 2\), the difference \(\frac{m+1}{3m}-\frac{m+2}{4m}=\frac{m-2}{12m}\ge 0\) and \(E_m\) for BFN-SP is no higher than \(\frac{m+1}{3m}\).

  • BFN-(SP)\(^{2t}\)S. In this case, one has to analyze \(\frac{2t(m+1)+2}{3(2t+1)m}\) as a function of \(t\). After taking the value of 0 for \(t=-\frac{1}{m+1}\), it grows monotonously for all \(t>0\) and attains its maximum at the infinity. Since

    $$\begin{aligned} \lim _{t\rightarrow \infty }\frac{2t(m+1)+2}{3(2t+1)m} =\frac{m+1}{3m}, \end{aligned}$$

    \(E_m\) for BFN-(SP)\(^{2t}\)S is no higher than \(\frac{m+1}{3m}\).

Thus, \(E_m\) for BFN-(SP)\(^{2t}\) and BFN-(SP)\(^{2t-1}\)S is no lower than that for BFN-(SP)\(^{2t+1}\), BFN-SP, and BFN-(SP)\(^{2t}\)S, which yields the first claim of the theorem. The second claim follows from choosing \(t=1\). \(\square \)

7 Conclusion

In this work, we considered a wide class of balanced Feistel networks with any number of interleaved S-box layers and linear diffusion layers in their F-function. In this class, we demonstrated that SPS and SPSP F-functions are arguably optimal with respect to the relative number of active S-boxes provided. Our results indicate that one SP-layer in the F-function is not enough to attain optimality, whereas taking more than two S-box layers does not increase the efficiency either. The optimality is shown with respect to the security of a cipher towards differential and linear cryptanalysis.

As nearly any SPN-based block cipher, BFNs with SP-type F-functions exhibit the differential effect—many differential trails contributing to the same differential. Having SPS or SPSP constructions as F-functions—as in the optimal constructions of this paper—simplifies the consideration of upper bounds on the differential probability over several rounds. The work [1] proves that the maximum average differential probability over 3 rounds for a BFN with bijective F-functions is upper-bounded by \(\pi ^2\), where \(\pi \) is the maximum differential probability of the F-function. At the same time, the maximum differential probability of an SPS or SPSP construction with MDS diffusion is known to be upper-bounded by \(p^m\), where \(p\) is the maximum differential probability of the underlying S-box [17]. This provides an upper bound of \(p^{2m}\) on the average differential probability over 3 rounds of BFN-SPS and BFN-SPSP. Similar considerations apply to the linear probability. However, capturing the differential or linear hull effect for an arbitrary number of rounds and incorporating it into the efficiency metric appears to be a challenging task.

Besides BFNs, generalized Feistel networks (GFNs) are often used in the design of block ciphers. Both CLEFIA [28] and PICCOLO [23] follow this design approach with SP-type F-functions. We conjecture that our optimality result also applies to any GFN under the definition of [10]. In other words, our conjecture is that the instantiation of the F-function with SPS and SPSP will be optimal with respect to the relative number of active S-boxes. We leave this as an important open problem.