1 Introduction

Cryptography is currently being used to obscure information to secure the digital marketing transactions over the network. In this contemporary world, cryptography is a technique to convert an original message into coded format (by using ciphers) with the purpose to create confusion in the data and converting back coded data into plaintext at the receiver end (decryption). In digital marketing, a security concern is a big challenge. The speedy, hassle free, and reliable transactions via the internet are a necessity of the digital age as the general public is totally relying on the internet for routine life activities and marketing transactions. In the connection to data protection, confidentiality, authentication, non-repudiation and data integrity are core concerns. These concerns motivate computer scientists for designing cryptographic algorithms to bring ease to an individual’s life. Cryptography is divided into two branches, called as substitution and transposition. In substitution, each letter of the message alters its identity but maintains its position whereas in transposition letters of the actual data are rearranged [29] .Two types of cryptographic models exist for secure digital communications i.e., Symmetric and Asymmetric. In symmetric encryption technique, only one secret key is used for encryption and decryption of plaintext. A cipher system is called symmetric if the decryption key can be easily guessed from the encryption key [43]. The key must be exchanged among both parties with the aim to use in the encryption and decryption process. Whereas, in an asymmetric encryption scheme, two keys are used to encrypt and decrypt the message: one is public and the other is private [41]. Actually, data is “scrambled” using private encryption methods, making it meaningless to the third party which does not have the crypto key to decrypt it. Once the message has been transmitted to the chosen receiver who has the key, the algorithm returns the cipher text to its original message. Secret key is a specific password/code or a random number which is generated through some secure key generator algorithms. Symmetric encryption algorithms are divided into two categories: Block algorithms: The data bits are bound into blocks and encrypted using a particular secret key. As the encryption is in process, the system holds the data into memory till the completion of the block. In stream Algorithms, there is no concept of storing data into memory, data is encrypted as it streams.

Asymmetric-key algorithms are known as “public-key algorithms”. They employ public and private keys, which are logically connected to each other. The pair of public and private keys is considered a ‘key-pair’. From key-pair, one is used for data encryption, while the other is used in the decryption process. Public key is accessible to the general public, while the owner keeps the private key hidden at all times. There is no logical connection between public and private so, one cannot be used to determine the other. To preserve an individual’s privacy, the Statistical Transformation with 3-Dimensional Shearing Algorithm (ST3DSA) distorts the data utilizing three-dimensional shearing transformations, information value, and weight of evidence even without exchanging the mining results. For the sake of analysis, the recommended ST3DSA procedure is applied to both numerical and categorical parameters in the bank marketing, adult income, and lung cancer datasets. The experimental results confirmed that ST3DSA has the capacity to cope with a person’s privacy issues [30]. In order to secure cloud storage and sharing, Kumar et al. [55] presented a certificate-less encoding technique. In this technique, the user gets character-based cryptography to provide attribute - based encryption in the certificate less encryption scheme. Clients can design their own personality via open and private key match using any open public key encryption algorithm. The general public key is sent to the cloud with the character for approval. After confirming the personality of the comparing client, the cloud will generate two or three open keys and private keys. So, each user has their own open public and private key and both of these keys are generated by the cloud. This scheme is composed of RSA and Blowfish algorithms and both of these algorithms belong to a symmetric cryptosystem. Secondly, it is important to know all codes being used in cryptography as code performs real mathematics like twisted path code, shift code, and reverse tail etc. [27]. There are numerous strong arguments why algorithms/crypto codes should be made public. Actually, it is incredibly easy to make a minor change and create a weak cryptographic algorithm. If the algorithm is not accessible to general public, no one will notice the flaw until an attacker attempts to exploit it [17].

In 1997, the National Institute of Standards and Technology started the process of selecting cryptographic encryption algorithms. Finally, the Rinjdael Algorithm (Advanced Encryption Standard) of Symmetric Key Cryptosystem was accepted as a standard for secure digital communication. This recognition led to its widespread usage in the implementation of critical applications requiring reliable architecture. The only nonlinear key component of AES is S-box which is used to obscure the correlation between plaintext and ciphertext and produces resistance to cyber-attacks. Actually, S-box is the indispensable part of a secure block cipher scheme which is also used to build relations between key and ciphertext as complex as possible. The nonlinearity, strict avalanche criterion and differential uniformity are considered as fitness functions in the evaluation of S-box. Crypto S-box should have different and unique values and there should not be a correlation among the values to assure the quality of S-box. Various variants of S-box are more robust and capable of providing interesting properties to cryptographic algorithms. There are several S-box construction methodologies presented in literature to design high quality S-boxes with increased algebraic complexity, nonlinearity and other good cryptographic properties. Chaotic maps play an important role in the generation of random numbers to construct quality S-boxes by using heuristic techniques. A significant number of articles is available in literature based on chaotic maps to generate dynamic and complicated S-boxes.

This article primarily focuses on S-Box analysis for the purpose of securing the digital transmission of multimedia:

  1. 1)

    A detailed review of S-box and its basic anatomy, and an extensive analysis of current S-box construction methodologies.

  2. 2)

    A comparative analysis of old known and the latest newly proposed S-boxes for investigating their cryptographic features.

  3. 3)

    A Comprehensive study of S-box performance evaluation criteria including nonlinearity, strict avalanche criterion, bit independence criterion, differential uniformity, linear probability, and fixed points is conducted. In addition, we propose the challenges in order to judge the quality of an S-box.

Organization of the Paper: The rest of the arrangement of this article is given as follows. In Section 2, we reviewed the S-box construction techniques and their mathematical formulas as presented in literature. In Section 3, a study of S-box performance evaluation criterion is discussed. In Section 4, a comparative study and challenges are presented. In Section 5, a conclusion is given Fig. 1.

Fig. 1
figure 1

Block diagram of 8*8 S-box

2 S-box structure and key methodologies for S-box construction

Security development is a key to communication technologies. Currently, advanced encryption standard (AES) is considered to be a reliable encryption algorithm for encryption of electronic data for secure transmission. A variety of approaches have been made to evaluate the security and performance of AES and it is important to note that Substitution box (S-box) is key in designing and implementing of AES and it is used in the development of advanced cryptographic algorithms as well. S-box is a confusion component and mathematically it is represented as a mapping S: {0, 1}n → {0,1}n, where {0,1}n denotes a vector of n elements from Galois field GF(2n). In other words, S-box receives n input bits and alters them into m output bits. It’s referred to as S-box and frequently utilized as a lookup table. A high-quality S-box is considered to have statistical properties of cryptography like confusion and diffusion that make it resistant to various attacks and in cryptosystem, playing a vital role in cryptography for data transmission and secure communication over the network. Generally, S-box is a table of m rows and n columns and each element of S-box should be an integer with size of one byte and in range of {0,1, 2,3,….,255}.

2.1 Key methodologies for S-box construction

S-box is the only nonlinear component in the AES encryption process which is fixed throughout the algorithm and due to its fixation, the cryptanalysts tried to exploit this weakness. The key idea of the research is that the background Galois field and its underlying primitive polynomial have an impact on the S-boxes’ function and performance. This finding leads to the interesting notion that, rather than developing new algorithms, improving current algorithms is worth researching since it is the least time-consuming yet more successful. An S-box is carefully chosen to give security against any linear or differential cryptanalysis. The Mobius transformation, complete Latin square, linear trigonometric transformation, affine transformation and Lightweight S-box construction methodologies are explored for the structure of newly constructed S-boxes because of its wider use and complicated features. So, the researchers are currently working on construction of those dynamic S-boxes which are more immune against linear and differential attacks. We have covered review of those S-boxes’ construction methodologies in this section.

2.1.1 A. Linear Trigonometric Transformation

An n × n S-box is generated through following trigonometric transformation [69].

$$ T(z)= Cos\left(\left(A+B\right)\ast X\ast z+C\right) $$

Where,

$$ {\displaystyle \begin{array}{c}0\le z\le \left({2}^n-1\right),\\ {}n=8,\\ {}\begin{array}{c}\kern1.5em 0<X<1,\\ {}\kern2.5em A,C\in \left\{1,3,\dots, {2}^n-1\right\},\\ {}\begin{array}{c}\kern0.5em B\in z\ and\\ {}\kern1.25em B= Array\ of\ size\ 256\end{array}\end{array}\end{array}} $$
(1)

The values of the variables A, B, C, and X in the above equation are used by the encryption key. This equation is capable of producing dynamic S-boxes using different values of above-mentioned parameters.

2.1.2 S-box construction using complete Latin Square

A Latin square is a square matrix and consists of rows and columns with the condition that each row and column contains unique values. Complete Latin square can be constructed in one of the following ways.

Method 1: Latin square of size N × N (N = 2y and y∈ {1, 2, 3, …, N} ) is generated as

$$ 0,1,2m-1,2,2m-2,3,2m-3,\dots, m-1,m+1,m $$
(2)

Method 2: Primitive root g of prime number N + 1 is calculated as

$$ {a}_i={g}^i\mathit{\operatorname{mod}}\ N+1,i\in \left\{1,2,3,\dots, N\right\} $$
(3)

Determine the primitive roots g of N + 1; if N + 1 is a prime number. For completion of Latin square, uses of chaotic sequences are very important and generated by the enhanced logistic map. Logistic map is comprised of following three steps for S-box construction [21].

  1. 1)

    Latin Square Generation

  2. 2)

    Generation of two orthogonal matrices

  3. 3)

    Using chaotic sequences, S-box is constructed by scrambling the orthogonal matrices.

2.1.3 Mobius Transformation and Permutation Function

Linear transformation function is also known as Mobius transformation and it is written as given below

$$ \boldsymbol{Z}\left(\boldsymbol{x}\right)=\frac{\boldsymbol{ax}+\boldsymbol{b}}{\boldsymbol{cx}+\boldsymbol{d}} $$
(4)

Where, ad − bc ≠ 0 and a, b, c and d belongs to Galois Field (GF)

The Galois field GF has a fixed number of elements and it is applicable on data which has a vector representation. A finite field has addition and multiplication operations. In cryptographic encryption Galois Field is represented as Mn, where M = 2 and 256 order Galois field GF(28) is used in AES. In this transformation, elements are represented as bytes (one byte contains 8 bits) which are denoted as a polynomial with coefficients from binary Galois field GF(2) as

$$ \kern3.25em {a}_7{x}^7+{a}_6{x}^6+{a}_5{x}^5+{a}_{47}{x}^{47}+{a}_3{x}^3+{a}_2{x}^2+{a}_1x+{a}_0 $$
(5)

Permutation is the process of rearranging the function f from set D into C such that each element of D has unique output as f : D → C is one to one if and only if f(x) = f(y) implies x = y.

Function f : D → C is onto if for each element c ∈ C, there exists some d ∈ D with f(d) = c. A function which has properties of one-to-one and onto is known as bijective function. Total number of permutations is given by n! for a set of n elements. Linear fractional property is utilized for construction of S-box and its application on Galois Field (2n, n = 8) having elements from 0 to 255 in decimal form. By using GF(2n), the generated S-box will consist of 256 elements. Traditional AES S-box uses fixed irreducible polynomial P(x) = x8 + x4 + x3 + x + 1 to find out multiplicative inverse. In [16, 40], P(x) = x8 + x4 + x3 + x2 + x + 1 polynomial was also used for construction of S-box and here is a complete procedure as discussed in [40].

  1. i.

    Chosen linear fractional transformation f\( (z)=\frac{35z+15}{9z+5} \) and P(x) = x8 + x4 + x3 + x2 + x + 1

  2. ii.

    Calculating f(z)by taking all values of GF(28)

  3. iii.

    Matrix generated from linear fractional transformation

  4. iv.

    Applied permutation to the matrix and produced required S-box.

2.1.4 S-box construction based on Bent Function and Logistic Chaotic System

Here are the steps to construct S-box with high nonlinearity based on bent function and discrete chaotic system [26].

  1. i.

    Logistic chaotic system

  2. ii.

    Constructing Boolean function g(x)

  3. iii.

    Construction of Maiorana MC Farland type bent function

  4. iv.

    Construction of MM type Multi output bent function

  5. v.

    Constructed S-box did not meet the standard properties as S-box should have unique values from 0 to 256. To maintain this property; all duplicate values will be removed and replaced with missing numbers.

  6. vi.

    Generating S-boxes with high nonlinearity

S-boxes generated through this method to be used in symmetric cryptography techniques/algorithms and made full use of the properties of chaotic system.

2.1.5 S-box construction using discrete chaotic maps and cuckoo search algorithm

This method has used the properties of 1-D discrete chaotic maps including pseudo random sequence generator to design optimized S-box with high values of bijectivity, nonlinearity, strict avalanche criteria etc. Here is a pseudo code to construct S-box based on chaotic map and cuckoo search algorithm [3].

  1. i.

    A discrete chaotic map is chosen

  2. ii.

    Set the chaotic map’s starting settings

  3. iii.

    Start iteration from 0 to n for chaotic map i.e. S-box[i] = i + 1

  4. iv.

    Keep continue the iteration process until all cells of S-box have been filled

  5. v.

    Use the generated S-box as initialization for optimization

  6. vi.

    Applied Cuckoo search algorithm

  7. vii.

    Compute Nonlinearity as a fitness function

This was all about the process of S-box generation using a chaotic system with the use of a cuckoo search algorithm. However, cuckoos are interesting birds due to their behavior during egg-laying or hatching period as they prefer to store their eggs into nests of cuckoos to increase the hatchability or to gain higher possibility of feeding. Hers are the steps for a cuckoo’s search to design a S-box

  1. i.

    Host nest generation using chaotic map

  2. ii.

    Get cuckoos using levy flight

  3. iii.

    Maintain balance

  4. iv.

    Evaluate fitness

  5. v.

    Select a nest (S-box), n randomly

2.1.6 Affine Transformation Based S-box Construction

S-box construction consists of three affine equations with their corresponding convenient vector. Through affine transformation, there are following ways to design compact nonlinear substitution boxes [13].

  • Forward S-box construction

  • Backward S-box designing

The steps involved in the forward S-box construction are explained as follows

  1. i.

    As per irreducible polynomial f(z) = x8 + x5 + x4 + x3 + x2 + x + 1, taking the multiplicative inverse of 256 values.

  2. ii.

    Compute affine transform and XORed the output with the first constant vector

  3. iii.

    Repeated the above procedure with the second affine transform and the output XORed with the second constant vector.

  4. iv.

    Kept repeating the same procedure with the third affine transform and the result XORed with third constant vector. In this way, by applying multi affine transformation, a compact nonlinear S-box will be generated.

2.1.7 Lightweight 8-bit S-box construction

In 8-bit S-box construction each element is replaced with its inverse and applied suitable affine transformation. To make it convenient, the selected field inversion composite field was F(24)2. Let ax + b is a random number in the said function, where a ϵ 4 most signficant bits, b ∈ 4 least significant bits. The inversion cx + d = (ax + b)−1 can be calculated as follows [46].

$$ \kern1.5em {\left( ax+b\right)}^{-1}=a{\left({a}^2B+ abA+{b}^2\right)}^{-1}x+\left(b+ aA\right){\left({a}^2b+ abA+{b}^2\right)}^{-1} $$
(6)

Where, A = 1, B = λ, the simplified form of the equation is

$$ {\left( ax+b\right)}^{-1}=a{\left({a}^2\lambda +b\left(a+b\right)\right)}^{-1}x+\left(b+a\right){\left({a}^2\lambda +b\left(a+b\right)\right)}^{-1}={aD}^{-1}x+\left(b+a\right){D}^{-1} $$
(7)

The irreducible polynomials are written as follows

$$ {F}_2={F}_{2^2}\kern2.5em {\mathrm{Q}}_0(x)={x}^2+x+1 $$
(8)
$$ {F}_{2^2}={F}_{{\left({2}^2\right)}^2}\kern1.25em {Q}_1(x)={x}^2+x+\varnothing $$
(9)
$$ {F}_{2^4}={F}_{{\left({2}^4\right)}^2}\kern2em {Q}_2(x)={x}^2+x+\lambda $$
(10)

To ensure the irreducible property of above polynomials, the values of ∅and λ need to be selected carefully.

2.1.8 Lightweight S-box Designing for IoT Devices

The term “interconnection” is used to describe the internet of things between the physical computing devices or devices embedded with sensors, processing power having ability to communicate and exchange data with other devices. Developing IoT based devices without security does not make sense and definitely will make the users’ privacy vulnerable. So, IoT integration with security protocols is a challenge. IoT architecture is explained as follows:

For IoT applications, the three-layer (Perception, Network and Application) architecture has become the most popular paradigm.

  • Perception: This layer contains the sensor devices. This is where the information originates. The information might come from any of the linked device’s sensors.

  • Network: The network layer defines how huge volumes of data are transferred across an application. This layer connects all of the devices together and transmits data to the relevant back-end applications.

  • Application: This is the layer that consumers view. This might be an app for controlling a device in an intelligent environment, or a platform that displays the integrity of the system’s components. IoT architecture is illustrated in Fig. 2.

Fig. 2
figure 2

IoT Architecture

In connection with IoT, Alshammari et al. [4] presented a symmetric lightweight cryptosystem in IoT devices using a chaotic S-box, based on the conventional AES. The permutation and diffusion phases are implemented using the Hilbert curve scan pattern and the Lorenz system. Here are the key steps to generate IoT based S-box

  1. i.

    Create a set of Chaotic Map of 2048 bits

  2. ii.

    Generate empty binary set S

  3. iii.

    Iterate 200 times following Lorenz Map equation

$$ {\displaystyle \begin{array}{c}{x}^{\prime }=a\left(y-x\right),\\ {}\begin{array}{c}{y}^{\prime }= cx-y- xz\\ {}{z}^{\prime }= xy- bz\end{array}\end{array}} $$
(11)
  1. iv.

    Iterate 2048 times above equation and represent the current values as x, y, z. x is obtained as x =  mod (floor(104 ∗ x), 2)

  2. v.

    Divide the empty set S into 8 binary sub sets of 256 bits for getting m Boolean functions.

  3. vi.

    S-box created

In context of IoT security, Sohal et al. [59] uses three technologies for identification of hostile edge devices in a fog computing environment: the Markov model, the Intrusion Detection System, and the Virtual Honeypot Device.

2.1.9 S-box Construction using Subgroup of Galois Field (GF)

In all types of block ciphers, S-box is used to create confusion in the data as it is only nonlinear transformation component. Size of the S-box depends on the application’s scope. In [56], S-box is designed based on fifteen order subgroup of the GF of order 256. The method is explained as follows:

  1. i.

    Inversion function representation p: Subgroup named as K15

$$ {\displaystyle \begin{array}{c}p:{K}_{15}\cup \left\{0\right\}\to {K}_{15}\cup \left\{0\right\}\ by\ 0\\ {}\kern0.5em {y}^{-1}, if\ y\in {K}_{15}\\ {}0, if\ y=0\end{array}} $$
(12)

Here is a way to calculate inversion function P(y)

  1. ii.

    Linear scalar multiplication is defined as

$$ {\displaystyle \begin{array}{c}q:{K}_{15}\cup \left\{0\right\}\to {K}_{15}\cup \left\{0\right\}\ by\\ {}\kern0.75em q(y)= uy, For\ all\ y\in {K}_{15}\cup \left\{0\right\},u\in {K}_{15}\ is\ fixed\ \end{array}} $$
(13)
  1. iii.

    In the result of composing p and q, 8 × 8 bit S-box would be produced.

As discussed above fifteen linear scalar functions can be defined on the basis of K15 ∪ {0}. So, therefore as a result 15 different S-boxes would be constructed Tables 1 and 2.

Table 1 Summary of Novel Methodologies
Table 2 Boolean function representation for n = 2

2.2 Analysis of above discussed S-box design methodologies

A block cipher is an encryption method that divides the plaintext into fixed-length groups called blocks. Encryption is carried for the whole block rather than one bit at a time as in stream ciphers. Confusion and diffusion are two wide ranging operations of secure cipher which were proposed by Claude Shannon in the domain of cryptography. The diffusion deals with the changing effects of a single bit of plaintext to cipher text to obscure the relationship between plaintext and cipher text and vice versa. Confusion means each bit of cipher text alters as per change in key or plaintext that obscures the relationship between plaintext and key. The concept of round recurrence plays a key role to attain the concepts of confusion and diffusion. Block cipher cryptosystem consists of four major conversions: substitution, permutation, mixing and key adding. We frequently reuse certain well-known and extensively used S-boxes, but there is always a need for new S-box construction methods for ensuring confusion properties of the cryptosystem. In the discussed cryptosystems, connected chaotic systems and different S-boxes are employed. The chaotic range is improved by combining chaotic systems, which is important for increasing the level of confusion in cryptosystems. So, the main contribution of S-box in discussed methodologies is to create confusion while also enhancing security and robustness. Linear Cryptanalysis is a powerful method used widely on block cipher for cryptanalysis and it is an effective method for developing powerful and complex attachments. By using the concept of high probability and taking its advantage, it posits a relationship between plaintext bits, ciphertext bits and the key. This one is a well-known plaintext attack where the attacker has random information about plaintext bits and corresponding ciphertext bits. Therefore, a linear approximation is a way to find out the action of cipher, i.e. Encrypted text = f(plaintext, key). Linear cryptanalysis refers to mod-2 bit-wise chain of exclusive-OR operation denoted by ⨁. \( {X}_{i_1}\bigoplus {X}_{i_2}\bigoplus \dots \bigoplus {X}_{i_n}\bigoplus {Y}_{j_1}\bigoplus {Y}_{j_2}\bigoplus \dots \bigoplus {Y}_{j_m} \)= \( {Z}_{k_1}\bigoplus {Z}_{k_2}\bigoplus \dots \bigoplus {Z}_{k_{w.}} \)

Xi represents the ith bit of input set X = [X1, X2, …], Yj indicates the jth bit of output Y = [Y1, Y2, …] and Zk represents the kth bit of the key. While performing linear cryptanalysis it is assumed that everything is known except key and requires following three steps.

  1. i.

    Find linear approximation of nonlinear parts of the algorithm usually through S-boxes.

  2. ii.

    Linear approximation is a method used to relate the plaintext bits, ciphertext bits and the bits of key. Pairing the linear approximation of S-boxes with other linear operations of encryption algorithm to compute linear approximation of entire encryption algorithm.

  3. iii.

    Linear approximation works as a guide for which key to try first.

Differential cryptanalysis refers to a set of blueprints for tagging differences in a network of transformations, finding places where a cipher exhibits organized behavior, and using such attributes to recover the private key. Specifically, it is used in the study of block cipher to determine the result in the encrypted ciphertext in case of change in plaintext. It is important because a change in a specific pattern in encrypted text may signify weakness in the encryption scheme. This one is also known as chosen attack as the attacker has a choice to select input and corresponding output in an attempt to derive a key. In this cryptanalysis, a pair of inputs (X, X′′) chosen by the attacker, to satisfy ∆X a corresponding ∆Y value, occurs with high probability. The pair (∆X, ∆Y) referred to as a differential. Chaotic S-boxes are being used in the domain symmetric encryption algorithms. It is, therefore, by the use of chaos properties and Bent function; S-boxes are generated with high nonlinearity. In this technique, a set of bent functions is constructed as an output of an S-box to construct a high quality S-box. Randomness is an important index in a digital chaotic system. Lightweight S-boxes are required to ensure the security of IoT devices. Nonlinear 4 × 4 S-boxes are suited to internet of things (IoT) applications and the structure is developed in the Galois field GF(24and GF(22)2 . The field is realized by taking the multiplicative inversion followed by an affine transformation. The affine transmission is carried out in the Galois Field GF(24). The constructed S-box with basic mathematical structure demonstrates 5% in the gate equivalent area. For 4-bit input, 4-bit Boolean function (BF) gives 1-bit output that is represented as a 16-bit output vector. Similarly, the truth table of a 16-bit Boolean function is represented by a 16-bit vector. The 4-bit sequential input produces 16 rows, each bit specifying the position at the same column and consists of bits and thereby four 4-bit input vectors are generated with the help of four 16-bit columns which are commonly used for all types of 4-bit Boolean function. So, 16 output bits created 216 possibilities and the equivalent decimal varies between 0 to 216. Hence, for four 16-bit input vectors there are 216 possible 16-bit output vectors. The 8-bit Boolean function provides 1-bit output for 8-bit input and displays the output in the form of a 256-bit vector while the truth table is shown by 256-bit output vector using 8-bit Boolean function. The 8-bit sequential inputs contain 256 rows and provide an 8-bit input vector which is common for 8-bit Boolean function. Finally, we can say four valued 4-bit functions can be used to represent a 4-bit S-box.

2.3 Review of Boolean function and its algebraic form

Modern cryptosystems rely heavily on substitution boxes (S-boxes) and Boolean functions. If a function f : GF(2n) → GF(2) receives n tuples as input and produces one of the two output bits {0, 1} [10]. If a Boolean function takes 4-bit input {x1, x2, x3, x4}, produces 16 combinations of decimal values varying from 0 to 15 and for each combination of input, it will provide 1-bit output. This input-output relationship gives a 16-bit output vector against four 16-bit vectors, each one attached to 4-bit input as mentioned above. Boolean’s function is a mapping from (0, 1)4 to (0, 1)1 and algebraically with 16 coefficients is denoted as

$$ \mathrm{F}\left(\mathrm{x}\right)={a}^0+\left({a}^1.{x}^1+{a}^2.{x}^2+{a}^3.{x}^3+{a}^4.{x}^4\right)+\left({a}^5.{x}^1.{x}^2+{a}^6.{x}^1.{x}^3+{a}^7.{x}^1.{x}^3.+{a}^8.{x}^2.{x}^3+{a}^9.{x}^2.{x}^4+{a}^{10}.{x}^3.{x}^4\right)+\left({a}^{11}.{x}^1.{x}^2.{x}^3+{a}^{12}.{x}^1.{x}^2.{x}^4+{a}^{13}.{x}^1.{x}^3.{x}^4+{a}^{14}.{x}^2.{x}^3.{x}^4\right)+{a}^{15}.{x}^1.{x}^2.{x}^3.{x}^4\dots $$

Where x belongs to 4-bit input {x1, x2, x3,x4} and denotes the hex value or decimal value, Boolean function assumes 1-bit output, ‘+’ and ‘.’ represents XOR and AND operations respectively.

For example, if n = 2 and GF(2n)→ GF(2) mapping is given as below

$$ f\left(x,y,z\right)=x.y\bigoplus z $$

Thus, a single input bit results in a single output bit in a Boolean function, but an S-box is made up of multiple Boolean functions with distinct outputs.

2.4 Review of advanced encryption standard

Advanced Encryption Standard (AES) has variable rounds and the number of rounds depends on the length the of key. Each round comprises 4 functions: Sub-Bytes, Shift Rows, Mix Columns and Add round key. Here is a working scenario of AES

2.4.1 AES encryption process

This encryption process is organized into following four sub-processes:

  1. A.

    Byte Substitution

The 16 input bytes are replaced by searching up an AES S-box. The end result is a matrix of size 4 × 4.

  1. B.

    Shift Row

In this phase each row of the matrix is shifted cyclically to the left. The result is a new matrix of the same size and same values but on different positions.

  1. III.

    Mix Columns

In this round, a special mathematical function takes four bytes from one column as input and returns four completely new bytes that replace the original column.

  1. IV.

    Add Round Key

The matrix’s 16 bytes are now treated as blocks of 128 bits and are XORed with the round key’s 128 bits. If this is the final round, the output will be the ciphertext.

Moreover, the decryption process is also organized into four sub-process as discussed in the encryption process but in reverse order. Encryption and decryption process is illustrated in the following Fig. 3.

Fig. 3
figure 3

AES Encryption and Decryption Stages

3 Performance evaluation criteria

An S-box is the only nonlinear component of block cipher and it should be resistant to various linear and differential attacks. S-box is considered to be secure if it generally achieves number of criteria, such as nonlinearity, strict avalanche criteria, bijection, bit independence criterion, input/output XOR distribution and maximum linear probability.

  1. A.

    Nonlinearity

An S-box is a nonlinear component and nonlinearity can be defined as a distance between the Boolean function and the set of all affine functions. S-box has a nonlinear mapping between ciphertext and plaintext as linear mapping is easily understandable by hackers or crackers to get the plaintext from ciphertext. In other words, nonlinearity refers to the change in number of bits in the truth table to reach the nearest affine function. Higher nonlinearity assures that S-box is more immune against linear attacks. The nonlinearity of S-box or upper bound can be calculated as follows [18]

$$ {NL}_f={2}^{n-1}-\frac{1}{2}\underset{\alpha \in {F}_2^n}{\max}\mid {W}_{f\left(\alpha \right)}\mid $$
(14)

AES S-box has nonlinearity is 112. However, for GF(28), the optimal value of nonlinearity is 120 and best known value of nonlinearity found in literature is 114.5 [14]. Nonlinearity comparison of S-boxes, yielded by various methods found in literature given in Table 3.

Table 3 Nonlinearity of well-known S-boxes
  1. B.

    Bit Independent Criterion

The output bit independent criterion (BIC) is another desirable parameter to ensure the quality of S-box which was first introduced by Webster and Tavares [64]. BIC refers to the change of one bit in plaintext or key for a given set of values (avalanche vector), all the pairwise output values of respective set (vector) should be independent. The avalanche vector Vei has two co-efficient components Xth and Yth which are used to measure the bit independent criterion of overall pairs (P, Pi) of avalanche vector which differ only in bit i Pi = (P ⨁ ei).

$$ {BIC}^{ei}\left({d}_j,{d}_k\right)=\mid corr\left({d}_j^{ei}+{d}_k^{ei}\right)\mid $$
(15)

BIC should be in the range of 0 and 1. In the best case it is equal to zero while in the worst case it is equal to 1.

  1. C.

    Strict Avalanche Criterion:

In cryptography, avalanche effect is the desirable condition for block ciphers. Strict avalanche refers to the change of a single bit in plaintext or key results in a change of 50% of output bits. In the case of high-quality S-box, this slight change in input should cause a drastic change in ciphertext. If an error occurs in encrypted data on a wireless channel, the decryption process at the destination results in 50% original message to be an error due to SAC effect. In other words, the probability to switch each bit of input S-box should be one half changes in output S-box. The SAC score of some existing S-boxes presented in literature given in Table 4.

Table 4 SAC Score of well-known S-boxes
  1. D.

    Linear and Differential Probability

In cryptosystem, a user tries to muddle plaintext into ciphertext in such a way that it becomes more meaningless for the invaders. Linear attack is one of the most widely used attacks on cipher blocks. In Nonlinear Transformed S-box, ideally, nonlinear transformation should have differential uniformity. There should be a unique mapping between input differential ∆xi and output differential yi, while ensuring the uniform probability for each i. The differential uniformity for S-box is defined as

$$ \boldsymbol{DP}\left(\Delta \boldsymbol{x}\to \Delta \boldsymbol{y}\right)=\left[\frac{\#\left\{\boldsymbol{x}\in \boldsymbol{X}/\boldsymbol{S}\left(\boldsymbol{x}\right)\bigoplus \left(\boldsymbol{x}+\Delta \boldsymbol{x}\right)=\Delta \boldsymbol{y}\right\}}{{\mathbf{2}}^{\boldsymbol{m}}}\right] $$
(16)

4 Comparative study of S-box

A quality S-box should satisfy a number of criteria which are discussed in detail in Section 4. In order to determine the performance and security of S-box some mandatory properties should be analyzed as shown in Table V. Niemiec and Machowski proposed the simulator in order to analyze the algebraic properties of S-box [39]. However, all properties cannot be analyzed by using this tool. Only a certain number of criteria can be evaluated which are nonlinearity, strict avalanche criterion, completeness, low XOR table, balanced and diffusion order. In order to assess the several properties of S-box a comparative analysis is presented in detail.

As a result, the analysis has found that some of the properties are very important in order to resist algebraic attacks including linear, differential and statistical etc. A detailed analysis is also discussed in [1] as given in Table 5. In the domain of cryptography, the nonlinearity property of S-box is analyzed which makes the S-box protected from linear cryptanalysis. High nonlinearity shows stronger resistance to linear attacks. Based on the previous study, there exists no specific method to evaluate the S-box quality or security while another study shows that only two properties are enough to judge the strength of S-box. These properties are strict avalanche criterion and robustness. In short, most of the researchers analyze the S-box properties based on their perception and common usage. Therefore, benchmark methodology will contribute to protecting S-box from algebraic attacks. The Niemiec and Machowski and Wang et al. [39, 63] simulators are used in order to analyze the algebraic properties of S-boxes Fig. 4.

Table 5 Experimental results of cryptographic features of well-known S-boxes
Fig. 4
figure 4

Analysis of Razaq et al. S-box [52]

4.1 Challenges in S-box construction

Since 2001, AES has been a worldwide standard encryption algorithm and S-box is one of its important nonlinear components, but this component is fixed throughout the algorithm. Due to its fixation, cryptanalysts attempted to take advantage of this flaw. In fact, the S-box is a heart of cryptosystem and its basic function is to create confusion between key and ciphertext. So, here are five biggest challenges confronting the future of S-box:

  1. 1)

    Currently used AES S-box is fixed so the major challenge is the construction of dynamic S-boxes which can be employed to replace static AES S-boxes. Fix mapping makes it relatively easy for an invader to decipher the original plaintext from the ciphertext. Hence, there is always a need to build robust S-boxes with a larger number of input bits that can resist several linear and differential attacks.

  2. 2)

    Many linear, trigonometric, heuristic evolution, spatiotemporal chaotic dynamics and chaotic map based S-box construction techniques are discussed in literature to produce robust S-boxes but the hardest challenge is to consider the critical properties which we consider important for the analysis and strength of S-box. The reason is clear that some algebraic properties have minor contributions and even vary case to case [44] but some have major contributions in the development of a good S-box. Fibonacci numbers and prime factors are also used to increase the security of S-box and presented the detailed analysis of S-box properties in [37]. Such nonlinear transformations to construct resistant S-boxes should be explored by researchers recurrently.

  3. 3)

    Another major issue is designing and selection of tools for the analysis of cryptographic properties of S-box as the existing tools do not have enough features to analyze all properties. Hence, we believe that development of new tools for the analysis of cryptographic properties will help cryptographers in better examining the behavior of S-boxes and researchers will be able to execute desired attributes of an entire S-box.

  4. 4)

    Researchers have investigated for a long time to construct various S-boxes and most of these S-box construction methodologies are random S-box generation. Security of random S-box is acceptable because of using different techniques in the development process and it is more immune against major cryptanalysis attacks. The problem is that randomness requires high computational complexity and it has no fixed points of reconstruction procedure [34]. So, instead of random construction, we should prefer to adopt complex mathematical structures for better resistance against algebraic attacks. For example, Hussain et al. [38] presented linear fractional transformation based S-box construction.

  5. 5)

    Sometimes S-boxes are generated manually or through simple mathematical operation but this type of construction is suitable for small S-boxes and even though this method could be lossy and uncompact for the construction of large S-boxes. Particularly, initial configuration must be improved by using novel approaches as discussed in Section 2.1.

5 Conclusion

In connection with the digital transactions, most of the new and interesting S-box security analytics are reviewed in this study. An S-box is the sole nonlinear component of most of the cryptographic algorithms that provides a complicated connection between plaintext and ciphertext and makes the digital transaction secure. Finally, we have conducted linear and differential analysis to see if the discussed S-box construction methodology has good statistical features which help in the resistance of algebraic attacks on digital communication. A comparison of the new and existing S-boxes reveals that the linear fractional transformation-based S-boxes are more robust in order to secure digital transactions and can be used in significant ciphers. However, some mathematical based construction techniques are better in terms of performance criteria but some are worse in the context of reliability and lower nonlinearity like chaos-based construction methods. Such S-boxes should be used with caution. For example, if we replace the Rijndael S-box in AES with some chaos-based S-box with overall lower nonlinearity, the resulting cipher will be weaker in terms of resistance to algebraic attacks. We have also compared the results of well-known S-boxes which are constructed from linear, differential transformations and with some permutations. The result shows that linear fractional transformation-based S-boxes are better than fractional order chaotic systems. Furthermore, all the discussed techniques enhance the security of digital transactions by introducing strong S-boxes instead of fixed one component of block ciphers. Furthermore, all of the discussed S-boxes have successfully cleared the standard security tests [53] as recommended by National Institute of Standard and Technology for secure digital communication.