1 Introduction

Energy efficiency/power saving implementations of decoder systems in telecommunication standards [1, 2] is one of the major concerns in line with the concept of green communications [3, 4] and in parallel to computational methods in the field of mechanics [5,6,7] and Big Data analytics [8]. Implementations of low-complexity decoders [9, 10] have been brought forward by some researchers and others have performed computational complexity evaluations of the hardware implementations of decoding algorithms [11].

The search for low complexity decoding algorithms is of paramount importance in communication standards. The reason is mainly because reducing the computational complexity engenders a corresponding reduction in the power usage of the electronic devices. Most complexity analysis is performed on Digital Signal Processing hardware and measured in terms of CPU cycles [12,13,14]. In this work, we propose a computational complexity analysis at the level of binary logical operations by considering three different sets of Turbo decoding equations. The aim behind using binary logical operations is to have a standardized level at which the computational complexity is measured. The different Turbo decoding methods do not have the same sets of Mathematical operations (addition, subtraction, multiplication, etc.) and different Mathematical operations are not necessarily equal in terms of complexity when implemented at the hardware level. However, when converting the different mathematical operations into the number of binary logical operations required, a fair computational complexity analysis can be performed. Additionally, the Turbo decoding mechanisms are equipped with Regression and SDR-based extrinsic information scaling and stopping techniques. A comparative analysis is performed in terms of the error performances as well as the amount of binary logical operations required.

The paper is organized as follows. Section 2 highlights the related-works. Section 3 provides the complexity analysis in terms of binary logical operations for each of the decoding approaches of the binary Turbo codes. Section 4 gives an overview of the different binary Turbo Decoding algorithms and their corresponding complexities in terms of logical operations. Section 5 discusses the performance and complexity analysis. Finally, the work is concluded in Sect. 6.

2 Related Works

Several research works have been initiated and conducted in the quest for low-complexity decoding algorithms having an acceptable trade-off with respect to the corresponding error performances so as to be deployed in relevant communication standards. For example, in [15], the authors have proposed a decoding algorithm for block Turbo codes with low complexity. The algorithm operates on an adaptive application of two different estimation rules and the results demonstrate that the reduction in computational complexity of the proposed algorithm has no significant loss in error performance compared to the conventional one. The authors of [16, 17] have proposed an alternate soft-output decoding mechanism with low complexity for polar codes whose error performance is improved in addition to a significant reduction in terms of storage and processing. The authors of [18] have proposed a normalized Log-MAP (Nor-Log-MAP) decoding algorithm in which the function max* is approximated by using a fixed normalized factor multiplied by the max function. Simulation results show that the proposed algorithm helps in achieving a saving of around 2.1% in terms of logic resources as compared to the conventional one. Gains of the order of 0.25–0.5 dB in error performance are also realised.

Evaluations of computational complexity have also been performed in the literature. For example, in [19], the authors have analyzed the computational complexity of several turbo decoding algorithms in terms of mathematical operations. The algorithms have also been implemented on a Digital Signal Processor and measurements pertaining to a number of CPU cycles per decoded bit have been taken and analyzed. Results have demonstrated that the Max Log MAP algorithm yields a lower computational complexity than the Viterbi algorithm with no significant loss in error performance. The authors of [20] have proposed two mechanisms which reduce the decoding complexity of Turbo Product Codes using extended Hamming codes as component codes. This reduction in computational complexity is achieved in terms of the Hard Decision Decoding whereby a single component code is required with the proposed algorithm. An additional early termination technique is also proposed for un-decodable blocks which aid in the complexity reduction. The simulation results show that the error performance remains fairly unchanged as compared to the conventional decoding algorithms in addition to the significant reduction in overall computational complexity. In [21], the comparison of performance and computational complexity for two different decoding mechanisms was performed. In this work, the complexity was measured using the number of clock cycles needed to complete the different decoding algorithms. Results demonstrated that Turbo codes were recommended to be used with moderate code-rate and LDPC codes were recommended to be used with high code-rates. Furthermore, the work of [22] investigated three different and efficient error control codes. A derivation of the total number of operations used by the different algorithms has been performed and an evaluation of the results obtained has been compared with benchmarks of state-of-the-art SDR platforms.

3 Logical Complexity Analysis

The assumptions made in the derivation of the total number of computations in this work are as follows:

  1. 1.

    One bitwise logical operation would be either a shift (left or right), or a Boolean operation (OR, NOR, AND, NAND, XOR, XNOR, and NOT).

  2. 2.

    Each value computed would be represented by K bits in general on the binary scale.

The computations for the different operations are shown next.

3.1 Derivation of Complexity in Terms of Logical Operations

In this section, a detailed breakdown of the different mathematical operations in terms of logical operations has been shown.

3.1.1 Addition

The electronic circuit of a Half-Adder [23] which takes as input two bits (BIT 1 and BIT 2) and outputs a SUM BIT and a CARRY BIT is shown in Fig. 1.

Fig. 1
figure 1

Half-Adder representation with Binary Logic Gates

One Half-Addition, therefore requires 2 logical operations (1 XOR and 1 AND).

The electronic circuit of a Full-Adder [23] which takes as input three bits (BIT 1, BIT 2, and INPUT CARRY BIT) and outputs a SUM BIT and a CARRY BIT is shown in Fig. 2.

Fig. 2
figure 2

Full-Adder representation with Binary Logic Gates

One Full-Addition requires 5 logical operators (2 XOR, 2 AND, and 1 OR). The addition of a K-bit stream to another K-bit stream requires: 1 Half-Adder and (K − 1) Full-Adders.

The circuit for performing the addition in parallel is shown in Fig. 3.

Fig. 3
figure 3

Representation of Addition in parallel

The total number of logical operations required is:\(\left( {1 \times 2} \right) + \left( {\left( {{\text{K}} - 1} \right) \times 5} \right) = 2 + 5{\text{K}} - 5\) and can be represented as:

$$T_{L}^{Add} = 5{\text{K}} - 3$$
(1)

3.1.2 Subtraction

The algorithm for subtraction of bit streams (STREAM_1–STREAM_2) is as follows [23]:

  1. 1.

    Perform 2′s complement of STREAM_2

  2. 2.

    Add the 2′s complement of STREAM_2 to STREAM_1.

The 2′complement operation requires K NOT Gates, (K − 1) Full-Adders, and 1 Half-Adder. The total number of logical operations in this case is: K + 5K − 5 + 2 = 6K − 3. The addition operation requires (5K − 3) logical operations. The circuit for subtraction in parallel would be very similar to that for addition with the only difference that there would be two levels instead (One for performing the 2′s complement and the other for the addition of the two streams). The total number of logical operations required for the subtraction is: (6K − 3) + (5K − 3) and can be represented as:

$$T_{L}^{Sub} = 11{\text{K}} - 6$$
(2a)

3.1.3 Comparison

A comparison operation can be performed in terms of subtraction operations as demonstrated above, whereby the overflow bit determines the decision of the comparison (whether greater or smaller than). The total number of logical operations required for the comparison operation can be represented as:

$$T_{L}^{Comp} = 11{\text{K}} - 6$$
(2b)

3.1.4 Multiplication

The algorithm for binary Multiplication of a multiplicand with a multiplier is as follows:

  1. 1.

    Fix the multiplicand.

  2. 2.

    For each bit in the multiplier.

    1. a.

      Shift the multiplicand one bit to the left.

  3. 3.

    End For Loop.

  4. 4.

    Sum all the shifted versions of the multiplicand to obtain the result of the multiplication.

The multiplication of bit streams (STREAM_1 and STREAM_2) requires the following operations [23]:

  • (K − 1) left shifts and (K − 1) total additions of (2K − 2) bits.

  • (K − 1) additions of (2K − 2) bits => (K−1) Half-Adders and ((2K−2) × (K−1)) Full-Adders.

    • (K−1) Half-Adders

    • (2K2−4K + 2) Full-Adders

Total number of operations => (K−1) + (2 × (K−1)) + (5 × (2K2−4K + 2)) = K − 1 + 2K − 2 + 10K2 − 20K + 10 and can be represented as:

$$T_{L}^{Mult} = 10{\text{K}}^{2} { } + { }23{\text{K }} + { }7$$
(3)

3.1.5 Division

The algorithm for the binary Division of a dividend with a divisor is as follows:

  1. 1.

    Set quotient to 0

  2. 2.

    Align leftmost digits in dividend and divisor

  3. 3.

    Repeat

    1. a.

      If that portion of the dividend above the divisor is greater than or equal to the divisor

      1. i.

        Then subtract divisor from that portion of the dividend and

      2. ii.

        Concatenate 1 to the right hand end of the quotient

      3. iii.

        Else concatenate 0 to the right hand end of the quotient

    2. b.

      Shift the divisor one place right

  4. 4.

    Until dividend is less than the divisor

  5. 5.

    Quotient is correct, dividend is remainder

  6. 6.

    STOP

Assuming that the dividend is a K-bit stream, the divisor is an m-bit stream and the binary division by shift and subtract algorithm is used [24], the following number of operations are required:

  1. (i)

    K shifts

  2. (ii)

    \(\left\lfloor {\frac{{\text{K}}}{{{\text{m}} + 1}}} \right\rfloor\) shifts consisting of subtractions with (m + 1) bits

Total number of operations => \({\text{K}} + \left\lfloor {\frac{{\text{K}}}{{{\text{m}} + 1}}} \right\rfloor \times \left( {11\left( {{\text{m}} + 1} \right) - 6} \right) = {\text{K + }}\left\lfloor {\frac{{\text{K}}}{{{\text{m}} + 1}}} \right\rfloor \times { }\left( {11{\text{m}} - 5} \right)\).

Taking the upper bound, where m is minimum and is taken to be equal to 1, the total number of operations can be represented as:

$$T_{L}^{Div} = {\text{K}} + \left( {17 \times \left\lfloor {\frac{{\text{K}}}{2}} \right\rfloor } \right)$$
(4)

3.1.6 Logarithm

A logarithm can be bounded based on its properties [25]. Consider the basic inequality:

$$\frac{x - 1}{x} \le ln\left( x \right) \le x - 1,\quad for\quad x > 0$$
(5)

Assuming the upper bound value is computed for the Natural logarithms, the total number of computations would result into:

  1. (i)

    K NOT gates => K logical operations

  2. (ii)

    (k − 1) Full-Adders => (5K − 5) logical operations

  3. (iii)

    1 Half-Adder => 2 logical operations

  4. (iv)

    x − 1 operation => (k − 1) full-adders; 1 half-adder => (5K − 3) logical operations.

The total number of logical operations would be: K + 5K − 5 + 2 + 5K − 3 and can be represented as:

$$T_{L}^{Log} = 11{\text{K }}{-}{ }6$$
(6)

3.1.7 Exponential

An exponential \(e^{\tau }\) can be considered as multiplications of the constant value, \(e\) by itself \(\tau\) times, such that:

$$e^{\tau } = e \times e \times e \times \cdots \times e$$
(7)

With \(e\) and \(\tau\) being represented by K bits at the binary level, the exponential would consist of (2K − 1) multiplications of \(e\) (K-bits). Therefore, the total number of computations would be represented as:

$$T_{L}^{Exp} = \left( {2^{{\text{K}}} {-}1} \right) \times \left( {10{\text{K}}^{2} + 23{\text{K}} + 7} \right)$$
(8)

3.1.8 Maximum Operation

The selection of the maximum of bit streams (STREAM_1–STREAM_2) requires the following operations [23]:

Treating STREAM_1 and STREAM_2 as signed K-bit integers, then

  1. 1.

    Invert STREAM_2 to its − STRE_2 representation;

  2. 2.

    Sum STREAM_1 to − STREAM_2;

  3. 3.

    Use the sign of the result as a selector variable of a 2-input, K-bit multiplexer.

Converting STREAM_2 to − STREAM_2 by performing 2′s complement. The logical operations required are:

  1. (i)

    K NOT gates => K logical operations

  2. (ii)

    (K − 1) Full-Adder => 5 × (K − 1) logical operations

  3. (iii)

    1 Half-Adder => 2 logical operations

  4. (iv)

    The addition of STREAM_1 to − STREAM_2 would require the following operations:

  5. (v)

    (K − 1) Full-Adder => 5 × (K − 1) logical operations

  6. (vi)

    1 Half-Adder => 2 logical operations.

Total number of logical operations => K + 5K − 5 + 2 + 5K − 3 = 11K − 6.

The Selector or Multiplexor would require a separate digital circuit system. Consider a 2:1 MUX as shown in Fig. 4.

Fig. 4.
figure 4

2:1 MUX representation with Binary Logic Gates

Extending from the above logic, 2K-bit inputs through a 2:1 MUX would require K × K binary AND Operations; K × (K − 1) binary OR operations; K × \(log_{2} \left( {\text{K}} \right)\) NOT operations. The number of maximum operations would also be impacted by the number of symbols in non-binary Turbo codes. To generalize, the total number of operations required with a maximum of \(M_{s}\) states would be represented as:

$$T_{L}^{Max} = \left( {M_{s} - 1} \right) \times \left( {2{\text{K}}^{2} + 10{\text{K}}{-}6 + \left( {{\text{K }} \times log_{2} \left( {\text{K}} \right)} \right)} \right)$$
(9)

4 Logical Complexity of Binary LTE Turbo Codes

The dissimilar decoding techniques and logical complexities for binary LTE Turbo codes are shown in the following sub-sections. The equations of the total computational complexities in terms of binary logical operations for the different decoding approaches of Binary LTE Turbo codes are obtained based on the analysis in Sect. 2.

4.1 Background: Binary Turbo Codes

Figure 5 depicts a classic framework for Binary Turbo decoding. The decoding process is described subsequently. The aggregation of an interleaver together with two decoders make up the Turbo decoder. The first Decoder takes \(r_{0}\) and \(r_{1}\) which correspond to the corrupted forms of S0 and P1, which are intercepted at the receiver end. Decoder 2 accepts \(\overline{{r_{0} }}\) which is the interleaved version is of \(r_{0}\) and \(r_{2}\), which is the noisy version of P2. \(\varLambda_{1e}\) and \(\varLambda_{2e}\) are the extrinsic information generated by the decoders. \(\varLambda_{2}\) is the Log Likelihood Ratio (LLR) output from Decoder 2 and \(\varLambda_{t}\) is the final hard output obtained after the hard-limiting operation. Standards like CDMA-2000 and Long Term Evolution (LTE) have adopted Turbo codes with the aim to attain higher data rates.

Fig. 5
figure 5

Generic Turbo decoding structure. Source: [26]

Several equations have been proposed for the diverse Turbo decoding algorithms. As such, the existing Maximum Logarithmic Maximum A-posteriori Probability (Max Log-MAP) binary Turbo decoding system employing different sets of equations are presented in the following sub-sections. The decoding mechanisms are explained in details in [27].

In view to enhance the iterative decoding performance in terms of Bit Error Rate (BER), extrinsic information scaling mechanisms such as Sign Difference Ratio (SDR) [28] and Regression-based [29] have been developed. In addition to improving the error performance, these works also demonstrate the techniques for early-stopping of the decoding mechanisms by employing the computed scale factors. The concept of early-stopping helps in reducing the computational complexity without extensive trade-offs in terms of the error performance. These improvements are however obtained at the expense of additional computations included in the decoding process for computing the scale factor, scaling the extrinsic information and performing comparisons with a set threshold to halt further unnecessary iterations. The decoding frameworks with SDR and Regression-based scaling and stopping are depicted in Figs. 6 and 7 respectively. The detailed operating principles of these algorithms can be obtained from [28, 29] respectively.

Fig. 6
figure 6

Turbo decoding structure with SDR scaling and stopping mechanism. Source: [28]

Fig. 7
figure 7

Turbo decoding structure with Regression-based scaling and stopping mechanism. Source: [29]

4.2 Logical Complexity with Decoding Methods

This section details the complexity breakdowns for each Turbo decoder using each of the Turbo decoding algorithms. The complexity analysis in this sub-section does not involve the incorporation of extrinsic information scaling and early stopping mechanisms.

4.2.1 Logical Complexity with Decoding Method 1

The computational complexity breakdown for each Turbo decoder using Method 1 with packet length of N is explained in [27] and presented in Table 1.

Table 1 Computational breakdown for each decoder with Method 1

The equations for the number of computations per mathematical operation for Method 1 are as follows [27]:

$$C_{{M1_{log} }}^{binary} = 16{\text{N}}$$
(10)
$$C_{{M1_{exp} }}^{binary} = 3{\text{N}}$$
(11)
$$C_{{M1_{max} }}^{binary} = \left( {8 + 8 + 2} \right){\text{N}}$$
(12)
$$C_{{M1_{add} }}^{binary} = \left( {16 + 16 + 16 + 32 + 2} \right){\text{N}}$$
(13)
$$C_{{M1_{sub} }}^{binary} = \left( {48 + 1 + 2} \right){\text{N}}$$
(14)
$$C_{{M1_{mult} }}^{binary} = 32{\text{N}}$$
(15)
$$C_{{M1_{div} }}^{binary} = 2{\text{N}}$$
(16)

where,

  • \(C_{{M1_{log} }}^{binary}\) is the number of computations required for logarithm operations for Method 1,

  • \(C_{{M1_{exp} }}^{binary}\) is the amount of computations required for exponential operations for Method 1,

  • \(C_{{M1_{max} }}^{binary}\) is the number of computations required for Maximum operations for Method 1,

  • \(C_{{M1_{add} }}^{binary}\) is the amount of computations required for addition operations for Method 1,

  • \(C_{{M1_{sub} }}^{binary}\) is the number of computations required for subtraction operations for Method 1,

  • \(C_{{M1_{mult} }}^{binary}\) is the number of computations required for multiplication operations for Method 1,

  • \(C_{{M1_{div} }}^{binary}\) is the number of computations required for division operations for Method 1.

The equations for the computational complexities of the decoding Method 1 for binary LTE Turbo codes in terms of binary logical operations are as follows:

$$C_{{M1_{log} }}^{binary} = 16{\text{N}} = 16{\text{N}}\left( {11{\text{K}} - 6} \right)$$
(17)
$$C_{{M1_{exp} }}^{binary} = 3{\text{N}}\left( {2^{{\text{K}}} - 1} \right)\left( {10{\text{K}}^{2} + 22{\text{K}} + 8} \right)$$
(18)
$$C_{{M1_{max} }}^{binary} = 18{\text{N}}\left( {M_{s} - 1} \right)\left( {2{\text{K}}^{2} + 10{\text{K}} + {\text{K}}log_{2} \left( {\text{K}} \right) - 6} \right)$$
(19)
$$C_{{M1_{add} }}^{binary} = 82{\text{N}}\left( {5{\text{K}} - 3} \right)$$
(20)
$$C_{{M1_{sub} }}^{binary} = 51{\text{N}}\left( {11{\text{K}} - 6} \right)$$
(21)
$$C_{{M1_{mult} }}^{binary} = 32{\text{N}}\left( {10{\text{K}}^{2} + 22{\text{K}} + 8} \right)$$
(22)
$$C_{{M1_{div} }}^{binary} = 2{\text{N}}\left( {{\text{K}} + \left( {17 \times \left\lfloor {\frac{{\text{K}}}{2}} \right\rfloor } \right)} \right)$$
(23)
$$\begin{aligned} C_{{M1_{total} }}^{binary} & = 16{\text{N}}\left( {11{\text{K}} - 6} \right) + 3{\text{N}}\left( {\left( {2^{{\text{K}}} {-}1} \right)\left( {10{\text{K}}^{2} + 22{\text{K}} + 8} \right)} \right) \\ & \quad + 18{\text{N}}\left( {\left( {2{\text{K}}^{2} + 10{\text{K}} + {\text{K}}log_{2} \left( {\text{K}} \right){-}6} \right)} \right)\left( {M_{s} - 1} \right) + 82{\text{N}}\left( {5{\text{K}} - 3} \right) \\ & \quad + 51{\text{N}}\left( {11{\text{K}} - 6} \right) + 32{\text{N}}\left( {10{\text{K}}^{2} + 22{\text{K}} + 8} \right) + 2{\text{N}}\left( {{\text{K }} + { }\left( {17 \times \left\lfloor {\frac{{\text{K}}}{2}} \right\rfloor } \right)} \right) \\ \end{aligned}$$
(24)

4.2.2 Logical Complexity with Decoding Method 2

The computational complexity breakdown for each Turbo decoder using Method 2 with packet length of N is explained in [27] and presented in Table 2.

Table 2 Ccomputational breakdown for each decoder with Method 2

The equations for the number of computations per mathematical operation for Method 2 is as follows [27]:

$$C_{{M2_{max} }}^{binary} = \left( {8 + 8 + 2} \right){\text{N}}$$
(25)
$$C_{{M2_{add} }}^{binary} = \left( {8 + 32 + 32 + 16} \right){\text{N}}$$
(26)
$$C_{{M2_{sub} }}^{binary} = 1{\text{N}}$$
(27)

where,

  • \(C_{{M2_{max} }}^{binary}\) is the number of computations required for Maximum operations for Method 2,

  • \(C_{{M2_{add} }}^{binary}\) is the number of computations required for addition operations for Method 2,

  • \(C_{{M2_{sub} }}^{binary}\) is the number of computations required for subtraction operations for Method 2,

The equations for the computational complexities of the decoding Method 2 for binary LTE Turbo codes in terms of binary logical operations are as follows:

$$C_{{M2_{max} }}^{binary} = 18{\text{N}}\left( {M_{s} - 1} \right)\left( {2{\text{K}}^{2} + 10{\text{K}} + {\text{K}}log_{2} \left( {\text{K}} \right) - 6} \right)$$
(28)
$$C_{{M2_{add} }}^{binary} = 88{\text{N}}\left( {5{\text{K}} - 3} \right)$$
(29)
$$C_{{M2_{sub} }}^{binary} = 1{\text{N}}\left( {11{\text{K}} - 6} \right)$$
(30)
$$C_{{M2_{total} }}^{binary} = 18{\text{N}}\left( {\left( {2{\text{K}}^{2} + 10{\text{K}} + {\text{K}}log_{2} \left( {\text{K}} \right){-}6} \right)} \right)\left( {M_{s} - 1} \right) + 88{\text{N}}\left( {5{\text{K}} - 3} \right) + {\text{N}}\left( {11{\text{K}} - 6} \right)$$
(31)

4.2.3 Logical Complexity with Decoding Method 3

The computational complexity breakdown for each Turbo decoder using Method 3 with packet length of N is explained in [27] and presented in Table 3.

Table 3 Computational breakdown for each decoder with Method 3

The equations for the number of computations per mathematical operation for Method 3 is as follows:

$$C_{{M3_{exp} }}^{binary} = 3{\text{N}}$$
(32)
$$C_{{M3_{max} }}^{binary} = \left( {8 + 8 + 2} \right){\text{N}}$$
(33)
$$C_{{M3_{add} }}^{binary} = \left( {32 + 16 + 16 + 32 + 7} \right){\text{N}}$$
(34)
$$C_{{M3_{sub} }}^{binary} = \left( {1 + 2} \right){\text{N}}$$
(35)
$$C_{{M3_{mult} }}^{binary} = 48{\text{N}}$$
(36)

where,

  • \(C_{{M3_{exp} }}^{binary}\) is the number of computations needed for exponential operations for Method 3,

  • \(C_{{M3_{max} }}^{binary}\) is the number of computations needed for Maximum operations for Method 3,

  • \(C_{{M3_{add} }}^{binary}\) is the number of computations needed for addition operations for Method 3,

  • \(C_{{M3_{sub} }}^{binary}\) is the number of computations needed for subtraction operations for Method 3,

  • \(C_{{M3_{mult} }}^{binary}\) is the number of computations needed for multiplication operations for Method 3.

The equations for the computational complexities of the decoding Method 3 for binary LTE Turbo codes in terms of binary logical operations are as follows:

$$C_{{M3_{exp} }}^{binary} = 3{\text{N}}\left( {2^{{\text{K}}} - 1} \right)\left( {10{\text{K}}^{2} + 22{\text{K}} + 8} \right)$$
(37)
$$C_{{M3_{max} }}^{binary} = 18{\text{N}}\left( {M_{s} - 1} \right)\left( {2{\text{K}}^{2} + 10{\text{K}} + {\text{K}}log_{2} \left( {\text{K}} \right) - 6} \right)$$
(38)
$$C_{{M3_{add} }}^{binary} = 103{\text{N}}\left( {5{\text{K}} - 3} \right)$$
(39)
$$C_{{M3_{sub} }}^{binary} = 3{\text{N}}\left( {11{\text{K}} - 6} \right)$$
(40)
$$C_{{M3_{mult} }}^{binary} = 48{\text{N}}\left( {10{\text{K}}^{2} + 22{\text{K}} + 8} \right)$$
(41)
$$\begin{aligned} C_{{M3_{total} }}^{binary} & = 3{\text{N}}\left( {\left( {2^{{\text{K}}} {-}1} \right)\left( {10{\text{K}}^{2} + 22{\text{K}} + 8} \right)} \right) \\ & \quad + 18{\text{N}}\left( {\left( {2{\text{K}}^{2} + 10{\text{K}} + {\text{K}}log_{2} \left( {\text{K}} \right){-}6} \right)} \right)\left( {M_{s} - 1} \right) \\ & \quad + 103{\text{N}}\left( {5{\text{K}} - 3} \right) + 3{\text{N}}\left( {11{\text{K}} - 6} \right) + 48{\text{N}}\left( {10{\text{K}}^{2} + 22{\text{K}} + 8} \right) \\ \end{aligned}$$
(42)

4.3 Logical Complexity with SDR Scaling and Stopping

The scaling parameter at iteration \(n\) for each decoder \(d\) is calculated as:

$$S_{dn} = \frac{1}{{\text{N}}}\mathop \sum \limits_{t = 1}^{{\text{N}}} f\left( { \wedge_{de}^{\left( n \right)} , \wedge_{d}^{\left( n \right)} } \right)$$
(43)

where,

  • \(f\left( { \wedge_{de}^{\left( n \right)} ,{ } \wedge_{d}^{\left( n \right)} } \right) = 1\) if \(\wedge_{de}^{\left( n \right)} ,{ }\) and \({ } \wedge_{d}^{\left( n \right)}\) have the same sign, otherwise \({ }f\left( { \wedge_{de}^{\left( n \right)} ,{ } \wedge_{d}^{\left( n \right)} } \right) = 0\),

  • \({\text{N}}\) represents the size of the frame in bits.

Table 4 presents the breakdown related to the SDR based scaling parameter at each half-iteration, the details of which can be obtained in the work of [27].

Table 4 Complexity breakdown for one SDR-based scale factor

The equations for the number of computations per mathematical operation for SDR scaling and stopping in terms of binary logical operations are as follows:

$$C_{{SDR_{comp} }}^{binary} = \left( {{\text{N}} + 1} \right)\left( {11{\text{K}} - 6} \right)$$
(44)
$$C_{{SDR_{add} }}^{binary} = \left( {{\text{N}} - 1} \right)\left( {5{\text{K}} - 3} \right)$$
(45)
$$C_{{SDR_{div} }}^{binary} = {\text{K}} + \left( {17 {\text{x}} \frac{{\text{K}}}{2}} \right)$$
(46)
$$C_{{SDR_{mult} }}^{binary} = 10{\text{K}}^{2} { } + { }23{\text{K }} + { }7$$
(47)

where,

  • \(C_{{SDR_{comp} }}^{binary}\) is the number of computations required for comparison operations for SDR scaling and stopping,

  • \(C_{{SDR_{add} }}^{binary}\) is the number of computations required for addition operations for SDR scaling and stopping,

  • \(C_{{SDR_{div} }}^{binary}\) is the number of computations required for division operations for SDR scaling and stopping,

  • \(C_{{SDR_{mult} }}^{binary}\) is the number of computations required for multiplication operations for SDR scaling and stopping.

4.4 Logical Complexity with Regression Scaling and Stopping

The scaling parameter based on regression, \(r_{d}^{2\left( n \right)}\), is presented in (47).

$$r_{d}^{2\left( n \right)} = \left( {\frac{{\mathop \sum \nolimits_{t = 1}^{{\text{N}}} \left( { \wedge_{d}^{\left( n \right)} \left( t \right){ } - { }\widehat{{ \wedge_{d}^{\left( n \right)} }}} \right){\text{x}}\left( { \wedge_{de}^{\left( n \right)} \left( t \right){ } - { }\widehat{{ \wedge_{de}^{\left( n \right)} }}} \right)}}{{\sqrt {\mathop \sum \nolimits_{t = 1}^{{\text{N}}} \left( { \wedge_{d}^{\left( n \right)} \left( t \right){ } - { }\widehat{{ \wedge_{d}^{\left( n \right)} }}} \right)^{2} {\text{x}}\mathop \sum \nolimits_{t = 1}^{N} \left( { \wedge_{de}^{\left( n \right)} \left( t \right){ } - { }\widehat{{ \wedge_{de}^{\left( n \right)} }}} \right)^{2} } }}} \right)^{2}$$
(47)

where,

  • \(d\) = {1, 2} which is the decoder number,

  • \({\text{N}}\) is the length of the packet and is set to 6144 in this simulation,

  • \(r_{d}^{2\left( n \right)}\) is the scaling parameter at iteration \(n\) for decoder \(d\),

  • \(\wedge_{d}^{\left( n \right)} \left( t \right)\) is the tth a-posteriori LLR of decoder \(d\) at iteration n and time t,

  • \(\widehat{{ \wedge_{d}^{\left( n \right)} }}\) is the mean a-posteriori LLR at iteration \(n\) for decoder \(d\),

  • \(\wedge_{de}^{\left( n \right)} \left( t \right)\) is the tth extrinsic LLR at iteration \(n\) for decoder \(d\),

  • \(\widehat{{ \wedge_{de}^{\left( n \right)} }}\) is the mean extrinsic LLR at iteration \(n\) for decoder \(d\),

  • n takes values ½, 1, … I (maximum number of iterations).

Intuitively, a correlation value of 1.0 between the a-posteriori LLR and extrinsic values yields a stopping criterion. However, based on simulations carried out and as explained in [29], a threshold of 0.98 can be used for the stopping mechanism.

The breakdown for the Regression based scaling and stopping at every half-iteration is presented in Table 5.

Table 5 Complexity breakdown: one Regression-based scaling and stopping

The equations for the number of computations per mathematical operation for Regression scaling and stopping in terms of binary logical operations are as follows:

$$C_{{Reg_{sub} }}^{binary} = 4{\text{N}}\left( {11{\text{K}} - 6} \right)$$
(48)
$$C_{{Reg_{mult} }}^{binary} = \left( {3{\text{N}} + 2} \right)\left( {10{\text{K}}^{2} { } + { }23{\text{K }} + { }7} \right)$$
(49)
$$C_{{Reg_{add} }}^{binary} = \left( {3{\text{N}} - 3} \right)\left( {5{\text{K}} - 3} \right)$$
(50)
$$C_{{Reg_{div} }}^{binary} = {\text{K}} + \left( {17 \times \left\lfloor {\frac{{\text{K}}}{2}} \right\rfloor } \right)$$
(51)
$$C_{{Reg_{comp} }}^{binary} = \left( {11{\text{K}} - 6} \right)$$
(52)

where,

  • \(C_{{Reg_{sub} }}^{binary}\) is the number of calculations necessary for subtraction operations for Regression scaling and stopping,

  • \(C_{{Reg_{mult} }}^{binary}\) is the number of calculations necessary for multiplication operations for Regression scaling and stopping,

  • \(C_{{Reg_{add} }}^{binary}\) is the number of calculations necessary for addition operations for Regression scaling and stopping,

  • \(C_{{Reg_{div} }}^{binary}\) is the number of calculations necessary for division operations for Regression scaling and stopping,

  • \(C_{{Reg_{comp} }}^{binary}\) is the number of calculations necessary for comparison operations for Regression scaling and stopping.

5 Performance and Complexity Analysis

The performance and complexity analysis of the different Binary LTE Turbo decoding algorithms has been performed in this section.

5.1 Performance Analysis

In this sub-section, the graphs pertaining to the error performances and iteration profiles are plotted and analysed. For each of the Turbo decoding method, simulations have been performed using the following system criteria:

  • Packet size: 6144 bits

  • Number of packets: 200

  • Code-rate: 1/3

  • Maximum number of iterations: 12

  • Modulation: Binary Phase Shift Keying (BPSK)

  • Channel Noise: Additive white Gaussian Noise (AWGN)

Simulations have been performed for the following 3 schemes for each of the Turbo decoding method:

  • Decoding without scaling or stopping

  • Decoding with SDR-based scaling and stopping

  • Decoding with Regression-based scaling and stopping

The BER performance graph is presented in Fig. 8.

Fig. 8
figure 8

BER performance graph for Binary LTE Turbo Decoding methods

When comparing the conventional decoding methods without any scaling or stopping mechanisms from the above figure, it can be observed that Method 2 performs better than both Methods 1 and 3 with gains of 0.2 dB and 0.1 dB respectively at a BER of 10−7. When applying SDR scaling and stopping to the 3 decoding methods, there is visually no significant improvement in terms of the error performance as compared to the conventional decoding mechanisms. Regression-based scaling and stopping mechanism improves significantly the error performances of decoding Methods 1 and 3. Methods 1 and 3 provide a gain of 0.35 and 0.3 dB respectively compared to the conventional decoding methods at a BER of 10−3 itself. Decoding Method 2 does not perform well after reaching the water-fall region. The performance degrades at BER 10−3 and an error-floor is observed at a BER of around 10−4. The iterations profile is demonstrated in Fig. 9.

Fig. 9
figure 9

Iterations profile for Binary LTE Turbo Decoding methods

The number of iterations for the conventional decoding methods without scaling and stopping mechanisms remains constant at 12 which is the specified value for maximum number of iterations. The decoding methods with SDR-based scaling and stopping mechanisms demonstrate a continuous reduction in the number of iterations which is a significant improvement to compensate for the almost no BER performance gain compared to the conventional decoding methods. The 3 decoding methods with SDR scaling and stopping provide gains of 3 and 4.5 iterations at Eb/N0 values of 0.6 and 0.7 dB respectively compared to the conventional decoding schemes. With the incorporation of regression-based stopping and scaling, the gains obtained in terms of the iterations profile is even more significant compared to those of the conventional decoding methods and the ones using SDR-based scaling and stopping mechanisms. Methods 1 and 3 with regression-based stopping and scaling have already reached a BER of 10−7 at Eb/N0 of 0.6 dB and at this same Eb/N0 they provide a gain of nearly 6 and 3 iterations compared to the conventional decoding methods and those using SDR-scaling and stopping respectively. Method 2 with regression-based scaling and stopping outperforms the conventional and SDR-based schemes throughout the Eb/N0 range. It also uses fewer iterations as compared to regression-based Methods 1 and 3 above Eb/N0 of 0.75 dB. This improved iterations profile is not significant enough when considering the BER performance. The error-floor which is observed in the BER performance of Method 2 using regression-based scaling and stopping does not suggest a good trade-off between performance and complexity reduction in this case.

5.2 Total Computational Complexity without Scaling/Stopping

The total computational complexity analysis for binary LTE Turbo codes has been performed in this sub-section for different values of K. Tables 6, 7, and 8 depict the total number of binary logical operations for values of K = 16, 32, and 64 respectively.

Table 6 Total number of binary logical operations with K = 16
Table 7 Total number of binary logical operations with K = 32
Table 8 Total number of binary logical operations with K = 64

The bar charts for Logarithm, Exponential, Maximum, Addition, Subtraction, Multiplication, and Division operations in the Binary LTE Turbo codes are shown in Figs. 10, 11, 12, 13, 14, 15 and 16 respectively. The bar chart for the total number of binary operations is shown in Fig. 10.

Fig. 10
figure 10

Bar chart for Logarithm operations in Binary LTE Turbo codes

Fig. 11
figure 11

Bar chart for Exponential operations in Binary LTE Turbo codes a K = 16, b K = 32, c K = 64

Fig. 12
figure 12

Bar chart for Maximum operations in Binary LTE Turbo codes

Fig. 13
figure 13

Bar chart for Addition operations in Binary LTE Turbo codes

Fig. 14
figure 14

Bar chart for Subtraction operations in Binary LTE Turbo codes

Fig. 15
figure 15

Bar chart for Multiplication operations in Binary LTE Turbo codes

Fig. 16
figure 16

Bar chart for Division operations in Binary LTE Turbo codes

Figure 11 is broken down into Figs. 11a–c for more clarity.

Figures 11a–c show that exponential operations are used only in Methods 1 and 3.

Figure 12 shows that all 3 techniques use the same extent of Maximum processes.

Figure 13 shows that Method 3 uses more addition operations than Method 2 which in turn uses more addition operations than Method 1.

Figure 14 shows that Method 1 uses more subtraction operations than Method 3 which in turn uses more addition operations than Method 2.

Figure 15 shows that only Methods 1 and 3 use multiplication operations. Method 3 uses more multiplication operations than Method 1.

Figure 17 is broken down into Fig. 17a–c for more clarity. All 3 figures show that Methods 1 and 3 use more operations than Method 2. The total number of binary logical operations used by Method 2 is significantly lower than Methods 1 and 3 in the context of binary Turbo codes. Considering K = 16, Methods 1 and 3 use approximately 25,590 times more binary logical operations in total than Method 2 at each half-iteration. Therefore using Method 2 with binary Turbo codes significantly reduces the complexity which in turn leads to a more energy efficient/power saving implementation. The energy efficiency of Turbo decoding Method 2 is supported by the error performance profiles of the 3 conventional Max-Log MAP decoding methods without the incorporation of any extrinsic information scaling and stopping technique. The BER graphs for the 3 decoding methods almost overlap over the whole Eb/N0 region.

Fig. 17
figure 17

Bar chart for Total operations in Binary LTE Turbo codes a K = 16, b K = 32, c K = 64

5.3 Total Computational Complexity with Scaling/Stopping

The total computational complexity analysis for binary LTE Turbo codes with SDR and Regression-based scaling and stopping mechanisms has been performed in this sub-section for K = 64 and N = 6144.

Table 9 shows the total number of binary logical operations for the 3 decoding approaches with SDR-based scaling and stopping. The same is shown in Table 10 for the 3 decoding methods with Regression-based stopping and scaling incorporated.

Table 9 Total number of binary logical operations for binary LTE Turbo codes with SDR-based stopping and scaling with K = 16 and N = 6144
Table 10 Total number of binary logical operations for binary LTE Turbo codes with Regression-based scaling and stopping with K = 16 and N = 6144

5.4 Comparative Analysis of Computational Complexity

The total number of logical computations with K = 16 and N = 6144 is given in Table 11.

Table 11 Total number of binary logical operations for binary LTE Turbo codes

The bar chart representation of the values in the above table is shown in Fig. 18.

Fig. 18
figure 18

Bar chart for total number of binary logical operations for binary LTE Turbo codes

The bars pertaining to Methods 1 and 3 reveal that SDR scaling and stopping does not increase the number of computations significantly as compared to the Regression-based stopping and scaling. The order in terms of the number of computations remains at 1012 when incorporating SDR-based stopping and scaling while the order goes up to 1017 when using Regression-based stopping and scaling. This rise in the number of computations is compensated by the significant improvement in error performance and reduced number of iterations over the Eb/N0 range. Figure 19 is given below to better analyze the trend with Method 2.

Fig. 19
figure 19

Bar chart for total number of binary logical operations for binary LTE Turbo codes—Method 2 Only

The same trend is observed with Method 2. The order in terms of the number of computations remains at 108 when incorporating SDR-based stopping and scaling while when using Regression-based stopping and scaling, the order goes up 109. Despite the order of the total number of computations being lower with Method 2, the error performance graph reveals that the incorporation of Regression-based extrinsic information scaling and stopping with Method 2 causes an error-floor to occur at a BER of around 10−4. SDR-scaling and stopping on the contrary is a good candidate to be used with Method 2 when considering the error performance, iterations profile and the trade-off pertaining to the increase in total number of computations.

6 Conclusion

The work presented in this paper is essentially the derivation and analysis of the computational complexity of different decoding algorithms for binary LTE Turbo codes. The evaluation of the computational complexity is performed in terms of binary logical operations for generalisation. Results demonstrate the variation in computational complexities when using different algorithms for Turbo decoding. When considering streams of 16 bits, Method 3 uses 0.0065% more operations in total as compared to Method 1. Furthermore, Method 2 uses only 0.0035% of the total logical complexity needed with Method 1. The incorporation of SDR and Regression-based scaling and stopping with the Turbo decoding Methods have also been simulated and analysed. When considering Methods 1 and 3, the order of the computational complexity remains at 1012 when using SDR with the provision of an average gain of 0.1 dB in error performance and a gain of 3 iterations at an Eb/N0 value of 0.6 dB. The average gain in error performance is 0.3 dB when using the regression-based stopping and scaling. In terms of the iterations profile, gains of above 6 iterations are obtained at an Eb/N0 value of 0.7 dB over the conventional decoding methods.

This work is important since the use of a decoding method with fewer number of binary logical operations significantly reduces the computational complexity which in turn leads to a more energy efficient/power saving hardware implementation. Also, evaluating the computational complexity in terms of the number of binary logical operations provides a more generalised mechanism to compare different decoding algorithms for error control codes in contrast to being limited to only complexities based on the types of hardware used.

Several future works can be foreseen from this work. A straightforward future work would be to perform a comparative analysis of the computational complexity in terms of the number of binary logical operations between Turbo codes and other error control codes for example non-binary Turbo codes and Low Density Parity Check Codes (LDPC). Another future work would be to formulate the exact mathematical expression for the Logarithm operation instead of assuming the maximum bound.