1 Introduction

Turbo codes [1] have been widely deployed in various state-of-the-art transmission systems such as 3G, 4G long term evolution (LTE), and Advanced-LTE [2, 3]. CDMA 2000 [4] and HomePlug Green PHY [5] also employ turbo codes to improve transmission over hostile communication channels. However, the complexity of iterative decoding algorithms still needs to be reduced in view to meet the requirements of applications with stringent delay constraints. As such, various schemes have been proposed to enhance conventional turbo decoding algorithms. An overview of these schemes is given next.

The authors of [6] have presented a novel irregular interleaver design based on the sphere bound of a d-dimensional turbo code together with a new normalized dispersion measurement definition for random and irregular interleavers. In [7], a low-complexity maximum likelihood decoding algorithm for convolutional tailbiting codes based on the Viterbi algorithm has been proposed. Realistic transmission conditions demonstrate the lower memory requirements of the proposed solution. In view to design turbo codes with good distance spectrum, a method for searching interleavers within a certain class has been proposed in [8]. The proposed method is applied to the quadratic permutation polynomial (QPP) interleaver used in the LTE standard. In comparison to previous methods, there is significant reduction in the search complexity with improved performances in terms of the search time allowing for interleavers of higher length.

In [9], the authors have observed that the extrinsic information output by Max-Log-maximum a posteriori probability (MAP) turbo decoders had too much negative impact on the a posteriori probabilities and therefore proposed a scaling factor with a value of less than 1 to attenuate its effect. The typical value which has been reported to achieve good performance is 0.75 [9] when using the IMT-2000/3GPP parameters. Simulations of this proposed scheme have demonstrated that gains of about 0.2–0.4 dB in E b/N 0—ratio of the bit energy, E b, to the noise power spectral density, N 0—could be obtained over the conventional Max-Log-MAP decoding algorithm. In [10], a variant of the scheme developed in [9] was adapted for both the MAP algorithm and the soft output Viterbi algorithm (SOVA). The empirical values of the extrinsic information scale factors used with both MAP algorithm and SOVA enhance the error performance in both AWGN and fading channels. Research has also been conducted on schemes which would adaptively determine an extrinsic information scaling factor at every half-iteration of the turbo decoding process. As such, the sign difference ratio (SDR) scheme of [11] was modified by the authors in [12] to dynamically generate the scale factors for each iteration. Compared to other existing stopping schemes, the proposed SDR technique gives no degradation in terms of error performance together with a reduced complexity in the amount of data storage required. In [13], the adaptive SDR-based scaling technique of [11] has been combined with modified asymmetric LTE turbo code and reliability-based hybrid automatic repeat request (RB-HARQ) to enhance the performance. The simulation results have demonstrated that the proposed scheme of [14] could achieve a gain of 1 dB in E b/N 0 over a conventional LTE turbo code with RB-HARQ. Moreover, in [13] and [15], the adaptive SDR-based extrinsic information scaling mechanism was combined with JSCD and prioritized quadrature amplitude modulation (QAM) constellation mapping to enhance the performance of the turbo code. Recently, in [16], two different online methods for finding the scaling factors providing the optimum performance for both the symbol and the bit level are proposed. It has been shown in this study that near-optimum turbo decoding, regardless of the initial noise variance mismatch, can be obtained with online log likelihood ratio (LLR) scaling, without knowledge about the noise mismatch statistics in advance. In [17], the authors have tackled a major challenge by targeting a very high-throughput turbo decoder without losing in terms of the excellent error performance. An LTE turbo code decoder with a throughput of 2.15 GBit/s, which is compliant with LTE advanced, is proposed. In [18], the properties of the mutual information between the extrinsic LLR at the output of the two constituent decoders are analyzed with application to turbo codes. Two online methods for the estimation of optimal scaling factor applied to the extrinsic LLR have been derived.

The gains obtained with scaling schemes had a trade-off in terms of computational complexity. With a view to reduce the decoding complexity of Turbo code, early stopping or iterative detection schemes are used to prevent unnecessary additional iterations in the decoding process. For example, in [19], early stopping techniques which employ a predicted decoding threshold were proposed. These mechanisms have been divided into two categories. The first one is soft-bit decision based, such as absolute LLR measurements [20] and cross-entropy (CE) [21]. The second class is hard-bit decision based, such as sign change ratio [22] and SDR [11]. These mechanisms aid in reducing the computational complexity since no additional unnecessary iterations are performed. However, most of these proposed stopping criteria only consider the solvable decoding aspect. In [19] a new technique which employs the cross-correlation as a measure of information to envisage the decoding threshold was proposed. Furthermore, in [19], two early termination (ET) mechanisms (ET-I and ET-II) which operate using the predicted decoding threshold were developed. The iterative turbo decoding process could halt in either high signal-to-noise ratio (SNR) regions where there is a high reliability of the decoded bits (solvable decoding) or low SNR regions in which it is difficult for the decoder to estimate the transmitted information (unsolvable decoding). The results have demonstrated that no degradation in error performance is observed as a consequence of the reduced iterations with the ET-I scheme. Also, the reduction in iterations due to the ET-II scheme did not incur any performance loss. The main difference between these two schemes is that ET-I is based on the infinite-iteration decoding threshold, obtained by the EXIT chart, while ET-II is based on the fixed-iteration decoding threshold. Iterative detection schemes have also been developed for duo-binary turbo code. For example, in [23], emphasis was laid on designing of a stopping rule criterion for duo-binary turbo code. The proposed solutions were tested by employing the architecture of HomePlug AV—a new generation technology—developed to distribute both video and audio data over existing power-line wiring at home. In [24], a new stopping criterion was developed for duo-binary turbo code. The technique is based on the correlation between the bit-level extrinsic information and has been proposed while taking the symbol-level decoding of duo-binary turbo code into consideration. The simulation results demonstrate that the scheme which has been proposed could decrease the average number of decoding iterations together with an enhanced error performance as compared to the conventional criteria. The scheme proposed in [25] has the capability of detecting and tagging convergent windows by using a threshold enabling the reduction in the number of computations for convergent windows in further iterations to detect the a posteriori values of each window. The simulation results with the WiMAX convolutional turbo code show that the stopping scheme is able to reduce 47 % of the convergent windows at E b/N 0 of 1.4 dB with a small loss in coding gain.

This paper proposes a novel extrinsic information scaling mechanism which not only improves the bit error rate (BER) performance but also provides an early stopping mechanism, thereby reducing the computational complexity in both low and high E b/N 0 regions. The scaling factor is obtained by computing the Pearson’s correlation coefficient between the extrinsic and a posteriori LLR at every half-iteration. Additionally, two stopping criteria based on the regression angle and the Pearson’s correlation coefficient for the low and high E b/N 0 regions, respectively, are proposed. The regression angle is obtained by computing the slope of the line of best fit from the plot of the Pearson’s correlation coefficient values at each half-iteration. The proposed scheme demonstrates significant gains in terms of error performance and computational complexity. For example, simulation results with 16-QAM and code rate of 1/3 show an average gain of 0.3 dB in E b/N 0 for BERs below 10−3 with a minimum average gain of 0.9 iterations per packet at an E b/N 0 of 2.1 dB and a striking average gain of 7 iterations per packet for E b/N 0 values less than 1 dB over the conventional LTE turbo code.

The organization of this paper is as follows. A background study is presented in Sect. 2. A detailed description of the proposed scheme is given in Sect. 3. The simulation results with an in-depth analysis are presented in Sect. 4. Section 5 concludes the paper.

2 Background

This section gives an overview of turbo decoding when techniques such as extrinsic information scaling and iterative detection have been incorporated in the decoding process. A review of the regression analysis is also presented.

2.1 Turbo code with extrinsic information scaling and iterative detection

A generic turbo decoding architecture with iterative detection and scaling is shown in Fig. 1. A conventional turbo decoder does not have any of the blocks A and B.

Fig. 1
figure 1

Generic turbo decoding architecture receiver system

The conventional turbo decoding equations based on the Max-Log-MAP algorithm [26, 27] are presented next. The branch transition probability for decoder 1 which starts in state l′ and ends in state l at time instant t with an input bit i (i = 0 or 1) is given as:

$$ {\gamma}_t^{1(i)}\left({l}^{\prime },l\right)={ \log}_e\left[{p}_t^1(i) \exp \left(-\frac{{\left[r{0}_t - {S}_{0_t}\right]}^2 + {\left[r{1}_t - {P}_{1_t}\right]}^2}{2{\sigma}^2}\right)\right]={ \log}_e\left[{p}_t^1(i)\right] - \left(\frac{{\left[r{0}_t - {S}_{0_t}\right]}^2 + {\left[r{1}_t - {P}_{1_t}\right]}^2}{2{\sigma}^2}\right) $$
(1)

where log e (p 1 t (i)) is the natural logarithm of the a priori probability of the input bit i which is computed from the channel extrinsic information and fed to the first decoder; r0 t and r1 t are the soft-bits which have been demapped and correspond to the bipolar equivalent of the transmitted systematic bits \( {S}_{0_t} \) and first parity bits \( {P}_{1_t} \), respectively, and σ 2 is the noise variance.

The LLR ∧  (n)1 (t) at time t and nth iteration for the first decoder is given as:

$$ {\wedge}_1^{(n)}(t)= \max \left({\alpha}_{t-1}^1\left({l}^{\prime}\right)+{\gamma}_t^{1(1)}\left(l,{l}^{\prime}\right) + {\beta}_t^1(l)\right)- \max \left({\alpha}_{t-1}^1\left({l}^{\prime}\right)+{\gamma}_t^{1(0)}\left(l,{l}^{\prime}\right) + {\beta}_t^1(l)\right)\ \mathrm{f}\mathrm{o}\mathrm{r}\ 0\le {l}^{\prime}\le {\mathrm{M}}_S-1 $$
(2)

where α 1 t (l) is the forward recursive variable at state l and time t and β 1 t (l) is the backward recursive variable at state l and time t.

The extrinsic information ∧  (n)1e (t) at time t and iteration n for the first decoder is given as:

$$ {\wedge}_{1e}^{(n)}(t) = {\wedge}_1^{(n)}(t)-\frac{2}{\sigma^2}r{0}_t-{\overline{\wedge}}_{2e}^{\left(n-1\right)}(t) $$
(3)

where \( {\overline{\wedge}}_{2e}^{\left(n-1\right)}(t) \) is the deinterleaved extrinsic information at time t and the (n − 1)th iteration for the second decoder.

The same decoding operations are performed by decoder 2 using the inputs \( \overline{r{0}_t},\;r{2}_t \), and p 2 t (i), where p 2 t (i) is the a priori probability with input bit i which is computed from the channel extrinsic information and fed to the second decoder and \( {\overline{r0}}_t \) and r2 t are the soft-bits which have been demapped and correspond to the bipolar equivalent of the interleaved systematic bits \( \overline{S_{0_t}} \) and second parity bits, \( {P}_{2_t} \) respectively.

When only iterative detection is considered, block A only is incorporated in the system. Block A contains the unit which takes as input the extrinsic information and a posteriori LLR output from each decoder at each iteration in order to compute a value for the stopping threshold. When the stopping condition is met, a signal is sent to the corresponding switches T1 and T2 which are opened and the iterative process is stopped.

With SDR algorithm for iterative detection [11], the value for the SDR is computed as follows [26, 28]:

$$ SD{R}_d^{(n)} = \frac{1}{N}\ {f}_{\mathrm{diff}}\left({\wedge}_{de}^{(n)},\kern0.5em {\wedge}_d^{(n)}\right),\kern0.5em \mathrm{f}\mathrm{o}\mathrm{r}\ n\ge 1 $$
(4)

where SDR (n) d is the SDR value for decoder d at iteration n; f diff(∧ (n) de , ∧  (n) d ) is the function computing the number of differences in sign between the a posteriori LLR for decoder d at iteration n and the extrinsic information for decoder d at iteration n, ∧  (n) de ; and N is the length of the packet.

The decoding process is halted when the SDR (n) d drops below a certain threshold and 0.001 ≤ threshold ≤ 0.01, which has been experimentally determined in [11].

The iterative detection based on the sign change ratio (SCR) algorithm [22] takes as input only the extrinsic information from each decoder and computes the threshold based on the number of changes in sign between the current and previous extrinsic information. The value for the SCR is computed as the following [26, 28]:

$$ SC{R}_d^{(n)} = \frac{1}{N}\ {f}_{\mathrm{change}}\left({\wedge}_{de}^{(n)},\kern0.5em {\wedge}_{de}^{\left(n-1\right)}\right),\kern0.5em \mathrm{f}\mathrm{o}\mathrm{r}\ n>1 $$
(5)

where SCR (n) d is the SCR value for decoder d at iteration n; f change(∧ (n) de , ∧  (n − 1) de ) is the function computing the number of changes in sign between the current and previous extrinsic information for decoder d; and N is the length of the packet.

The decoding process is stopped when the SCR (n) d drops below a certain threshold and 0.005 ≤ threshold ≤ 0.03, which has been experimentally determined in [22].

The Hard Decision Aided (HDA) [19] criterion terminates the iterative process if the hard decision which is output from the decoder is unchanged for two successive iterations. This criterion can be used only when n > 1 similar to the SCR-based stopping mechanism.

The ET-I scheme on the infinite-iteration decoding threshold, obtained by the extrinsic information transfer (EXIT) chart, while the ET-II scheme is based on a fixed iteration decoding threshold [19].

When iterative detection together with extrinsic information scaling is considered, both blocks A and B are incorporated in the system. Block A consists of the generator for the extrinsic information scaling factor and stopping mechanisms. Block B contains the multiplier which scales the extrinsic information with the scale factor generated from block A. In [12], the stopping threshold used was 1.0, which implies that there is no sign difference between the a posteriori LLR and extrinsic information. When the scale factor generated from block A has a value less than 1.0, it is used as a scaling factor, and when the value of the scale factor is equal to 1.0, a stopping signal is sent to the switches to terminate the iterative process [12, 14, 15]. The scale factor for each decoder d at iteration n is computed as follows:

$$ {S}_{dn} = \frac{1}{N}{\displaystyle {\sum}_{t=1}^Nf\left({\wedge}_{de}^{(n)},{\wedge}_d^{(n)}\right)} $$
(6)

where f(∧ (n) de , ∧  (n) d ) = 1 if ∧  (n) de , and ∧  (n) d have the same sign, otherwise f(∧ (n) de , ∧  (n) d ) = 0; N is the frame size in bits.

When only extrinsic information scaling is considered, block B only is incorporated in the system where a constant scale factor of 0.75 is used as per the schemes in [9]. The value of the constant scale factor of 0.75 has been generalized for turbo codes in [29] and [30].

2.2 Regression analysis

The process of laying out a straight line which best fits the average between two variables X and Y is in fact the measurement of linear correlation between these two variables [31]. This line is called the line of regression or line of best fit. The value of the sum of the squared vertical distances from the data coordinates to the line of regression need to be the least possible for “best fit” [31]. The slope or gradient of the line then corresponds to the direction of correlation. The estimated regression function is Y idx  = b 0 + b 1 X idx . The computation of b 0 and b 1 is done using the method of least squares [31]. b 0 is the point at which the line crosses the Y-axis, and b 1 is the gradient of the line. The equations used for the determination of the constants b 0 and b 1 are as the following:

$$ {b}_1 = \frac{S{C}_{XY}}{S{S}_X} = \frac{{\displaystyle \sum}\left({X}_{idx}-\overline{X}\right)\left({Y}_{idx}-\overline{Y}\right)}{{\displaystyle \sum }{\left({X}_{idx}-\overline{X}\right)}^2} $$
(7)
$$ {b}_0=\overline{Y} - {b}_1\overline{X} $$
(8)

where SC XY is the sum of co-deviates for paired values of X idx and Y idx ; SS X is the sum of squared deviates for X idx values; \( \overline{Y} \) is the mean of all the Y idx values; and \( \overline{X} \) is the mean of all the X idx values.

The primary measure of linear correlation is the Pearson’s product–moment correlation (r) which lies between −1 and +1 [31]. The Pearson’s product–moment correlation coefficient (r) is defined as:

$$ r=\frac{{\displaystyle \sum}\left({X}_{idx} - \overline{X}\right) \times \left({Y}_{idx} - \overline{Y}\right)}{\sqrt{{\displaystyle \sum }{\left({X}_{idx} - \overline{X}\right)}^2 \times {\displaystyle \sum }{\left({Y}_{idx} - \overline{Y}\right)}^2}} $$
(9)

The categories of correlation are:

  1. 1.

    Positive correlation—the other variable also increases upon increasing one variable.

  2. 2.

    Negative correlation—the second variable decreases when the first one is increased.

  3. 3.

    No correlation—the second variable neither increases nor decreases when the first one is increased.

Different conclusions can be made based on the categories of correlation:

  1. 1.

    A positive linear correlation is denoted by positive values.

  2. 2.

    A negative linear correlation is denoted by negative values.

  3. 3.

    No linear correlation is denoted by a value of zero.

  4. 4.

    Strong linear correlation exists when the values are close to +1 or −1.

Pearson’s correlation coefficient (r 2) is defined as:

$$ {r}^2={\left(\frac{{\displaystyle \sum}\left({X}_{idx}-\overline{X}\right) \times \left({Y}_{idx} - \overline{Y}\right)}{\sqrt{{\displaystyle \sum }{\left({X}_{idx} - \overline{X}\right)}^2 \times {\displaystyle \sum }{\left({Y}_{idx} - \overline{Y}\right)}^2}}\right)}^2 $$
(10)

The statistical measure of the strength of a linear relationship between paired data is called the Pearson’s correlation coefficient (r 2) [31].

3 Proposed system model

This section describes the transmitter and receiver of the proposed system. Essentially, it is the receiver which completely demarcates from previous scaling and iterative detection schemes by using regression analysis.

3.1 The proposed decoding method

The complete transmitter and receiver system is shown in Fig. 2. The transmitter system consists of a rate 1/3 LTE turbo encoder [3] with generator polynomial [1, 15/13] in octal. A set of random bits S0 is encoded, sub-block interleaved, interlaced, multiplexed, and modulated with QPSK modulation or 16-QAM. Then, the symbols are sent over a complex additive white Gaussian noise (AWGN) channel and the corrupted information is intercepted at the receiver side. The received signal R t is demapped, deinterlaced, and sub-block deinterleaved into P0 t , P1 t , and P2 t , which are, respectively, the soft information corresponding to the systematic and parity bits. These are then sent to the elementary decoders (DEC1 and DEC2) of the turbo decoder. The regression analyzer outputs the scaling factor r 2(n) d , which is the measure of the linear correlation between the a posteriori LLR (..  (n) d (t)) and the extrinsic LLR ( (n) de (t)) as shown in Eq. (11).

Fig. 2
figure 2

Complete transmitter and receiver system

A simplified flowchart of the complete transmitter and receiver system of Fig. 2 is shown in Fig. 3.

$$ {r}_d^{2(n)}={\left(\frac{{\displaystyle {\sum}_{t=1}^N}\left({\wedge}_d^{(n)}(t) - \widehat{\wedge_d^{(n)}}\right)\times \left({\wedge}_{de}^{(n)}(t) - \widehat{\wedge_{de}^{(n)}}\right)}{\sqrt{{\displaystyle {\sum}_{t=1}^N}{\left({\wedge}_d^{(n)}(t) - \widehat{\wedge_d^{(n)}}\right)}^2\times {\displaystyle {\sum}_{t=1}^N}{\left({\wedge}_{de}^{(n)}(t) - \widehat{\wedge_{de}^{(n)}}\right)}^2}}\right)}^2 $$
(11)

where d = {1, 2} is the decoder number; N is the packet length, which is 6144 [3] in this simulation; r 2(n) d is the scaling factor for decoder d at iteration n; ∧  (n) d (t) is the tth a posteriori LLR of the decoder d at iteration n and time t; and \( \widehat{\wedge_d^{(n)}} \) is the mean a posteriori LLR of decoder d at iteration n and is computed as:

$$ \widehat{\wedge_d^{(n)}} = \frac{1}{N}{\displaystyle {\sum}_{t=1}^N{\wedge}_d^{(n)}(t)} $$
(12)

∧  (n) de (t) is the tth extrinsic LLR of decoder d at iteration n and \( \widehat{\wedge_{de}^{(n)}} \) is the mean extrinsic LLR of decoder d at iteration n and is computed as:

$$ \widehat{\wedge_{de}^{(n)}} = \frac{1}{N}{\displaystyle {\sum}_{t=1}^N{\wedge}_{de}^{(n)}(t)} $$
(13)

n is the number of half-iterations taking values in {1/2, 1, … I (maximum number of iterations)}.

Fig. 3
figure 3

Simplified flowchart of complete transmitter and receiver system

r 2(n) d is employed as a scaling factor for all E b/N 0 values. However, it can also be used as a stopping criterion at high E b/N 0 values. When r 2(n) d  = 1, it implies perfect correlation between the a posteriori and the extrinsic LLR. Hence, this criterion can intuitively be used as a stopping condition. However, to derive a more accurate threshold or value of r 2(n) d at which the iterative process can be stopped, plots of r 2(n) d against number of iterations at different E b/N 0 values for QPSK modulation and 16-QAM were analyzed as depicted in Figs. 4, 5, 6, and 7.

Fig. 4
figure 4

Plots for regression values with QPSK modulation and code rate = 1/3

Fig. 5
figure 5

Plots for regression values with QPSK modulation and code rate = 1/2

Fig. 6
figure 6

Plots for regression values with 16-QAM and code rate = 1/3

Fig. 7
figure 7

Plots for regression values with 16-QAM and code rate = 1/2

The plots of Figs. 4 and 6 reveal that for high E b/N 0, as n increases, r 2(n) d also increases and reaches a maximum value of Th = 0.98 with both QPSK and 16-QAM at a code rate of 1/3, while the plots of Figs. 5 and 7 reveal that the value of Th = 0.95 with both QPSK and 16-QAM at a code rate of 1/2. Hence, these experimental values are used as thresholds. However, for low E b/N 0 values, r 2(n) d does not change or changes very little as n increases. Hence, a different criterion based on the regression angle θ n is used as a stopping rule for low E b/N 0 values. The regression angle θ n is obtained from the gradient of the regression line fitted to the graph of r 2(n) d against n for a particular E b/N 0 and is computed in the regression angle module shown in Fig. 2. To obtain this regression line, the decoders are allowed to perform up to four iterations and the values of r 2(n) d are stored in a buffer. From four iterations onward, the gradient of the regression line is computed from the last seven r 2(n) d values stored in the buffer. The window B is defined in terms of the total number of half-iterations and is fixed to seven in this case. The regression angle is obtained as follows:

$$ {\theta}_n^B = { \tan}^{-1}\left(\frac{{\displaystyle {\sum}_{j=2n-\left(B-1\right)}^{2n}\left({r}_d^{2(n)}(j)-\widehat{r_d^{2(n)}}\right)}\times \left(idx(j)-\widehat{idx(j)}\right)}{{{\displaystyle {\sum}_{j=2n-\left(B-1\right)}^{2n}\left({r}_d^{2(n)}(j)-\widehat{r_d^{2(n)}}\right)}}^2}\right) $$
(14)

where idx(j) is the index of the jth half-iteration and idx ∈ {1, 2, 3, …, 22, 23, 24}.

For example, consider the regression values with 16-QAM and code rate = 1/2 at E b/N 0 = 8.0 dB and fourth iteration.

r 2(n) d (j) = {0.6721, 0.7683, 0.8341, 0.8745, 0.9002, 0.9207, 0.9295, and 0.9407}

j = {1, 2, 3, 4, 5, 6, 7, and 8}

idx = {1, 2, 3, 4, 5, 6, 7, and 8}

The corresponding plot is shown in Fig. 8. The range of scale factors and iteration indices over which the regression angle is computed are also shown. As per Eq. (14), the value of the regression angle θ B n can be obtained to be 1.5405 radians for the above example.

Fig. 8
figure 8

Plot for regression values with 16-QAM and code rate = 1/2 at fourth iteration and E b/N 0 = 8.0 dB

The acute regression angles can be positive, negative, or equal to zero as shown in Fig. 9.

Fig. 9
figure 9

Regression angles

It can be clearly seen from Figs. 4, 5, and 6 that negative regression angles are obtained for small window B at low E b/N 0 values. Thus, a general condition for stopping the iterative decoding at any E b/N 0 value and any window size B would be to check that the condition θ B n  ≤ 0 is met.

The threshold detector block in Fig. 2 checks for the following conditions: θ B n  ≤ 0 or r 2(n) d  ≥ Th, which when satisfied opens the switches T1 or T2 and the iterative decoding operation is terminated. ∧  (n) d (t) is then used for the final hard decision.

The algorithm for regression analysis based extrinsic information scaling and early stopping can be formalized as follows:

  1. 1.

    Set the total number of iteration I to 12.

  2. 2.

    Initialize to zero the counter for number of iterations, n iters = 0.

  3. 3.

    For each iteration.

  4. 4.

    Compute γ 1(i) t (l′, l) for decoder 1 as per Eq. (1).

  5. 5.

    Compute α 1 t (l) for decoder 1.

  6. 6.

    Compute β 1 t (l) for decoder 1.

  7. 7.

    Compute ∧  (n)1 (t) for decoder 1 as per Eq. (2).

  8. 8.

    Compute the extrinsic information LLR, ∧  (n)1e (t) as per Eq. (3).

  9. 9.

    Compute the regression-based scale factor, r 2(n)1 as per Eq. (11).

  10. 10.

    Increment n iters in steps of 0.5. n iters = n iters + 0.5.

  11. 11.

    Set the regression window value to 7.

  12. 12.

    If n iters ≥ 4.

  13. 13.

    Compute the regression angle, θ B n as per Eq. (14).

  14. 14.

    If θ B n  ≤ 0 or r 2(n)1  ≥ Th

  15. 15.

    Perform hard decision on ∧  (n)1 (t)

  16. 16.

    Go to line 33

  17. 17.

    End if.

  18. 18.

    End if.

  19. 19.

    Compute γ 2(i) t (l′, l) for decoder 2.

  20. 20.

    Compute α 2 t (l) for decoder 2.

  21. 21.

    Compute β 2 t (l) for decoder 2.

  22. 22.

    Compute ∧  (n)2 (t) for decoder 2.

  23. 23.

    Compute extrinsic information LLR, ∧  (n)2e (t).

  24. 24.

    Compute the regression based scale factor, r 2(n)2 as per Eq. (11).

  25. 25.

    Increment n iters in steps of 0.5. n iters = n iters + 0.5.

  26. 26.

    If n iters ≥ 4

  27. 27.

    Compute the regression angle as per Eq. (13).

  28. 28.

    If θ B n ≤ 0 or r 2(n)1  ≥ Th

  29. 29.

    Perform hard decision on ∧  (n)2 (t)

  30. 30.

    Go to line 33

  31. 31.

    End if.

  32. 32.

    End if.

  33. 33.

    End for loop.

3.2 Computational complexity analysis

The computational complexity analysis is performed by comparing the average number of computations per packet. The number of computations per time unit for a convolutional code with k inputs and v memory elements with Max-Log MAP decoding algorithm is given as follows [27]:

$$ \mathrm{Number}\ \mathrm{of}\ \mathrm{computations}\ \mathrm{per}\ \mathrm{time}\ \mathrm{unit}=\left(\left(4\times {2}^k\times {2}^v\right)+8\right)+\left(\left(2\times {2}^k\right)\times {2}^v\right)+\left(\left(4\times {2}^v\right)-2\right) $$
(15)

For the conventional LTE turbo code with k = 1 and v = 3, the number of computations at each iteration is given as:

$$ {C}_{\mathrm{LTE}}=N \times \left[\left(64+8\right)+(32)+\left(32-2\right)\right]=134N $$
(16)

where C LTE is the number of computations for each packet at each iteration for the conventional LTE turbo decoder.

When SDR scaling is used, the additional number of computations for each iteration as per Eq. (6) is as shown in Table 1.

Table 1 Table for number of additional computations per iteration

The total number of computations at each iteration for LTE turbo codes with SDR scaling is given as:

$$ {C}_{\mathrm{SDR}}={C}_{\mathrm{LTE}}+6N+2=134N+6N+2=140N+2 $$
(17)

where C SDR is the number of computations for each packet at each iteration for the LTE turbo decoder with SDR scaling and stopping mechanism.

When regression-based scaling and early stopping are used, the additional number of computations for each iteration as per Eqs. (11) and (14) are as shown in Table 1. The number of computations in Table 1 is based on the assumption that the regression angle is computed over each iteration. The total number of computations at each iteration for LTE turbo codes with regression-based scaling and early stopping is given as:

$$ {C}_{\mathrm{Reg}}={C}_{\mathrm{LTE}}+20N+2+100=134N+20N+102=154N+102 $$
(18)

where C Reg is the number of computations for each packet at each iteration for the LTE turbo decoder with regression-based scaling and early stopping mechanism.

The table for the computational complexity analysis for an interleaver size N = 6144 is as given in Table 2.

Table 2 Computational complexity analysis table

4 Simulation results

A comparative analysis of the following schemes in terms of BER performance, average number of decoding iterations per packet, and average number of computations per packet [27] is made:

  • Scheme 1—conventional turbo coding as used in LTE.

  • Scheme 2—this scheme employs SDR based scaling and stopping mechanisms [12].

  • Scheme 3—this is the proposed scheme which uses regression scaling and the stopping conditions: θ B n  ≤ 0 or r 2 ≥ Th.

Simulations were performed using a turbo code with the following parameters: generator: G = [1, 15/13] in octal, QPP interleaver with depth of 6144, 200 packets, and code rates of 1/3 and 1/2.

4.1 Results with QPSK modulation and a code rate of 1/3

The graph of BER as a function of E b/N 0 has been plotted for QPSK modulation and a code rate of 1/3 as shown in Fig. 10. It is observed that compared to scheme 1, scheme 3 provides a gain of 0.35 dB on average in BERs below 10−3. Scheme 3 also provides a gain of 0.1 dB on average in BERs below 10−3 compared to scheme 2. The proposed scheme demonstrates that it is possible to achieve gains in terms of BER performance over both the conventional decoding scheme and the one employing SDR-based extrinsic information scaling mechanism throughout the whole E b/N 0 range.

Fig. 10
figure 10

BER performance with QPSK modulation and code rate = 1/3

The EXIT charts for the schemes 1, 2, and 3 with QPSK modulation and a code rate of 1/3 at an E b/N 0 of 0.3 dB are shown in Fig. 11. Due to the missing notion of iterations, the study of the convergence of product and generalized concatenated block codes using the EXIT chart technique becomes difficult when extrinsic information scaling is being considered [32]. The mutual information for the a priori LLR (IA) is plotted against the mutual information for the extrinsic LLR (IE) for each decoder based on the works of [18] and [32]. It can be observed from the charts that when extrinsic information scaling is incorporated in the decoding process, the initial output mutual information is higher for input mutual information of zero. Also, the tunnel path for scheme 3 is wider, which allows for better convergence.

Fig. 11
figure 11

EXIT chart for schemes with QPSK modulation and code rate = 1/3 and E b/N 0 of 0.3 dB

The graph of the average number of iterations as a function of E b/N 0 has been plotted for QPSK modulation and with a code rate of 1/3 as shown in Fig. 12. It is observed that compared to scheme 1, scheme 3 provides a minimum gain of two iterations at an E b/N 0 value of 0.2 dB approximately. Scheme 3 also provides a minimum gain of one iteration beyond an E b/N 0 value of 0.2 dB. The proposed scheme demonstrates that it is possible to achieve gains in terms of BER performance together with average number of iterations over both the conventional decoding technique and the one employing an SDR-based extrinsic information scaling mechanism throughout the E b/N 0 range. It can also be observed that the proposed scheme allows for early termination in the low E b/N 0 range when further iterations do not result in BER performance improvement.

Fig. 12
figure 12

Iterations performance with QPSK modulation and code rate = 1/3

The graph of the total number of computations as a function of E b/N 0 has been plotted for QPSK modulation and code rate of 1/3 as shown in Fig. 13. It is observed that scheme 3 uses fewer computations than schemes 1 and 2 for E b/N 0 ≤ 0.2 dB range. For the range E b/N 0 ≥ 0.3 dB, scheme 3 employs the same decreasing number of computations as scheme 2 on average while still providing better BER performance than both schemes 1 and 2 as depicted in Fig. 10.

Fig. 13
figure 13

Average number of computations per packet with QPSK modulation and code rate = 1/3

4.2 Results with QPSK modulation and a code rate of 1/2

The graph of BER as a function of E b/N 0 has been plotted for QPSK modulation and a code rate of 1/2 as shown in Fig. 14. It is observed that compared to scheme 1, scheme 3 provides a gain of 0.2 dB on average in BERs below 10−3. Scheme 3 also provides a gain of 0.4 dB on average in BERs below 10−5 compared to Scheme 2. The proposed scheme demonstrates that it is possible to achieve gains in terms of BER performance over the conventional decoding scheme throughout the whole E b/N 0 range and over the one employing an SDR-based extrinsic information scaling mechanism for E b/N 0 > 4.0 dB.

Fig. 14
figure 14

BER performance with QPSK modulation and code rate = 1/2

The EXIT charts for the schemes 1, 2, and 3 with QPSK modulation and a code rate of 1/2 at an E b/N 0 of 1.0 dB are shown in Fig. 15. It can be observed from the charts that when extrinsic information scaling is incorporated in the decoding process, the initial output mutual information is higher for input mutual information of zero. Also, the tunnel path for scheme 3 is wider, which allows for better convergence.

Fig. 15
figure 15

EXIT chart for schemes with QPSK modulation and code rate = 1/2 and E b/N 0 of 1.0 dB

The graph of the average number of iterations as a function of E b/N 0 has been plotted for QPSK modulation as shown in Fig. 16. It is observed that compared to scheme 1, scheme 3 provides a minimum gain of 1.2 iterations on average at an E b/N 0 value of 3.5 dB approximately. Scheme 3 also provides a minimum gain of 0.4 iterations on average at the same E b/N 0 value of 3.5 dB. The proposed scheme demonstrates that it is possible to achieve gains in terms of BER performance together with average number of iterations over both the conventional decoding technique and the one employing an SDR-based extrinsic information scaling mechanism throughout the E b/N 0 range. It can also be observed that the proposed scheme allows for early termination in the low E b/N 0 range when further iterations do not results in BER performance improvement.

Fig. 16
figure 16

Iteration performance with QPSK modulation and code rate = 1/2

The graph of the total number of computations as a function of E b/N 0 has been plotted for QPSK modulation and code rate of 1/2 as shown in Fig. 17. It is observed that scheme 3 uses fewer computations than schemes 1 and 2 for E b/N 0 ≤ 3 dB range. The peak number of computations which is higher than the value for both schemes 1 and 2 occurs at E b/N 0 = 3.5 dB. For the range E b/N 0 ≥ 4.0 dB, scheme 3 employs a fewer number of computations than scheme 2 on average while still providing better BER performance than both schemes 1 and 2 as depicted in Fig. 14.

Fig. 17
figure 17

Average number of computations per packet with QPSK modulation and code rate = 1/2

4.3 Results with 16-QAM and a code rate of 1/3

The graph of BER as a function of E b/N 0 has been plotted for 16-QAM and code rate of 1/3 as shown in Fig. 18. It is observed that compared to scheme 1, scheme 3 provides a gain of 0.3 dB on average in BERs below 10−3. Scheme 3 also provides a gain of 0.1 dB on average in BERs below 10−3 compared to scheme 2. The proposed scheme demonstrates that gains in terms of BER performance can be obtained over both conventional decoding technique and the one employing an SDR-based extrinsic information scaling mechanism throughout the whole E b/N 0 range.

Fig. 18
figure 18

BER performance with 16-QAM and code rate = 1/3

The EXIT charts for the schemes 1, 2, and 3 with 16-QAM and code rate of 1/3 at an E b/N 0 of 2.0 dB are shown in Fig. 19. It can be observed from the charts that when extrinsic information scaling is incorporated in the decoding process, the initial output mutual information is higher for input mutual information of zero. Also, the tunnel path for scheme 3 is wider, which allows for better convergence.

Fig. 19
figure 19

EXIT chart for schemes with 16-QAM and code rate = 1/3 and E b/N 0 of 2.0 dB

The graph of the average number of iterations as a function of E b/N 0 has been plotted for 16-QAM and a code rate of 1/3 as shown in Fig. 20. It is observed that compared to scheme 1, scheme 3 provides a gain of 0.9 iterations at an E b/N 0 value of 2.1 dB approximately. Scheme 3 also provides a striking average gain of seven iterations for E b/N 0 values less than 1 dB. The proposed scheme demonstrates that gains in terms of BER performance together with average number of iterations can be obtained over both the conventional decoding technique and the one employing an SDR-based extrinsic information scaling mechanism throughout the E b/N 0 range. It can also be observed that the proposed scheme allows for early termination in the low E b/N 0 range when further iterations do not result in BER performance improvement. With the gain in error performance obtained with the proposed scheme, it can be inferred that the regression analysis based scaling factor reduces the over-optimism of the extrinsic LLRs better than the SDR-based scaling factors.

Fig. 20
figure 20

Iteration performance with 16-QAM and code rate = 1/3

The graph of the total number of computations as a function of E b/N 0 has been plotted for 16-QAM and a code rate of 1/3 as shown in Fig. 21. It is observed that scheme 3 uses fewer computations than schemes 1 and 2 for E b/N 0 ≤ 2 dB range. For the range 2.1 dB ≤ E b/N 0 ≤ 2.8 dB, scheme 3 employs more computations than scheme 2 on average for providing the improved BER performance than both schemes 1 and 2 as depicted in Fig. 21. It can also be noted that for E b/N 0 ≥ 2.9 dB, the proposed scheme performs fewer number of computations compared to Scheme 2.

Fig. 21
figure 21

Total number of computations with 16-QAM and code rate = 1/3

4.4 Results with 16-QAM and a code rate of 1/2

The graph of BER as a function of E b/N 0 has been plotted for 16-QAM and a code rate of 1/2 as shown in Fig. 22. It is observed that compared to scheme 1, scheme 3 provides a gain of 0.4 dB on average in BERs below 10−3. Scheme 3 also provides a gain of 0.3 dB on average in BERs below 10−3 compared to scheme 2. The proposed scheme demonstrates that gains in terms of BER performance can be obtained over both conventional decoding technique and the one employing SDR based extrinsic information scaling mechanism throughout the whole E b/N 0 range.

Fig. 22
figure 22

BER performance with 16-QAM and code rate = 1/2

The EXIT charts for schemes 1, 2, and 3 with 16-QAM and a code rate of 1/2 at an E b/N 0 of 3.0 dB are shown in Fig. 23. It can be observed from the charts that when extrinsic information scaling is incorporated in the decoding process, the initial output mutual information is higher for input mutual information of zero. Also, the tunnel path for scheme 3 is wider, which allows for better convergence.

Fig. 23
figure 23

EXIT chart for schemes with 16-QAM and code rate = 1/2 and E b/N 0 of 3.0 dB

The graph of the average number of iterations as a function of E b/N 0 has been plotted for 16-QAM as shown in Fig. 24. It is observed that compared to scheme 1, scheme 3 provides a gain of 0.6 iterations on average at an E b/N 0 value of 6.5 dB approximately. Scheme 3 also provides a striking average gain of 6 iterations for E b/N 0 values less than 5 dB. The proposed scheme demonstrates that gains in terms of BER performance together with average number of iterations can be obtained over both the conventional decoding technique and the one employing SDR based extrinsic information scaling mechanism throughout the E b/N 0 range. It can also be observed that the proposed scheme allows for early termination in the low E b/N 0 range when further iterations do not result in BER performance improvement. With the gain in error performance obtained with the proposed scheme, it can be inferred that the regression analysis based scaling factor reduces the over-optimism of the extrinsic LLRs better than the SDR-based scaling factors.

Fig. 24
figure 24

Iteration performance with 16-QAM and code rate = 1/2

The graph of the total number of computations as a function of E b/N 0 has been plotted for 16-QAM and code rate of 1/2 as shown in Fig. 25. It is observed that scheme 3 uses fewer computations than schemes 1 and 2 for E b/N 0 ≤ 6 dB range. The peak number of computations which is higher than the value for both schemes 1 and 2 occurs at E b/N 0 = 6.5 dB. For the range E b/N 0 ≥ 7.0 dB, scheme 3 employs fewer computations than scheme 2 on average while still providing better BER performance than both schemes 1 and 2 as depicted in Fig. 25.

Fig. 25
figure 25

Average number of computations per packet with 16-QAM and code rate = 1/2

5 Conclusion and future works

This paper proposes a novel regression analysis-based extrinsic information scaling together with an early stopping mechanism for LTE turbo code. The novel scheme uses Pearson’s correlation between (Λ (n) d ) and (Λ (n) de ) of each decoder at each iteration to adaptively compute an extrinsic information scaling factor which reduces the over-optimism of the extrinsic LLRs in a better way than the SDR-based scheme, thereby further enhancing the BER performance. Based on the trends in the regression values of r 2(n) d , two stopping conditions have been proposed for both the low and high E b/N 0 regions. The regression analysis-based extrinsic information scaling and early stopping-based mechanism proposed in this paper show that better error performance in terms of BER can be obtained compared to the SDR-based extrinsic information scaling and stopping mechanism. Also, complexity in terms of decoding iteration is significantly reduced throughout the whole E b/N 0 range with both QPSK and 16-QAM. Thus, the proposed scheme is promising as it completely demarcates from conventional scaling techniques and also opens up new avenues to explore by using regression analysis in turbo decoding. Some interesting future works would be to enhance the existing EXIT chart schemes to cater for convergence with different adaptive extrinsic information scaling schemes; analyze the performance of the proposed system with higher-order QAM and develop the analytical expressions for the determination of the threshold value to be used in the stopping mechanism.