1 Introduction

Predictability studies are hot topics in time series analysis fields (Lorenz 1996; Boffetta et al. 2002). Predictability has been taken as a way to characterize the complexity of the time series’ dynamics (Boffetta et al. 2002), so there is close relation between predictability and complexity of the time series. Currently, structural complexity (Li and Fu 2014; Dakos and Soler-Toscano 2017) and prediction accuracy (Bauer et al. 2015; Dietze 2017; Babu and Reddy 2014) of time series attract huge focuses in various fields, such as climate, ecology, economy, and social service. However, there is no definite conclusion about their association. It is generally thought (Garland et al. 2014) that stochastic processes possess the higher structural complexity and deterministic processes like chaotic outputs have a lower level of complexity. Different structural patterns in a given series may influence its prediction accuracy. Previous studies indeed show that ordinal pattern information, such as the stronger long-term memory in stochastic processes, can enhance prediction accuracy (Franzke and Woollings 2011; Yuan et al. 2018). And it was also found that the increased nonlinearity strength in chaotic series can improve the prediction accuracy of deterministic processes (Ye and Hsieh 2008). Some previous studies also conjectured there are some well-defined structures hidden in the real-world series, which can induce a different prediction accuracy (Patil et al. 2001; Yuan et al. 2013; Fu et al. 2019; Molgedey and Ebeling 2000), but there is no further study on whether the higher prediction accuracy is indeed induced by strengthening or weakening well-defined structures. There are different kinds of well-defined structures in the real-world time series, both stochastic and deterministic. Will different types of well-defined structures in the series contribute differentially to the prediction accuracy? Will it have its own regime or phase for each specific well-defined structure in the predictability and prediction accuracy plot? Conclusive answers to these questions will contribute greatly to the understanding and prediction of complex time series.

As a measure of the highest realizable prediction degree, time series intrinsic predictability (Lorenz 1996) also directly reflects complexity of the time series (Boffetta et al. 2002). Both of them can be quantified by permutation entropy (PE) or weighted permutation entropy (WPE) (Garland et al. 2014; Bandt and Pompe 2002; Fadlallah et al. 2013). Previous studies conjectured that there exist monotonous relation between WPE and prediction accuracy for certain data, and this relation was recommended to guide prediction of complex time series (Garland et al. 2014; Pennekamp et al. 2018). However, only limited kinds of well-defined structures in time series were considered in their studies; no definitive results about regime or phase of series with specific well-defined structure in the predictability and prediction accuracy plot are provided.

In our present work, we will first make clear what complexity corresponds to time series with known different types of well-defined structures, which has not been clearly revealed in the literature. Through these detailed studies, regime or phase of each specific well-defined structure can be determined in the predictability and prediction accuracy plot. To accomplish this goal, the theoretically modeled time series are considered with commonly existing well-defined structures in different fields, such as short-term memory, long-term memory, multifractal patterns, and nonlinearity in chaotic series (Graves et al. 2017; Schmitt et al. 2000; Sugihara et al. 2012). Then, two types of prediction strategies, linear or nonlinear modeling, are exploited to forecast these time series and to check to what level of prediction accuracy these time series with different well-defined structures correspond. Finally, we will reach the answer to what regime or phase of series with specific well-defined structures in the predictability and prediction accuracy plot. And this conclusive result will be tested to three climate series to validate its guiding in prediction modeling of real-world time series.

In the following, the methodology used in this work will be introduced in Section 2, then Section 3 reveals the influence of four kinds of well-defined structures on predictability and prediction accuracy, and lastly, Section 4 concludes this work with some conclusions and discussions.

2 Methodology

2.1 Synthetic time series with well-defined structures

2.1.1 Short-term memory

The first considered well-defined ordinal pattern is short-term correlation or short-term memory commonly found in the real-world time series. The autocorrelation function of time series with short-term memory will rapidly decrease exponentially with time delay (Höll and Kantz 2015), and correlation can only exist in neighboring data points for this kind of structural time series. We employ the first order of the autoregressive process (AR(1)) xi = axi − 1 + εi to simulate time series with this kind of structure. Here, a represents the strength of short-term memory, and it can be strengthened from 0 to 1, and {εi} is Gaussian white noise with zero mean and unit standard deviation.

2.1.2 Long-term memory

For a long-term correlated time series, its autocorrelation function decreases in the form of a power function with time delay (Höll and Kantz 2015). Here, its simulation is accomplished by an autoregressive fractionally integrated moving average (ARFIMA (p, d, q), Granger and Joyeux 1980; Massah and Kantz 2016) process with p and q representing the intensities of the autoregressive process and moving average process, respectively. Since only the long-term memory is needed here, the model ARFIMA (0, d, 0) is adopted. We can control long-term memory intensity of the wanted time series by the parameter d. The famous Hurst exponent (Graves et al. 2017) H can be computed by H = d + 0.5, and for positive long-term correlation, H is in the interval between 0.5 and 1.

2.1.3 Multifractal patterns

In multifractal time series, there are different autocorrelation intensities over different magnitude levels. The multifractal strength can be quantified by the width of a singular spectrum (Kantelhardt et al. 2002). We simulated multifractal time series with the binomial multifractal model \( {x}_i={a}^{B_{i-1}}{\left(1-a\right)}^{\log_2N-{B}_{i-1}} \); here, N is the total number of data points, and for the ith data point xi, Bi−1 is the number of digits equal to 1 in the binary representation of the index i−1 (this means that the index i−1 will be first transformed into binary digits). The parameter a will modulate the strength of the modeled multifractal time series (a could be changed from 0.5 to 1), since the relation between a and is \( d\tau =\frac{\ln a-\ln \left(1-a\right)}{\ln 2} \). This model has been widely employed to simulate the characteristics of finance, turbulence, precipitation, and runoff temporal data (Kantelhardt et al. 2002; Rybski et al. 2011; Nian and Fu 2019).

2.1.4 Nonlinearity in chaotic series

For a chaotic system, we take the Lorenz63 system (Lorenz 1963) as an example; the system reads

$$ {\displaystyle \begin{array}{l} dx/ dt=- ax+ ay\\ {} dy/ dt= bx-y- xz\\ {} dz/ dt= xy- cz\end{array}} $$
(1)

The nonlinearity of chaotic series from these system outputs can be controlled, and the integral chaotic regime will also vary with this controlling (Ye and Hsieh 2008; Basu and Foufoula-Georgiou 2002; Elsner and Tsonis 1992; Ing and Wei 2003; Provenzale et al. 1992). In the previous studies, the output time series were demonstrated to behave differently for some choices of parameters (a, b, c) in Eq. (1) (Basu and Foufoula-Georgiou 2002; Elsner and Tsonis 1992; Ing and Wei 2003). Among them, the most important findings are those that increased nonlinearity can enhance the predictability of the output time series (Ye and Hsieh 2008).

We first numerically solve Eq. (1) by using the fourth order of the Runge-Kutta method to get the time series of variables X, Y, and Z (initial values are set as (2.85, − 4.77, 30.85) for X, Y, and Z, respectively, and the time step is 0.01). Then, we compute the ratio of the nonlinear term and linear term in the second equation and third equation in Eq. (1) as \( {\beta}_y=\frac{\left\langle | xz|\right\rangle }{\left\langle | bx|+|y|\right\rangle } \) and \( {\beta}_z=\frac{\left\langle | xy|\right\rangle }{\left\langle | cz|\right\rangle } \). Both βy and βz represent the nonlinearity degree of the Lorenz63 system (“< >” and “| |” denote temporal average and absolute value, respectively). The detailed results about parameters and computed nonlinearity degrees are listed in Table 1. It should be pointed out that five cases for both βy and βz ensure that nonlinearity influence on both predictability and prediction accuracy can be quantified.

Table 1 Detailed information of chaotic time series

Up to now, we have constructed the required time series for our present study, and we call short-term memory time series as S(t), long-term memory time series as L(t), multifractal time series as M(t), and chaotic time series as Z(t) (because we only employ the variable Z in the outputs of the Lorenz63 system). The analyzed lengths of time series are 10,000 for all different cases, where the first 8000 points are taken as training series and the last 2000 points as testing series.

2.2 Model-free predictability

2.2.1 Permutation entropy

Permutation entropy (Bandt and Pompe 2002) is widely employed to quantify the complexity of a time series {xt, t = 1, 2, …, T}. Firstly, reconstructed phase space of this time series and the subvector in the phase space are \( {X}_j^{m,\tau }=\left\{{x}_j,{x}_{j+\tau },...,{x}_{j+\left(m-1\right)\tau}\right\} \); here, m and τ denote the embedding dimension and time delay. The subscript satisfies j = 1, 2, ..., T − (m − 1)τ and it represents the jth subvector. In each subvector, there are m! possible permutations of the elements, and every possible permutation is denoted as πi. Then, the permutation entropy can be defined as

$$ p\left({\pi}_i^{m,\tau}\right)=\frac{\sum_{j\le N}{I}_{u:\mathrm{type}(u)={\pi}_i}\left({X}_j^{m,\tau}\right)}{\sum_{j\le N}{I}_{u:\mathrm{type}(u)=\prod}\left({X}_j^{m,\tau}\right)},\mathrm{PE}\left(m,\tau \right)=-\sum \limits_{i:{\pi}_i^{m,\tau}\in \varPi }p\left({\pi}_i^{m,\tau}\right)\ln p\left({\pi}_i^{m,\tau}\right) $$
(2)

2.2.2 Weighted permutation entropy

The complexity of the time series sometimes does not only depend on the permutation but also depend on the amplitude. In this case, weighted permutation entropy (WPE) (Fadlallah et al. 2013) is found to perform better in quantifying complexity, since it takes the amplitude information in the time series into account. The algorithm of WPE is just like PE, after reconstructing the phase space; it is needed to give weights for every permutation in advance by computing the variance of subvectors \( {\overline{X}}_j^{m,\tau }=\frac{1}{m}\sum \limits_{k=1}^m{x}_{j+\left(k+1\right)\tau } \) and \( {w}_j=\frac{1}{m}\sum \limits_{k=1}^m{\left[{x}_{j+\left(k-1\right)\tau }-{\overline{X}}_j^{m,\tau}\right]}^2 \), then the weight wi will be taken into the calculation of WPE

$$ {p}_w\left({\pi}_i^{m,\tau}\right)=\frac{\sum_{j\le N}{I}_{u:\mathrm{type}(u)={\pi}_i}\left({X}_j^{m,\tau}\right){w}_j}{\sum_{j\le N}{I}_{u:\mathrm{type}(u)=\prod}\left({X}_j^{m,\tau}\right){w}_j},\mathrm{WPE}\left(m,\tau \right)=-\sum \limits_{i:{\pi}_i^{m,\tau}\in \varPi }{p}_w\left({\pi}_i^{m,\tau}\right)\ln {p}_w\left({\pi}_i^{m,\tau}\right) $$
(3)

To avoid the finite size effect on PE/WPE analysis, it is necessary to ensure that the data length is larger than 10m! for the analyzed time series (Riedl et al. 2013). In our present work, the data lengths of underlying time series are 15,000, 10,000, and 4000, respectively. We should use the same value of m for all of them so that 5 is a solution for m. And then τ is set as 1, which has been suggested to be suitable for quantifying permutation complexity (Bandt 2005; Riedl et al. 2013; Pennekamp et al. 2018).

2.3 Prediction model

Since there are both linear and nonlinear ordinal patterns hidden in the series generated by the four theoretical models, correspondingly, both linear and nonlinear methods should be chosen to evaluate the prediction accuracy. At the same time, the main objective in this study is not to optimize the best model to minimize the predictive error but to show that the increased predictive structure strength in the series can improve the predictability and prediction accuracy, which can provide insight to choose a suitable prediction model. So in this study, only one linear strategy and one nonlinear strategy are considered.

2.3.1 Linear prediction strategy

As a representative linear model (Ing and Wei 2003), the fourth-order autoregressive (AR (4)) model is employed to fit a hyperplane to the given points and then use it to make prediction in our work. With the help of xi + 1 = k0 + k1xi + k2xi − 1 + k3xi − 2 + k4xi − 3, we first fit the training time series to acquire model’s parameters (k0, k1, k2, k3, k4) by the least square method and then make one-step ahead prediction for the testing time series.

2.3.2 Nonlinear prediction strategy

For nonlinear and dimensional time series (Provenzale et al. 1992; Lorenz 1969), nonlinear prediction strategies could outperform the linear strategies. Here, we employ a classical nonlinear method, Lorenz method analogues (LMA) (Elsner and Tsonis 1992; Lorenz 1969; Fraser and Swinney 1986), to achieve the nonlinear prediction strategy. This method first reconstructs the phase space of the training time series and gets the subvector \( {X}_j^{m,\tau }=\left\{{x}_j,{x}_{j+\tau },...,{x}_{j+\left(m-1\right)\tau}\right\} \). The choices of the embedding dimension and time delay need some special handlings, which can be found in the studies of Fraser and Swinney (1986) and Kennel (1992). Then, the one-step ahead prediction for the testing time series is carried out in the reconstructed phase space. Based on the current subvector (corresponding to a point in the phase space), it is generally to choose the closest m + 1 points (the distances are computed from the Euler distances of the points/vectors) to get the weighted mean of these vectors (weights are counted by the distances) as the forecast of the vector in next step.

It should be noted here, before fitting and prediction, that the original time series are all normalized by means of xi = (xi − 〈{xi}〉)/std({xi}) (“< >” and “std” denote the temporal average and standard deviation, respectively), and the lengths of the training time series in this work are 8000 and 2000.

2.4 Prediction accuracy

To quantify the prediction accuracy and realizable predictability, here, we employ two important metrics to depict the predicting residuals. The first one is the forecast error (FE), and it quantifies the levels of the residual series’ variance relative to the normalized white noise’s variance (Hyndman and Koehler 2006). So, FE is defined as

$$ \mathrm{FE}=\frac{\sum \limits_{i=1}^n{\left({p}_i-{x}_i\right)}^2}{\sum \limits_{i=1}^n{\varepsilon_i}^2}, $$
(4)

where pi denotes the predicted value, xi denotes the true value in the testing time series, and {εi} is the Gaussian white noise. The smaller the FE, the better the prediction accuracy is. When FE is less than 1, the forecast skill is acceptable.

The second metric is the mean absolute scaled error (MASE) between the true and predicted data (Hyndman and Koehler 2006), and it can evaluate the match degree between the time series and model. MASE quantifies the residual relative to the one-step variability (OSV) in the training time series, so the prediction accuracy can be compared with random walk prediction based on the training data. MASE is defined as

$$ \mathrm{MASE}=\sum \limits_{j=\mathrm{tr}+1}^{N_{\mathrm{te}}+{N}_{\mathrm{tr}}+1}\frac{\mid {p}_j-{x}_j\mid }{\frac{N_{\mathrm{te}}}{N_{\mathrm{tr}}}{\sum}_{i=2}^{N_{\mathrm{tr}}}\mid {x}_i-{x}_{i-1}\mid }, $$
(5)

where Nte and Ntr represent the lengths of the testing and training time series, respectively. MASE > 1 means that on average the prediction model does perform worse than the random walk forecast on the training data, but for MASE < 1, it performs better.

In addition to the above two metrics, the averaged relative OSV can be taken to evaluate the variability between neighboring data points in the time series, which can provide intuitively the variation in time series when their ordinal structures are strengthened or weakened. The averaged relative OSV is defined as

$$ \mathrm{OSV}=\frac{\sum \limits_{i=2}^n\mid {x}_i-{x}_{i-1}\mid }{\sum \limits_{i=2}^n\mid {\varepsilon}_i-{\varepsilon}_{i-1}\mid }, $$
(6)

where {εi} is the Gaussian white noise with zero mean and unit standard deviation.

3 Results

3.1 Influence of enhanced ordinal structures on predictability

For complex time series, four kinds of well-defined ordinal patterns, i.e., short-term memory, long-term memory, multifractal patterns, and nonlinearity in chaotic series, are common ordinal structures. Among them, the short-term memory and long-term memory are linear ordinal structures, but the multifractal patterns and nonlinearity in chaotic series are nonlinear ordinal structures. Both linear and nonlinear ordinal structures may play a differential role in adjusting the corresponding series’ intrinsic predictability. Actually, the increased strength of both linear and nonlinear ordinal structures can lower the time series’ complexity (PE/WPE) and enhance time series’ intrinsic predictability. As suggested by Garland et al. 2014, the time series’ intrinsic predictability can be quantified by 1 − WPE or 1 − PE (see Fig. 1 for details). Besides this uniform monotonous association between the time series’ intrinsic predictability and the strength of ordinal structures, more information can be revealed in the association between the time series’ intrinsic predictability and the strength of ordinal structures. Most importantly, linear and nonlinear ordinal structures may play a differential role in adjusting the corresponding series’ intrinsic predictability. There are different regimes or phases for linear and nonlinear ordinal structures in the intrinsic predictability and ordinal structure plot. Time series with different types of ordinal structures admit different levels of WPE’s value, where the linear stochastic process’s ordinal structures, such as short-term memory and long-term memory, have higher WPE (see Fig. 1a and b) and where the deterministic nonlinear process’s ordinal structures, such as multifractal patterns and chaotic attractors, have lower WPE (see Fig. 1c and d). This well-defined distinguishable regimes or phases for linear and nonlinear ordinal structures in the intrinsic predictability and ordinal structure plot can be taken as an indicator to preselect a corresponding suitable model to model and predict the underlying series (Garland et al. 2014). And lastly, we should point out that PE may not work well for all kinds of time series, just like multifractal series; it cannot differentiate the multifractal series of different multifractal strengths (see Fig. 1c). This may be caused by the fact that multifractal structures in multifractal series are induced by different amplitudes rather than temporal correlations.

Fig. 1
figure 1

Scatter plots for the monotonous relation between PE/WPE (hollow blue/solid black dots) and the strengths of ordinal structures with (a) short-term memory, (b) long-term memory, (c) multifractal patterns, and (d) chaotic attractors, respectively. The error bars in (a) and (b) are the intervals of 2.5 standard deviation

3.2 Influence of enhanced ordinal structures on prediction accuracy

Just as we mentioned in the previous sections, the main objective in this study is not to optimize the best model to minimize the predictive error but to show that the increased predictive structure strength in the series can improve prediction accuracy, which can provide insights to choose a suitable prediction model. So, in this subsection, results from only AR(4) and LMA are compared.

3.2.1 Linear structures

First of all, let us demonstrate how different strengths of ordinal structures influence practical prediction from the AR and LMA methods for time series with short-term memory. Figure 2 presents the comparison among the testing series, predicted series from AR, and predicted series from LMA under three typical cases with a = 0.2, a = 0.55, and a = 0.9. The most important common finding is that the match degree between the testing series and predicted ones increases with the increasing strengths of ordinal structures, and the more the predictive patterns, the better the prediction model performs. Another finding is that both AR and LMA perform almost equally for the short-term correlated series; when the strength of ordinal structures is weaker (such as a = 0.2), both methods cannot capture the extreme variations in the testing series (see Fig. 2a), but for the higher strength of ordinal structures (such as a = 0.9), both methods can capture the detailed variations in the testing series (see Fig. 2c). This finding is consistent with previous studies, since LMA predictor is practicable for both linear and nonlinear series (Garland et al. 2014).

Fig. 2
figure 2

Comparison between predicted time series (black lines from AR(4) and green lines from LMA) and testing time series with different long-term memory (gray lines): (a) 0.2, (b) 0.55, and (c) 0.9

More quantitative results can be provided by the two prediction accuracy metrics. For all cases under different strengths of ordinal structures, both the AR and LMA methods reach the same results. The mean of FE is not larger than 1, and FE monotonously decreases to 0.05 when short-term memory is enhanced to a = 1.0, which means that there exists a forecast skill and that the prediction accuracy is becoming better. The standard deviations of FE from both the AR and LMA methods decrease with the increasing strengths of ordinal structures (see Fig. 3a). Similarly, MASE shows the same results for both the AR and LMA methods. Since MASE reflects the match degree between the predicted series and the testing series compared with those from random walk prediction, whether it is less than 1 is the rule for the match degree. We can see for the AR and LMA methods that MASE is less than 1 and MASE increases with increasing strengths of ordinal structures (see Fig. 3b). The standard deviations of MASE from both the AR and LMA methods decrease with the increasing strengths of ordinal structures (see Fig. 3b). It should be noted that the behavior of MASE is contrary to that of OSV of the time series itself. Figure 3a shows the changing OSV of the time series with short-term memory and the averaged OSV is becoming weak when the short-term memory is enhanced, which coincides with the results from WPE.

Fig. 3
figure 3

Evaluation of predicted time series with short-term memory: (a) FE and (b) MASE. The solid blue lines present AR predications (blue shadows behind are intervals of 2.5 standard deviation) and hollow black dots represent LMA predictions (error bars are intervals of 2.5 standard deviation). In particular, the solid green up-triangles represent the variation of one-step variability in chaotic time series

Since both series with short-term memory and series with long-term memory are linear stochastic series, the results are similar for both kinds of series (see Fig. 2 and Fig. 4). When long-term memory is strengthened, some local patterns like trends in sequences become more persistent (Franzke and Woollings 2011), and the averaged OSV in Fig. 5a decreases. However, still minor differences can be found for both kinds of series (see Fig. 3 and Fig. 5). The first difference between them is that the range of OSV, FE, and MASE is narrower for series with long-term memory than with short-term memory. At the same time, the standard deviation for both FE and MASE is almost unchanged with different strengths of ordinal structures in series with long-term memory, and this feature is totally different from that in series with short-term memory.

Fig. 4
figure 4

Comparison between predicted time series (black lines from AR(4) and green lines from LMA) and testing time series with different long-term memory (gray lines): (a) 0.6, (b) 0.9, and (c) 0.99

Fig. 5
figure 5

Evaluation of predicted time series with different long-term memory: (a) FE and (b) MASE. The solid blue lines present AR predications (blue shadows behind are intervals of 2.5 standard deviation) and hollow black dots represent LMA predictions (error bars are intervals of 2.5 standard deviation). In particular, the solid green up-triangles represent the variation of one-step variability in chaotic time series

And lastly, we want to stress that the computation cost in LMA is huger than that in AR; so if we can learn from the WPE information for any given series that both LMA and AR perform equally, we do not need to repeat the computation in LMA.

3.2.2 Nonlinear structures

The aforementioned results are about linear structures in the time series; will enhancing nonlinear structures make FE, MASE, and OSV have the same responses or more specific features that can be revealed? Among the nonlinear structures, multifractal patterns and chaotic attractors are two typical nonlinear structures in real-world time series. Both multifractal series and chaotic series share more features, such as the dimension is all fractal. However, more different behaviors are also revealed in both kinds of series, for example, chaotic series have no marked magnitude differences with sharp transition commonly found in the multifractal series (see Fig. 6). These marked magnitude differences may result in distinguished predictability and prediction accuracy.

Fig. 6
figure 6

Comparison between predicted time series (black lines from AR(4) and green lines from LMA) and testing multifractal time series (gray lines) with different multifractal strengths: (a) 0.58, (b) 1.78, and (c) 3.17

First of all, we can learn from the multifractal series that the peaks with sharp transition are more dominated when the multifractal strength is increased (see Fig. 6). At the same time, the temporal distribution of peaks becomes uniform when the multifractal strength is stronger, and the differences between large and small fluctuations are also magnified. However, the averaged OSV decreases with strengthening multifractal structures (Fig. 7a), which coincides with results given by WPE (Fig. 1c). And these features are markedly different from those revealed in linear series shown in the aforementioned results. So, the most marked difference is that AR and LMA perform totally differently (Fig. 6 compared with Fig. 2 and Fig. 5). AR cannot capture the detailed variations in the testing series, especially for the information related to the larger magnitudes, but LMA can. With the increasing multifractal strength, the performance of LMA is nearly perfect, which can be quantitatively found in two metric results given in Fig. 7b. We can see that for most of cases, FE from LMA is much smaller than that from AR, and its value is limited below 0.125 for all cases. At the same time, MASE is increasing slowly with its value below 0.75 for all cases. However, the value of MASE is increasing slowly with its value above 1.25 for all cases, which indicates that the prediction strategy from linear methods is unsuitable to predict the multifractal series.

Fig. 7
figure 7

Evaluation of predicted multifractal time series under different multifractal strengths: (a) FE and (b) MASE. The solid blue dots represent AR predications and hollow black dots represent LMA predictions. In particular, the solid green up-triangles represent the variation of one-step variability in chaotic time series

Whereas for the chaotic series with different nonlinear strengths, the results are a little different from those given for the multifractal series, where LMA works well but AR fails for all cases (Fig. 8). The variations in the chaotic series are smooth without peaks with sharp transition, which makes the averaged OSV be below the minimum value given in the multifractal series for all cases but one (see Fig. 9a). Quantitatively, both FE and MASE from LMA are the lowest among four kinds of well-defined ordinal structures, where FE from LMA nearly collapses to 0 (Fig. 9a) and where MASE from LMA is below 0.03, which is two orders smaller than that from AR (Fig. 9b). For AR, MASE is larger than 3 for all cases. The results show that the nonlinear model such as LMA performs very well for the prediction of the chaotic time series, but the linear model such as AR works even much worse. The reason for this is that trajectory becomes denser with enhancing nonlinearity in the chaotic series (Ye and Hsieh 2008; Sugihara et al. 2012; Elsner and Tsonis 1992; Ing and Wei 2003; Provenzale et al. 1992) and that the variability in neighboring points will decrease. And all of these facts will make the variations in the chaotic series more ordered with the lowest WPE (see Fig. 1).

Fig. 8
figure 8

Comparison between predicted time series (black lines from AR(4) and green lines from LMA) and testing chaotic time series (gray lines) with different nonlinearity: (a) 0.26, (b) 0.27, and (c) 0.40

Fig. 9
figure 9

Evaluation of predicted chaotic time series under different nonlinearity: (a) FE and (b) MASE. The solid blue dots represent AR predications and hollow black dots represent LMA predictions. In particular, the solid green up-triangles represent the variation of one-step variability in chaotic time series

3.3 Association between prediction accuracy and predictability

Previous studies have conjectured there is a well association between the intrinsic predictability (WPE/PE) and the realizable predictability or prediction accuracy (FE/MASE) in any given series (Garland et al. 2014). And this conjecture has been validated in several series (Fu et al. 2019; Pennekamp et al. 2018). However, there is no further study to exploit the deep association between the intrinsic predictability (WPE/PE) and the realizable predictability or prediction accuracy (FE/MASE). For the aforementioned four kinds of theoretical series, we have a chance to achieve this goal. If we show the results of (1 − WPE) and MASE from these four kinds of theoretical series in a plot, a clearer association is reached between the intrinsic predictability and the realizable predictability or prediction accuracy (see Fig. 10). There are different regimes or phases for linear and nonlinear time series in the (1 − WPE)-MASE plot. The regime with the highest 1 − WPE and the lowest MASE corresponds to the case of the chaotic series, whereas the middle regime is for the case of the multifractal series, and the regime with the lowest 1 − WPE and the highest MASE is for the linear series. There is a distinct regime separation between linear series and nonlinear series. The regime in the (1 − WPE)-MASE plot can be taken as a benchmark to guide the choice of the prediction strategy for any given series from the real world. Since 1 − WPE for any given series is easier to compute and compare the estimated 1 − WPE with the results given in Fig. 10, we can decide whether a linear or nonlinear strategy is chosen to model or predict this given series.

Fig. 10
figure 10

Scatter plot MASE (LMA) versus 1 − WPE for series with different ordinal structures (solid black squares for short-term memory, hollow red dots for long-term memory, solid blue up-triangles for multifractal patterns, and hollow green down-triangles for chaotic attractors). There are different regimes for series with different ordinal patterns. Solid blue dot denotes state A(0.21, 0.73) from the daily AMOC index, solid red dot denotes state E(0.86, 0.43) from the daily ENSO index, and solid green dot denotes state T(0.98, 0.18) from the daily air temperature anomaly

3.4 Application in predicting real-world time series

To illustrate the power of the regime revealed in the (1 − WPE)-MASE plot in guiding the suitable modeling or predicting strategy to some real-world complex time series, three climatic records are studied here. All climatic records, including daily air temperature anomaly in Valkenburg (TEM) from 1976 to 2017, daily indices of El Niño-Southern Oscillation (ENSO) from 1980 to 2017, and daily indices of Atlantic Meridional Overturning Circulation (AMOC) from 2004 to 2017, were downloaded from the site (https://climexp.knmi.nl/start.cgi). First of all, we can compute the intrinsic predictability (1 − WPE) for each series, and their values are 0.18 for TEM, 0.43 for ENSO, and 0.73 for AMOC, detailed results are summarized in Table 2.

Table 2 Details for real-world time series

Comparing the 1 − WPE results with the regimes shown in the (1 − WPE)-MASE plot (Fig. 10), the suggested modeling or predicting strategy is totally different. Firstly, 1 − WPE = 0.18 indicates that the daily air temperature anomaly in Valkenburg should be modeled and predicted by the linear model such as AR, and the nonlinear method such as LMA will reach similar results. The predication accuracy quantified by MASE for LMA is 0.98 and for AR is 0.96 (see Table 2), and they are all below 1, which indicates both methods work well. The state (0.18, 0.98) corresponds well to the output with short-term memory (see green dot T(0.18, 0.98) in Fig. 10). The well-matched degree between testing and predicted series can be found in Fig. 11a and d. Secondly, for the ENSO index, 1 − WPE = 0.43, which lies between the multifractal regime and the regime with short-term memory, and it is much close to the multifractal regime (see red dot E(0.43, 0.86) in Fig. 10). So, nonlinear methods should be adopted to model and predict the daily ENSO index variations; further computation confirms that LMA indeed performs better (with MASE = 0.86, which is below 1) than AR does (with MASE = 2.19, which indicates the AR model fails to capture the detailed ENSO index variations, see Fig. 11b and e). The well-matched degree between testing and predicted series from LMA can be found in Fig. 11b and e. And lastly, for the AMOC index, 1 − WPE = 0.73, the intrinsic predictability is really high. The state in the (1 − WPE)-MASE plot belongs to the nonlinear regime between the chaotic regime and the multifractal regime (see blue dot A(0.73, 0.21) in Fig. 10), which indicates that the linear model cannot model this series well (with MASE = 2.79, which indicates the AR model fails to capture the detailed AMOC index variations, see Fig. 11c and f) and that the nonlinear methods must be chosen to model and predict the daily AMOC index variations. In fact, from the daily AMOC index series, we can find there are regime shifts (see Fig. 11f) just like what we find in the chaotic series (see Fig. 8c).

Fig. 11
figure 11

Comparison between predicted time series (black lines from AR(4) and green lines from LMA) and testing real-world time series: (a) TEM, (b) ENSO, and (c) AMOC. (d)–(f) Locally enlarged version for (a)–(c), respectively

4 Conclusion and discussion

This article reveals that predictability is enhanced by the increasing strength of the ordinal structures, such as the short-term memory, long-term memory, multifractal patterns, and chaotic attractors, which commonly exist in real-world time series. Since the time series’ complexity and one-step variability are reduced with the increasing strength of these well-defined structures, the prediction models and methods can depict and predict the temporal variations of strengthening ordinal structures better. Detailed studies on the intrinsic predictability (quantified by 1 − WPE) and prediction accuracy (by FE or MASE) for these four kinds of theoretical series with known differential ordinal structures show that there is an own specific regime or phase in the intrinsic predictability or prediction accuracy for each kind of theoretical series. Deterministic and nonlinear series take the higher intrinsic predictability or prediction accuracy (the lower forecast error), whereas linear and stochastic series have the lower intrinsic predictability or prediction accuracy (the higher forecast error).

The well corresponding relation between the intrinsic predictability and the prediction accuracy for each specific series with its own ordinal structures indicates there is specific regime in the (1 − WPE)-MASE plot for each kind of series with specific ordinal structures. Only from the estimated 1 − WPE for any given series can one determine which regime this underanalyzing series belongs to. This can guide us to preselect and optimize the suitable model or method to model or predict this series with unknown ordinal structures from the real world. Taking this insight into account, we analyze three climate series, i.e., daily air temperature anomaly in Valkenburg (TEM) from 1976 to 2017, daily indices of El Niño-Southern Oscillation (ENSO) from 1980 to 2017, and daily indices of Atlantic Meridional Overturning Circulation (AMOC) from 2004 to 2017. From the estimated 1 − WPE for these three different series, 0.18 for the temperature anomaly, 0.43 for the ENSO index, and 0.73 for the AMOC index, we can easily classify the daily air temperature anomaly in Valkenburg as series with the short-term memory and the daily ENSO index and daily AMOC index as nonlinear series. Further prediction studies on these series confirm that the AR model is enough to the daily air temperature anomaly in Valkenburg, and this result is consistent with the previous findings that higher-frequency daily surface temperature fluctuations can be well modeled by the AR model after proper detrending (von Storch and Zwiers 1999; Bartos and Janosi 2005). However, the AR model fails to model and predict the daily ENSO index and daily AMOC index. Especially, there is substantial variability on short time scales of a few days (Balan Sarojini et al. 2011; Cunningham et al. 2007) in the AMOC index taking the chaos-like behaviors, so a model or method that can address chaotic series is required. Whereas for the high-frequency ENSO index, there are more complicated features with multiple periods and there is good memory and no single scaling (Petroni and Ausloos 2008), which are certainly different from those of the linear stochastic processes. Since the nonlinear strategy like LMA is computationally expensive, the estimation of 1 − WPE for any given series is simple with a little computation cost. Only from the estimated 1 − WPE for any given series can we optimize our modeling or prediction strategy in advance on which modeling or prediction strategy we should choose.

It should be pointed out that there are many other methods to infer the time series’ intrinsic predictability, such as mean prediction time (Salvino et al. 1995), fractal dimension (Rangarajan and Sant 1997), memory or persistence (Franzke and Woollings 2011), Lyapunov exponents, and improved Lyapunov exponents (Patil et al. 2001; Ding et al. 2010; Ding et al. 2011). Here, the choice of WPE (Garland et al. 2014; Fu et al. 2019; Pennekamp et al. 2018) in this study is due to its sensitiveness to different structures and robustness to different transformation (Garland et al. 2014; Fu et al. 2019; Pennekamp et al. 2018). In addition, although one-step prediction is investigated here, the relevant results are qualitatively similar for multistep prediction. But there may be more marked differences between linear and nonlinear methods for series with nonlinear ordinal patterns, since nonlinear behaviors will be dominant for multistep prediction (Sugihara 1990).