Keywords

1 Introduction

Assessing measures of risk for financial returns has an important role in investment decisions, portfolio analysis and for regulatory purposes [1, 2]. The importance of accurate estimation of risk of asset returns have been discussed extensively, particularly after the recent financial crisis [2, 3]. Measures of risk often need to be analyzed for more than one financial asset since most investment decisions are based on a selected portfolio of multiple assets, where the investor aims to diversify of risk [4].

For a single series of financial returns, it is well documented that the associated risk changes substantially over time, which is denoted by time-varying conditional volatility in asset returns [5]. Methods which avoid strong distributional assumptions prove to be useful for estimating such time-varying volatility [610]. For multiple financial returns, an additional important feature in portfolio management is the correlation between the returns of different assets, where the portfolio risk depends on the correlation of the returns from each asset forming the portfolio [11]. For example, if the portfolio is based on two negatively correlated stocks, the portfolio is said to be ‘well-diversified’ with small risk [4]. Thus an accurate risk calculation, e.g. for a portfolio of financial assets, requires the accurate calculation of assets’ correlation at the given decision time.

Earlier research has shown that correlations between financial returns, particularly across international markets change over time, e.g. during major financial crises [12]. Different methodologies, with different distributional assumptions for correlation, have been proposed to calculate time-varying correlations for returns. Parametric models are proposed to estimate correlations between returns as well as other model parameters [13]. Alternative methodologies are based on moving window correlation estimates, where time-varying correlation at a given time is approximated by a proxy, namely the sample correlation at a selected time window. Moving window estimates have the advantage of avoiding strong distributional assumptions and are shown to perform well particularly in forecasting [14]. However, these estimates are also shown to be sensitive to the selection of a window size and there is a natural trade-off between capturing time variation in correlations and obtaining an accurate proxy for correlation at a given time. If the selected window size is too large, proxies of correlation, i.e. sample correlation, approaches the long run mean of correlation. Hence time variation in correlation cannot be captured. On the contrary, if the selected window size is too small, time variation in correlation is captured, but each proxy of correlation has high uncertainty and correlation estimation is not accurate [15].

In this paper, we propose the first PFS for modeling time-varying correlations of financial returns and to improve moving window correlation estimates. The method captures time varying correlation and conditional volatility without an underlying restricted statistical model for the correlations. PFS has previously been shown to perform well for conditional volatility and risk estimation [7, 9, 16]. The proposed model is different from the earlier PFS models. In the current paper, the antecedents and the consequents of the system are based on proxies, i.e. approximations of time-varying correlations instead of observed data. The use of approximations of correlations instead of the actual values leads to measurement errors. We show that the PFS model takes into account the imprecision resulting from these measurement errors through the use of fuzzy sets. In addition, the trade-off between capturing time-variation in correlation and obtaining accurate correlation estimates is mitigated using the proposed PFS model. These features are illustrated using simulated data and a real data application.

2 Probabilistic Fuzzy Systems

Probabilistic fuzzy systems combine two different types of uncertainty, namely fuzziness or linguistic vagueness, and probabilistic uncertainty. A probabilistic fuzzy system follows an idea similar to [17, 18] where the different concepts [1922] of fuzzy sets and probabilities are complementary [20]. In this work we consider that the probabilistic uncertainty relate to aleatoric variability, while fuzzy sets are used to represent gradualness, epistemic uncertainty or bipolarity [21, 23].

The PFS consists of a set of rules whose antecedents are fuzzy conditions and whose consequents are probability distributions. Assuming that the input space is a subset of \(\mathbb {R}^n\) and that the rule consequents are defined on a finite domain \(Y \subseteq \mathbb {R}\), a probabilistic fuzzy system consists of a system of rules \(R_q\), \(q = 1, \ldots , Q\), of the type

$$\begin{aligned} R_q : \hbox {If } {\mathbf x} \hbox { is } A_q \hbox { then } f(y) \hbox { is } f(y|A_q) \, , \end{aligned}$$
(1)

where \({\mathbf x} \in \mathbb R^n\) is an input vector, \(A_q: X \longrightarrow [0,1]\) is a fuzzy set defined on X and \(f(y|A_q)\) is the conditional pdf of the stochastic output variable \(\underline{y}\) given the fuzzy event \(A_q\). The interpretation is as follows: if fuzzy antecedent \(A_q\) is fully valid (\(x \in \text{ core }(A_q)\)), then y is a sample value from the probability distribution with conditional pdf \(f(y|A_q)\).

A PFS has been described with two possible and equivalent reasoning mechanisms, namely the fuzzy histogram approach and the probabilistic fuzzy output approach [24]. In this work we focus on the fuzzy histogram approach since the pdf obtained from the approach can be used to assess the precision in correlation estimates. We replace in each rule of (1) the true pdf \(f(y|A_q)\) by its fuzzy approximation (fuzzy histogram) \(\hat{f}(y|A_q)\) yielding the rule set \(\hat{R}_q\), \(q = 1, \ldots , Q\) defined as

$$\begin{aligned} \hat{R}_q : \hbox {If } {\mathbf x} \hbox { is } A_q \hbox { then } f(y) \hbox { is } \hat{f}(y|A_q) \, . \end{aligned}$$
(2)

The fuzzy histogram \(\hat{f}(y|A_q)\) for each rule is obtained from a fuzzy partition of the compact output space Y with \(j=1,\ldots ,J\) fuzzy classes \(C_j|A_q\) with probability estimates \(\hat{\Pr }(C_j|A_q)\) and the corresponding membership function \(u_{C_j}(y)\) [25]

$$\begin{aligned} {\hat{f}}(y|A_q) = \sum _{j=1}^J \frac{\hat{\Pr }(C_j|A_q) {u_{C_j}(y)}}{\int _{-\infty }^{\infty } u_{C_j}(y)dy}, \end{aligned}$$
(3)

where the probability estimates \(\hat{\Pr }(C_j|A_q)\) satisfy the conditions \(\hat{\Pr }(C_j|A_q) \ge 0\) and \(\sum _{j=1}^J \hat{\Pr }(C_j|A_q)=1\), and they can be calculated using the maximum likelihood method [7]. In this paper we do not assume any particular algebraic structure for the conditional probability of fuzzy events. There are several examples of definitions of conditional probabilities of fuzzy events that satisfy the classical axioms of conditional probabilities, such as [26].

The interpretation of this type of reasoning is as follows. Given the occurrence of a (multidimensional) antecedent fuzzy event \(A_q\), which is a conjunction of the fuzzy conditions defined on input variables, an estimate of the conditional probability density function based on a fuzzy histogram \(\hat{f}(y|A_q)\) is calculated.

Given an input vector \(\mathbf {x}\), the output of a probabilistic fuzzy system is a conditional density function which can be computed as

$$\begin{aligned} \hat{f}(y|\mathbf {x}) =\sum _{j=1}^J \sum _{q=1}^Q \beta _{q}(\mathbf {x}) \hat{\Pr }(C_j|A_q) \frac{u_{C_j}(y)}{\int _{-\infty }^{\infty }u_{C_j}(y) dy}, \end{aligned}$$
(4)

where \( \beta _{q}(\mathbf {x}) = u_{A_q}(\mathbf {x}) / \sum _{q'=1}^Q u_{A_{q'}}(\mathbf {x})\) is the normalised degree of fulfillment of rule \(R_q\) and \(u_{A_q}\) is the degree of fulfillment of rule \(R_q\). When \(\mathbf {x}\) is n-dimensional, \(u_{A_q}\) is determined as a conjunction of the individual memberships in the antecedents computed by a suitable t-norm, i.e., \( u_{A_q}(\mathbf {x}) = u_{A_{q_1}}(x_1) \circ \cdots \circ u_{A_{q_n}}(x_n)\), where \(x_i, i=1,\ldots ,n\) is the i-th component of \(\mathbf {x}\) and \(\circ \) denotes a t-norm.

It can be shown [24] that the conditional density output \(\hat{f}(y|\mathbf {x})\) of a PFS is a proper probability density function i.e. \(\int _{-\infty }^{\infty } \hat{f}(y|{\mathbf x}) dy = 1 \,\) and the expected value \(\hat{\mathrm E}(\underline{y}|{\mathbf x})\) and the second moment \(\hat{\mathrm E}(\underline{y}^2|{\mathbf x})\), exist if the given the partitioning of the output space, since the output membership values satisfy \( \sum _{j=1}^J u_{C_j}(y) = 1, \forall y \in Y, y < \infty .\) Under these conditions, a crisp output using the expected value can be calculated as

$$\begin{aligned} \hat{\mu }_{y|\mathbf {x}}=\hat{E}(y|\mathbf {x})=\int _{-\infty }^{\infty } y \hat{f}(y|\mathbf {x})dy = \sum _{q=1}^Q \sum _{j=1}^J \beta _q({\mathbf x})\hat{\Pr }(C_j|A_q) z_{1,j}, \end{aligned}$$
(5)

where \(z_{1,j} = \int _{-\infty }^{\infty } y u_{C_j}(y) dy/\int _{-\infty }^{\infty } u_{C_j}(y) dy\) is the centroid of the jth output fuzzy set.

3 Correlation Estimation Using PFS

In this paper we consider a model for two returns \(y_t = (y_{1,t}, y_{2,t})'\):

$$\begin{aligned} y_t = H^{1/2}_t z_t \end{aligned}$$
(6)

where \(t=1,\ldots , T\) indicates the time period, \(z_t = (z_{1,t}, z_{2,t})'\) is such that \(z_{i,t}\) for \(i=1,2\) are random variables with mean 0 and variance 1, \(H_t\) is a \(2\times 2 \) positive definite matrix and \(H_t^{1/2}\) denotes the Choleski decomposition of \(H_t\). In most models, e.g. in multivariate GARCH models, the distribution of \(z_{i,t}\) is defined as a standard normal distribution. We focus on two assets for illustration purposes, but the model and the applications can be generalized to any number of assets.

The covariance of two returns in (6) is \(\mathrm {Var}(y_t) = H^{1/2}_t H^{\prime 1/2}_t = H_t\), i.e. the matrix \(H_t\) represents the time-varying variance-covariances of \(y_t\), which by construction are not observable. Different models have been proposed to model the time-varying conditional variance-covariance matrix \(H_t\) in (6) [13]. A common feature of these models is the dependency of the current covariances \(H_t\) and past covariances \(H_{t-1},\ldots , H_{t-p}\). Similar to univariate GARCH models, such a dependency on past values ensure smooth changes in the variance-covariance structure over time. In addition, any modeling approach for \(H_t\) should ensure that this matrix is a positive definite matrix at each time period. This necessary condition may lead to additional parameter restrictions in models [13].

The following decomposition of the variance-covariance matrix is often used to identify variances and correlation coefficients [13]:

$$\begin{aligned} H_t = D_t R_t D_t = \begin{pmatrix} h_{1,1,t}^{1/2} &{}0 \\ 0 &{} h_{2,2,t}^{1/2} \end{pmatrix} \begin{pmatrix} 1 &{}\rho _{t} \\ \rho _{t} &{} 1 \end{pmatrix} \begin{pmatrix} h_{1,1,t}^{1/2} &{}0 \\ 0 &{} h_{2,2,t}^{1/2} \end{pmatrix} \end{aligned}$$
(7)

where \(D_t\) is the diagonal matrix with variances of each series in diagonals and \(R_t\) matrix includes the correlations of the two series \(\rho _{t}\). Using this decomposition, \(H_t\) is a positive definite matrix as long as the diagonal elements of \(D_t\) are positive and \(\rho _{t}\in (-1,1)\) for all t. The advantage of the decomposition in (7) is that the diagonal elements of the matrix \(D_t\) can be estimated using a given conditional volatility model, for example using [7] or [10], for each series \(y_1 = (y_{1,t},\ldots , y_{1,t})\) and \(y_2 = (y_{2,t},\ldots , y_{2,t})\). This estimation can be performed independent of the estimation of correlation coefficients in \(R_t\) since \(D_t\) defines the unconditional variance of each series at time t, which is by definition independent of correlations \(R_t\).

For the two series in (6), moving–window (MW) correlation estimates \(\hat{\rho }_t \) using window length m can be calculated using Pearson’s linear correlation coefficient:

$$\begin{aligned} \hat{\mu }_{i,t}= & {} \frac{\sum _{t' = t-m+1}^{t} y_{i,t'}}{m},\, \text{ for } i=1,2\nonumber \\[-7mm] \end{aligned}$$
(8)
$$\begin{aligned} \hat{\sigma }^2_{i,t}= & {} \frac{\sum _{t' = t-m+1}^{t} (y_{i,t'} - \hat{\mu }_{i,t})^2}{m-1},\, \text{ for } i=1,2\nonumber \\[-7mm] \end{aligned}$$
(9)
$$\begin{aligned} \hat{\rho }_{t}^{(m)}= & {} \sum _{t' = t-m+1}^{t} \frac{ (y_{1,t'}-\hat{\mu }_{1,t})(y_{2,t'}-\hat{\mu }_{2,t})}{(m-1)\sigma _{1,t}\sigma _{2,t}}.\nonumber \\[-9mm] \end{aligned}$$
(10)

The correlation estimate in (10) has an asymptotic normal distribution with variance \((1 - (\hat{\rho }_{t}^{(m)})^2) / (m - 2)\). However, for small m, this asymptotic property does not necessarily hold, hence asymptotic variances do not reflect the actual variance of the correlation estimate. In all following examples we report the estimation uncertainty in \(\hat{\rho }\) in (10) using bootstrap [27] results based on 1000 bootstrap samples of size m / 2. When the purpose is to forecast future correlations between assets, the common method is to use past information as follows [15]:

$$\begin{aligned} E\left( \hat{\rho }_{t+1}^{(m)}| y_1,\ldots , y_t \right) = \hat{\rho }_{t}^{(m)}, \end{aligned}$$
(11)

where \(\hat{\rho }_{t}^{(m)}\) is obtained from (10).

The PFS for correlation modeling has the following rules for \(q = 1, \ldots , Q\):

$$\begin{aligned} \hat{R}_q : \hbox {If } \hat{\rho }^{(m)}_{t-1} \hbox { is } A_q \hbox { then } f(\hat{\rho }^{(m)}_t) \hbox { is } \hat{f}(\hat{\rho }^{(m)}_t|A_q) \, , \end{aligned}$$
(12)

where both the antecedent (past correlation) and the consequent (current correlation) are estimated using (10) with a pre-selected window size m, and \(\hat{f}(\hat{\rho }^{(m)}_t|A_q)\) is a fuzzy histogram described as [25]

$$\begin{aligned} {\hat{f}}(\hat{\rho }^{(m)}|A_q) = \sum _{j=1}^J \frac{\hat{\Pr }(C_j|A_q) {u_{C_j}(\hat{\rho }^{(m)}_t)}}{\int _{-\infty }^{\infty } u_{C_j}(\hat{\rho }^{(m)}_t)d\hat{\rho }^{(m)}_t}\, , \end{aligned}$$
(13)

i.e. both the antecedent and the consequent variables are only approximations of the real variable of interest, correlations.

The parameters of the probabilistic fuzzy systems are estimated using a procedure similar to [7], here briefly summarized. Following the distinction between input and output present in the rule structure of (2), the optimization problem is divided in two parts. First we obtain the input membership parameters by using a fuzzy clustering heuristic, that uses the fuzzy c-means algorithm, set the output membership parameters as Gaussian, shouldered at the edges and finally optimize the probability parameters \(\hat{\Pr }(C_{j}|A_{q})\) using maximum likelihood estimation.

4 Simulated Data with Time-Varying Correlation

In this section we illustrate the performance of the PFS model using simulated data and compare the results with MW estimates of correlation, which are often used as proxies for correlation [15]. In the described PFS, both the input and the output of PFS are approximations of actual (unobserved) correlation. We use simulation experiments to study the effect of these approximated inputs and outputs in PFS on the approximation capability of PFS, particularly in comparison to MW approximation. In addition, for the simulation studies, actual correlation is known. We can therefore compare obtained results from the two methods, MW and PFS, with actual correlation values. Such a comparison is not possible using real data, unless a loss function is defined [15].

As an example, we simulate \(T=500\) observations \(y_t = (y_{1,t}, y_{2,t})'\) for \(t=1,\ldots , T\) from a model with highly persistent time-varying correlations following an auto-regressive process, described by:

$$\begin{aligned} \begin{array}{rcl} y_t &{}\sim &{}N\left( \begin{pmatrix} 0 \\ 0 \end{pmatrix}, \begin{pmatrix} 1 &{}\rho _t&{} \\ \rho _t &{}1&{} \end{pmatrix} \right) \\ \rho _t &{} = &{} \max \left( -1 + \epsilon , \min \left( 1 - \epsilon , 0.1 + 0.8 \rho _{t-1} + \eta _t\right) \right) \end{array} \end{aligned}$$
(14)

where \(\epsilon = 10^{-5}\), \(\eta _t \sim NID(0, 0.005)\) and the restriction \(\rho _t\in (-1,1)\) of the covariance decomposition (7) is satisfied. As shown in Fig. 1a, the model presented in (14) has time-varying correlations between times series \(y_{1,t}\) and \(y_{2,t}\), and furthermore, in periods of high correlation, series \(y_{1,t}\) and \(y_{2,t}\) have common upward or downward movements.

4.1 PFS Application with True Consequents and Proxies for Antecedents

We first consider a PFS model of the form (12) where the inputs are correlation estimates calculated from (10) and the output of PFS are the true observed correlation (given by the simulated parameters). We again note that this application is not realistic since correlations are unobserved in reality, but it serves the purpose of documenting the effect of using approximations of correlation as the input variables on the PFS results, isolating this the effect from the approximation of the output variable.

Each PFS rule in (12) are adjusted such that the output of PFS is the actual correlation at time t, \(\rho _t\) as the output:

$$\begin{aligned} \hat{R}_q : \hbox {If } \hat{\rho }^{(m)}_{t-1} \hbox { is } A_q \hbox { then } f(\rho _t) \hbox { is } \hat{f}(\rho _t|A_q) \, . \end{aligned}$$
(15)

We first consider a window length of \(m=10\) for obtaining correlation approximations in (13), for moving window forecasts and for the input variable of PFS. Results of the MW correlation estimates and those from PFS with 4 antecedents and 9 consequents are shown in Fig. 1b and c. MW estimates of correlation in Fig. 1b change substantially over time, capturing changing correlation levels. In addition, the obtained 99 % intervals around these estimates are often too wide, covering all values within \((-1,1)\), indicating that the uncertainty around the estimated values are high. Figure 1b also shows that the peaks of MW estimates are mostly after the peaks in correlation. I.e MW estimates are often late and inaccurate in capturing correlation changes. PFS estimates in Fig. 1c, however, follow the increases and decreases of the actual correlation smoothly, with tighter confidence intervals.

Fig. 1.
figure 1

MW and PFS estimates of time-varying correlation for simulated data. (Color figure online)

Fig. 2.
figure 2

MAE comparisons of MW and PFS for simulated data. (Color figure online)

We next compare MW and PFS estimates for time-varying correlation by using mean absolute error (MAE) between estimated and actual correlations to compare the accuracy of the methods. We emphasize that such a comparison is only possible if actual correlation is known, i.e. in a simulation setting. MAE for the two methods are calculated as follows:

$$\begin{aligned} \text{ MAE }^{(m)} = \frac{1}{T}\sum _{t=1}^T \left| \hat{\rho }_{t}^{(m)} - \rho _t\right| \end{aligned}$$
(16)

where PFS estimates of \(\hat{\rho }_{t}^{(m)}\) are obtained from (5), MW estimates of \(\hat{\rho }_{t}^{(m)}\) are obtained from (11), and \(\rho _t\) is the simulated value of correlation at time t. MAE from MW and PFS using different window sizes are shown in Fig. 2a. According to MAE, PFS results are smaller than those of MW estimation for all window sizes. Regardless of the window size, MAE from PFS is around 0.3 %, while MA estimates lead to very high MAE, especially with small window sizes. It is particularly interesting that the PFS estimates with a small window size are still accurate. This result follows from the addition of probability parameters in the model in PFS, which are estimated using the full sample information. Even though the antecedent is calculated inaccurately, e.g. with a too small window size, PFS parameters incorporate full sample information and regulate the correlation estimates through the use of fuzzy sets. Finally, the variance of mean absolute errors from MW estimates in Fig. 2a is 0.02 while that of PFS is approximately 1.7e-5, i.e. PFS estimates are clearly less sensitive to the window size selection compared to MW estimates.

4.2 PFS Application with Proxies for Antecedents and Consequents

In this section, we consider a more realistic PFS set-up compared to Sect. 4.1, where both the antecedents and consequents in PFS are obtained from MW estimation. Specifically, both the antecedents and the consequents of PFS are obtained for an MW estimation in (10) with a pre-selected window size m, as defined in (12)–(13).

The estimated time-varying correlations and the 99 % intervals for MW and PFS applications for a single simulation study and window size \(m=10\) are shown in Fig. 1d. The PFS application is based on 4 antecedents and 9 consequents. We compare the results to those obtained by MW estimation, reported in Fig. 1b. Correlation estimates and the 99 % intervals are smoother when PFS model is used compared the MW results, even though both the antecedent and consequent of PFS are based on the approximations of correlation instead of actual correlation. Similarly, the uncertainty in the time-varying correlation \(\rho _t\), illustrated by the 99 % interval is much smaller using PFS. Figure 2b presents MAE obtained from MW estimation and PFS using different window sizes. MAE from PFS are between 0.2 and 0.3, regardless of the window size, while MAE from MW estimation varies substantially with the window size. In addition, the variance of the MAE from MW estimates in Fig. 2b is 0.04 while that of PFS is approximately 0.02. Hence PFS estimates are less sensitive to the choice of the window length used for correlation estimates, while MW performs particularly poorly when the window length is small. In other words, PFS decreases the sensitivity of the results to the choice of the window length. Furthermore, we note that the number of antecedents and consequents in PFS has a small effect on the obtained MAE values; for all cases PFS models provide good approximations to actual correlations.

5 Real Data Application

For the real data illustration, we use 1463 daily percentage returns for the Hong Kong Hang Seng Index (HSI) and NASDAQ index between 04 January 2006 and 15 December 2011, where returns \(r_t\) at time t are calculated as \( r_t = 100\times ( \ln (p_t) - \ln (p_{t-1})) \) where \(p_t\) is the closing price of the index at time t. We select this period of stock returns to ensure that the recent financial crisis is included in the data period, and thus we can analyze potential changes in the correlation between the two indexes during the crisis. Daily returns between 04/01/2006 and 18/11/2009 are used as the estimation sample, and the remaining 500 returns after 18/11/2009 are taken as the forecast sample. We note that the days for which the stock market is closed are slightly different for the two stock markets. We analyze the data during days where both stock markets were open. Percentage returns for both indexes are shown in Fig. 3a.

Fig. 3.
figure 3

HSI returns, NASDAQ returns and MW correlation estimates. (Color figure online)

Fig. 4.
figure 4

PFS results for HSI and NASDAQ returns. (Color figure online)

We first employ the covariance decomposition in (7) on the returns, and obtain the MW estimates of correlation using (10) with a window size of \(m=10\). Since the purpose is to obtain correlation estimates of these stock returns, we do not estimate or report the variance matrix \(D_t\) in (7). Given the MW estimates of correlation, we apply the PFS model in (12) with 4 antecedents and 9 consequents, using MW estimates of the previous day as the input variable, and MW estimates of the current day as the output variable for PFS.

MW estimates of correlation are provided in Fig. 3b for the estimation and forecast sample. PFS estimates, on the other hand are reported in Fig. 4a for the estimation sample and in Fig. 4b for the forecast sample. The reported PFS estimate corresponds to the expected value of the PFS output (5). The general findings confirm the simulation study in Sect. 4. MW estimates of correlation are very volatile, and such extreme variation in correlation is counter-intuitive for daily stock prices. In addition, the uncertainty around the MW estimates, represented by the 99 % intervals in Fig. 3b, are very high in all periods. We conclude that the mowing window estimation is unlikely to provide accurate estimates of correlation. PFS estimates of correlation in Fig. 4a, on the other hand, are more stable with smaller 99 % intervals compared to MW estimation. We conclude that, even with this relatively small window length, the obtained results from PFS capture time-varying correlation with substantial accuracy in estimates. Hence the trade-off between capturing time-variation in correlation and obtaining accurate correlation estimates is mitigated using PFS.

6 Conclusions

In this paper we show that a PFS can be used to model unobserved time-varying correlation between financial returns. The proposed method avoids strong distributional assumptions on the correlation process and uses the conventional approximation of time-varying correlation, namely sample correlations from moving windows, as antecedents and consequents. The method is applied to simulated and real data where we show that the PFS application improves over the conventional moving window approximation of time-varying correlation in terms of decreasing the sensitivity to the selection of the window length. In future work, we plan to apply the PFS to intra-day correlation between different stock prices where accurate estimation depends heavily on moving window estimates and the sizes of the moving windows. In addition, we plan to analyze the theoretical foundations and study the interpretability of the rules of the proposed methodology.