Abstract
In a recent work we proposed the corrected transfer entropy (CTE), which reduces the bias in the estimation of transfer entropy (TE), a measure of Granger causality for bivariate time series making use of the conditional mutual information. An extension of TE to account for the presence of other time series is the partial TE (PTE). Here, we propose the correction of PTE, termed Corrected PTE (CPTE), in a similar way to CTE: time shifted surrogates are used in order to quantify and correct the bias, and the estimation of the involved entropies of high-dimensional variables is made with the method of k-nearest neighbors. CPTE is evaluated on coupled stochastic systems with both linear and nonlinear interactions. Finally, we apply CPTE to economic data and investigate whether we can detect the direct causal effects among economic variables.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
18.1 Introduction
The leading concept of Granger causality has been widely used to study the dynamic relationships between economic time series [4]. In practice, only a subset of the variables of the original multivariate system may be observed and omission of important variables could lead to spurious causalities between the variables. Therefore, the problem of spurious causality is addressed. Moreover, for a better understanding of the causal structure of a multivariate system it is important to study and discriminate between the direct and indirect causal effects.
Transfer entropy (TE) is an information theoretic measure that quantifies the statistical dependence of two variables (or subsystems) evolving in time. Although TE is able to distinguish effectively causal relationships and asymmetry in the interaction of two variables, it does not distinguish between direct and indirect relationships in the presence of other variables. Partial transfer entropy (PTE) is an extension of TE conditioning on the ensemble of the rest of the variables and it can detect the direct causal effects [20]. As reported in [13], using the nearest neighbor estimate, PTE can effectively detect direct coupling even in moderately high dimensions. The corrected transfer entropy (CTE) was proposed as a correction to the TE [12], aiming at reducing the estimation bias of TE. For its estimation, instead of making a formal surrogate data test, the surrogates were used within the estimation procedure of the measure, and the CTE was estimated based on correlation sums.
We introduce here the corrected partial transfer entropy (CPTE) that combines PTE and CTE, which reduces the bias in the estimation of TE, so that TE goes to the zero level when there is no causal effect. Similarly to CTE, the surrogates are used within the estimation procedure of CPTE, instead of performing a significant test for PTE. Further, for the estimation of CPTE, the nearest neighbor estimate is implemented since it has been shown to be robust to the time series length and to its free parameter (number of neighbors) and efficient in high dimensional data (e.g., see [21]).
The paper is organized as follows. In Sect. 18.2, the information causality measures, transfer entropy and partial transfer entropy are introduced and the suggested measure, corrected partial transfer entropy (CPTE) is presented. In Sect. 18.3, CPTE is evaluated on a simulation study using coupled stochastic systems with linear and nonlinear causal effects. As an example of a real application, the direct causal effects among economic variables are investigated in Sect. 18.4. Finally, in Sect. 18.5, the results from the simulation study and the application are discussed, while the usefulness and the limitations of the nonparametric causality test are addressed.
18.2 Methodology
In this section, we introduce the information causality measures transfer entropy (TE) and partial transfer entropy (PTE), and define the corrected partial transfer entropy (CPTE), a measure able to detect direct causal effects in multivariate systems. Transfer entropy (TE) is a nonlinear measure that quantifies the amount of information explained in Y at h time steps ahead from the state of X accounting for the concurrent state of Y [19]. Let x t , y t be two time series and \(\mathbf{x}_{t} = (x_{t},x_{t-\tau },\ldots,x_{t-(m-1)\tau })^{{\prime}}\) and \(\mathbf{y}_{t} = (y_{t},y_{t-\tau },\ldots,y_{t-(m-1)\tau })^{{\prime}}\), the reconstructed vectors of the state space of each system, where τ is the delay time and m is the embedding dimension. TE from X to Y is defined as
where H(x) is the Shannon entropy of the variable X. For a discrete variable X, the Shannon entropy is defined as \(H(X) = -\sum p(x_{i})\log p(x_{i})\), where p(x i ) is the probability mass function of the outcome x i , typically estimated by the relative frequency of x i . The partial transfer entropy (PTE) is the extension of TE accounting for the causal effect on the response Y by the other observed variables of a multivariate system besides the driving X, let us denote them Z. PTE is defined as
where z t is the stacked vector of the reconstructed points for the variables in Z.
The information measure PTE is more general than partial correlation since it is not restricted to linear inter-dependence and relates presence and past (vectors \(\mathbf{x}_{t},\mathbf{y}_{t},\mathbf{z}_{t}\)) with future (y t+h ). Following the definition of Shannon entropy for discrete variables, one would discretize the data of X, Y, and Z first, but such binning estimate is inappropriate for high dimensional variables (m > 1). Instead we consider here the estimate of nearest neighbors. The joint and marginal densities are approximated at each point using the k-nearest neighbors and their distances from the point (for details see [6]). k-nearest neighbor estimate is found to be very robust to time series length, insensitive to its free parameter k and particularly useful for high dimensional data [11, 21].
Asymptotic properties for TE and PTE are mainly known for their binning estimate, which stem from the asymptotic properties of the estimates of entropy and mutual information for discrete variables (e.g., see [5, 10, 17]). Thus parametric significance testing for TE and PTE is possible assuming the binning estimate, but it was found to be less accurate than resampling testing making use of appropriate surrogates [7]. The nearest neighbor estimates of TE and PTE do not have parametric approximate distributions, and we employ resampling techniques in this study.
Theoretically, both PTE and TE should be zero when there is no driving-response effect (X → Y ). However, any entropy estimate gives positive TE and PTE at a level depending on the system, the embedding parameters and the estimation method. We introduce the Corrected Partial Transfer Entropy (CPTE), designed to give zero values in case of no causal effects and positive values otherwise. In order to define CPTE X → Y | Z , we compute M surrogate PTE values by randomizing the driving time series X using time shifted surrogates [15]. These M values form the null distribution of PTE for a significance test. We denote by q 0 the PTE value on the original set of time series and q(1 −α) the (1 −α)-percentile value from the M surrogate PTE values, where α corresponds to the significance level for an one-sided test. The CPTE X → Y | Z is defined as follows:
In essence, we correct for the bias given by q(1 −α) and either obtain a positive value if the null hypothesis of direct causal effect is rejected or obtain a zero value if CPTE is found statistically insignificant.
18.3 Evaluation of CPTE on Simulated Systems
CPTE is evaluated on Monte Carlo simulations on different multivariate stochastic coupled systems with linear and nonlinear causal effects. In this section, we present the simulation systems we used and display the results from the simulation study.
18.3.1 Simulation Setup
CPTE is computed on 100 realizations of the following coupled systems, for all pairs of variables conditioned on the rest of the variables and for all directions.
-
1.
A VAR(1) model with three variables, where X 1 drives X 2 and X 2 drives X 3
$$\displaystyle\begin{array}{rcl} x_{1,t}& =& \theta _{t} {}\\ x_{2,t}& =& x_{1,t-1} +\eta _{t} {}\\ x_{3,t}& =& 0.5x_{3,t-1} + x_{2,t-1} +\epsilon _{t}, {}\\ \end{array}$$where θ t , η t , ε t are Gaussian white noise with zero mean, diagonal covariance matrix, and standard deviations 1, 0.2, and 0.3, respectively.
-
2.
A VAR(5) model with four variables, where X 1 drives X 3, X 2 drives X 1, X 2 drives X 3, and X 4 drives X 2 [22, Eq. 12]
$$\displaystyle\begin{array}{rcl} x_{1,t}& =& 0.8x_{1,t-1} + 0.65x_{2,t-4} +\epsilon _{1,t} {}\\ x_{2,t}& =& 0.6x_{2,t-1} + 0.6x_{4,t-5} +\epsilon _{2,t} {}\\ x_{3,t}& =& 0.5x_{3,t-3} - 0.6x_{1,t-1} + 0.4x_{2,t-4} +\epsilon _{3,t} {}\\ x_{4,t}& =& 1.2x_{4,t-1} - 0.7x_{4,t-2} +\epsilon _{4,t} {}\\ \end{array}$$ -
3.
A VAR(4) model of variables, where X 1 drives X 2, X 1 drives X 4, X 2 drives X 4, X 4 drives X 5, X 5 drives X 1, X 5 drives X 2, X 5 drives X 3 [18]
$$\displaystyle\begin{array}{rcl} x_{1,t}& =& 0.4x_{1,t-1} - 0.5x_{1,t-2} + 0.4x_{5,t-1} +\epsilon _{1,t} {}\\ x_{2,t}& =& 0.4x_{2,t-1} - 0.3x_{1,t-4} + 0.4x_{5,t-2} +\epsilon _{2,t} {}\\ x_{3,t}& =& 0.5x_{3,t-1} - 0.7x_{3,t-2} - 0.3x_{5,t-3} +\epsilon _{3,t} {}\\ x_{4,t}& =& 0.8x_{4,t-3} + 0.4x_{1,t-2} + 0.3x_{2,t-3} +\epsilon _{4,t} {}\\ x_{5,t}& =& 0.7x_{5,t-1} - 0.5x_{5,t-2} - 0.4x_{4,t-1} +\epsilon _{5,t} {}\\ \end{array}$$ -
4.
A coupled system of three variables with linear and nonlinear causal effects, where X 1 drives X 2, X 2 drives X 3, and X 1 drives X 3 [3, Model 7]
$$\displaystyle\begin{array}{rcl} x_{1,t}& =& 3.4x_{1,t-1}(1 - x_{1,t-1})^{2}\exp -x_{ 1,t-1}^{2} + 0.4\epsilon _{ 1,t} {}\\ x_{2,t}& =& 3.4x_{2,t-1}(1 - x_{2,t-1})^{2}\exp -x_{ 2,t-1}^{2} + 0.5x_{ 1,t-1}x_{2,t-1} + 0.4\epsilon _{2,t} {}\\ x_{3,t}& =& 3.4x_{3,t-1}(1 - x_{3,t-1})^{2}\exp -x_{ 3,t-1}^{2} + 0.3x_{ 2,t-1} + 0.5x_{1,t-1}^{2} + 0.4\epsilon _{ 3,t} {}\\ \end{array}$$
The three first simulation systems are stochastic systems with only linear causal effects, while the fourth one has both linear and nonlinear causal effects. For all simulations systems, the time step h for the estimation of CPTE is set to one (as originally defined for TE in [19]) or m. The embedding dimension m is adapted to the system complexity, the delay time τ is set to one, and we use α = 0. 05. The number of neighbors k is set to 10 and we note that the choice of k has been found not to be crucial in the implementation of TE or PTE, e.g., see [6, 11, 13]. We consider the time series lengths n = 512 and 2,048, in order to examine the performance of the measure for both short and large time series length.
18.3.2 Results from Simulation Study
In order to evaluate the performance of CPTE, we display the percentages of rejection of the null hypothesis of no causal effect from the 100 realizations of the coupled systems.
For the first simulation system, if we set h = 1 and m = 1, the percentages of statistically significant CPTE at the directions of direct causal effects X 1 → X 2 and X 2 → X 3 are 100 %, while for the other directions of no causal effects the percentages vary from 2 % to 11 % (see Table 18.1). The choice h = 1 and m = 1 is favorably suited for this system and only direct causal effects are found significant. For different h or m values, indirect effects are detected by CPTE. For example, if we set h = 1 and m = 2, the indirect causal effect X 1 → X 3 is detected by CPTE. In this case however, this effect is indeed direct if two time lags are considered. The expression of x 3 after substituting x 2 becomes: \(x_{3,t} = 0.5x_{3,t-1} + x_{1,t-2} +\epsilon _{t} +\eta _{t-1}\). The same holds for h = 2 and m = 1, and here the direct causal effect X 1 → X 2 cannot be detected as the expression of x 2, t for two steps ahead is \(x_{2,t} =\theta _{t-1} +\eta _{t}\).
Concerning the second system, the largest lag in the equations is 5, and therefore by setting h = 1 and m = 5, CPTE correctly detects the direct causal effects X 1 → X 3, \(X_{2} \rightarrow X_{1}\), and \(X_{4} \rightarrow X_{2}\). For the true direct effect X 2 → X 3 being under-valued in the system, the percentages of significant CPTE values increase with n, indicating that larger time series lengths are required to detect this interaction (see Table 18.2). By increasing h, indirect effects become statistically significant, e.g. for h = 5, CPTE correctly detects again all the direct interactions, even for small time series lengths, but it also indicates the indirect driving of X 4 to X 1 (with 50 % percentage for n = 512, and 100 % for n = 2, 048) and of X 4 to X 3 (35 % for n = 512, 74 % for n = 2, 048).
The third simulation system is on 5 variables and the largest lag is 4, so we set m = 4. For h = 1, CPTE correctly detects all the direct causal effects with a confidence increasing with n, e.g. the percentage of detection changes from 34 % for n = 512 to 96 % for n = 2, 048 for the weakest direct causal effect X 2 → X 4. However, for larger n, CPTE also indicates the indirect driving of X 5 → X 4 with percentage 52 % (see Table 18.3). For h = 4, the performance of CPTE worsens and it fails to detect some direct causal effects. For example, the percentages of significant CPTE values at the direction X 1 → X 4 are 11 % and 24 % for n = 512 and 2,048, respectively. For other couplings, the improvement of the detection from n = 512 to n = 2, 048 is larger: 17 % to 53 % for X 2 → X 4, 18 % to 47 % for X 5 → X 2, and 45 % to 98 % for X 4 → X 5.
The last simulation system involves linear interactions (X 2 → X 3) and nonlinear interactions (X 1 → X 2 and X 1 → X 3), all at lag one. For h = 1 and m = 2, CPTE correctly detects these causal effects for both small and large time series lengths, while the percentage of detection remains low at the absence of coupling, as shown in Table 18.4. Again, if h is larger than 1, false detections are observed. However, increasing n enhances the performance of CPTE, and for h = 2 and n = 4, 096 the percentage of significant CPTE for X 1 → X 2, X 2 → X 3, and X 1 → X 3 are 97 %, 100 %, and 77 %, respectively. Therefore, the effect of the selection of the free parameters h and m on CPTE gets larger for shorter time series.
18.4 Application on Economic Data
As a real application, we investigate the causal effects among economic time series. Specifically, the goal of this section is to investigate the impact of monetary policy into financial uncertainty and the long-term rate by taking the direct effects of this relationship into account. The data are daily measurements from \(05/01/2007\) up to \(18/5/2012\). They consist of the 3-month Treasury Bill returns as a monetary policy tool, denoted as X 1, the 10-year Treasury Note to represent long-term behavior, denoted as X 2, and the option-implied expected volatility on the S & P500 returns index (VIX), X3, in order to take financial uncertainty into consideration.
In similar studies instead of using the 3-month TBill, the changes in monetary policy are mirrored in the evolution of the Fed Funds which is directly controlled by FED. However, as it is pointed out in [1, 8], the 3-month TBill rate can adequately reflect the Fed Funds movements.
An in-depth investigation of the interrelations among the three variables starts by estimating CPTE for all pairs of variables conditioned on the third variable. In the aim to smooth away any linear interdependence from the returns series the CPTE is applied on the VAR filtered variables. As it is shown in [2], information theoretic quantities, such as transfer entropy, perform better when VAR residuals are used. CPTE indicates the nonlinear driving of X 1 on X 2 (\(\mathrm{CPTE}_{X_{1}\rightarrow X_{2}} = 0.0024\)) for h = 1, m = 1, τ = 1, and k = 10. Regarding the “stability” of the results, it is expected to be lost by increasing the embedding dimension m. Clearly, CPTE for larger m values does not indicate any causal effect.
In order to further analyze the directions of those causal effects, PTE values from the VAR filtered returns are also calculated. The statistical significance of PTE is assessed with a surrogate data test. The respective p-values of the two-sided surrogate test are obtained with means of shifted surrogates. If the original PTE value is on the tail of the empirical distribution of the PTE surrogate value, then the “no-causal effects” hypothesis is rejected. It is worth noticing that the two-sided surrogate test for PTE indicates the same causal effects as CPTE, revealing that X 1 → X 2 (p-value = 0.03). The corresponding PTE values for this direction of the causality are much larger compared with the rest of relationships.
18.5 Conclusions
Corrected Partial Transfer Entropy (CPTE) is a nonparametric causality measure able to detect only the direct causal effects among the components (variables) of a multivariate system. CPTE is defined exploiting the concept of surrogate data in order to reduce the bias in Partial Transfer Entropy (PTE), giving zero values in case of no causal effects and otherwise positive values.
CPTE correctly detected the direct causal effects for all tested stochastic simulation systems, but only for the suitable selection of the free parameters. CPTE is sensitive to the selection of the free parameters h and m, especially for short time series. The selection of the step ahead h = 1 turns out to be more appropriate than h = m at all cases. The suitable selection of the free parameters seems to be crucial at most cases in order to avoid spurious detections of causal effects. The more complicated a system is, the larger the time series are needed.
In the real application, CPTE indicated the direct driving of the 3-month TBill returns on the 10-year TNote returns, without, however, excluding the presence of indirect dependencies among these interest rate variables and the VIX. Determining the 3-month TBill as the “node” variable, of our 3-dimensional system, highlights the interest in examining its underlying dynamics jointly with the transmission mechanisms of monetary policy. Although the transfer entropy (TE) method has been recently applied in financial data, the partial transfer entropy is a relatively new technique in this field. TE is estimated on the returns of the economic variables (log-returns) and does not rely upon cointegration aspects (e.g., see [9, 14, 16]). On the basis of the well-documented long-term comovement between the 3-month TBill and the 10-year TNote, the impact of non-stationarity on the performance of the above tests is an important issue meriting further investigation. This point reveals new insights about the informational content of Granger-causality type tests. The results from real data should be handled with care due to their high degree of sensitivity to the specific properties of the under-study variables.
References
Garfinkelm, M.R., Thornton, M.R.: The information content of the federal funds rate: Is it unique? J. Money Credit Bank. 27, 838–847 (1995)
Gomez-Herrero, G.: Brain connectivity Analysis with EEG. PhD thesis, Tampere University of Technology (2010)
Gourévitch, B., Le Bouquin-Jeannés, R., Faucon, G.: Linear and nonlinear causality between signals: Methods, examples and neurophysiological applications. Biol. Cybern. 95, 349–369 (2006)
Granger, J.: Investigating causal relations by econometric models and cross-spectral methods. Acta Phys. Pol. B 37, 424–438 (1969)
Grassberger, P.: Finite sample corrections to entropy and dimension estimates. Phys. Lett. A 128(6, 7), 369–373 (1988)
Kraskov, A., Stögbauer, H., Grassberger, P.: Estimating mutual information. Phys. Rev. E 69(6), 066138 (2004)
Kugiumtzis, D.: Partial transfer entropy on rank vectors. Eur. Phys. J. Spec. Top. 222, 401–420 (2013)
Kyrtsou, C., Vorlow, C.: Modelling non-linear comovements between time series. J. Macroecon. 31(1), 200–211 (2009)
Marschinski, M., Kantz, H.: Analysing the information flow between financial time series. an improved estimator for transfer entropy. Eur. Phys. J. B 30, 275–281 (2002)
Miller, G.A.: Note on the Bias of Information Estimates. The Free Press, Monticello (1955)
Papana, A., Kugiumtzis, D.: Evaluation of mutual information estimators for time series. Int. J. Bifurcat. Chaos 19(12), 4197–4215 (2009)
Papana, A., Kugiumtzis, D., Larsson, P.G.: Reducing the bias of causality measures. Phys. Rev. E 83(3), 036207 (2011)
Papana, A., Kugiumtzis, D., Larsson, P.G.: Detection of direct causal effects and application to epileptic electroencephalogram analysis. Int. J. Bifurcat. Chaos 22(9), 1250222 (2012)
Peter, F.J.: Where is the market? Three econometric approaches to measure contributions to price discovery. PhD thesis, Eberhard-Karls-Universität Tübingen (2011)
Quian Quiroga, R., Kraskov, A., Kreuz, T., Grassberger, P.: Performance of different synchronization measures in real data: A case study on electroencephalographic signals. Phys. Rev. E 65(4), 041903 (2002)
Reddy, Y.V., Sebastin, A.: Interaction between forex and stock markets in india: An entropy approach. In: VIKALPA, vol. 33, No. 4 (2008)
Roulston, M.S.: Estimating the errors on measured entropy and mutual information. Physica D 125, 285–294 (1999)
Schelter, B., Winterhalder, M., Hellwig, B., Guschlbauer, B., Lucking, C.H., Timmer, J.: Direct or indirect? graphical models for neural oscillators. J. Physiol. 99, 37–46 (2006)
Schreiber, T.: Measuring information transfer. Phys. Rev. Lett. 85(2), 461–464 (2000)
Vakorin, V.A., Krakovska, O.A., McIntosh, A.R.: Confounding effects of indirect connections on causality estimation. J. Neurosci. Methods 184, 152–160 (2009)
Vlachos, I., Kugiumtzis, D.: Non-uniform state space reconstruction and coupling detection. Phys. Rev. E 82, 016207 (2010)
Winterhalder, M., Schelter, B., Hesse, W., Schwab, K., Leistritz, L., Klan, D., Bauer, R., Timmer, J., Witte, H.: Comparison of linear signal processing techniques to infer directed interactions in multivariate neural systems. Signal Process. 85, 2137–2160 (2005)
Acknowledgements
The research project is implemented within the framework of the Action “Supporting Postdoctoral Researchers” of the Operational Program “Education and Lifelong Learning” (Action’s Beneficiary: General Secretariat for Research and Technology), and is co-financed by the European Social Fund (ESF) and the Greek State.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this paper
Cite this paper
Papana, A., Kugiumtzis, D., Kyrtsou, C. (2014). A Nonparametric Causality Test: Detection of Direct Causal Effects in Multivariate Systems Using Corrected Partial Transfer Entropy. In: Akritas, M., Lahiri, S., Politis, D. (eds) Topics in Nonparametric Statistics. Springer Proceedings in Mathematics & Statistics, vol 74. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-0569-0_18
Download citation
DOI: https://doi.org/10.1007/978-1-4939-0569-0_18
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-0568-3
Online ISBN: 978-1-4939-0569-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)