Abstract
We present a short selective review of causal inference from observational data, with a particular emphasis on the high-dimensional scenario where the number of measured variables may be much larger than sample size. Despite major identifiability problems, making causal inference from observational data very ill-posed, we outline a methodology providing useful bounds for causal effects. Furthermore, we discuss open problems in optimization, non-linear estimation and for assigning statistical measures of uncertainty, and we illustrate the benefits and limitations of high-dimensional causal inference for biological applications.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Andersson S, Madigan D, Perlman M (1997) A characterization of Markov equivalence classes for acyclic digraphs. Ann Stat 25: 505–541
Banerjee O, El Ghaoui L, d’Aspremont A (2008) Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. J Mach Learn Res 9: 485–516
Bühlmann P, Kalisch M, Maathuis M (2010) Variable selection in high-dimensional linear models: partially faithful distributions and the PC-simple algorithm. Biometrika 97: 261–278
Bühlmann P, Rütimann P, Kalisch M (2011) Controlling false positive selections in high-dimensional regression and causal inference. Stat Methods Med Res (published online). doi:10.1177/0962280211428371
Chickering D (2002) Optimal structure identification with greedy search. J Mach Learn Res 3: 507–554
Friedman J, Hastie T, Tibshirani R (2007) Sparse inverse covariance estimation with the graphical Lasso. Biostatistics 9: 432–441
Hauser A, Bühlmann P (2011) Characterization and greedy learning of interventional Markov equivalence classes of directed acyclic graphs. J Mach Learn Res 13:2409–2464
Hoyer P, Janzing D, Mooij J, Peters J, Schölkopf B (2009) Nonlinear causal discovery with additive noise models. In: Advances in neural information processing systems 21, 22nd annual conference on neural information processing systems (NIPS 2008), pp 689–696
Hughes T, Marton M, Jones A, Roberts C, Stoughton R, Armour C, Bennett H, Coffey E, Dai H, He Y, Kidd M, King A, Meyer M, Slade D, Lum P, Stepaniants S, Shoemaker D, Gachotte D, Chakraburtty K, Simon J, Bard M, Friend S (2000) Functional discovery via a compendium of expression profiles. Cell 102: 109–126
Kalisch M, Bühlmann P (2007) Estimating high-dimensional directed acyclic graphs with the PC-algorithm. J Mach Learn Res 8: 613–636
Kalisch M, Mächler M, Colombo D, Maathuis M, Bühlmann P (2012) Causal inference using graphical models with the R package pcalg. J Stat Softw 47(11):1–26
Lauritzen S (1996) Graphical models. Oxford University Press, Oxford
Maathuis M, Colombo D, Kalisch M, Bühlmann P (2010) Predicting causal effects in large-scale systems from observational data. Nat Methods 7: 247–248
Maathuis M, Kalisch M, Bühlmann P (2009) Estimating high-dimensional intervention effects from observational data. Ann Stat 37: 3133–3164
Meinshausen N, Bühlmann P (2006) High-dimensional graphs and variable selection with the Lasso. Ann Stat 34: 1436–1462
Meinshausen N, Bühlmann P (2010) Stability selection (with discussion). J R Stat Soc Ser B 72: 417–473
Mooij J, Janzing D, Heskes T, Schölkopf B (2011) On causal discovery with cyclic additive noise models. In: Advances in neural information processing systems 24, 24nd annual conference on neural information processing systems (NIPS 2011)
Pearl J (2000) Causality: models, reasoning and inference. Cambridge University Press, Cambridge
Richardson T (1996) A discovery algorithm for directed cyclic graphs. In: Proceedings of the 12th conference on uncertainty in artificial intelligence (UAI-1996), pp 454–461
Shimizu S, Hoyer P, Hyvärinen A, Kerminen A (2006) A linear non-Gaussian acyclic model for causal discovery. J Mach Learn Res 7: 2003–2030
Spirtes P (1995) Directed cyclic graphical representations of feedback models. In: Proceedings of the 11th conference on uncertainty in artificial intelligence (UAI-1995), pp 491–499
Spirtes P, Glymour C, Scheines R (2000) Causation, prediction, and search, 2nd edn. MIT Press, Cambridge
Stekhoven D, Moraes I, Sveinbjörnsson G, Hennig L, Maathuis M, Bühlmann P (2012) Causal stability ranking. Bioinformatics 28:2819–2823
Studený M, Hemmecke R, Lindner S (2010) Characteristic imset: a simple algebraic representative of a Bayesian network structure. In: Proceedings of the 5th European workshop on probabilistic graphical models, pp 257–264
Tibshirani R (1996) Regression analysis and selection via the Lasso. J R Stat Soc Ser B 58: 267–288
van de Geer S, Bühlmann P (2013) ℓ 0-penalized maximum likelihood for sparse directed acyclic graphs. Ann Stat 41:536–567
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B 67: 301–320
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bühlmann, P. Causal statistical inference in high dimensions. Math Meth Oper Res 77, 357–370 (2013). https://doi.org/10.1007/s00186-012-0404-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00186-012-0404-7