1 Introduction

Several prominent scientists, scientific institutions, and philosophers have argued that science, by virtue of its methods, is committed to methodological Naturalism (MN), the view that science cannot consider or evaluate hypotheses that refer to supernatural entities, even if such entities do in fact exist.Footnote 1 Earlier attempts to separate the domains of science and religion were made by the late evolutionary paleontologist Stephen Jay Gould, who claimed that science and religion occupy two independent, non-overlapping magisteria (NOMA): the magisterium of science covers the empirical realm—what the universe is made of and how it works, whereas the magisterium of religion deals with questions of ultimate meaning and value (Gould 1997). Since the domains of science and religion do not overlap, there can be no conflict between them.Footnote 2

MN can be viewed as an extension of Gould’s effort towards reconciling science and religion, but via a different route. It asserts that the methods of science are inherently naturalistic, and hence incapable of examining supernatural entities or phenomena. For instance, philosopher Robert Pennock writes:

[S]cience does not have a special rule just to keep out divine interventions, but rather a general rule that it does not handle any supernatural agents or powers. (Pennock 1999, 283–284)

Philosopher Michael Ruse reaffirms this position:

[S]cience [makes no] reference to extra or supernatural forces like God. Whether there are such forces or beings is another matter entirely, and simply not addressed by methodological naturalism … [T]he methodological naturalist insists that, inasmuch as one is doing science, one avoid all theological or other religious references. (Ruse 2005)Footnote 3

Thus, according to MN, science is necessarily mute on the question of whether or not supernatural phenomena exist, because, due to its inherently naturalistic methods, it simply does not have the tools to investigate the supernatural.

These views are not merely academic or idle, as they have the potential to directly impact policies concerning science education. For instance, in their publication Teaching About Evolution and the Nature of Science, the National Academy of Sciences (NAS) writes:

Because science is limited to explaining the natural world by means of natural processes, it cannot use supernatural causation in its explanations … Explanations employing nonnaturalistic or supernatural events, whether or not explicit reference is made to a supernatural being, are outside the realm of science and not part of a valid science curriculum. (NAS 1998)

Similarly, according to Eugenie Scott, director of the National Center for Science Education (NCSE), “Science is a way of knowing that attempts to explain the natural world using natural causes. It is agnostic toward the supernatural—it neither confirms nor rejects it.” (Scott 1999; see also Scott 2008). The principle of MN was also adopted by Judge Jones in the Kitzmiller v. Dover trial concerning the teaching of intelligent design (ID) in science classrooms (Jones 2005).

We have previously argued that both NOMA and MN are mistaken, and that science can test, both in principle and in practice, at least some supernatural claims (Boudry et al. 2010; Fishman 2009).Footnote 4 To be sure, scientific evidence may ultimately support a naturalistic worldview and hence justify the adoption of a pragmatic form of MN, which tends to discount supernatural explanations as a general methodological guideline, on the grounds that they are extremely unlikely given the consistent failure of supernatural hypotheses in the past. We have termed this view Provisional MN, and distinguish it from Intrinsic MN, which considers supernatural explanations to be off-limits to science in principle (Boudry et al. 2010; see also Edis 1998, 2002; Fales 2009). Thus, in contrast with Intrinsic MN, Provisional MN reflects a contingent outcome of scientific investigation and does not presuppose an a priori commitment to natural causes and explanations. Some authors have argued that Intrinsic MN is a ploy to insulate religion from critical examination and to cater to certain political aims, such as neutralizing religious opposition to evolution.Footnote 5 Indeed, to exclude the supernatural a priori would seem to validate the persistent claim of ID adherents and other creationists that science is dogmatically committed to naturalism and refuses to even entertain supernatural explanations (Johnson 1999; Nagel 2008; for further references, see Boudry et al. 2012). While we do not take issue with Provisional MN, we have maintained that Intrinsic MN imposes artificial constraints on science which are antithetical to its fundamental goal: to pursue the truth about the nature of reality on the basis of the evidence, wherever it may lead (Boudry et al. 2010; Fishman 2009).

2 Ontological Naturalism as a Presupposition of Science

Recently, biologist and philosopher Martin Mahner (2012) has advanced the discussion by challenging both MN (in either of its forms) and the notion that supernatural claims are amenable to scientific evaluation. According to Mahner, the reason why science cannot consider supernatural claims, is not because it presupposes MN, but rather because it presupposes Metaphysical or Ontological Naturalism (ON), the stronger view that supernatural entities, such as gods, ghosts, spirits, do not exist (exactly the position that proponents of Intrinsic MN were trying to avoid). Mahner claims that ON is an essential part of the metaphysical framework of science in the sense that if it were rejected science would be impossible. Indeed, in order to reliably employ the empirical methods of science in the first place, scientists must presuppose ON as an a priori principle and thus be “committed to the “presumption of nonexistence”…with regard to God’s existence” (p. 1456). Accordingly, “if ON is a metaphysical presupposition of science, science should be unable to deal with anything supernatural” (p. 1448).

It is important to be clear about Mahner’s position here. He is not claiming that ON is indefeasible, or that it should be dogmatically held irrespective of the evidence for or against it (indeed, he refers to his position as ‘provisional ON’- a qualification we discuss later on). Rather, he argues that the successful application of scientific methodology and evaluation of evidence require that we presuppose that ON is true; indeed, its truth explains the success of science. Conversely, given that ON is an intrinsic feature of science, if ON were false, science would fail as an enterprise. Consequently, science cannot test the supernatural, because “the empirical operations employed to produce such evidence [for or against the supernatural] presuppose the nonexistence of the very entities whose existence is supposed to be confirmed by this evidence” (p. 1449). To summarize, Mahner is advancing two main claims: 1. Science is not metaphysically neutral, but relies for its success on the presupposition that ON is true. 2. Scientists cannot test the supernatural, as doing so leads to a paradox, given their commitment to ON.

In addition to ON, which he refers to as the ‘no-supernature’ principle (item f below), Mahner mentions several other general metaphysical principles that he claims must also be presupposed before any reliable empirical science can be done. These include the following:

  1. (a)

    ontological realism

  2. (b)

    the (ontological) lawfulness principle

  3. (c)

    the ex-nihilo-nihil-fit principle

  4. (d)

    the antecedence principle and an ontological conception of causation

  5. (e)

    the no-psi principle

  6. (f)

    the no-supernature principle

Here we will argue that science presupposes neither naturalism (in any of its forms), nor any of the other principles cited by Mahner as constituting “necessary metaphysical presuppositions” of the scientific enterprise. Concurring with other authors, we maintain that ON is not an a priori presupposition of science, but rather a defeasible conclusion of science, based upon the available evidence to date.Footnote 6 We will also argue that there is nothing inherent to so-called ‘supernatural’ claims which would prevent them from being amenable to scientific investigation, at least in principle. Indeed, science can evaluate (and in fact has already evaluated) supernatural claims according to the same explanatory criteria used to assess any other ‘non-supernatural’ claim. These criteria include explanatory virtues such as explanatory power (goodness of fit to the evidence), simplicity/parsimony (data compression, unification), and non-ad hoc-ness (introduction of unsupported auxiliary hypotheses merely to save a hypothesis from disconfirmation). To the extent that a supernatural explanation satisfies these explanatory criteria better than rival naturalistic explanations, it should be provisionally favored. To the extent that it does not, it should be provisionally rejected. Finally, we will show that it is in fact Mahner’s position which runs into a paradox: insofar as his metaphysical principles are presupposed by science, they are not provisional and defeasible by the evidence, and insofar as they are provisional and defeasible, they are not presupposed by science.

Before going further, it is important to clarify the meaning of the terms ‘natural’ and ‘supernatural’, which have been notoriously difficult to pin down. By the end of this paper, we will express doubts about whether any epistemic significance should be attached to them. However, in order to evaluate the claim that supernatural hypotheses are scientifically untestable, or that science presupposes ON, we have no choice but to engage with these terms. Thus, for the sake of argument, we will adopt a working ‘umbrella’ definition of ‘supernatural’ as referring to entities or phenomena that possess one or more of the following characteristics: (1) They operate in ways that fundamentally violate our current understanding of how the world works, (2) they exist outside the spatiotemporal realm of our universe (though they may still causally interact with our universe), and (3) they suggest that reality is at bottom purposeful and mind-like, particularly in a sense that implies a central role for humanity and human affairs in the cosmic scheme. We neither expect that this definition will encompass all uses of the term, nor do we expect complete agreement on the characteristics we have included under it. However, it is both reasonably close to Mahner’s own characterization (see below) and captures the colloquial sense of the term, i.e., what most religious people and believers in the paranormal have in mind when they refer to extraordinary entities, such as gods, ghosts, spirits, and psi phenomena, as ‘supernatural’. This is important, because (as discussed further below) if we depart too much from the folk notion of the supernatural, questions about the testability of “supernatural” claims can be decided by analytic definition and become wholly uninteresting.

3 Evaluation of Metaphysical Presuppositions of Science

As Mahner focuses his discussion primarily on ON, we will similarly concentrate our discussion on the ‘no-supernature principle’, and only briefly comment on the other alleged ‘metaphysical presuppositions’ of science.

Let us first consider (a), the presupposition of ontological realism. Mahner argues that scientists must presuppose that their experiments involve “real entities in a real world, not just objects existing in our mind. That is, we work on the basis of ontological realism, which helps to explain not only the success but in particular the failure of scientific theories” (p. 1440). However, we maintain that ontological realism, while it may partly explain the success of science, is a defeasible conclusion of science—one that is arrived at by consideration of the evidence. What makes something ‘real’, and not just a figment of our imagination or a social construction, is that it exhibits a consistent pattern irrespective of (or indeed in spite of) our subjective beliefs, thoughts, biases, or desires. Whether or not there are phenomena that fulfill this criterion is empirically discoverable through science. Ontological realism about the entities described by science is the conclusion of an inference to the best explanation on the basis of the available evidence, not a presupposition of science (e.g., Niiniluoto 1999).

Concerning the ontological lawfulness principle (b), Mahner argues that “a lawful world is not a piece of empirical knowledge: it is a necessary condition of cognition. Without things behaving regularly due to their lawful properties, no organism would be able to learn much about the world” (p. 1440). There is some truth to this claim about human cognition, however, we dispute the notion that general lawfulness must be presupposed by science. Science attempts to identify patterns and regularities in the world, so as to provide empirically adequate and parsimonious explanations of phenomena. However, scientists need not presuppose that there is lawfulness or regularity across the board; rather they observe and interrogate nature to see if and to what extent regularity is to be found. In a general sense, scientists do act as though nature is not going to deceive us, that emeralds are ‘Green’ and not ‘Grue’,Footnote 7 that the fundamental laws of nature are constant over space and time. However, for any given regularity, the assumption that it will persist is always defeasible.

According to evolutionary epistemologists, evolution has endowed us with an extraordinary ability to detect patterns and to assume that the world follows a lawful and regular course (De Cruz et al. 2011; Papineau 2000). These mental heuristics allowed our ancestors to exploit regularities in the environment so as to successfully navigate through the world. Science is a more refined and systematic way of identifying these regularities. In science, as in everyday reasoning, we provisionally assume that the laws of nature are not going to change spontaneously or capriciously. Also, if we are dealing with as yet unknown phenomena, it is a safe bet to assume that they will exhibit some form of regularity. This pragmatic default assumption is provisionally adopted in order to further our scientific investigations, and to maximize our chances of capturing true regularities in nature, if they exist.Footnote 8

While we agree with Mahner that no organism would be able to thrive, and cognition would be impossible, without there being at least some regularity in the environment, this is simply a consequence of the weak ‘anthropic’ principle, according to which the reality in which we find ourselves must be consistent with our existence (see Stenger 2007b). Indeed, without some minimal form of regularity in the world to support cognition, we wouldn’t be here to do science in the first place. So, in the rather trivial anthropic sense, given that science requires the existence of observers, some degree of lawfulness is required before science can be possible. But beyond this minimal anthropic requirement, a blanket presupposition of regularity is not needed if we want to find out something about the world. Similarly, it is rather uncontroversial that nature must display some stable patterns if science is to be a worthwhile endeavor. However, this does not mean that scientists must presuppose that nature is in fact lawful. Lawfulness is not an a priori presupposition, but rather an a posteriori conclusion of scientific investigation.

To illustrate the ex-nihilo-nihil-fit principle (c) (the notion that ‘something’ cannot appear from ‘nothing’), Mahner describes a standard high school science experiment involving the production of oxygen gas from yeast and hydrochloric acid. Upon failing to observe the gas, Mahner argues, “no scientist would seriously entertain the idea that somewhere in the experimental setup the gas has literally dissolved into nothing. Conversely, no scientist would assume that we can produce gas out of nothing” (p. 1441). Contra Mahner, we maintain that the ex-nihilo-nihil-fit principle is not a metaphysical presupposition of science, but rather an a posteriori conclusion of scientific inquiry. In principle, science could be led to the extraordinary conclusion that the law of conservation of energy (that mass-energy cannot be created or destroyed) is violated after repeatedly observing ‘something’ coming from ‘nothing’ and after thoroughly investigating and ruling out alternative explanations. Thus, scientists need not presuppose a priori the impossibility of creatio ex nihilo.

Concerning (d), Mahner writes:

[W]e must assume that causation is for real and hence an ontological category, as well as that there is a principle of antecedence: causes precede their effects in time, so that the present is determined causally or stochastically by the past, but not conversely. In other words, we need to assume not only that the experimental setup (or the world in general) is real, but also that we can interact with it and that our actions can trigger orderly chains of events. (p. 1441)

Again, we submit that there is no need for scientists to presuppose this principle a priori. At most, it is a tacitly assumed default assumption. In principle, it is possible to empirically test whether we can interact with the experimental setup and that causes precede effects (see the Bem study about precognition below). Indeed, certain puzzling phenomena at the microscopic level of reality described by quantum mechanics may be best explained by backward causality (Price 1997, Stenger 2000). Hence, the principle of antecedence need not be considered a metaphysical presupposition of science.

Analogously, before the advent of relativity theory, the notions of absolute (Euclidean) space and time held sway in physics. Specifically, it was assumed that events which appear as simultaneous in one reference frame must also appear as simultaneous in another. However, Einstein’s theory of relativity challenged this assumption, along with other intuitive or ‘common-sense’ notions of space and time. Now this new relativization of time and space threatened an influential argument in Immanuel Kant’s Critique of Pure Reason, according to which Euclidean space and absolute time are the a priori forms of knowledge, without which physical science would be impossible. Although some followers of Kant resisted the new theory of relativity on a priori grounds, most Neo-Kantians tried to incorporate these new findings in their transcendental philosophy (Ryckman 2006). It is instructive to consider the detrimental effects on scientific progress that would have ensued if physicists had clung to the philosophical a priori forms as defined by Kant, and considered the notions of absolute time and Euclidean space to be inviolable metaphysical (or transcendental) presuppositions of science.

Let us now consider the ‘no-psi’ principle (e). Mahner thinks that scientists must presuppose the non-existence of psi (e.g., extrasensory perception, telepathy, telekinesis) in order to “exclude the possibility that the experimental setup can be causally influenced in a direct way solely by our thoughts or wishes” (p. 1441) and to ensure that “[n]either humans nor little green aliens from another galaxy … [are] able to meddle, just by thinking alone, with empirical methods or our perceptional and conceptual processing of their results” (p. 1442).

Mahner argues that the ‘no-psi’ principle applies even more so when considering ‘supernatural’ entities with exceptional powers:

… we must stipulate…that no supernatural entity manipulates either the experimental setup or our mental (neuronal) processes or both…If we admit the supernatural, there is no reason to exclude a priori the existence of a malicious entity that could meddle with the world including our cognitive processes. So we need to start with the postulate that no such entities exist. (p. 1442)

Thus, scientists must also presuppose the ‘no-supernature’ principle (f). Perhaps what Mahner means by scientists having to ‘presuppose’ the principles of ‘no-psi’ and ‘no-supernature’ is that scientific methods are reliable only if we live in a certain kind of world, namely one without psi and the supernatural. On this charitable reading of Mahner, the non-existence of psi and the supernatural is a logical precondition for the success of science. However, even if this is the case (and we shall argue that it is not), it does not follow that scientists must “stipulate” that psi and the supernatural do not exist and, accordingly, are prohibited from ever considering paranormal or supernatural explanations when evaluating the results of their experiments. Thus, Mahner’s statements above suggest two distinct interpretations of the term ‘presuppose’: According to the first interpretation, science could not work if certain metaphysical principles were false, whereas according to the second interpretation, scientists are forbidden to consider explanations that would imply the falsity of these metaphysical principles.

In line with the first interpretation, one may concede that in order to be a successful enterprise, science requires that Mahner’s principles are true to some extent (for instance, that the world displays some lawful behavior). However, this interpretation does not entail that scientists must presuppose or “stipulate” the truth of these principles a priori (the second interpretation). Contrary to the second interpretation, we maintain that, while scientists may infer from the empirical success of their theories that (most probably) no supernatural forces have interfered with their work, they need not (and should not) presuppose from the outset that the supernatural does not exist. Psi hypotheses can be held up to the evidence by testing whether our experimental results can be influenced solely by our thoughts or wishes. Indeed, science already has investigated psi phenomena using methods employed in standard scientific practice (e.g., Alcock 2003). For instance, a recent peer-reviewed paper by psychologist Daryl Bem (2011) reported results of parapsychological experiments which appeared to support the existence of ‘precognition’—the awareness of future events prior to their occurrence. While the results of this study and their interpretation have been challenged (Alcock 2011; Wagenmakers et al. 2011), methodologically sound attempts to replicate the findings by independent investigators have been carried out (and failed) (Ritchie et al. 2012). This shows that, in principle, science could demonstrate the existence of psi, provided that the methodology of the studies is sound. Moreover, it is not immediately clear why the scientific enterprise would collapse as soon as one unmistakable piece of evidence for psi is found. What if we had good empirical reasons to believe that the effects of psi are limited in range and power? (in Sect. 6 we discuss the same problem with respect to divine interventions). If the ‘no-psi’ principle is ‘presupposed’ by science, however, in the sense that scientists are forbidden to consider paranormal explanations, then how could we make sense of experiments such as Bem’s?

In what follows, we examine in greater detail the rationale behind Mahner’s ‘no-supernature’ principle (f).

4 Defining ‘Natural’ and ‘Supernatural’

According to Mahner, naturalism is the view that “all that exists is our lawful spatiotemporal world”; conversely, supernaturalism is the view that there is another non-spatiotemporal world transcending the natural one, “whose inhabitants—usually assumed to be intentional beings—are not subject to natural laws” (p. 1437). This characterization echoes clause (2) of our working definition of the supernatural, while the reference to intentionality roughly corresponds to clause (3). There is still some ambiguity about what ‘natural laws’ amount to, however, and there may even be a concern about circularity (defining ‘supernatural’ in terms of ‘natural’).Footnote 9

In our previous publications (Boudry et al. 2010; Fishman 2009), we argued that there is no inherent barrier preventing science from evaluating supernatural claims.Footnote 10 To make things more concrete, we provide a (non-exhaustive) list of observations which, according to both our umbrella definition and Mahner’s definition, would constitute at least tentative evidence for supernatural entities or phenomena:

  1. 1.

    Intercessory prayer can heal the sickFootnote 11 or re-grow amputated limbsFootnote 12

  2. 2.

    Only Catholic intercessory prayers are effective.

  3. 3.

    Anyone who speaks the Prophet Mohammed’s name in vain is immediately struck down by lightning, and those who pray to Allah five times a day are free from disease and misfortune.

  4. 4.

    Gross inconsistencies are found in the fossil record and independent dating techniques suggest that the earth is less than 10,000 years old—thereby confirming the biblical account and casting doubt upon Darwinian evolution and contemporary scientific accounts of geology and cosmology.

  5. 5.

    Specific information or prophecies claimed to be acquired during near death experiences or via divine revelation are later confirmed - assuming that conventional means of obtaining this information have been effectively ruled out.

  6. 6.

    Scientific demonstration of extra-sensory perception or other paranormal phenomena (e.g., psychics routinely win the lottery).

  7. 7.

    Mental faculties persist despite destruction of the physical brain, thus supporting the existence of a soul that can survive bodily death.

  8. 8.

    Stars align in the heavens to spell the phrase, “I Exist—God”.

The aforementioned observations would not prove conclusively that the supernatural exists, as it seems always possible that a natural explanation will ultimately be found to account for them.Footnote 13 The question, of course, is whether we can always adjudicate between what is a ‘natural’ and a ‘supernatural’ explanation for such phenomena. The reason why we have offered only a rough-and-ready definition of the supernatural is that any distinction between natural and supernatural is bound to break down in various scenarios. Robert Pennock hints at the treacherous nature of this distinction:

This is not to say … that things we now think of as supernatural necessarily are so. It could turn out, for example, that ghosts exist but that unlike our fictional view of them, they are subject to natural law. In such a case we would have learned something new about the natural world … and would not have truly found anything supernatural. (Pennock 1999, 389)

But what exactly would it mean for ghosts to be subject to “natural law”? That they consist of matter and energy as traditionally conceived? That their behavior follows a lawful course, or that they cannot violate physical conservation laws? All of these possibilities leave room for ambiguity.

Let us consider the first item on our list. A ‘natural’ explanation for the efficacy of intercessory prayer still seems logically possible. For instance, perhaps extraterrestrials are fooling around with us and mocking our silly religious beliefs. Suppose that their prayer pranks are accomplished by means of some advanced technology that appears miraculous to us. In fact, these aliens live in our universe and are composed of the same physical stuff as us. This explanation does not sound supernatural, at least not by any reasonable stretch. But what if these aliens live in a different dimension, or have mastered the art of psychokinesis? Should they be considered supernatural?

Mahner himself seems to have trouble maintaining consistency in his characterization of the supernatural. He writes that telepathy could be a ‘natural’ alternative explanation for the efficacy of prayer. But this suggestion would directly contradict his ‘no-psi’ principle (e). Indeed, how do we decide whether telepathy would be a truly supernatural phenomenon? Is there some special psi-particle involved? Does it interact with other particles, or is it something radically different? If it turned out to be an irreducibly mind-like process, that would overthrow our whole framework of physics. But if telepathy were reducible to a mindless physical force, the label ‘supernatural’ would arguably not be appropriate (see Carrier 2007).

Let us put some more pressure on our own working definition (and Mahner’s). An omniscient and omnipotent creator who is not part of our spatiotemporal universe is generally considered ‘supernatural’. But what if this creator is a cosmic programmer who resides in a separate universe bounded by its own laws? What if he is composed of physical stuff and is not an irreducibly mind-like entity? Thus, it seems that our intuitions about what is and what is not supernatural fail us here. In our view, it does not really matter how we judge every conceivable scenario. At most, the concept of the supernatural is a convenient term to capture a rough-and-ready distinction between mundane stuff and extraordinary, transcendent, mind-like stuff. But the distinction between ‘natural’ and ‘supernatural’ is far too hazy to expect it to do any substantial philosophical work.

Pennock has suggested an analytic definition to solve this problem: “[I]f we could apply natural knowledge to understand supernatural powers, then, by definition, they would not be supernatural” (Pennock 1999, 290). As we have argued before (Boudry et al. 2010), however, this analytic definition appears to beg the question, and reduces the issue of whether science can test the supernatural to triviality—as scientific testability becomes simply a matter of how one chooses to label particular entities or phenomena, rather than being related to their ontological and epistemic status. Pennock’s definition reduces to a play of words without substance, and without any contact with the colloquial sense of ‘supernatural’. This can be clearly appreciated by considering his claim that “supernatural beings and powers are not controllable by humans” (Pennock 1999, 290). This claim clearly contradicts what most religious people believe, as well as what is written in many scriptural texts, with regard to the efficacy of intercessory prayer.Footnote 14

In short, we submit that the terms ‘natural’ and ‘supernatural’ cannot be defined or differentiated in a crisp and non-trivial way. More specifically, no one has yet offered a definition of the term ‘supernatural’ which (1) captures standard religious miracle claims about gods, ghosts, and spirits, (2) is non-circular and independent from the definition of science, and (3) entails that supernatural claims can never be tested by science. Clearly, if the term cannot be sharply and coherently defined, then policies aimed at excluding the supernatural from scientific consideration are misguided. If gods, ghosts, and spirits are able to interact with the spatiotemporal world, such that we could directly or indirectly detect their effects—as is implied by standard religious miracle claims—then it doesn’t matter much whether these entities are labeled as ‘natural’ or ‘supernatural’. Whether the entities or phenomena posited by claim X are defined as ‘natural’ or ‘supernatural’ is irrelevant to its scientific status (Fishman 2009).

In any event, given our working definition, we believe that our list of observations would constitute powerful, albeit defeasible, evidence for the supernatural. Conversely, the consistent absence of this evidence counts as evidence against the supernatural (Fales 2009; Fishman 2009; Monton 2006; Stenger 2007a).Footnote 15 One might object that these observations cannot provide evidence for the supernatural since one must first specify details of the particular supernatural hypotheses under consideration and the extent to which they would predict such observations (we will discuss and critique this objection later on). However, for any given observation X, one may ask, what is the most plausible explanation for X? Whereas all logically consistent explanations are possible, some will be grossly implausible relative to other rival explanations—and natural explanations for the observations listed above seem highly implausible relative to supernatural ones. Indeed, what could plausibly explain the effects of intercessory prayer other than the actions of a supernatural entity—or at least a ‘functionally equivalent’ non-supernatural entity with comparable powers?

5 Overnatural Versus Transnatural

In order to resolve the controversy over the testability of supernatural claims, Mahner suggests that the term ‘supernatural’ has been applied to two different categories of phenomena, which he calls the ‘overnatural’ and the ‘transnatural’ (following Spiegelberg 1951; see also Tanona 2010). Overnatural entities are effectively super-powered beings with quasi-natural properties, whereas transnatural entities are categorically different from ‘natural’ ones, so much so that their properties are essentially mysterious, ineffable, and incomprehensible.

Thus, while the overnatural seems to be somewhat intelligible by analogy with known natural properties (e.g., an anthropomorphic God), the transnatural is ‘wholly other’ and therefore entirely enigmatic and unknowable (e.g., an abstract, incomprehensible God). According to Mahner, those who maintain that the supernatural is testable seem to conceive of the supernatural as merely overnatural, whereas those who argue that the supernatural is inherently untestable regard the supernatural as transnatural, and hence both inaccessible and unintelligible.

We agree that the term ‘supernatural’ has often been indiscriminately applied in referring to these two kinds of entities. In fact, it is conceivable that ‘transnatural’ notions of God have been constructed by theologians in an attempt to evade evidence which is difficult to reconcile with the more intelligible ‘overnatural’ or anthropomorphic conceptions of God. However, if the transnatural refers to entities that are ineffable and unintelligible, then it is trivially true that science cannot examine transnatural-supernatural entities, for the obvious reason that science cannot evaluate unintelligible hypotheses. As Mahner himself notes, “[t]here neither is an ontological theory proper of the transnatural nor could there be, because there can be no theory of the unintelligible” (p. 1447). Thus, scientists need not presuppose the non-existence of the transnatural-supernatural, but may reject it on the grounds that it has no real content to begin with, just as they would reject any other unintelligible concept, natural or otherwise. If, as Mahner claims, “the notions of omnipotence and omniscience are incoherent” (p. 1446), then science should reject entities bearing these characteristics on that basis, not because they are supernatural per se.

However, the only supernatural entities or phenomena which most religious believers care about or can relate to are overnatural ones.Footnote 16 As philosopher Evan Fales writes:

To show that supernatural beings are in principle unobservable, one would have to show that they cannot have any causal impact upon our experience, either directly or through changes they produce in the material world. But no theist will accept that causal restriction upon her favored supernatural beings. Perhaps it could be shown that a disembodied spirit, lacking mass, spatial location, and arguably even temporality, could in principle not causally interact with our physical world. But no one, to my knowledge, has ever demonstrated such a thing; and if it could be demonstrated, theism would immediately become an uninteresting religious position. (Fales 2009)

Hence, the category of transnatural-supernatural (unintelligible) concepts is empty, whereas the category of conceivable overnatural-supernatural concepts is non-empty; and we have seen how science could investigate (and, indeed, already has investigated) the supernatural, even according to Mahner’s own definition of the term. Thus, we conclude that the ‘no-supernature’ principle is unfounded.

6 Is the Supernatural a “Science Stopper”?

Mahner sets up a slippery slope that is often appealed to by defenders of MN. If we entertain the possibility of a supernatural presence in this world, so the argument goes, then such a being would be “able to interfere with the lawful course of natural events, hence also with our brain functions” (p. 1446). But this interference would undermine our scientific efforts insofar as we could no longer trust the results of any of our experiments (or, indeed, any of our daily inferences). For all we know, supernatural entities could be wreaking havoc and trying to sabotage both our science and our mental processes. Thus, as the existence of the supernatural would be a ‘science stopper’, scientists must presuppose the ‘no-supernature’ principle.

But even if we agree that supernatural entities have the power to suspend natural laws, intervene at will, mess with our test tubes, etc., why assume that they would maximally take advantage of these possibilities? For no apparent reason, Mahner assumes that God (or any other supernatural being) would never take half measures: either he would not intervene at all, or he would mischievously interfere in all our experiments. According to Mahner, to restrict the intervention of such entities would be ad hoc. But the problem of ad hoc-ness cuts both ways. Until we have evidence for God, any assumption about his intervention policy is as arbitrary as the next one.

We can agree that, if tomorrow all regularities in the world were suspended, then science, as we know it, would be impossible (also in light of the ‘anthropic principle’ discussed earlier). Indeed, the absence of meddling supernatural entities may very well help to “explain why science works and succeeds in studying and explaining the world” (p. 1438). But this is not an all-or-none matter. If God exists, nothing dictates that he must intervene constantly in the world, or that he must do so in a capricious way. Any god that intervenes at least once in the universe, performing miracles or revealing himself may be properly called an ‘interventionist’ god. If such a deity decides to intervene at most occasionally (as most theists believe), allowing nature to unfold on its own accord the rest of the time, this still leaves much regularity in the world. Suppose that God’s last special intervention in the world was the resurrection of Jesus. Even if God could intervene at any time, because he is omnipotent, this does not necessitate that he will. It remains possible that after two millennia of laissez faire policy he will intervene again, and perhaps violate the lawful regularities on which science depends. However, this concern is simply a variant of the old problem of induction, which even the metaphysical naturalist must contend with.

By the same token, suppose that we discovered aliens visiting us here on Earth or living on another planet. If these aliens were sufficiently technologically advanced, they could intervene in our affairs or tamper with our experimental results in a manner that would be inscrutable to us. Would this discovery necessarily spell the end of science? We do not think so. Indeed, there is no reason to suppose that these aliens will, in fact, intervene in our lives and manipulate the results of our experiments; and even if they have intervened at some point, as suggested by the directed panspermia hypothesis of Francis Crick and Leslie Orgel, there is no reason to assume that they will do so in the future. Finally, it is possible that a supernatural cosmic programmer is responsible for generating all of the regularities and patterns that we observe and which our scientific theories attempt to explain. In this case, science would work perfectly well even if the supernatural exists.

Mahner is also concerned that admitting even a single supernatural entity would put us “on a slippery slope to admitting as many as we fancy” (p. 1451). If we admit the supernatural, then no holds are barred. But this concern really stems from the rashness with which many religious believers jump to supernatural explanations, and not from any intrinsic difficulty with supernatural hypotheses. True enough, sloppy reasoning puts us on a slippery slope to more sloppy reasoning. If we accept the miracle stories in the Bible on the flimsiest of grounds, then we can no longer withhold belief in miracles coming from other holy books, and indeed we open the gates to all sorts of supernatural claims. But this is different from accepting a supernatural claim only after rigorous scientific investigation. Moreover, the discovery of evidence for the supernatural, e.g., positive effects of intercessory prayer, would not necessarily be a ‘science stopper’. On the contrary, one could imagine that scientists would be greatly excited about such a groundbreaking discovery, and would rush to do further research.

To summarize, Mahner’s thesis that science must presuppose ON as a metaphysical principle rests in part on the claim that if the supernatural exists, then our cognitive processes and scientific results cannot be trusted. We have argued that this claim is questionable.

7 Methodological Versus Ontological Naturalism Revisited

Mahner considers the views of several proponents of MN (namely Forrest, Pennock, and Ruse), and concludes that upon closer examination MN collapses into provisional ON, the view he endorses. Specifically, when proponents of MN hold that the application of scientific methods requires that scientists assume the non-existence of the supernatural, they are not merely referring to the methodological or epistemological requirements of science. In fact, they are making an ontological commitment to the presupposition that the supernatural does not exist. Hence, MN is not neutral with regard to God’s existence and is actually equivalent to provisional ON.

While MN is not the main focus of this paper, we are sympathetic to Mahner’s critique of MN and its artificial attempt to separate methodology and ontology (see Boudry et al. 2012 for a similar argument to that effect). However, Mahner’s endorsement of provisional ON creates a paradox. According to Mahner, in order to reliably employ scientific methods, we must assume (that is, presuppose) that ON is true. On the other hand, Mahner claims that ON should be regarded as a provisional “metaphysical null hypothesis, stating that a supernature does not exist” (p. 1444). As commonly understood in science, a ‘null’ hypothesis is one that can be rejected, given sufficiently strong evidence against it. Indeed, Mahner admits that “there could be evidence at most for the more or less anthropomorphically defined overnatural” (p. 1449), from which it follows that ON can indeed be scientifically disconfirmed. But the notion that the overnatural-supernatural could be confirmed via the same scientific methods which, according to Mahner, presuppose its non-existence, is logically inconsistent. Thus, there is an incompatibility between the position that science is committed to ON a priori and that ON is amenable to scientific testing-that is, subject to confirmation or disconfirmation by the evidence. Thus, it appears that Mahner, like Forrest and Pennock before him (Boudry et al. 2010), wants to have it both ways, alternately affirming and denying the defeasibility of naturalism in science.

Now, it seems that Mahner anticipates this objection when he writes that “ON is … provisional as a metaphysical principle, but it is not provisional as an essential feature (an ontological presupposition) of science” (p. 1457). Unlike normal null hypotheses, Mahner claims, metaphysical null hypotheses such as ON are “unfalsifiable by direct empirical evidence”. Rather they can be disconfirmed “indirectly”, for example, if ON turned out to be “incompatible with scientific practice” in the sense that “science could fail as a cognitive enterprise, either in its entirety or in some particular area, so that we might have to reconsider ON” (p. 1444). That is, if science were found to ‘break down’ in some way, then we could infer that ON is false. It is important to note that the apparent failure of science would not necessarily mean that ON in particular should be reconsidered, as this scenario is also compatible with, say, technologically advanced aliens messing with our experiments. Alternatively, the break-down of science may simply indicate that the patterns and regularities we have hitherto observed in the world are not spatio-temporally invariant.

In any event, it is unclear how one could establish the breakdown of science. Presumably, this would require empirical evidence of some sort. But how could one evaluate such evidence without the application of scientific methods? Thus, we submit that the paradox in Mahner’s thesis remains. For as long as scientists must presuppose ON, ‘lawfulness’, ‘no-psi’, etc. they are forbidden from ever considering the possibility that science has indeed broken down or “failed as a cognitive enterprise”. For instance, according to Mahner’s thesis, even if the world were in fact chaotic, scientists must continue to presuppose that the world is lawful and thus could never conclude that the ‘lawfulness’ principle is false. Hence, ON and the other alleged metaphysical presuppositions of science cannot be “indirectly” disconfirmed and are therefore not defeasible after all.

8 Bayesian Confirmation Theory, Ad Hoc Accommodation, and Ockham’s Razor

We have previously discussed ways in which (at least ‘overnatural’) supernatural hypotheses could be scientifically tested, that is confirmed or disconfirmed by the evidence. To further illustrate how the plausibility of supernatural hypotheses may be evaluated and to consider other points raised by Mahner concerning their testability, here we provide a brief overview of Bayesian confirmation theory, which is widely considered to provide an accurate, albeit idealized, account of how rational agents both do and should update their beliefs on the basis of new evidence.Footnote 17 Central to Bayesian confirmation theory is Bayes’ theorem (named after its originator, Rev. Thomas Bayes), which can be readily derived from the axioms of probability theory (Fig. 1). According to Bayesian confirmation theory, our initial degree of confidence in the truth of a hypothesis, H, represented by its prior probability, p(H), is revised on the basis of new evidence, E, which may either confirm or disconfirm H by raising or lowering, respectively, its posterior probability, p(H|E), relative to p(H). The process by which a rational agent’s degree of belief changes in the light of the evidence is constrained by the mathematical framework of the probability calculus.Footnote 18

Fig. 1
figure 1

Two equivalent forms of Bayes’ theorem. Using Bayes’ theorem, an agent’s initial degree of belief in a hypothesis or prior probability, p(H), is updated on the basis of new evidence, E, via the likelihood, p(E|H), to yield a posterior probability, p(H|E). The term p(E) in the denominator (left formula) represents the probability of the evidence under all mutually-exclusive and exhaustive hypotheses. In its expanded form (right formula), p(E) equals the product of the likelihood and prior of each hypothesis, summed over all hypotheses

In essence, Bayes’ theorem is a mapping of prior probability, p(H), to posterior probability, p(H|E), via the likelihood, p(E|H), the probability of the evidence given that the hypothesis is true (Li and Vitányi 1997; see Fig. 1). The likelihood is therefore a measure of how well the hypothesis predicts or fits the evidence. Importantly, the posterior probability of a particular hypothesis depends not only on its prior and likelihood with respect to the evidence, but is inversely proportional to the prior and likelihood of the competing alternative hypotheses, as represented in the denominator of the expanded version of Bayes’ theorem (Fig. 1, right). Thus, a high likelihood is not sufficient for a hypothesis to be confirmed by the evidence; it must also be higher than that of competing hypotheses. Consequently, Bayesian confirmation of hypotheses by the evidence is inherently contrastive, i.e., relative to the set of rival hypotheses.

Given this background, it is easy to see how the existence of a specific supernatural entity, such as the all-powerful, all-knowing, and all-good God of Christianity, is disconfirmed by the available evidence. Many observations are highly unexpected given that the God of Christianity exists. These include the enormous amount of apparently gratuitous suffering in the world (the problem of evil), the flawed or inefficient construction of organisms (i.e., their ‘unintelligent design’), the vast spatio-temporal scale of the universe which renders human life comparatively insignificant in the cosmic scheme, the lack of verifiable evidence for the effects of intercessory prayer, etc. Thus, the Christian God hypothesis (G) has a low likelihood in relation to this evidence (E), compared with Naturalism (N), on which this evidence is quite expected: p(E|G) ≪ p (E|N). This evidence therefore disconfirms the existence of the Christian God—that is, the posterior probability of the Christian God’s existence, given the evidence, is less than its prior probability: p(G|E) < p(G). This result does not necessarily imply that p(G|E) < p(N|E), however, as the posterior probabilities of G and N depend also on their priors.Footnote 19

8.1 Accommodation

However, it is always possible to save a hypothesis from disconfirmation by accommodation, that is, by tacking on additional elements or auxiliary assumptions to the original hypothesis so that it fits the observed data (which would otherwise disconfirm it). Accordingly, philosopher John Shook (2010) describes the history of the debate between theists and atheists over the existence of God as resembling an arms race: atheists come up with a reason to reject a particular theistic hypothesis, and theists respond by modifying the hypothesis so that it escapes the criticism. As Mahner notes, “it is the sole purpose of theological apologetics to come up with protective ad hoc hypotheses to obviate such falsifications” (p. 1449). A particularly extreme example of such an ad hoc “rescue” is the Omphalos hypothesis, which asserts that God created the world according to the biblical account, but deliberately planted evidence giving the appearance of a deep age of the universe and of the process of biological evolution on earth (e.g., the geologic and fossil records) in order to test our faith. Another well-known example from the philosophy of religion is theodicy: the attempt to reconcile, by the introduction of elaborate auxiliary assumptions, the apparently gratuitous suffering observed in the world with the existence of an all-powerful and all-good God (whose existence is otherwise disconfirmed by the evidence). Thus, in reaction to the disconfirming evidence, supernaturalists often increase the complexity of their hypotheses by adding ad hoc auxiliary assumptions that immunize the core hypothesis against falsification. What makes the added assumptions ad hoc is that they do not entail additional observations or predictions that could independently confirm them, or that they do not achieve sufficient unification to offset the increase in complexity (Boudry and Leuridan 2011). They are simply made up out of whole cloth in order to save the core hypothesis from disconfirmation.

It is important to note that the capacity for accommodation is not unique to supernatural hypotheses. Indeed, any hypothesis considered in science or in everyday life, regardless of whether it deals with ‘natural’ or ‘supernatural’ phenomena, can be saved from disconfirmation by accommodation. However, a consequence of such rescue maneuvers is that the overall theory must be made more complex in order to accommodate all of the available data. And as we shall see, there is a probabilistic cost associated with increasing the complexity of a hypothesis. Indeed, a penalty for complexity is required if we are to avoid the conclusion that all rival hypotheses which entail the data are equally plausible (as in the example of Green versus Grue mentioned earlier). There is nothing inherently bad about accommodation, provided that the probabilistic cost of increased complexity does not outweigh the probabilistic advantage gained by the increased goodness of fit to the data. However, an accommodating theory which includes an ad hoc auxiliary assumption that is either unsupported by independent evidence or that increases complexity without a compensatory increase in goodness of fit can be considered implausible on Bayesian grounds. Accordingly, by unnecessarily increasing hypothesis complexity, ad hoc accommodation violates the principle of Ockham’s razor, which embodies the explanatory virtue of simplicity (or parsimony)—often considered one of the hallmarks of ‘good’ scientific explanations.

8.2 Ockham’s Razor

The principle of Ockham’s razor urges us not to multiply entities in our explanations unnecessarily—that is, to choose the simplest explanation (the one with the shortest description or fewest assumptions) consistent with the evidence. Intuitively, we feel that, all else being equal, a simple explanation for a given set of data is more probable than a complex one. Indeed, one of the primary goals of science is data compression—to discover simple models or “laws,” usually expressed mathematically, that provide the most compact description of the available data. And it is generally believed that simple laws are more probable a priori than complex laws that fit the data equally well.Footnote 20 However, why should simpler hypotheses be more probable than complex ones? Can there be an objective, rather than merely pragmatic or aesthetic, basis for Ockham’s razor?

From Bayesian confirmation theory and information theory it can be shown that increasing the complexity of a hypothesis, without there being a compensatory increase in its goodness of fit to the data, yields a reduced posterior probability relative to a simpler hypothesis that fits the data equally well. This reduction in posterior probability may be due to a decreased likelihood, a consequence of adding adjustable parameters to the hypothesis, or to a lowered prior probability. Thus, Ockham’s razor is neither simply a pragmatic or aesthetic principle, nor a prohibition against considering supernatural explanations, but can be justified on probabilistic grounds.Footnote 21 We briefly consider these Bayesian and information-theoretic approaches to simplicity below.

8.2.1 Simplicity Reflected in the Likelihood

Introducing additional entities into a theory has an effect similar to that of introducing additional parameters into an equation. The values of these parameters can be adjusted so that the new theory accommodates the data better than the original theory. Accommodating the data with the modified theory is analogous to fitting a set of data points using a complex mathematical model with one or more parameters whose values are unspecified. From a Bayesian perspective, a complex model with many adjustable parameters whose values can vary over a wide range (e.g., a high-order polynomial equation) is naturally and automatically penalized for its complexity by having a lower likelihood in relation to the data that are actually observed, compared with a simpler model (e.g., a linear equation) that fits the data equally well, but with fewer free parameters. The reason for this is straightforward: a simple model can accommodate fewer possible sets of observations than a complex model. Thus, the simple model makes specific or “sharp” predictions that can be more easily falsified. In contrast, the complex model is more flexible and makes “vague” predictions. As probabilities must sum to 1, given the complex model, the probability mass for the observations is spread out over many possible outcomes (most of which will never be observed). Consequently, the probability of each possible outcome, including the one that is actually observed, is lower than the probability of the outcome that is predicted more sharply by the simpler model. Accordingly, the likelihood of the simpler model that predicts the observed data is higher than that of the complex model (see Fig. 2). This automatic penalty for complex models has been referred to as the “Bayesian Ockham’s razor” (for details see Huemer 2009; Jefferys and Berger 1992; MacKay 2003; Myung and Pitt 1997). Thus, whenever additional adjustable parameters are introduced into a theory to enable it to accommodate evidence that would otherwise disconfirm it, there is a probabilistic price to pay for this increase in complexity.

Fig. 2
figure 2

The Bayesian Ockham’s razor. Schematic illustration of how simple models, theories, or hypotheses (H) tend to have a higher likelihood, p(D|H), with respect to the data (D) observed (in this case, outcome 3) than more complex models or theories that can fit the data equally well (figure adapted from Huemer 2009). Note how the simple theory makes a specific or ‘sharp’ prediction, consistent with only a single outcome (3), whereas the complex theory makes vague predictions consistent with all possible outcomes 1–4. This likelihood advantage of simpler models (those with fewer adjustable parameters) has been called the “Bayesian Ockham’s razor”. Note also that the simpler theory is more easily ‘falsified’ (disconfirmed) than the complex theory, as the likelihood for the simple theory is zero for outcomes 1, 2, and 4, whereas it is non-zero for the complex theory

8.2.2 Simplicity Reflected in the Prior

Another way to accommodate the data is to construct a modified theory with fully specified parameter values which enable it to fit the data as precisely as a rival theory. As the modified theory now makes specific or “sharp” predictions, its likelihood may be comparable to that of the rival theory. For instance, for any given set of data points, one can draw a specific curve that fits them exactly; indeed, there are an infinite number of such curves. Similarly, any number of elaborate and outlandish conspiracy theories can be constructed to precisely fit our observations. Thus, a likelihood equal to 1 may be achieved by merely “hard coding” the data into the body of the modified theory such that it entails them. However, in order to accommodate the data in this manner, the complexity of the theory must increase, which in turn decreases its prior probability.

A probabilistic basis for Ockham’s razor at the level of the priors is provided by combining ideas from Claude Shannon’s information theory and from algorithmic information theory, developed independently by Ray Solomonoff, Andrey Kolmogorov, and Gregory Chaitin.Footnote 22 These information-theoretic approaches provide a principled way to assign initial prior probabilitiesFootnote 23 to hypotheses based on their degree of complexity, where the complexity of the hypothesis is defined as the length of its shortest description, or Kolmogorov complexity.Footnote 24 Essentially, the prior probability of a hypothesis is inversely related to its complexity, such that simple hypotheses (those with a short description) have a higher prior probability than complex hypotheses (those with a long description).Footnote 25 Intuitively, the more independent assumptions an explanation contains, i.e., the longer its description, the more ways it can be wrong—and hence the less probable it is (all else being equal). Specifically, for every additional binary digit (bit) of information or detail added to the description of the hypothesis, the probability of the hypothesis is correspondingly halved (multiplied by ½, the probability of each bit). Thus, in the absence of relevant background information, the initial prior probability may be assigned a value equal to 2-L, where L is the length (in bits) of the hypothesis (e.g., Kirchherr et al. 1997; Wallace 2005).Footnote 26 A similar result follows from the conjunction rule of probability: the conjunction of two independent events A and B—e.g., the outcome of two coin-flips—has a probability that is less than or equal to that of each event alone. A hypothesis that merely restates the data explicitly—i.e., by “hard coding” the data into the description of the hypothesis—provides no compression and has a Kolmogorov complexity that is equal to or greater than that of the data themselves. It is therefore less probable than a hypothesis that compresses the data in the form of a short description or algorithm.

With this brief technical background behind us, we can now address several points raised by Mahner concerning the scientific status of supernatural hypotheses. First, Mahner comments that supernatural hypotheses are underdetermined by the data, in that observations that might be considered as evidence for the supernatural are compatible with alternative causes or explanations. For instance, Mahner writes, “… if there were reproducible evidence that intercessory prayer works, there would be several alternative natural hypotheses compatible with the evidence, such as a telepathic mechanism causing the healing or a superior alien civilization playing a prank on us” (p. 1450). [As noted earlier, by invoking such a telepathic mechanism, Mahner contradicts his ‘no-psi’ principle]. However, philosophers of science have long appreciated that theories are underdetermined by observations. Empirical evidence can always be accounted for by a potentially infinite number of explanations—just as a set of data points can be fit by a potentially infinite number of curves. The problem of underdetermination, therefore, is not unique to the evaluation of supernatural hypotheses, but applies to any inductive or abductive inference, whether it deals with entities or phenomena that are characterized as ‘natural’ or ‘supernatural’. Moreover, some of the alternative explanations suggested by Mahner (e.g., telepathy) are effectively indistinguishable from phenomena and entities that are typically associated with the label ‘supernatural’. So, we submit that Mahner’s objection here is simply based on an argument over terminology, i.e., the definition of ‘supernatural’ (see earlier discussion). Thus, if the problem of underdetermination doesn’t deter us in normal scientific practice, then there is no reason why it would do so when it comes to a special and nebulously defined class of supernatural explanations. And we have seen how Bayesian confirmation theory can help to solve the underdetermination problem based on a consideration of hypothesis complexity.

Mahner has rightly identified an important problem with some supernatural explanations- they are omni-explanatory; and an explanation that explains everything explains nothing at all. This follows directly from the Bayesian considerations discussed above: the likelihood of a hypothesis that makes “vague” predictions (i.e., that is omni-explanatory) is lower than that of a rival hypothesis which makes specific or “sharp” predictions that correspond with the data observed (as illustrated in Fig. 2). Thus, we agree with Mahner that omni-explanatory hypotheses are deficient. However, not all supernatural hypotheses are omni-explanatory. As noted earlier, specific gods with particular attributes do imply specific predictions about what we should observe in the world- and these predictions can be empirically tested (e.g., Stenger 2007a). Nonetheless, it is important to recognize that the rejection of omni-explanatory supernatural entities in science is not based on an a priori commitment to a metaphysical presupposition of ‘no-supernature’. Rather, it is based on considerations of parsimony, and hence, probability. Therefore, while some definitions of God may be too vague to be useful (and indeed, ‘transnatural’ definitions of God may be incoherent), this is a separate issue from the question of whether science presupposes the non-existence of the supernatural.

Similarly, the generic explanation “God did it” can be rejected on the grounds that it offers no compression of the data and provides no information at all about what we should expect to observe in the world (Boudry and Leuridan 2011). Hence, in order to account for the data, we would have to specify this detailed information explicitly, by “hard coding” it into the hypothesis. Indeed, the length of the hypothesis would be equal to or greater than that of the description of the world! Accordingly, given the information-theoretic considerations discussed above, the explanation “God did it” is inconceivably complex and has a correspondingly low posterior probability.Footnote 27 Thus, the doctrine of occasionalism, according to which every event in the world is directly caused by God’s will, is in fact extremely improbable a priori, contrary to what Mahner suggests. Importantly, the explanation “God did it” is not rejected because it violates a metaphysical presupposition of science, but rather because it is vastly less probable than almost any alternative naturalistic explanation for the universe. Indeed, naturalistic explanations for the universe and its operation will be considerably more probable insofar as they are expressed using simple and compact mathematical descriptions that provide good compressions of the data (for instance, Newton’s law of gravitation). Similarly, we can discount the unparsimonious hypothesis that we are just “brains in a vat” being manipulated by scientists living a different dimension purely on probabilistic grounds, without having to presuppose (as perhaps Mahner would do) that this, or similar hypotheses relating to Cartesian skepticism, are false.

Mahner comments that in addition to being omni-explanatory many supernatural explanations can be rejected because they are pseudo-explanatory, as we know nothing about the laws or mechanisms of the supernatural. Thus, to invoke a supernatural explanation for a given observation is simply to replace one mystery with another. However, even if we lack an understanding of the laws or mechanisms by which the supernatural operates, does this mean that we should never consider a supernatural explanation for one of the observations mentioned earlier, such as the effects of intercessory prayer? Indeed, what if intercessory Catholic prayers were shown to reliably heal the sick or re-grow amputated limbs? Should we reject out of hand the possibility that Catholicism has some merit, simply because we have no understanding of how prayer works? Similarly, should we reject the possibility of telepathy a priori, simply because we cannot imagine a mechanism by which psi phenomena might operate? If we were to follow this rationale for rejecting hypotheses a priori, we might also reject the prevailing view amongst physicists that quantum phenomena, such as radioactive decay, are inherently indeterministic, a view which obviates the possibility of discovering an underlying explanation for quantum events.

Finally, Mahner argues that supernatural explanations are deficient in that we have no empirical means of deciding among competing supernatural explanations for a particular observation. For instance, he notes that the origin of a complex organ such as the vertebrate eye may be explained by reference to some creative intervention by a god, devil, angel, demon, or other entity. First, once again Mahner is interpreting the flaws of creationism and ID (e.g., see Young and Edis 2004) as indicating a defect intrinsic to supernatural hypotheses. The inability of ID adherents to discriminate between different supernatural entities is a consequence of the minimalist character of their design inference. In principle, however, nothing prevents design proponents from trying to infer the attributes and intentions of the designer on the basis of his creations. If God really designed the bacterial flagellum, wouldn’t we expect some telltale signs of his activity (some divine signature or moral message)? The reluctance of ID advocates to move beyond the ‘bare design’ hypothesis doesn’t alter the fact that it is possible to flesh out supernatural hypotheses to the point where they entail specific predictions about the world. Second, although vagueness should be avoided as much as possible, Mahner’s objection is not decisive. It is possible that we can infer that some intelligent designer was at work, without being able to tell whether it is an extraterrestrial, a demon, an angel, a coalition of gods, etc. Even if a design argument cannot discriminate between different kinds of designers, these entities still share the common property of being intentional agents with the creative intelligence sufficient to design the vertebrate eye.

Recall that the Bayesian evaluation of a hypothesis is inherently contrastive, i.e., relative to a set of mutually-exclusive rival hypotheses. Hence, even if the probability of the evidence given the ‘generic’ design hypothesis (its likelihood) is very low, owing to the fact that it does not make ‘sharp’ predictions about what we should expect to observe (as discussed above), the likelihood of design might still be higher than that of rival non-design hypotheses. Hence, the evidence (e.g., a complex biological structure with an apparent adaptive function, such as the human eye), might still support a generic design hypothesis over non-design alternatives (Fales 2009; McGrew 2004). The inference to design does not require that we have detailed information about the intentions and abilities of the putative designer; a relative likelihood for the design hypothesis can still be reasonably assigned based on our past experience of the types of artifacts intelligent agents simpliciter are known to create.Footnote 28 Indeed, prior to the discovery of evolution by natural selection, even Darwin considered the argument for intelligent design as propounded by William Paley (1802) to be “conclusive” (Darwin 1876). Of course, it is now clear that complex biological ‘machines’ such as eyes (Nilsson 2009) and bacterial flagella (Pallen and Matzke 2006) can indeed arise via unguided evolution by natural selection, a fact that indirectly disconfirms contemporary versions of ID (Fales 2009). But the mere vagueness of the generic design hypothesis does not render it in principle unconfirmable by the evidence. Whether or not it is confirmed by the evidence depends also on the probability of the evidence on the rival hypotheses (see McGrew 2004 for further discussion of the issue of testability of design hypotheses).

9 Criteria of Good Explanations: Goodness of Fit and Simplicity

Supernatural hypotheses are routinely discounted by scientists not because they violate some tacit metaphysical presuppositions of science, but rather because they either make specific predictions which are at odds with our observations, or they fail to fulfill criteria of good explanations. The problems plaguing some supernatural hypotheses are not unique to ‘supernatural’ hypotheses, but may apply also to (now discredited) ‘natural’ hypotheses, such as the Ptolemaic model of the solar system and the theory of Phlogiston.

According to philosopher David Harker (2008), a good explanation is one that explains much (it has high explanatory power) while assuming little (it is simple or parsimonious). Hence, the plausibility of a hypothesis depends not so much on when it was proposed relative to observing the data (that is, whether the data are predicted or accommodated by the hypothesis), but rather on (1) whether it fits the observed data well, i.e., both accurately and sharply, and (2) the number of independent assumptions that it requires in order to do so. These ideas are naturally and formally captured by the Bayesian and information-theoretic approaches described earlier. Specifically, the preferred explanation for a set of observations has both a comparatively high likelihood (it makes specific and accurate predictions or postdictions, giving it high explanatory power), and a high prior probability (it is parsimonious and consistent with our background knowledge). In general, there is a tradeoff between goodness of fit and simplicity. An increase in hypothesis complexity is justified only if it results in a correspondingly greater increase in goodness of fit (e.g., MacKay 2003; Wallace 2005). For a given set of observations there will be an optimal balance between hypothesis complexity and goodness of fit to the data (Myung and Pitt 1997; Twardy et al. 2005). An explanation that achieves this optimal balance satisfies our informal criteria for “good” explanations.

Thus, the fundamental problem with a theory which merely accommodates the data by the addition of ad hoc auxiliary assumptions, without implying new observations apart from the ones it was designed to explain, is that its complexity is inflated without providing the potential for a compensatory increase in its explanatory power.Footnote 29

As philosopher Paul Herrick writes:

If two potential explanations equally explain the same phenomena, if we adopt the more complicated explanation, we postulate additional entities with no gain in explanatory power. As a result, we take on an increased risk of error (because we might be wrong about the extra entities) with no compensating gain in explanatory power. Unnecessary elements in a hypothesis thus increase the possibility of falsehood with no balancing gain in explanatory power. If we follow the principle of economy (Occam’s razor) and eliminate unnecessary explanatory elements, in the long run we will minimize the risk of error with no loss in explanatory power. (Herrick 2000)

All else being equal, increasing the complexity of a hypothesis in order to accommodate the evidence increases its description length, and correspondingly, decreases its prior (and, hence, its posterior probability). However, if the added assumptions lead to predictions that can be empirically tested, then the accommodating hypothesis may receive additional confirmatory support that will compensate for its increased complexity.Footnote 30

10 Implications for Science Education

Our examination of the scientific testability of supernatural hypotheses and, more generally, of the issue of whether or not science presupposes ON has direct implications for science education policies. If, as we have argued, the scientific enterprise does not require an a priori commitment to methodological or metaphysical presuppositions, in particular Mahner’s ‘no-supernature’ principle, then scientists and science educators should not reject supernatural explanations out of hand.Footnote 31 Rather, they should be rejected on the grounds that they fail to satisfy general criteria of good explanations in science. For instance, Evan Fales writes:

The reason that ID is not good science is not because it invokes a supernatural creator. ID is not good science because the empirical arguments it provides fail on their merits—e.g., because the criteria for irreducible (or “specified”) complexity are defective, question-begging, or not demonstrably applicable to any known organism. (Fales 2009)

Thus, ID should not be taught in science classes as an alternative to Darwinian evolution not because it may make reference to a supernatural designer,Footnote 32 but rather because its claims do not meet the standards of good explanations (see also Clark 2009; Laudan 1982). In agreement with previous authors (Martin 1994), we believe that teaching science students how to think critically and how to evaluate hypotheses according to the criteria of good scientific explanations (perhaps using the Bayesian and information-theoretic frameworks outlined above) is as important as teaching them what to think. Accordingly, except for the purpose of teaching critical thinking skills and the history of scientific thought, science educators need not waste their (and their students’) time considering discredited theories such as old earth creationism, phlogiston, disease as due to demonic possession, dowsing, psychic surgery, spiritualism, psi, flat earth theory, homeopathy, astrology, phrenology, or Ptolemaic astronomy. Again, rejection of these theories is not based on a priori methodological or metaphysical presuppositions of science, but on the grounds that they make predictions that conflict with the available evidence or they are unparsimonious.

While the approach of following the evidence wherever it leads, and evaluating claims according to the degree to which they fulfill the aforementioned explanatory criteria (and potentially others) is one that characterizes standard scientific practice, adopting this philosophy in science education makes matters both easier and more difficult for science teachers. On the one hand, given a commitment to an open, unbiased evaluation of claims, science educators need not answer vexing (or unanswerable) questions such as: is the claim religious or supernatural?, in order to ensure that the content of their classes conforms to legal restrictions on public education or to a set of alleged methodological or metaphysical presuppositions concerning the nature of science. For instance, if the topic of creationism/ID comes up in a science class, a teacher need not dismiss it on the grounds that it is a religious concept involving the supernatural; rather, it may be dismissed based on a dispassionate scientific evaluation of its explanatory merits. On the other hand, this approach clearly has the potential to offend students who come from religious backgrounds and may therefore impede science education. By defining the supernatural out of science, the position outlined by the NAS and NCSE allows teachers to put the matter of creationism/ID swiftly aside, which can be seen as an advantage. In our approach, by contrast, they can no longer avail themselves of this convenient shortcut.

In principle, biology teachers need not broach the subject of ID at all, just as physics teachers need not be bothered with astrology or dowsing. However, students often challenge their biology teachers with familiar creationist arguments, and indeed are encouraged to do so by the Discovery Institute.Footnote 33 For many students with a religious upbringing, the conflict between evolutionary science and religion is a live issue, and they are well acquainted with the theory of ID in any case. In such circumstances, it would be ill-advised to ignore the topic altogether. If students raise the subject of ID, teachers should point out why ID fails to meet the standards of good scientific explanations and why evolution by natural selection wins hands down (e.g., Coyne 2009; Young and Edis 2004). Naturally, this requires more effort and patience on the part of teachers than the policy recommended by the NCSE. However, we think it is the only intellectually responsible position. Not only is the view of science promoted by the NSCE untenable, but it provides succor to the suspicion, often voiced by ID advocates, that science is dogmatically attached to naturalism.

It is important that students appreciate the open-ended character of science. The current view promoted by the NCSE suggests that science closes up research avenues in advance and refuses to deal with certain types of explanations (Nagel 2008). As Hugh Gauch writes:

Science is worldview independent as regards its presuppositions and methods, but scientific evidence, or empirical evidence in general, can have worldview import … human presuppositions have no power to dictate or control reality … Precisely because science does not presuppose worldview-distinctive beliefs, such beliefs retain eligibility to become conclusions of science if admissible and relevant evidence is available. (Gauch 2009)

Thus, science holds no prejudice against religion, but due to its open-ended character it may arrive at conclusions that are profoundly at odds with religious worldviews. Thus, science educators confront the challenge of how to endorse open and unbiased scientific inquiry and at the same time maintain the receptivity of students to potentially controversial material or examination that may pose a threat to their worldview. These educational challenges are not exclusive to our position, however. Even ‘accommodationist’ teachers who wish to assure students of the compatibility of science and religion must still confront the stark conflicts between scientific theories and fundamentalist religious teachings, e.g., evolution vs. the six-day creation as described in Genesis (e.g., see Matthews 2009a, b).

11 Conclusions

If the world were chaotic and displayed no regularities at all, or if supernatural (or other) entities were routinely tampering with or sabotaging our experiments and cognitive faculties, then science would be impossible. This relatively uncontroversial thesis is a far cry from the more radical claim (as promoted by Mahner) that scientists must presuppose or stipulate a priori that the world is in fact lawful and that supernatural entities do not exist (ON).

We have argued that science does not presuppose ON a priori, but may support ON a posteriori by mustering evidence for natural explanations and against supernatural claims. Conversely, in principle science could unearth evidence for supernaturalism (e.g., positive effects of intercessory prayer). Furthermore, we maintain that the discovery of such evidence would not necessarily undermine the reliability of science. In the end, whether the entities referred to by a hypothesis are labeled as ‘natural’ or ‘supernatural’ is irrelevant to its epistemic status. Science can evaluate supernatural hypotheses according to the same explanatory criteria used to evaluate any other factual claim. These criteria include explanatory power (goodness of fit to the evidence) and simplicity (data compression, unification, parsimony). Because of the consistent failure of supernaturalism, however, science encourages us to keep looking for natural explanations when dealing with strange phenomena before hastily considering supernatural ones. But provisionally adopting this methodological guideline does not mean that science presupposes naturalism. Similarly, we have argued that science does not presuppose any of the other alleged metaphysical principles of science cited by Mahner, but could confirm or (partly) disconfirm them on the basis of the evidence. We maintain that imposing artificial restrictions on science based on arbitrary definitions and classifications is antithetical to the goals of open and unbiased scientific inquiry. Instead, the scientific legitimacy of hypotheses should be judged according to the extent to which they satisfy the aforementioned explanatory criteria. It is important that students appreciate this open-ended character of science, even if doing so brings to light conclusions that conflict with their religious worldview. The positions adopted by the NAS and NCSE and the extensions proposed by Mahner violate this principle of open-endedness in science and artificially restrict its scope of investigation. Science does not rule things out by fiat, and students should not be led to believe that it does.