1 Introduction

Causal eliminativists (henceforth “eliminativists”) deny that reality has a causal structure. This is not just a rejection of causal primitivism—the view that the causal relation is an irreducible part of the furniture of reality—but the stronger position that denies that anything answering to our causal talk can be found in the world.Footnote 1 A version of this view, asserting that the “law of causality” is false and has no place in science was famously defended by Russell (1913). As I am understanding it here, eliminativism is committed to the ontological claim that a singular causal relation is not actually instantiated.Footnote 2 I will, however, focus not on the positive case for eliminativism but on the most significant objection faced by its proponents. The contemporary consensus is that eliminativism is false.Footnote 3 This is in no small part because, as with other similar projects in philosophy, eliminativists about causation are vulnerable to an indispensability objection that supports ineliminativism—the philosophical view that consists in a rejection of eliminativism. I will focus on an argument that suggests that the causal relation is indispensable to the distinctions (which we must draw) between effective and ineffective agential strategies. I will conclude that, initial appearances notwithstanding, this argument does not give rise to a decisive objection against eliminativism.

In Sect. 2, I introduce eliminativism and show how some of the more straightforward pitfalls of the view can be avoided. Section 3 develops the indispensability objection to eliminativism. Sections 4 to 7 suggest a response to that objection. Sections 8 to 9 answer a potential challenge to this response.

2 Dialectic

One version of Russell’s central argument goes as follows: it is essential to our concept of causation that only some of an event’s antecedents count as causal.Footnote 4 Hartry Field puts the point as follows:

[T]here would be a big deal if we had to conclude that if \(c_1\) and \(c_2\) are both in the past light cone of e then there is no way of regarding one of them as any more a cause of e than the other: then Sam’s praying that the fire would go out would be no less a cause than Sara’s aiming the water-hose at it, and the notion of causation would lose its whole point. (Field, 2003, p. 439)

In other words, our concept of causation is discriminating. It follows that this causal concept could succeed in picking out a relation only if that relation holds uniquely among a select class of relata. Yet there are good reasons to think putatively causal relations cannot be selective in this way. The gist of the argument is that causes must determine their effects, that causes could do so only given a propitious background, that the insurmountable problem of selection means that these background requirements must also be causes, and that, therefore, given the nature of our physical world innumerably many things will count as causes.Footnote 5 Our concept of causation therefore picks out a relation which cannot be instantiated because it is physically or even logically incoherent. Given an Aristotelian view according to which properties exist only when instantiated, there can be no such thing as a causal relation in our world or nearby possible worlds.Footnote 6 Of course the eliminativist does not deny that events can be correlated in various complex ways that are manifested in patterns of conditional probabilities. Her contention is just that there is no privileged relation that exists above and beyond these various patterns of interdependence.

In making these claims, the eliminativist relies on a particular interpretation of fundamental science and in particular of physics. Following Russell, she contends that causal concepts are not required to formulate the fundamental laws of physics.Footnote 7 This interpretation removes what would be a devastating objection from physical indispensability. Nonetheless, other versions of the indispensability objection remain a threat on account of the central role causal notions play in our agential life, both in planning and in the evaluation of action. The canonical version of this idea was presented by Nancy Cartwright. It is worth emphasizing just how important this objection is: it is the reference point to which philosophers allude when they set eliminativism aside (Field, 2003, pp. 440–443; Hitchcock, 2007, p. 59; Hitchcock, 2013, p. 139; Price, 2007, pp. 284–288; Woodward, 2007, p. 73). The goal of the next section is to outline Cartwright’s argument. Subsequently, I show how an eliminativist might respond.

3 Causal decision theory

According to Cartwright “causal laws cannot be done away with, for they are needed to ground the distinction between effective strategies and ineffective ones” (Cartwright, 1979, p. 420). One of the incontrovertible facts of our world is that there are more and less effective ways of accomplishing certain ends. If it is raining, opening an umbrella is typically a more effective way of remaining dry than not doing so. Some people believe that avoiding vaccinations will be a more effective way of promoting their health over their lifetime than being vaccinated; if we disagree with them, it will be because we think that they are mistaken about the most effective course of action given their goals. Everyone, eliminativists included, should therefore accept:

Thesis: There is a distinction to be drawn between more and less effective strategies.

The goal of the argument from indispensability is to show that this commitment requires that a causal relation is actually instantiated and that some causal claims are true.

The critical stage of the ineliminativist’s case is a defense of a causal decision theory, which can ground the notion of effectiveness required by Thesis. The advocates of causal decision theory claim that it is to be favored because it delivers the right verdict in a number of cases that serve as counterexamples to rival analyses. For present purposes, I will assume that these verdicts are correct. Later, I will argue that there are non-causal decision theories available that are consistent with this (concessive) assumption. Consider, then, one of the cases motivating causal decision theory. Suppose that the holders of life insurance policies live longer than non-policyholders without the policy being the cause of their longevity. (Perhaps the kinds of people who purchase life insurance also visit the doctor more frequently, take regular exercise, eat a balanced diet, and have disposable income to invest in healthcare.) Purchasing such policies is not, then, an effective means of promoting a long life, even though those who have such policies do in fact live longer. In other words, it is not enough to make a strategy effective that adopting that strategy is merely correlated with the desired outcome (Cartwright, 1979, pp. 429–430). Examples like these give us a preliminary basis for moving towards a causal account of effectiveness.

Table 1 Insurance

The problem for eliminativists goes deeper still: it turns out that a strategy might increase the probability of a desired outcome relative to various pieces of a partition while failing to do so overall (or vice versa). This phenomenon is known as “Simpson’s paradox”. To see how this might arise, suppose again that I am deciding whether to take out a life insurance policy, and imagine that I am either asthmatic or not. I am interested in whether I will make it to 85 years of age. Now suppose that this time I learn that the probability that a policyholder survives to 85 is just 0.58, but the probability that a non-policyholder survives to 85 is 0.76. This makes taking out the policy look foolhardy. But imagine now that, consistent with the aforementioned fact, my chances of survival are as set out in Table 1 below. (This distribution of chances is possible if policyholders are more likely to be asthmatics. In particular, in this case it must be that in the general population of non-policyholders, just one in five people are asthmatic, but four out of every five policyholders have asthma.) It now looks like I should be a policyholder after all. Call this case Insurance.Footnote 8

Simpson’s paradox shows us that an effective strategy can sometimes even be negatively correlated with the desired outcome. While policyholders are less likely to survive until 85 in Insurance this is not because the policy has a negative effect on their life prospects, but because it is an indicator of a health condition (asthma) that negatively affects survival. The partition in Table 1 is a more informative basis for decision than the initial probabilities because it holds fixed relevant background factors over which we exert no causal influence. Insurance thus suggests that we should not accept a decision theory according to which whether an action is rational is a function of how probable desired outcomes are, conditional on the agent performing that action. Theories of this kind are known as “Evidential Decision Theory” (EDT). Cases like Insurance motivate the replacement of EDT with a Causal Decision Theory (CDT).

To see the differences between these views, it will help to express them formally. To do so, we must introduce some notation. Let the \(S_i\) partition the space of possible states and take A to be some act whose rationality is in question, then outcomes are given by conjunctions of actions and states \((S_i \wedge A)\). Now let \(V(\_)\) be a valuation function assigning a value to action-state conjunctions, and let \(Cr(\_)\) be the agent’s credence function. Then according to EDT, an action’s decision-theoretic value function, \(Val_{\text {EDT}}(\_)\), is as follows:

$$\begin{aligned} Val_{\text {EDT}}(A) = \sum _{i=1}^n V(S_i \wedge A) \cdot Cr(S_i \mid A) \end{aligned}$$

An action is rational according to EDT to the extent that it realizes a higher decision-theoretic value than the alternatives. By contrast, proponents of CDT hold that to decide what we ought to do we should find the product of the value of an action-state pairing and our credence in the causal hypothesis that the action in question will causally bring about the relevant outcome. Standardly “causal hypothesis” here is cashed out in counterfactual terms, with “” standing for a counterfactual connective.Footnote 9 Let us call this version of the theory “CDT\(_{\text {O}}\)” (for orthodox CDT).Footnote 10 CDT\(_{\text {O}}\) evaluates actions as follows:

As with EDT, a strategy’s effectiveness is a matter of its decision-theoretic value relative to other strategies.

While CDT\(_{\text {O}}\), requires that in evaluating what they should do agents must form credences about putatively causal claims, this doesn’t in itself require that there be any true causal propositions. One important limitation of the indispensability argument in the present context, then, is that causal notions only feature in causal decision theory within the scope of the credence function. While this observation does draw our attention to a potential weak point in the ineliminativist’s case, it need not wholly undermine their argument. Eliminativists maintain that all causal claims are false. So, by their own lights, eliminativists should accord all causal claims credence 0. The point here is not that the rational credence for all agents in causal statements is 0, nor is it that the truth of eliminativism would require all rational agents to assign credence 0 to causal claims (it might very well not, unless eliminativists believe that the evidence supporting eliminativism is wholly a priori). Rather, the claim is that eliminativists themselves are required to assign credence 0 to causal claims in light of their other doxastic commitments. But then they would no longer be in a position to draw a distinction between effective and ineffective strategies: for all actions and strategies would have the same decision-theoretic value according to their implementation of CDT\(_{\text {O}}\)—namely 0. But, the argument continues, it is not enough to satisfy Thesis that persons who are not eliminativists can continue to draw the distinction between effective and ineffective strategies. Eliminativsts are committed to thinking that they themselves can successfully draw such distinctions. Since it seems that they cannot do so, the indispensability argument claims to falsify eliminativism.

4 Eliminativist alternatives to CDT\(_{\text {O}}\)

I will suggest that the eliminativist can respond by introducing a decision theory that, like CDT\(_{\text {O}}\), is consistent with the cases developed above to motivate the move away from EDT, but which has no causal commitments (cf. Hall, 2004, pp. 268–269; Hitchcock, 2013, pp. 138–139). Decision theories typically incorporate a dependency condition, offsetting the valuation of an outcome by some measure of an act’s tendency to realize this value. To avoid the argument of Sect. 3, the eliminativist must identify some credential state, Cr(X), capturing some kind of action-outcome dependency such that i) for a given act A and state \(S_i\), and, ii) X comprises no causally committal claims or concepts. i) immediately imposes an important limitation. In virtue of a triviality result proved in Lewis (1986d) (see p. 10 below), Cr(X) cannot be a conditional credence. Instead, I will begin by examining whether the eliminativist can construe X as an acausal counterfactual.

Since this strategy may seem counterintuitive, it is worth proceeding carefully. In the next two paragraphs, I will assume for the sake of argument that a causal relation is instantiated. It turns out that, even granted this concession, the dependency condition on which CDT\(_{\text {O}}\) relies cannot be fully causal if the theory is to get the right results. To see this, consider preemption cases in which causal and counterfactual dependence come apart: if I fatally shoot a man who has just been injected with a deadly poison by an assassin, I cause his death, preempting the process initiated by the assassin. But since the poison would have killed him anyway, it is false that if I hadn’t shot the man, he wouldn’t have died. So we have a causal relationship, but no counterfactual dependence. A similar phenomenon occurs in cases of overdetermination.Footnote 11 Strikingly, where causation and counterfactual dependence come apart, our evaluative judgments seem to track counterfactual rather than causal dependencies (Hitchcock, 2013, pp. 138–139). If I am considering whether to strike the ball on the tee with my club, and if doing so would, in the process, deflect away Ada’s stroke, which would also have hit the ball, then if I swing, my swinging is the cause of the ball’s moving off the tee, but it is nonetheless false that had I not swung, the ball would not have moved. If S is the act of swinging and M is the event of the ball’s moving, then should be near 1 (so long as I am rational). But if “\(C(\lnot S,M)\)” represents that \(\lnot S\) causes M then \(Cr( C(\lnot S,M))\) should be near 0. If all I care about is that the ball move from the tee and we are using a causal hypothesis as our dependency condition then \(Val(S) \approx V(S \wedge M) > Val(\lnot S) \approx 0\), whereas if we are using a counterfactual dependency condition \(Val(S) \approx V(S \wedge M) \approx Val(\lnot S) \). A fully causal decision theory would thus require that I swing, while a counterfactual decision theory implies that both swinging and refraining are rationally permissible.

Fully causal decision theory seems, however, to err here since I can just as well get what I want if I do not swing, given that in this case the process that Ada will initiate no matter what I do would serve to move the ball. Moreover, it isn’t clear why I should care about preempting Ada: it is almost fetishistic to insist that my action should be the cause of what I care about happening when it would definitely have happened anyway.Footnote 12 (There may be further moral conditions if, for example, there are agent-relative duties not to act even when your refraining makes no consequential difference. How and how far these can be integrated into the decision-theoretic apparatus is a substantial question in moral philosophy that I will not take up here.) This suggests that it is no coincidence that CDT\(_{\text {O}}\) is implemented with a counterfactual dependency condition. For, if decision theory is to get the right results in these cases, then what it is rational to do must go by the counterfactual rather than the causal relations that hold (cf. Dorr, 2016, pp. 275–276). In this sense, initial appearances notwithstanding, CDT\(_{\text {O}}\) seems to be misnamed.

These remarks show how the eliminativist might respond to the indispensability argument: she can defend a counterfactual decision theory (CFDT)——which, she can argue, provides the best interpretation of the formalism preferred by advocates of CDT\(_{\text {O}}\). Since CFDT reinterprets the proposal of CDT\(_{\text {O}}\) it is relatively straightforward to show that the theories agree in their recommendations. The cost of this maneuver is that establishing CFDT’s non-causal credentials becomes harder. The next three sections seek to show how this burden could nevertheless be discharged.

5 The entropy theory of counterfactuals

For CDT\(_{\text {O}}\) to return the right results in cases like Insurance, we must restrict the class of counterfactual judgments involved in decision, setting aside “backtracking counterfactuals”—claims of the form “” that rely on the following kind of reasoning: “if A then B would have had to have already been the case, in which instance C would have followed”.Footnote 13 To single out the relevant class of judgments, we interpret the counterfactual connective that appears in the formalization of CDT\(_{\text {O}}\) in such a way that, according to the official theory, only the right kind of counterfactual judgments are salient to the evaluation of action. The classic Lewis–Stalnaker semantics (Lewis, 1973; Stalnaker, 1968) provides the requisite interpretation, according to which (roughly speaking) “” is true just in case “B” is true in all those possible worlds which are such that i) “A” is true at that world, and ii) that world is among the worlds that are most similar to the actual world. When supplemented with an appropriate account of what it is for two worlds to be similar—(Lewis, 1986c)—this proposal seems to rule out backtracking interpretations.Footnote 14

Significantly, it turns out that the counterexamples to EDT can be avoided so long as our decision theory eschews a backtracking dependency condition. To see this, consider again Insurance. EDT seems to go wrong there because it looks irrational to opt for an action on the basis of its “news value”. We avoid this irrationality by offsetting our valuations of action-state conjunctions by a condition capturing the forward-looking connection between action and state, while screening off connections prior to the moment of action. To move beyond EDT we need to find a temporally asymmetric condition capturing this idea. The eliminativist claims that such asymmetries need not be causal; her task is now to show how.

One option is simply to adopt the semantics underpinning CDT\(_{\text {O}}\) lock, stock, and barrel. The ineliminativist, however, can argue against this approach. Counterfactuals can have backtracking readings in a number of contexts; the counterfactual connective that features in CDT\(_{\text {O}}\) is non-backtracking because it is intended as a “causal counterfactual” (Lewis, 1986a, p. 326). This requires imposing a number of restrictions on its interpretation. If CFDT is to track the recommendations of CDT\(_{\text {O}}\), its proponents should be able to give a non-arbitrary account of why they use “” to denote only non-backtracking counterfactuals. But, given her rejection of causation, the eliminativist cannot argue that she does so on the grounds that this allows “” to pick out a causal relation. The ineliminativist might then argue that there are no other resources available to eliminativists that could motivate similar restrictions and provide an adequate basis for a non-backtracking reading.

Such claims would be misguided. David Albert and Barry Loewer have developed a physicalistically respectable account of counterfactuals that does not assume any causal notions but supports a non-backtracking reading.Footnote 15 Their proposal draws on the way thermodynamic asymmetries are explained in statistical mechanics. One way to characterize the second law of thermodynamics is as a rule to the effect that the entropy of any system that is isolated energy-wise never decreases. The best explanation of this regularity currently relies on what is called the “Past Hypothesis” (PH), which posits a low entropy initial state of the universe at its inception . This hypothesis turns out to be well-supported by contemporary cosmology. We can use PH to formulate (PROB), a probability distribution over the possible initial conditions at some time t that are compatible with PH. If L is a proposition describing the dynamical laws, we can then calculate a statistical mechanical probability function over possible macrostates of the universe—\(Pr_{SM} (\_) = \text {PROB}(\_ \mid L \wedge \text {PH} )\). If A is a decision taken at some time t, \(Pr(\_)\) is a probability function capturing objective chances, and \(M_t\) is the macrostate of the universe at t, Loewer’s proposal is then that:Footnote 16

Call this “the entropy theory of counterfactuals”.

While this is a departure from the Lewis–Stalnaker semantics, it turns out to return almost identical results in the cases where they both apply. According to Lewis, a world’s similarity to the actual world is a function of similarity in its dynamical laws (and the instances of “miracles”—violations of said laws) and in the size of the region within which the fundamental facts differ from those holding in the actual world. \(Pr_{SM}\) is conditional on the dynamical laws holding. States with positive statistical mechanical probability thus realize similarity in this regard to a maximum degree. (This is possible because the macrophysical dynamics turns out to be indeterministic even assuming determinism at the microphysical level.) Moreover, because PH is an asymmetric boundary condition, \(Pr_{SM}\) is temporally asymmetric in the sense that a decision at t makes a significant difference to the probability of macro events after t, but no difference to the probability of macro events before t. (E) thus avoids backtracking since for any macro event B that did not actually occur prior to the decision A taken at time t, \(Pr_{SM} (B \mid A \wedge M_t )=0\) and so for any such A, we should think that things could not have been macroscopically different before t, if A were (counterfactually) to come to pass.Footnote 17

6 Extending the theory

Unfortunately, (E) is insufficient as a general theory of counterfactuals since it is limited in application to a narrow range of cases. In the present section, I consider which extensions are required if it is to serve the eliminativist’s purposes. I will focus first on a restriction that limits the entropy theory to counterfactuals with a particular kind of antecedent and second on a limitation that restricts the theory’s applicability to counterfactuals with a particular kind of consequent. I will argue that the first limitation need not be relaxed, and that while things are different in the second case, there are viable extensions that are consistent with eliminativism.

Loewer’s account assumes that the antecedents of the relevant counterfactuals are decisions. Clearly though it is possible to assess counterfactuals with different kinds of antecedents. “Had Bradman hit the ball, he would not have been out.” True, although the antecedent does not describe a decision. We might therefore wonder whether the entropy-theoretic story can apply to counterfactuals like these. Loewer suggests that it can, although certain restrictions remain. Arguably though this is moot, since such extensions are not required for our purposes. This is important: Dorothy Edgington has argued that it may not be possible to formulate a theory of counterfactuals without causal resources (Edgington, 2004). Her argument is motivated by counterfactuals with antecedents that concern how things were in the past. Edgington’s analysis is plausible. But her proposal need not disturb the eliminativist. For the counterfactuals that feature in decision theory can feasibly be restricted to cases where the antecedent consists of a future decision. Since Edgington’s examples do not fall within this class, they do not undermine the entropy-theoretic account of the counterfactual connective as it is understood in CFDT. Cases like those discussed by Edgington might cause problems later for an eliminativist trying to construct a full theory of the world. However, so long as we can develop an eliminativistically acceptable decision theory, the prospects for addressing such problems as they emerge by appeal to what would be an effective means for a hypothetical agent seem good.

A different limitation arises because, in the first instance, Loewer’s account handles only counterfactuals whose consequent is a proposition corrresponding to the state of affairs of some event B having probability x. But the counterfactuals that appear in CFDT are of the form: “”, where \(S_i\) is a non-probabilistic state of affairs. The account must therefore be expanded if it is to serve the eliminativist’s purposes. One way to generalize it would be to take the ideal credence in “” to be x, where Loewer’s account predicts the truth of “” (cf. Kutach, 2002). Given (E) this proposal would make . However, as mentioned in Sect. 4, this equality cannot hold generally. Lewis (1986d) proves an important triviality result which suggests that, given certain minimal conditions that are satisfied in the present case, there is no conditional connective “\(\rightarrow \)” such that \(Pr(A \rightarrow B) = Pr(B \mid A)\) for all AB. In these cases \(Pr(B \mid A)\) seems to track the assertibility (following the terminology of Jackson, 1981) of the conditional “\(A \rightarrow B\)”. Lewis’s result thus suggests that probability of truth and assertibility may come apart for conditionals. The Principal Principle implies that credence should follow probability of truth, in which case it should track \(Pr(A \rightarrow B)\) rather than \(Pr(B \mid A)\) (Lewis, 1986e).

To extend Loewer’s account to counterfactuals with non-probabilistic consequents, we must therefore amend (E). Lewis (1986d) shows that while need not equal \( Pr(B \mid A)\), there is a way of defining a probability function that is equivalent to . To do so, we must introduce the technique of imaging. Imagine that we have a credence function defined over a space of possible worlds, such that \(Cr_t(w_i) \) reflects our degree of belief at t in the proposition that \(w_i\) is actual. Let “w(X)” denote a world where X holds. To conditionalize on X, we suppose that at \(t_{(i+1)}\) we know that X—that is that our world makes X true. This rules out all worlds where X is false. Conditionalizing on X therefore implies that \(Cr_{t_{(i+1)}} (w(\lnot X) ) = 0\) for all \(w(\lnot X)\). We then redistribute whatever credence we had previously assigned to worlds where X is false. To do so, we divide \(\sum _w Cr_{t_i} ( w (\lnot X ) )\) among the worlds where X is true, and attribute this quotient to each such world, producing our credence function for \(t_{(i+1)}\). The imaging rule is similar. Again, we suppose that we know that X and eliminate all worlds w such that \(w(\lnot X)\). This time though instead of redistributing \(\sum _w Cr_{t_i} ( w (\lnot X ) )\) among all the remaining X worlds, we divide it between only those worlds that are among the most similar to the \(\lnot X\) worlds (which we are now assigning credence 0).

This suggests a way to refine (E). Let’s introduce the notation “” to stand for the counterfactual connective as it is understood by eliminativists and define “\({\overline{Pr}}^{A}(X \mid Y)\)” as the probability function that results from \(Pr(X \mid Y)\) after imaging on A. To perform this operation, we need to introduce a similarity metric, but we need not give precise details here. (The account in Lewis Lewis, 1986c is necessarily more finicky and complex, but this is because Lewis’s account is unlike ours in relying on the similarity metric to rule out backtracking.) We could then adopt:

(Recall that t is the time of decision and \(M_t\) is the macrostate at t.) Replacing (E) with \(\text {(E}^{\prime })\) removes the second limitation on the entropy theory, allowing it to handle counterfactuals with non-probabilistic consequents.

While (E\(^{\prime }\)) bears a superficial resemblance to (E), the differences between them are important: where (E) is a truth-functional proposition, for instance, (E\(^{\prime }\)) is an equation of real-valued quantities. We can better understand (E\(^{\prime }\)) by introducing the notion of accuracy, understood as a graded analog of truth. Accuracy replaces truth as the success condition for variables representing some quantitative state of affairs. (E\(^{\prime }\)) specifies when a credence in a counterfactual of the relevant kind is perfectly accurate. Just as Lewis takes the Principal Principle to define the theoretical role of chance as the thing that guides credences (Lewis, 1994, p. 489), so (E\(^{\prime }\)) can be understood as picking out “” as the proposition credence in which is accurate when it agrees with the relevant statistical mechanical probability function. Given the way we have defined \({\overline{Pr}}_{SM}^{A} (B\mid M_t)\), (E\(^{\prime }\)) requires that is accurate to the extent that it equals the probability of a Lewis–Stalnaker conditional. This is mathematically convenient, as we’ll see, but comes at the expense of semantic applicability: since counterfactuals seem to be unlike probability functions in being strictly bivalent—either true or false simpliciter—(E\(^{\prime }\)) doesn’t give us general semantic conditions for counterfactuals. It turns out, though, that there is broad consensus both that the probabilities of conditionals are interestingly patterned and that these probabilities do not mesh in obvious ways with their truth conditions (e.g. Kaufmann, 2022). Giving a semantics for conditional propositions that makes sense of their probabilities is thus a deep problem in the philosophy of conditionals with which eliminativists need not concern themselves. Their task was to motivate a non-backtracking reading of the counterfactual without resort to causal notions. They accomplished this task by showing that the non-backtracking reading corresponded to the predictions of a physicalistically respectable probability function that tracks entropic asymmetries, and defining “” in terms of this probability function.

The eliminativist can now give their theory of decision:

CFDT gives the eliminativist exactly what she wanted: a theory comprising non-backtracking counterfactuals that makes no resort to causal notions. Since it gives the counterfactual connective a non-backtracking reading, CFDT will agree with CDT\(_{\text {O}}\) in the cases that motivated causal over evidential decision theory. For the characteristic feature of such cases is that what it is rational to do seems to be responsive to a temporally asymmetric connection between action and outcome, and a non-backtracking counterfactual condition realizes just such a connection. Moreover, there is no arbitrariness in the resulting view, since the proposal has a compelling scientific rationale (Loewer, 2007, p. 320). Counterfactuals can be given a backtracking reading, but there is a class of non-backtracking counterfactuals corresponding to an important physical regularity; it is perfectly plausible that this class might be of particular relevance to rational decision. Thus, eliminativists contend that it is CFDT which grounds our judgments about the effectiveness of strategies.

Crucially, agents can apply CFDT without any special knowledge of statistical mechanics; from the perspective of a would-be user of the theory, “” is simply a non-backtracking counterfactual. And while (E\(^{\prime }\)) provides a rationale for focusing on non-backtracking readings, it does not give rise to an epistemic standard of evaluation intended to supplement or supplant familiar evidentialist criteria. Rational agents should assign their credences in accordance with their evidence. (E\(^{\prime }\)) tells us when such credences would be accurate, but we may always receive misleading evidence that makes it rational to have inaccurate credences. What matters for deliberative purposes, though, is not that our credences are accurate, but that our choices align with the predictions of a decision theory that takes specific credences as inputs. (E\(^{\prime }\)) delineates the relevant class of inputs, without adjudicating their rationality.

Consider now a classic Newcomb case (Nozick, 1969). You are offered a choice between two boxes: your options are to pick either just the first box or to pick both boxes. An almost perfectly reliable predictor has predicted what choice you will make. Based on this prediction, the predictor has performed the following action: if they predicted that you will take just a single box, they placed $1 million in the first box and $1000 in the second box; if they predicted that you will take two boxes, they placed $0 in the first box and $1000 in the second box. Your valuation function is assumed to be directly correlated with your income in dollars. Causal decision theorists argue that the rational thing to do here is to choose both boxes: your decision can make no causal impact on how much money there is in the first box and no matter how much money there is in that box, you do better to take both boxes. CFDT should be able to return this verdict and indeed it does. Consider your choices: “O” or “T”, for choosing just one or two boxes respectively. There are two possible states of affairs: in \(S_1 \) the first box contains $1 million, in \(S_2\) it contains $0. Thus:

You do not know whether \(S_1\) or \(S_2\) is the case. But given (E\(^{\prime }\)) the counterfactual judgments of relevance to CFDT are non-backtracking. Since the predictor’s action precedes your decision, the only counterfactual connections between their action and your decision are backtracking, and so the actual state is counterfactually independent of your choice in the relevant sense of “counterfactual”. The rational thing to do is therefore to set your credence in the dependency condition equal to your unconditional credence in \(S_1\) or \(S_2\). Suppose you think that there is a 99% likelihood that \(S_2\) obtains and just a 1% likelihood that \(S_1\) is actual (this is a plausible assessment given the description of the case). Then, CFDT predicts:

$$\begin{aligned}Val_{\text {CFDT}}(O) = 1,000,000\cdot 0.01 + 0 \cdot 0.99=10,000\\Val_{\text {CFDT}}(T) = 1,001,000 \cdot 0.01 + 1000\cdot 0.99 = 11,000 \end{aligned}$$

Since \(Val_{\text {CFDT}}(T)>Val_{\text {CFDT}}(O)\), CFDT recommends taking both boxes.

7 Non-causal counterfactuals

While it seems plausible that causation and counterfactual dependence sometimes come apart, the ineliminativist can argue that a’s counterfactually depending on b is sufficient for b to count as a cause of a.Footnote 18 In order to settle the issue, we need to say something about what it would take for a relation to count as causal. In Sect. 2, I argued for one requirement—that causation must be discriminating in the sense that not all of an event’s temporal antecedents can count as causes—and assumed a second—that causes must determine their effects. In what follows, I’ll give two arguments against the thesis that counterfactual dependence suffices for causation. The first argument is straightforward but relies on the further assumption that causation must be transitive, while the second argument dispenses with this assumption.

Consider then a family of examples developed by Ned Hall, cases of “double prevention” (Hall, 2004, pp. 241–248). These are cases when one event forestalls a second that would in turn have blocked a third event that actually came to pass. In Hall’s version of the case, Suzy is on a bombing mission, but an enemy is sure to shoot her out the sky, preventing her bombing the intended target (B). But Billy, who is piloting a different airplane intervenes (I), attacking the enemy and preventing him from intercepting Suzy. Suzy’s bombing counterfactually depends on Billy’s intervention, since had Billy not intervened, the enemy would have frustrated her mission. That is: . If counterfactual dependence suffices for causation, then I causes B. In and of itself, this result seems perfectly plausible. Consider though the thesis that causation is transitive in the sense that if x causes y and y causes z, then x ipso facto counts as a cause of z. Ordinary reasoning about causation frequently makes tacit appeal to this thesis: Johnson became President because Lincoln died, Lincoln died because Booth shot him, therefore Booth’s shooting Lincoln caused Johnson to become President. Yet if counterfactual dependence suffices for causation and causation is transitive, then absurd conclusions follow. Imagine the enemy is commanded by his superior to intercept Suzy once her incursion is detected (C). Had this command not been issued Billy would not have shot down the enemy, so . If counterfactual dependence suffices for causation, C causes I and by the reasoning above I causes B. Thus by transitivity, C causes B. But this, Hall shows, is absurd: C initiates a self-canceling threat to B, for if no command is given then the enemy will not begin his sortie and so cannot obstruct Suzy, so ). C should not therefore count as a cause. Thus, if causation is transitive, then by reductio counterfactual dependence cannot suffice for causation.

Still the suspicion remains that there are at least some cases where counterfactual dependence is enough for causation. Let’s relax then the assumption that causation must be transitive and focus on the two other requirements I mentioned above. I suggested first that causes must discriminate. This claim is supported by an example of Field’s which I discussed in Sect. 2. The intuition is that, conceptually speaking, causation earns its keep as a way of distinguishing among an event’s temporal antecedents; some of these antecedents make a special contribution to the event’s coming to pass. If our concept of causation were indiscriminate then it would serve no useful function, but it is a necessary condition of concept-possession that the relevant concept should play some cognitive role. Hence causation must be discriminating.

Eliminativists now argue, however, that counterfactual dependence is not discriminating and so cannot be causal. The argument for this claim, developed in Latham (1987), is that (even assuming determinism) it is not possible to give a full specification of the counterfactual conditions of some event e except by specifying every parameter in some slice of the back light cone of e. In other words, for any region in e’s back light cone there are possible values that would block e’s occurrence. e therefore depends counterfactually on every region in any given slice of its back light cone not having any of the e-inhibiting values.Footnote 19 On this basis, the eliminativist can claim that counterfactual dependence and temporal precedence are extensionally equivalent—any event in the back light cone of e is equally a temporal antecedent and a potential counterfactual condition of e. It follows that this relation is not discriminating in the requisite sense. Thus, counterfactual dependence does not suffice for causal dependence because it is a weaker relation in virtue of being indiscriminate.

In giving this response, the eliminativist must proceed with caution. For if she is to recover effectiveness judgments she must argue that an agent’s available actions can differ in terms of their counterfactual connection with desired outcomes. But this seems, on the face of it, to require that counterfactual dependence can be discriminating after all. That is, we might think that discrimination is too stringent a requirement to impose on a causal relation since if eliminativist worlds are wholly undiscriminating, then there could be no notion of differential effectiveness in such worlds. Worries of this ilk misunderstand what is meant by “discrimination”. To count as undiscriminating, it is not required that a relation R should hold equally between any relata, but only that for any xy, if x precedes y then R(xy). The argument of the previous paragraph shows that this is true of counterfactual dependence. That conclusion though is compatible with thinking that eliminativist worlds could have sufficient structure to support differential judgments of effectiveness if the counterfactual connections between distinct events can vary in their strength. Significantly, the ineliminativist cannot make an analogous move because causation is a categorical rather than a graded relation: that is, it does not come in degrees but is either “on” or “off” (cf. Kaiserman Kaiserman, 2016 and Sartorio, 2020 here). This helps to make sense of why the indispensability argument does not succeed: differential judgments of effectiveness require only that our world be discriminating in a graded sense, while causal structure requires something stronger, namely that the world is discriminating in a categorical sense.

So far, I have said nothing about the second requirement, that causes should determine. The rationale here is straightforward: it is implicit in much of our everyday causal reasoning that if a set of causes \(\{c_1,\ldots ,c_n\}\) does not suffice to bring about an event e, then there must be some further cause of e\(c_{n+1}\)—missing from our initial set.Footnote 20 Although we do not need to rely on the assumption that causes determine to show that counterfactual dependence is not causal, this requirement is nonetheless worth mentioning because it helps us to understand how the resources required for CFDT to have application fall short of those needed for the world to have causal structure. The eliminativist’s idea is that discrimination and determination impose conflicting requirements on a relation: a set of discriminating conditions must be exclusive, but the determinants of an event must include all potential defeaters. Not only are judgments of strategy effectiveness not discriminating in a categorical sense, it is also not in general true that effective strategies must necessitate the desired outcome. The structure needed for decision theory to get a grip is thus doubly weaker than that required for the truth of causal claims.

An objector might worry that this picture demands too much of a relation if it is to count as causal. Determination and discrimination seem to me to be relatively minimal requirements; still it is worth seeing what the prospects for ineliminativism look like if we relax these demands. One way to do so is to adopt a sort of functionalism. According to this kind of view the success of causal talk in explanation and in decision requires that such talk must be tracking something, whatever that thing is will count as causal (cf. Hall, 2004, p. 256). A possible development of this idea is suggested by the “perspectivalist” view, exemplified in Huw Price’s work. Price’s idea is that causation is a conceptual upshot of our agential perspective. He proposes that c causes e just in case \(Pr(e \mid c) > Pr(e) \), where Pr is a probability function calculated from an agent’s point of view. In making such calculations, an agent supposes that any action available to them is uncaused, originating in their own free decision (Price, 1991, 2007, p. 281; Menzies & Price, 1993, pp. 190–191).

From the eliminativist perspective, however, perspectivalism looks like a kind of fictionalism in the sense that perspectivalist judgments are embedded under a condition that we know to be false—namely the supposition that our own actions are independent of whatever diachronic structure subsumes or determines other events. To eliminativists, this is no surprise: notwithstanding their success, our agential and explanatory practices often seem to rest on unrealistic idealizations that belie the inference from the relative success of some practice to the conclusion that the world works the way the relevant practice represents it as working. This counts strongly against perspectivalist and other quasi-functionalist views. More generally, eliminativists see our causal concept as tainted: they suspect that causal talk is an anthropocentric projection, a legacy of our proto-scientific image of the world. Reasons of conceptual hygiene thus favor the elimination of causal vocabulary.

8 Indispensability revived

Is this sufficient to answer the indispensability argument? It might seem so. But there is an important objection that suggests otherwise: Andy Egan has introduced several cases that give rise to a new version of the argument. Egan presents his cases as objections to CDT\(_{\text {O}}\), however, in light of the discussion above they would seem to be just as accurately described as counterexamples to CFDT (Edgington, 2011, p. 80). This suggests a way to resurrect the indispensability argument: so far, I have argued that causal notions are not indispensable to a theory of decision because CFDT is both non-causal and explanatorily adequate. Egan’s cases threaten to falsify the second conjunct by showing that a fully causal decision theory is required. If my eliminativist response is to succeed, it must be able to head off such cases.

Consider one of Egan’s examples: Paul must choose whether or not to press a button that would kill all psychopaths. Paul has a low credence in the proposition that he is a psychopath, and according to his valuation function, it would certainly be better to eliminate psychopathic persons. But Paul believes that no one who was not a psychopath would press the button, and he vastly prefers living in a world of psychopaths to being eliminated himself. Paul, we are inclined to think, should not press the button (Egan, 2007, p. 97). Call this case Psychopathy. A plausible specification of Paul’s valuation of outcomes is set out in Table 2. Let’s stipulate that Paul’s counterfactual credences are as follows: , , , . These stipulations are justified by the description of the case: Paul does not think he is a psychopath, so presumably he should not think he would become one if he presses the button. Hence all the counterfactuals with \(S_1\) in the consequent receive a credence less than 0.5 and all the counterfactuals with \(S_2\) in the consequent receive a credence greater than 0.5. It follows that \(10> Val_{\text {CFDT}}(P) > -20 \) but that \( -36> Val_{\text {CFDT}}(\lnot P) > -44 \). So \( Val_{\text {CFDT}}(P) > Val_{\text {CFDT}}(\lnot P) \) and so CFDT recommends pressing the button. But that would seem to be the wrong result; there is good reason to think that pressing the button would bring about Paul’s death which is, by his own lights, highly undesirable.

Importantly, thinking that Paul would die requires a backtracking reading of this counterfactual. For, otherwise, since Paul doesn’t initially believe he is a psychopath, and since he shouldn’t believe that merely pressing a button can induce psychopathy, he shouldn’t believe that pressing the button would kill him. Egan’s idea seems to be that our pre-theoretical intuitions support a backtracking reading of this counterfactual and thus that we pretheoretically judge that Paul ought not to press the button. CFDT which places a moratorium on backtracking reasoning must deliver the verdict that Paul would not die, were he to press the button, and thus endorses button-pressing. But this conflicts with our pre-theoretical intuitions, which are, the thought goes, strong enough to support the conclusion that CFDT thereby delivers the wrong verdict.

Table 2 Psychopathy

Edgington suggests that we learn from examples like these that:

[T]he ban on backtracking is a bad idea. We want all the evidence we can get about what the causal situation will be, on the assumption that I do [act] A. In the counterexamples to [EDT] what the backtracking evidence reveals is that there is no causal connection...In the counterexamples to Counterfactual Decision Theory, the backtracking evidence reveals that there is a causal connection. All the examples reinforce that causation is central...(Edgington, 2011, pp. 83–84)

One solution is to move to a fully causal decision theory. Instead of credences in counterfactuals we could rely on credences in propositions like: “doing A will cause outcome O” (Edgington, 2011, p. 84). Let’s call the resulting view “Edgington Causal Decision Theory” (CDT\(_\mathrm{(E)}\)).Footnote 21 Recall that we are using “C(xy)” to indicate that x caused y. CDT\(_\mathrm{(E)}\) is then the view that:

$$\begin{aligned} {Val_{\text {CDT}_\mathrm{(E)}}}(A) = \sum _{i=1}^n V(S_i \wedge A) \cdot Cr(C(A,S_i) \mid A) \end{aligned}$$

CDT\(_\mathrm{(E)}\) not only avoids problems in Psychopathy but can also address the problem posed by Insurance. Since having insurance doesn’t cause one to live a shorter life, but is merely correlated with a condition that does, conditional on being a policyholder one should have a low credence in the proposition that being a policyholder would reduce your life expectancy. By contrast, since the policy improves the chances of a long life for all, conditional on being a policyholder, you should think it likely that being a policyholder contributes to your longevity. Here, then, we have a way to augment the indispensability argument: Egan’s examples show that distinguishing between effective and ineffective strategies requires CDT\(_\mathrm{(E)}\), rather than CFDT or CDT\(_{\text {O}}\). CDT\(_\mathrm{(E)}\), though, would require eliminativists to assign credences to causal claims that are inconsistent with their view. So eliminativism seems to be falsified.

9 Defending CFDT

One option for the eliminativist would be to reject Egan’s account of cases like Psychopathy: Paul, she might say, should press the button after all. In support of this, the eliminativist could argue that the case is really a twist on a classic Newcomb problem: if it is rational to take both boxes in Newcomb cases, then it should also be rational for Paul to press the button. This is not, though, the best route for the eliminativist. While Psychopathy bears some resemblance to a Newcomb case it presents a different kind of problem. In typical Newcomb cases, the choice preferred by causal decision theorists is supported by dominance reasoning: no matter how the world is, it is better to take both boxes. Things are otherwise in Psychopathy: whether P or \(\lnot P\) is to be preferred depends on whether \(S_1\) or \(S_2\) obtains (cf. Joyce, 2012, pp. 129–130).

Structurally speaking, Newcomb cases provide examples of accidental correlation: some background condition B makes one of the agent’s options A and an outcome O covariant, such that the agent’s conditional credence \(Cr(O \mid A)\) should be relatively high, but B in fact screens off any correlation between O and A. Suppose that O is undesirable, nonetheless intuitively this may not give you reasons against A. Egan’s case adds a twist to this structure: B now induces a correlation between A and C, where C is some contextual constraint that so long as A is performed in turn makes O more probable. In this structure, the correlation between O and A is not screened off by B. Put differently, the problem in Psychopathy isn’t that Paul is concerned about pressing the button because that would suggest he might be a psychopath, it is that pressing would suggest that he might be a psychopath and so might kill him. Now the badness of O does seem to provide a reason against A.

The eliminativist should instead offer a more concessive response to Egan’s cases. Decision-theoretic rationality, she can claim, is equivocal: there are conceptions that recommend against pressing the button and other conceptions that recommend pressing. Our evaluative practices do not discriminate between these conceptions. To respond to the indispensability objection, eliminativists should be able to show that they can recover a version of the distinction between effective and ineffective strategies that matches the shape of our practices. But this does not require that they recover every version of this distinction. Egan’s cases are thus to be corralled to a conception of effectiveness distinct from that which the eliminativist is trying to justify.

In pursuing this approach, the eliminativist can follow the analysis of Egan’s cases given in Joyce (2012). Joyce’s response starts with the observation that the more your decision favors a certain course of action in this case, the more you should come to believe that it would not be the best thing to do. The choice of P, for example, provides evidence that were the agent to perform P, it would have been better to do something other than P. Since you are an agent facing a decision problem you must think of yourself as responsive to whatever considerations make an option better than its rivals and so your preference for P gives you some reason to revise that very preference. This kind of instability is known as a failure of ratifiability: an action is not ratifiable when choosing that action provides evidence that an alternative option would be the better choice. Neither option in Psychopathy is ratifiable (Joyce, 2012, p. 125).Footnote 22 It is controversial how and why failures of ratifiability should count against performing a certain action, but one important point seems to be that such failures convey significant information that hadn’t previously featured in your evaluation of the options (Joyce, 2012, pp. 138–142; Arntzenius, 2008, pp. 278–280, 290–295).

Joyce (2012) suggests that CDT\(_{\text {O}}\) (and so by extension CFDT) permits agents to act only when they are using evaluations that respond to all the relevant information that can be costlessly acquired. Certainly there does seem to be a sense in which someone who chose to act when there was information that was easily available and whose assessment would not impose any costs (so, for example, there is no urgent time pressure) acts irrationally. That’s not to deny that there is perhaps another sense in which you do something that’s rational when you act on your initial, evidentially unsupplemented, evaluation. But the eliminativist can reasonably deny that this is the version of the effectiveness-ineffectiveness distinction that she is trying to ground. The version to which she is committed is a version which carries with it informational requirements.

This last claim may seem somewhat weak until we can see how CFDT could satisfy these informational requirements. For if neither P nor \(\lnot P\) are ratifiable then it’s hard to see what an agent could do that would be better, and it doesn’t seem plausible that there could be decision-theoretic requirements that are in principle impossible to satisfy. Return then to Psychopathy. Joyce’s suggestion is that we should imagine the agent performing a series of sequential evaluations, updating her credences at each stage. This process can be iterated indefinitely until it reaches a fixed point at which subsequent iterations will not induce any further evaluational changes. A unique equilibrium point of this kind exists in Egan’s cases (Joyce, 2012, p. 133). Let \(t_e\) denote the time at which the equilibrium point is reached, CFDT can then allow that agents should perform whichever option has the highest valuation according to \(Val_{\text {CFDT}^{t_e}}(\_)\).

Paul must choose whether to press the button. Since any changes to his credence in his performing either P or \(\lnot P\) will affect his valuation thereof, it must be that his credence at \(t_e\) that he will perform P is equal to his conditional credence at \(t_e\) that he will perform P given the decision-theoretic evaluation of P at \(t_e\)—that is, his credence that he will perform P is unchanged by the information disclosed by the value of \(Val_{\text {CFDT}^{t_e}}(P\)). The same equation must also hold in the case of \(\lnot P\). It turns out that this is possible at the equilibrium point only if he evaluates both P and \(\lnot P\) equally. Both are therefore rationally permissible. We now have an explanation of how the informational requirements on CFDT can be satisfied even when neither alternative is ratifiable. The equilibrium valuation incorporates all the information that is revealed by the failures of ratifiability. Even if you act in accordance with the valuations arrived at from the equilibrium point you can anticipate regret once you irrevocably commit to an alternative (since you should then increase your credence in your performing that act to 1). Hence neither option is ratifiable. But once you have reached a decision at equilibrium, the information revealed by failures of ratifiability has already been incorporated into your evaluation. To further revise your decision on these grounds would be a kind of “double counting” (Joyce, 2012, pp. 135–142).

The eliminativist can now finalize her response. CFDT is not subject to Egan-style counterexamples for two reasons: first, because these pertain to a different conception of effectiveness from that which she is concerned to explicate, and second, because (arguably) the ineliminativist was wrong about what a decision theory should say in such cases. This also explains why eliminativists are not committed to Edgington’s genuinely causal decision theory: CFDT and CDT\(_\mathrm{(E)}\) deliver different verdicts about Egan’s cases. But, there is at least an argument to the effect that it is CFDT that gets things right. CFDT and CDT\(_\mathrm{(E)}\) are then, at the very worst, on a par. But in that case, CDT\(_\mathrm{(E)}\) is not indispensable, since there is a perfectly adequate alternative.