Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Introduction

Probability abounds in the natural and social sciences. Yet, science strives for objectivity. Scientists are not pleased when told that probability is just opinion and there is no more sense to it. They are prone to believe in objective probabilities or chances. This is an essay about how to understand them.

Indeed, it is my first serious attempt in English1 to come to terms with the notion of chance or objective probability. I cannot help feeling that this is a presumptuous enterprise. Many great minds have penetrated the topic. Each feasible position has been ably defended. No philosophically relevant theorem remains to be discovered. What else should there be to say? Yet, the issue is not settled. Even though all pieces are on the table, no one missing, how to compose the jigsaw puzzle is still not entirely clear. Philosophical uneasiness continues. Everybody has to try anew to put the puzzle together. So, here is my attempt to do so.

Let me lay my cards on the table right away. An event, or a state of affairs, is chancyiff it is partially determined by its past, to some specific degree; some might call this an Aristotelian conception of chance. Chance laws, then, generalize over such singular partial determinations. Likewise, a state of affairs is necessary(in the sense not of metaphysical, but of naturalnecessity) iff it is fully determined (i.e., sufficiently caused) by its past.2 Deterministic laws generalize over such singular full determinations, so that we may reversely say that a state of affairs is necessary iff it is entailed by the laws and its past. This parallel will become important later on.

There isdetermination. There are deterministic laws, or so we believed at least for ages. And there are chance laws and hence chancy events, as modern physics tells us. Objective probabilities may thus be conceived as single-case propensities of a radical kind: propensities of the entire world as it has developed up to now to realize not only this or that current state of affairs, but in effect this or that entire future evolution.3 The localization of propensities is a secondary, though, of course, important issue. The primary and really vexed issue is how at all to understand partial and full determination.

Given that there is partial determination, subjectivism or eliminativism concerning objective probabilities, a position associated with Bruno de Finetti and his positivistic predilections, is out of place. (Still, the most basic truths lie in his insights, and this essay will end up as hardly more than a projectivistic reinterpretation of de Finetti’s views.)

Reductionism concerning objective probabilities seems ill-guided, too, whether in the analytical form trying to define chances in non-probabilistic terms as, e.g. (hypothetical) frequentism does or in the weaker ontological form as displayed in the doctrine of Humean Supervenience championed by David Lewis. Indeed, the failure of Humean Supervenience is nowhere clearer, I find, than in the case of chances.

Hence, realism without reductionism is perhaps what we should be heading for. I am indeed attracted by the picture as sketched, e.g., by Black (1998, pp. 371f.) who argues against Lewis that the world is more than “a vast mosaic of local matters of particular fact” (Lewis 1986, p. ix), more, as it were, than a pattern of inert colors; it is also a pattern of pushes, hard deterministic as well as soft chancy ones. Maybe we should accept realism about primitive laws, dispositions, capacities, propensities, etc. (or their categorical bases), as has been vigorously defended by Armstrong (1983, in particular Chapter 9, and 1997, Chapter 15) and in quite a different way by Cartwright (1989).

Yet I share the widespread epistemological concerns about Australian realism that are as old as Hume’s criticism of necessary connexion or determination. What we need to get explained, at least, is the theoretical web within which chances get their role to play.4 However, the explanations given by propensity theorists are generally not in good shape. And so I appear to be torn by my various dissatisfactions, finding no place to rest.

No other than David Hume has suggested a position possibly comforting everyone, with his doctrine that causation is an idea of reflexion and that necessary connexion is nothing but determination or customary transition in thought. The doctrine has received its most extraordinary shape in Kant’s transcendental idealism. Nowadays, it is rather called projectivism and defended by Simon Blackburn under the label ‘quasi-realism’ and summarized thus:

Suppose we honor the first great projectivist by calling ‘Humean Projection’ the mechanism whereby what starts life as a non-descriptive psychological state ends up expressed, thought about, and considered in propositional form. Then there is not only the interest of knowing how far Humean Projection gets us. There is also a problem generated even if the mechanism gets us everywhere we could want. If truth, knowledge, and the rest are a proper upshot of Humean Projection, where is it legitimate to invoke that mechanism? Perhaps everywhere, drawing us to idealism, or nowhere, or just somewhere, such as the theory of value or modality. (1993, p. 5)

This ‘mechanism’, I shall argue, is operative at least in the case of chance and natural necessity. It is thus no accident that I am referring twice to Hume within one page. The move from Humean Supervenience to Humean Projection will be ourmove in this paper. (Indeed, I find that the latter is much better anchored in Hume’s writings than the former.)

The crux of projectivism, though, is that it may sound attractive as a general strategy, while remaining poor in constructive detail. Thus it is not likely to satisfy the probability community. Indeed, if one looks at recent surveys such as Gillies (2000), projectivism does not figure there under its own or any other name. This is the point where I hope to add a bit to the present discussion.5

As the reader may have guessed, this paper is largely an argument with David Lewis’ philosophy of probability. This has a personal motive. I well recall how enthralled I was by Lewis (1980) – and how bewildered by the continuation in Lewis (1986, Introduction and Postscripts to 1980, and 1994). I just had to come to grips with his work. There is also a substantial reason. Lewis’ account is peculiarly ambiguous. He starts inquiring the epistemology of chance and ends up investigating its ontological grounds. Thus, I find it most instructive to follow his line of thought and to search for the point of departure for a more adequate account.

There is a third reason. The parallel between deterministic and chance laws is obvious; it would be awkward to account for them in a wholly disparate manner. Lewis expressly pursues this parallel; after apparent success in the deterministic case, his strategy just had to carry over to chance laws, as elaborated in his (1994). Therefore, Lewis is the natural point of contact on this score, too, and however I diverge from Lewis’ account of chance, the divergence must work for deterministic laws as well. In fact, I see how it will. Contrary to appearances, natural necessity or full determination or lawlikeness is still less understood than partial determination; even the appropriate analytical means were missing. The theory of ranking functions will bring progress here. This remark, though, will be briefly outlined, and can be more easily grasped, I hope, after treating the actually more familiar probabilistic case.6

The paper will proceed in the following way: We shall start in the section Chance-Credence Principleswith recapitulating the central role the Principal Principle has, according to Lewis, for understanding chance. Lewis gives substance to this principle by claiming admissibility, as he calls it, for historical and chance information; this will be discussed and simplified in the section The Admissibility of Historic and Chance Information. The admissibility of chance information drives him into a contradiction, though, with the doctrine of Humean Supervenience. Lewis proposes to reform the Principal Principle, but I shall argue in the section The Admissibility of Chance Information and Humean Superveniencethat it is rather Humean Supervenience that has to go. This provokes a closer look at that doctrine, and we shall see in the section Humean Superveniencethat it is inherently questionable. So, this will be the point where a projectivistic reconstruction of the notion of partial determination is likely to deliver a more coherent account. The reconstruction will be carried out in the section Projection Turns the Principal Principle into a Special Case of the Reflection Principle, via the observation that the Principal Principle may be taken, in a precise way, as a special case of the Reflection principle propagated by van Fraassen (1984); this is no deep formal insight, but of some conceptual interest. The section Humean Projectionwill sum up the projectivistic doctrine and argue that it can meet familiar objections and serve the purposes for which Lewis had invoked Humean Supervenience. As explained, the whole line of reasoning must somehow carry over from chance to natural necessity or from partial to full determination. The appendix will indicate how this might go.

Chance-Credence Principles

Let us approach our topic, objective probability, via the Principal Principle, which seems to its baptizer “to capture all we know about chance” (Lewis 1980, p. 266, my emphasis) – a proper starting point, if this claim were true. There is in fact not only one principle relating chance and credence; subsequent literature has discerned a whole family of principles, which we do well to survey. So, let us start in a purely descriptive mood; we shall become involved into debate soon enough.

The basic idea relating chance and credence is very old and familiar; it is simply that if I know nothing about some proposition Abut its chance, then my credence in Ashould equal this chance. This is the Minimal Principle(as Vranas 2004calls it):

$$(\mathrm{MP})\quad C(A\vert P(A) = x) = x.$$

Here, Cstands for subjective probability or credence (the association with Carnap’s ‘confirmation’ is certainly appropriate), and Pfor objective probability or chance (or propensity, if you like). The subject having the credence remains unspecified, since (MP) is, as it were, a generic imperative; (MP), like the subsequent principles, is a rationality postulate telling us how any credence function should reasonably behave.

(MP) is the starting point of the sophisticated considerations in Lewis (1980). (MP) is also called “Miller’s Principle”, because Miller (1966) had launched a surprising early attack on it. However, (MP) is not an invention of the recent philosophical debate. It is known for long also under the label “direct inference”.7 In fact, it is implicit in each application of Bayes’ theorem to statistical hypotheses; there the ‘inverse’ posterior probabilities of the hypotheses given some evidence are calculated on the basis of their prior (subjective) probabilities and the ‘direct’ probabilities or likelihoods of the evidence under the hypotheses; and these ‘direct’ probabilities hide an implicit use of (MP). The merits of the recent discussion pushed by Lewis (1980) and others are rather to scrutinize variants of (MP).

Before proceeding to them there are, however, various things to clarify. Philosophy first, I propose. If Lewis is right that principles like (MP) “capture all we know about chance”, then the philosophical interest of these principles is evident. Lewis does not really argue for this claim. In fact, he does not make it, it only seems true to him. Indeed, he cannot strictly believe it by himself. When, as we shall see later on, he claims chances to Humeanly supervene on particular facts, then he clearly transcends the Principal Principle. And I shall end up agreeing with Arntzenius, Hall (2003) that there must be more we know about chance.

The point should rather be seen as a challenge. For, what is true is Lewis’ assertion “that the Principal Principle is indeed informative, being rich in consequences that are central to our ordinary ways of thinking about chance” (1980, p. 288), as is amply proved in his paper. For instance, it follows that chance conforms to the mathematical axioms of probability. The challenge then is what else there might be to say about chance. In default of an explicit definition of chance we seek for an implicit characterization, and it seems that we have already gone most of our way with the extremely neat Minimal Principle (which, as we shall see, is hardly strengthened by the other principles still to come). The more we are captivated by the Principal Principle, the harder the challenge.

The harder, though, also the philosophical puzzle posed by chances. It is strange that chances that are supposed to somehow reflect objective features of the external world should be basically related to our beliefs in a normative way. Our implicit characterizations of other theoretical magnitudes do not look that way. And the more weight is given to this relation, the more puzzling it is. How should we understand the peculiar normative power of that objective magnitude to direct our beliefs? If, indeed, the Principal Principle is all we know about chance, that power turns into sheer miracle. Why should we be guided by something the only known function of which is to guide us? Preachers may tell such things, but not philosophers.8 One of Lewis’ motives for the doctrine of Humean Supervenience is, we shall see, to solve this puzzle; indeed, he claims it to be the only solution. We need not take a stance right now, but we must always be aware in the subsequent discussion of the basic merits and problems of the Principal Principle. We are dealing here with high philosophical stakes.

At the moment, though, we must be a bit more precise about (MP). First, we must be clear about the domains of the functions involved. The chance measure Ptakes propositions, I said. We should not start, though, philosophizing about propositions. Let us simply assume that propositions are subsets of a given universal set Wof possible worlds.

Is any kind of proposition in the domain of P? This is an open question. It is debatable which propositions may be chancy or partially or fully determined and which not. There may be entirely undetermined propositions and there may be propositions for which the issue makes no sense. Let us leave the matter undecided and grant, in a liberal mood, that at least all matters of particular fact, and hence all propositions algebraically generated by these facts, have some degree of determinateness, i.e., chance. Lewis (1994, pp. 474f.) has an elaborate view on what particular facts are; here we may be content with an intuitive understanding.

In any case, a proposition saying that some factual proposition has some chance is not a particular fact in turn. This does not exclude that such a chance proposition is algebraically generated by particular facts, but neither does it entail it; it is crucial for this paper not to presuppose from the start that chance propositions are factual in the same way as particular facts. So, let us more specifically assume that each wWis a complete possible course of particular facts. Whether we should be more liberal concerning the domain of chance will be an issue later on.

Credence is not only about particular facts, but also about possible chances; this is explicit in (MP) itself. Thus, if \(\mathcal{P}\)denotes the set of possible chance measures for W, then \(W \times \mathcal{P}\)is the possibility space over which the credence Cspreads.

Moreover, I shall be silent about the precise algebraic structure of the set of propositions9 and just assume that each \(P \in \mathcal{P}\)is defined on some algebra over Wand Con some algebra over \(W \times \mathcal{P}\). Accordingly, I shall be silent about the measures we are considering being finitely or σ-additive. This sloppiness will have costs, but rigorous formalization would have costs, too. I am just following the practice usually found in the literature I am mostly referring to.

For instance, one consequence of sloppiness is that (MP) does not make strict sense, since the condition will usually have probability 0. Lewis says that we should move then to non-standard analysis and hyperfinite probability theory where the condition in (MP) may be assumed to have infinitesimal probability. More easily said than done. Within standard probability theory one may circumvent the problem by stating (MP) in the more general form:

$$(\mathrm{MPI})\quad C(A\vert P(A) \in I) \in I\ \mathrm{for\ any\ open\ interval}\ I.^{11} $$

This issue will return, and all the principles I am going to discuss should be restated accordingly.

There is another reason why (MP) will not do as it stands. Cmay not be any credence function. If Cis already well informed about A, for instance by being based on the observation of Aor of some effects of A, (MP) is obviously inadequate. As Lewis (1980, pp. 267f.) explains, this concern is excluded for sure only if Cis an initial or a priori credence function, as conceived as the target of further rationality constraints also by Carnap in his inductive logic. To indicate this, I shall denote an a priori credence by C 0(0 being a fictitious first time).

Finally, in order to present Lewis’ ideas we must note that chance evolves in time; this is particularly clear when chance is conceived as partial determination. Even full determination evolves in time, unless determinism holds and everything is fully determined at all times. Moreover, chance is world dependent; how chance evolves in time may vary from world to world. In order to make these dependences explicit we must replace Pby P wt , the chance in w at t. Thus we arrive at a slightly more explicit version of the Minimal Principle:

$$(\mathrm{{MP}}^{{_\ast}})\quad {C}_{ 0}(A\vert {P}_{wt}(A) = x) = x.$$

11 Having said all this, let us return to our descriptive path through the family of chance-credence principles (cf. also the useful overview in Vranas 2004). A first minor step proceeds to a conditional version of (MP) introduced by van Fraassen (1980, pp. 106f.), the Conditional Principle:

$$(\mathrm{CP})\quad {C}_{0}(A\vert B\ \&\ {P}_{wt}(A\vert B) = x) = x,$$

saying that, if you know nothing about Abut its chance conditional on B, your conditional credence in Agiven Bshould equal this chance. (CP) is certainly as evident as (MP). We shall soon see that (CP) is hardly stronger than (MP).

David Lewis has taken a different, apparently bigger step. After retreating to the a priori credence C 0in (MP) that is guaranteed to contain no information overriding the conditional chance information, Lewis poses the natural question which information may be added without disturbing the chance-credence relation stated in (MP). He calls such additional information admissible, and thus arrives at what he calls the Principal Principle:

$$(\mathrm{PP})\quad {C}_{0}(A\vert E\ \&\ {P}_{wt}(A) = x) = x\ \mathrm{for\ each\ admissible\ proposition}\ E.$$

But what precisely is admissible information? The answer is surprisingly uncertain; the literature (cf. e.g., Strevens 1995and Vranas 2004) strangely vacillates between defining admissibility and making claims about it. I think it is best to start with a clear definition, which is obvious, often intimated (e.g., by Vranas 2004, Footnote 5), but rarely endorsed in the literature (e.g., by Rosenthal 2004, p. 174):

$$\begin{array}{rcl} & & (\mathrm{DefAd})\quad E\ \mathrm{is}\ \mathit{admissible}\ \mathit{wrt}\ A\ \mathit{given}\ D\ \mathrm{iff}\ {C}_{0}(A\vert E \cap D) = {C}_{0}(A\vert D).\ \mathrm{Specifically}, \\ & & \qquad \qquad \ \ E\ \mathrm{is}\ \mathit{admissible}\ \mathit{wrt}\ A\ \mathit{in}\ w\ \mathit{at}\ \mathit{t}\ \mathrm{iff}\ E\ \mathrm{is}\ \mathit{admissible}\ \mathit{wrt}\ A\ \mathrm{given}\ {P}_{\mathit{wt}}\ (A) = x.\end{array}$$

The first general part says that Eis admissible wrt Agiven Diff Edoes not tell anything about Agoing beyond Daccording to the a priori credence C 0. Admissibility is nothing but conditional independence. The second part gives the specification intended and used in (PP).

Obviously, the definiens states at least a necessary condition of admissibility; any admissible Enot satisfying this condition would directly falsify (PP). I propose to consider the necessary condition to be sufficient as well. This strategy trivializes (PP); with (DefAd), (PP) reduces to nothing more than (MP), and the issue of admissibility is simply deferred. Still, I find the detour via (DefAd) helpful. It clearly separates the meaning of admissibility from the substantial issue which propositions Eshould be taken to satisfy (DefAd). This issue is our task in the next section.

One may still wonder why one should take the necessary condition for admissibility to be also sufficient. We may have stronger intuitions concerning admissibility. We may, for instance, think that two pieces of information admissible individually should also be jointly admissible, a feature not deducible from (DefAd). Or we may think that any Eadmissible wrt Ain wat tshould be so for general reasons pertaining to wand tand not to idiosyncratic reasons pertaining to A. And so on. However, the theoretical tactics is always to start with the weakest notion, which is (DefAd) in the case at hand. The substantial claims about admissibility will then also take a weak form, but we are always free to strengthen them. The point is that we would not have the reverse freedom when starting with a stronger notion right away.12 A further worry may be that (DefAd) lets a priori credence C 0decide about admissibility. However, we should read (DefAd) the other way around; whatever the substantial claims about admissibility, they are constraints on C 0via (DefAd).

The Admissibility of Historic and Chance Information

Lewis (1980) makes two substantial claims about admissibility. The first is that each piece of historic information is admissible. If I know the chance P wt (A) that Ahas in wat t, this knowledge cannot be improved by any information about what happened in wup to t; P wt (A) summarizes, as it were, all there is to know in wup to t. Let us denote the history of the world wup to time tby H wt . \({H}_{\mathit{wt}} = \{v \in W\vert {H}_{\mathit{vt}} = {H}_{\mathit{wt}}\}\)is a proposition. Moreover, let us say that the proposition Eis only about history up to t, or t-historical, for short, iff for each world weither H wt E= or H wt E. Then Lewis’ claim is:

$$(\mathrm{AdH})\quad \mathit{If }\ E\ \mathrm{is}\ t\mbox{ -}\mathrm{historical},\ \mathrm{then}\ E\ \mathrm{is}\ \mathrm{admissible}\ \mathrm{wrt}\ A\ \mathrm{in}\ w\ \mathrm{at}\ t.$$

Note that reference to Ais empty; only the relation of Eto tis relevant to (AdH).

This claim is almost universally accepted. Lewis (1980, p. 274) himself raises a doubt about (AdH). Could there not be a crystal ball that foretells me for sure whether or not Ahappens, even if Ais chancy? I shall explain later why I am not worried by this alleged possibility. Here, I just join (AdH). Thus, the Principal Principle starts unfolding some strength. Let me add three remarks that deepen the understanding of the point.

First, Lewis (1980) presents the case as if the admissibility of historical information were specifying (PP) and thus rationally constraining credence. So it does, but the core of the matter is thereby obscured in my view. (PP) and (AdH) immediately entail what might be called the Determination Principle:

$$(\mathrm{DP})\quad {P}_{wt}({H}_{wt}) = 1.$$

(This follows by replacing Aas well as Ein (PP) by H wt .) (DP) simply says that history is no longer chancy. This consequence is, of course, intended. However, it is not about credence, but only about chance. In fact, it is an analytic truth about (partial) determination: what is past is fully determined.13 Hence, it is more illuminating to realize that only this analytic truth needs to be added to (CP) to entail (AdH). The matter will further simplify later in this section.

The second point is one I have not seen emphasized in the Principal Principle literature, though it deems important to me: In the prolonged efforts of understanding objective probabilities, whether as frequencies or propensities, the so-called reference class problemstood out as central and embarrassing.14 The probability of a particular event seemed to depend on the reference class within which it was considered. Thus, that event could be assigned an objective probability only if one could distinguish the objectively correct reference class, apparently a dubious matter. The general recommendation was to rely on the narrowest reference class (or on the broadest reference class equivalent to the narrowest one); see also Hempel’s so-called criterion of maximal specificity. This may be taken as the narrowest availablereference class; but availability imports epistemic relativity. Or one may engage into the difficult business of Salmon (1984, pp. 60ff.) of distinguishing objectivelyhomogeneous reference classes. If Lewis is to explain us objective probabilities, he must respond to this problem.

He does implicitly, his response is (AdH). For the objective probabilities at tthe whole history, H wt , isthe objectively narrowest reference class; there could not be any more specific one. Indeed, it is hardly a class; it has only one member, actually unrepeatable and only counterfactually repeatable. Hence, this is at best a trivial and purely conceptual solution of the reference class problem; it does not even touch the real and deep methodological problem to specify sound and manageable reference classes. However, this is a problem the philosopher must leave to the scientist; the philosopher can only say what the ultimate standard is with which to compare all actually considered reference classes.

The third remark is related. If all the history up to tis admissible wrt some proposition Ain wat t, this means that up to tthere is absolutely nothing more to know about Athan its chance. You can learn absolutely everything up to tand you will be none the wiser concerning A. If you do not even know the chance of A, you are even more in the dark; the chance of Aat tis the best you can know about Aat t. This is the core intuition about partial determination: if Ais in some way partially determined at t, there is nothing before tthat would determine Ain any other way. And knowledge before tcan at best equal determination at t.

This point is reflected in many accounts of probability, for instance in the old idea that genuine random processes cannot be outfoxed by a gambling system. The same thought is found in von Mises’ (1919) definition of a collective as a sequence for which no place selection results in a subsequence with a deviating limit of relative frequencies and in the subsequent explications of this approach with recursion and complexity theoretic means (cf. Church 1940; Chaitin 1966). Salmon (1984, pp. 60ff.) realizes the same basic idea in terms of his objectively homogeneous reference classes, though in a different way. It is important to see all this connected with (AdH).

Let us turn to the second kind of admissible information acknowledged by Lewis (1980): information about the chances themselves, not only about the actual ones, but also about the ones as they would have been at various times. If I know the actual chance of A, how could information how other chances would have been tell me more about A? It cannot, as Lewis (1980, pp. 274ff.) argues.

To state this more precisely: Let T w be the complete theory of chance holding at the world w, i.e., according to Lewis, the conjunction of all conditionals true at whaving the form: “if the history of wup to t had been H vt , then P vt had been the chances in wat t .” Lewis assumes T w to be a proposition over W; this is a controversial assumption to be discussed later. Is it at all a proposition over \(W \times \mathcal{P}\)(or \(W \times {\mathcal{P}}^{T}\)– cf. footnote 12)? Prima facie not, since the counterfactual conditional is not among the Boolean operations. Still, we may take T w to be in the domain of C 0; the issue will be cleared up on the next page. Furthermore define Eto be a chance propositioniff for each world weither T w E= or T w E. Then Lewis’ second admissibility postulate is:

$$(\mathrm{AdP})\quad \mathrm{If}\ E\ \mathrm{is}\ \mathrm{a}\ \mathrm{chance}\ \mathrm{proposition},\ \mathrm{then}\ E\ \mathrm{is}\ \mathrm{admissible\ wrt}\ A\ \mathrm{in}\ w\ \mathrm{at}\ t.$$

Note again that the reference to Aand even to tis empty; all that matters is that Eis a chance proposition.

Lewis assumed that separate admissibility of historic and chance information entails their joint admissibility. This does not follow, however, on the basis of (DefAd). Hence, we should better read (AdH) or (AdP) as saying for each that its kind of information is admissible giventhe other kind of information; this entails its unconditional admissibility.15

At this point, we can easily see that the Conditional Principle (CP) is hardly stronger than the Minimal Principle (MP). If P wt (AB) = yis admissible wrt Bin wat tand P wt (B) = zis admissible wrt ABin wat t, then (PP) yields \({C}_{0}(A \cap B\vert {P}_{\mathit{wt}}(A \cap B) = y\ \&\ {P}_{\mathit{wt}}(B) = z) = y\)and \({C}_{0}(B\vert {P}_{\mathit{wt}}(A \cap B) = y\ \&\ {P}_{\mathit{wt}}(B) = z) = z\)and both together yield (CP). In other words: We have to add to (MP) only the admissibility of a tiny bit of chance information in order to get (CP).

(PP) + (AdH) + (AdP) may finally be combined to what Lewis (1980) called the Principal Principle reformulated and was later called the Old Principle, since it is not yet the end of the story.

$$(\mathrm{OP})\quad {C}_{0}(A\vert {H}_{wt}\ \&\ {T}_{w}) = {P}_{wt}(A).$$

This follows from (PP) because H wt & T w is admissible according to (AdH) and (AdP), T w contains “if H wt , then P wt (A) is the chance of Ain wat t”, and thus H wt and T w entail what P wt (A) is. Conversely, (OP) entails (PP) + (AdH) + (AdP). So, (OP) is a very elegant summary of the foregoing discussion.

The story can be further simplified, though. Let us look at T w again. It is not quite clear why it has to take the specific complicated form, perhaps because T w is to allow that some histories leave some events not even partially determined. However, we wanted to ignore such complications and assumed that all matters of particular fact are partially determined (or almost fully determined via chance 1). Hence, T w claims for each possible history H vt a full chance measure P vt for W. Then, however, we may condense the whole theory T w into one big chance measure P w such that the time-dependent chance P w, vt derives from P w through conditioning by H vt .16 We thus simply replace the conditionals with probabilistic consequents by conditional probabilities. That it is possible to so condense T w is indeed a consistency requirement for T w , which becomes explicit also in Lewis (1980, pp. 280f.) in his discussion of the kinematics of chance.17 P w thus is the time-independent chance lawor scheme of partial determinationas it holds in wfor all propositions over W, and T w simply says that P w is as it is.18 Hence, we have arrived at the following reduction of Lewis’ terminology:

$$\begin{array}{rcl} (\mathrm{RED}){T}_{w}& =& \{{P}_{w}\}\ (\mathrm{or\ rather} = W \times \{{P}_{w}\} \subseteq W \times \mathcal{P}), \\ \mathrm{and}\ {P}_{\mathit{w,vt}}(A)& =& {P}_{w}(A\vert {H}_{\mathit{vt}})\ \mathrm{for\ all}\ A,\ v,\ \mathrm{and}\ t.\end{array}$$

This shows at the same time that T w is indeed a proposition over \(W \times \mathcal{P}\)(indeed over \(\mathcal{P}\)alone) and is thus in the domain of C 0as we have originally conceived it.

(RED) makes clear that all the considerations about time-dependent chance are perhaps intuitively helpful and perhaps required for more general chance theories, but merely a conceptual detour within our frame. (RED) also explains why the above Determination Principle is analytic; (DP) follows from the definition (RED). And (RED) reinforces the redundancy of (AdH); given (RED) and (AdP) (OP) is just an application of the Conditional Principle (CP). However, we just saw that (CP) is entailed by (MP) and (a small part of) (AdP). So, the latter two are the only basic assumptions we need. (RED) finally helps us to express the Old Principle still more simply:

$$(\mathrm{{OP}}^{{_\ast}})\quad {C}_{ 0}(.\vert {T}_{w}) = {P}_{w}.$$

Indeed, (OP) looks like the Minimal Principle itself; the only difference is that (MP) refers to the chance of a single proposition, whereas (OP) refers to a whole chance measure. It is only from the restricted perspective of (MP) that (OP) appears to additionally assume the admissibility of chance information. Initially, I suppose, intuition would have been indifferent between (MP) and (OP).

The Admissibility of Chance Information and Humean Supervenience

So far, so good. We might be happy with (OP) and start discussing its philosophical significance. Alas, the story takes a most unexpected turn, for which it is important that we have discerned (AdP) as an additional assumption in (OP). (OP) thus becomes the starting point of considerable confusion. The source of the trouble is that Lewis not only takes chance-credence principles like (MP) to provide the most basic understanding of chance, but also maintains the ontological doctrine of so-called Humean Supervenience – because this is an attractive metaphysical doctrine, and because such chance-credence principles seem to require it. The trouble is real, and therefore we shall have to scrutinize both grounds of Humean Supervenience. But let us first have a formal look at what the trouble is.

With respect to chance, Humean Supervenience consists in the claim:

$$(\mathrm{HS})\quad {T}_{w}\ \mathrm{supervenes}\ \mathrm{on}\ \mathrm{the}\ \mathrm{totality}\ \mathrm{of}\ \mathrm{particular}\ \mathrm{facts}\ \mathrm{in}\ w.$$

With our reduction (RED) of T w and our understanding of the worlds in Was mere totalities of particular facts, we might as well express this claim thus:

$$(\mathrm{{HS}}^{{_\ast}})\quad {P}_{ w}\ \mathrm{supervenes}\ \mathrm{on}\ \mathrm{w}.$$

It is not quite clear for which worlds w(HS) is to hold. Certainly for the actual world we live in. One may think that (HS) applies to all worlds and is thus a necessary truth. Lewis (1994) sees it only as a contingent truth; (HS) is to hold only for worlds like ours, certainly a more modest and a more mysterious view. We do not have to take a stance here.

Since we are a bit sloppy concerning the algebra of propositions, we may say that (HS) amounts to the claim that T w is identical to a proposition over W.19 (HS) thus says there are not two possibility spaces, one for possible facts (forming the domain of chances) and one for possible chances (jointly forming the domain of credences). The latter rather reduces to the former; there is only the space of possible facts. Chance propositions are in effect factual propositions – and thus in the domain not only of credences, but of chance measures themselves.

Now, however, we are caught in paradox. Imagine that our world w, after having started with H wt , continues with some possible future F wt . F wt should have at least a tiny chance of coming about; so P wt (F wt ) > 0 and, according to (OP), C 0(F wt | H wt & T w ) > 0. On the other hand, F wt may be an undermining future in the sense that H wt F wt ( = {w}) is not in the supervenience base of T w , i.e., according to H wt and F wt wwould be governed by some chance law different from T w . Then F wt is impossible given H wt & T w , i.e., C 0(F wt | H wt & T w ) = 0. To put the case very briefly with reference to (OP): Consider the factual proposition \({\overline{T}}_{w}\)that T w is false. Clearly, \({P}_{w}({\overline{T}}_{w}) = {C}_{0}({\overline{T}}_{w}\vert {T}_{w}) = 0\), However, if wis genuinely chancy, we should have \({P}_{w}({\overline{T}}_{w}) >0\). Somewhere, we have made a mistake.

It seems clear where. Given (HS), not all chance information can be admissible, since information about the future may well be inadmissible and since chance information isinformation about the future according to (HS).20 Indeed, we should conclude that most chance information is inadmissible, though it is hard to be more precise because it is not so clear how supervenience exactly works, in which complex of particular facts T w exactly consists.

However, as Lewis (1994) argues, most chance information is at least nearly admissible, and (PP) and (OP) work approximately well even under the assumption of (AdP); the mistakes we incur are below noticeability. Still, the question is: if (OP) is only approximately valid, what is the standard it approximates? Following Thau (1994) Lewis (1994) proposes that this standard is provided by the New Principle:

$$(\mathrm{NP})\quad {C}_{0}(A\vert {H}_{wt}\ \&\ {T}_{w}) = {P}_{wt}(A\vert {T}_{w}),$$

or in our reduced form:

$$({\mathrm{NP}}^{{_\ast}})\quad {C}_{ 0}(.\vert {T}_{w}) = {P}_{w}(.\vert {T}_{w}).$$

This appears to solve our problem. The derivation of the paradox of undermining futures is blocked when we use (NP) instead of (OP), and the approximate validity of (OP) is explained by the fact that the difference between P wt (A) and P wt (A| T w ) is mostly below noticeability.

Is this an ad hoc solution? No. As Hall (1994, p. 511) and Strevens (1995, p. 557) observe and Hall (2004, pp. 104f.) insists, (NP) is a consequence of (CP) and (DP) which are uncontested.21 Moreover, the admissibility of chance information that drove (OP) into paradox is guaranteed for (NP); T w , and hence any weaker chance information, is trivially admissible wrt Agiven T w & P wt (A| T w ) = x. Hence, (NP) appears to be the right way to reconcile Humean Supervenience with (PP), the admissibility of historic information and the general inadmissibility of chance information.

However, Lewis (1994) is not entirely satisfied. He says:

A feature of Reality deserves the name of chance to the extent that it occupies the definitive role of chance; and occupying the role means obeying the old Principle, applied as if information about present chances, and the complete theory of chance, were perfectly admissible. Because of undermining, nothing perfectly occupies the role, so nothing perfectly deserves the name. But near enough is good enough. (p. 489)

And thus Lewis acquiesces in chances obeying (OP) not quite perfectly.

This remark provokes the final twist of the story. As Arntzenius, Hall (2003) point out, (NP) entails that there isa magnitude occupying the definitive role of chance perfectly, i.e. satisfying (OP) strictly. Suppose that the world wdetermines the chance theory T w ; according to (HS) it does so in some particular manner. And suppose that T w allows for undermining futures so that (OP) does not apply. Now, define P w = P w (. | T w )≠P w and T w = {P w }. So, T w and T w obviously are incompatible chance theories – in one sense. However, change also the supervenience bases for chances; say for each wthat it is not P w , but rather P w that is determined by (the facts of) w. So, in another sense, T w is a factual proposition over Waccording to the initial way of determination, and T w is so, too, according to the modified way of determination. And in this sense, they are not incompatible. On the contrary, T w entails T w , since whenever T v = T w according to the initial way of determination, T v = T w according to the modified one (though not necessarily vice versa). Moreover, T w cannot be threatened by undermining futures. And, this is the upshot, if P w , T w satisfy (NP), i.e., if C 0(. | T w ) = P w (. | T w ), then P w , T w satisfy (OP), i.e., C 0. | T w = P w .22

Hence, if the old principle is definitive of the chance role, as Lewis says, then P w , rather than P w , should be the chance law governing the world w. If we tend to say P w is determined by the particular facts, we should say it is rather P w that is determined by those facts. Thus, we face a new paradox, at least if we think that true chance theories must allow for undermining futures. And even if we deny this and rather attempt to choose P w right away so that P w = P w , then Arntzenius, Hall (2003) complete their argument by showing that chances then behave in an unacceptable way.

Schaffer (2003) tries to escape by claiming vagueness. Chance may be given by P w or by P w , and disambiguation is of little importance, since the difference is small, anyway. However, it is not chance that is vague, I think, only our thinking about it is not clear enough. My conclusion is that we are in deep trouble and have not found any stable position concerning the admissibility of chance information and the possibility of undermining futures. What got us there? It was, of course, the assumption of Humean Supervenience unquestioned so far. It is high time to attend to it more closely.

(HS) assumed that the chance proposition T w over \(\mathcal{P}\)is supervenient upon, or, with sloppy algebra, identical with some factual proposition over W. As a consequence, we had to consider chance propositions as being in the domain of P w and P wt , at least under a liberal, though not exceptional conception of this domain, and hence we had to consider such chances as P w (T w ) or whether or not P wt (A) = P wt (A| T w ). By contrast, if we give up (HS), we are free to reject such expressions and in particular the New Principle as meaningless. If we do so, the admissibility of chance information is rescued from paradox and perfectly acceptable. Indeed, at so many places philosophy ran into trouble in the past decades with iterating (the same kind of) modality. We should have been warned.

I raised the point in my (1999, p. 170). But it is underrated in the literature. The worst Lewis (1994) and Hall (1994) say about (NP) is that it is messy and user-hostile. Arntzenius, Hall (2003, p. 175) only say that the non-reductionist rejecting (HS) is free to assume P wt (T w ) = 1 and to thus eliminate the discrepancy between (OP) and (NP). Hall (2004, p. 99) insists that this stipulation is harmless. Formally, this is correct, but the non-reductionist need not even take this step. And he should not; the harm done consists in blurring the issue. It creates the impression that the issue between the reductionist and the non-reductionist would be whether P wt (T w ) is equal to or smaller than 1; it creates the delusion of there at all being a meaningful issue. It simply makes no sense to say that there is some chance that our world is governed by this scheme of partial determination rather than that or that this atom has (at t) a propensity of.4 of having (at t) a propensity of.2 of decaying (within the next hour).

Hoefer (1997, p. 328) expressly agrees by saying:

The laws are what they are because of the pattern of events in history, and not what they are “by law”. This is just a restatement of the core idea of Humean analyses of law. For just the same reason, the chances are not what they are “by chance”, and the quantity P tw (T w ) should be regarded by a Humean as an amusing bit of nonsense.

However, his argument is a different one. He doubts that all particular facts (and their Boolean combinations) are in the domain of the chance function. Hence, even if the chance of chancy facts supervenes on particular facts, the supervenience base will usually not be in the domain of the chance function. NP would only be guaranteed to make sense for the Human supervenientist, if all particular facts were chancy. By contrast, I am granting the latter and arguing that NP still does not make sense.

Vranas (2004, p. 373) tries to save the “arguably dubious” assumption that chance propositions are in the domain of P wt . He notices the potentially vicious circularity in such expressions as P wt (T w ), which is indeed a point of worry for the non-reductionist, but not for the reductionist, and he proposes to make sense of such expressions within reflexive situation theory (cf. Barwise and Etchemendy 1987) and thus ultimately within set theory without the foundation axiom. But why at all should the non-reductionist try to overcome his worry and take recourse to such remote means? For the non-reductionist particular facts and Boolean combinations thereof are chancy and what lies outside this domain is not. It is up to the reductionist to give an argument for conceiving the domain more broadly, and the argument must not presuppose (HS), as one would if one praises the apparent progress from (OP) to (NP).

Humean Supervenience

So, let us squarely face Humean Supervenience itself. I propose first to look at how Lewis thinks it is feasible. Once we shall have seen the doubtfulness of Lewis’ construction I can proceed with an alternative account and then with a brief discussion of Lewis’ reasons for taking (HS) to be without good alternative.

The thesis of Humean Supervenience says, according to Lewis (1994, p. 474)

that in a world like ours, the fundamental relations are exactly the spatiotemporal relations and that in a world like ours, the fundamental properties are local qualities. Therefore it says that all else supervenes on the spatiotemporal arrangement of local qualities throughout all of history, past and present and future.

Because this holds only for worlds like ours(HS) is contingent. Should alien qualities in Lewis’ sense play a role – irreducible chance would be such an alien quality –, the case may be different.

The bite of this claim emerges when we consider all the things that are extremely thorny for philosophers: laws, counterfactuals, causation – and objective probabilities. All this must be determined by the totality of particular facts, according to (HS). How? The crucial link is constituted by what Lewis calls the best-system analysis of law, which he takes over from F. P. Ramsey:

Take all deductive systems whose theorems are true. Some are simpler, better systematized than others. Some are stronger, more informative, than others. These virtues compete: an uninformative system can be very simple, an unsystematized compendium of miscellaneous information can be very informative. The best system is one that strikes as good a balance as truth will allow between simplicity and strength. How good a balance that is will depend on how kind nature is. A regularity is a law iff it is a theorem of the best system. (Lewis 1994, p. 478)

So far this applies only to deterministic laws. But Lewis suggests to expand the best-system analysis to cover chance laws as well, and he makes clear that the inclusion of chance laws in the best system is primarily governed by relative frequency and symmetry. Some say that Lewis’ position thereby basically reduces to frequentism, others say that it essentially transcends frequentism. We need not decide. We may well accept the best-system analysis for the time being. It is plausible, as far as it goes; it is, to echo Lewis, simple, but uninformative.

There are two critical points, though. The first is that the team of the best-system analysis and the Principal Principle introduces not only an ontological, but also an epistemological double standard. We have already seen the ontological double standard. The best-system analysis somehow establishes T w as the chance theory true of w, whereas (OP) rather requires T w to be determined by w. In addition, we now face an epistemological double standard. On the one hand, our beliefs aim at the best system guided by standards of simplicity and strength and their balance. On the other hand, one should think that all these standards are encoded in the a priori credence function C 0that we seek to constrain by (OP) and other rationality postulates. I do not see an incoherence here, but neither do I see how the two standards go together or what results from the circular procedure of letting C 0decide about the best system and feeding in the decision into condition (OP) on C 0. These are unresolved frictions, to say the least.23

The second critical point is, of course, whether the best-system analysis can at all bolster up Humean Supervenience about laws and chance. Prima facie, it cannot. On the contrary, according to this analysis deterministic and probabilistic laws supervene not only on the totality of particular facts, but also on the measures for simplicity, for strength, and for the goodness of balance; and these measures are something weadd (at least as far as simplicity and balance is concerned; strength has at least an objective partial order).

Surely, Lewis cannot be on good terms with this apparent consequence of the best-system analysis. He shies away from any idealistic tendency like the devil from the holy water, also in order to maintain Humean Supervenience. However, he sees a way out: Perhaps nature is kind to us, and “if nature is kind, the best system will be robustlybest – so far ahead of its rivals that it will come out first under any standards of simplicity and strength and balance” (Lewis 1994, p. 479 – his italics). If so, laws and chances do not depend on our inductive standards.

Yet, can there be a system that is robustly best under any standards? I guess even the kindest world is susceptible to transmogrification under gruesome standards. We may refer to factual human standards, but even there we find a lot of madness. Presumably, Lewis intends to quantify only over all reasonable inductive standards, and perhaps nature then has a better chance to be kind. Look, though, how wide the disagreement about reasonableness is, e.g., from the optimistic middle Carnap who had hoped for theinductive logic to the pessimistic subjectivists who plead for coherence and nothing more. It is quite obscure what a kind world is and how many of them there are.

In any case, Humean Supervenience turns out doubly constrained. It is ontologically restricted to worlds like ours devoid of alien matters, and it is epistemologically restricted to kind worlds free of indeterminateness concerning the best system. The two restrictions appear to be independent, and together they turn Humean Supervenience into an uncomfortable doctrine. I think that the problems in the last section about undermining futures constitute a telling objection. On the whole, the doctrine seems in need of getting straightened out.

As to the second critical point concerning objectivity and independence of our standards Lewis had envisaged another solution:

I used to think rigidification came to the rescue: in talking about what the laws would be if we changed our thinking, we use not our hypothetical new standards of simplicity and strength and balance, but rather our actual and present standards. (Lewis 1994, p. 479)

Yes precisely. rigidification is one salient strategy of objectification.24 Alas, Lewis continues:

But now I think that is a cosmetic remedy only. It doesn’t make the problem go away, it only makes it harder to state. (Lewis 1994, p. 479)

I did not understand this remark, so I requested him for clarification. Since I did not find the point explained elsewhere in his writings, let me quote extensively from his personal communication of February 13, 1996:

Let me answer not your question but a generalization of it. The problem is that a certain analysis says that X(in this case, lawhood) depends on Y(in this case, our standards of simplicity, etc.) and yet we would ordinarily think this wasn’t so. If Ywere different, Xwould be just the same – or so we offhand think. A proposed answer is that ‘X’ is a rigidified designator of the actual value of something that depends on Y, and of course it’s not true that the actualvalue would be different if Ywere different. That’s supposed to explain our opinion that there’s no dependence. Well, ifthat’s so – I’d think that it well might be so under at least some legitimate disambiguation – let ‘ † X’ be a derigidificationof the rigidified term ‘X’. Maybe there’s some nice ordinary-language reading of the derigidifying modifier; or maybe not, but in any case we can introduce it into our language by a suitable semantic explanation (as is done, for instance, in Stalnaker’s paper ‘Assertion’, Syntax and Semantics9).25 Then it might turn out that our original opinion that Xdoesn’t depend on Ysurvives in modified form: as the opinion that even † Xdoesn’t depend on Y. If so, the alleged rigidification of Xends up making no difference. I think that’s what does happen in the case of lawhood and our standards of simplicity etc. And that’s why the hypothesis of rigidification, even if true, doesn’t make the problem of counter-intuitive dependence go away. It makes it harder to state, because to state it you must first introduce the notion of derigidification.

He did not further explain, however, why the intuition that lawhood is independent of our standards should be maintained under derigidification. Projectivism, which I am going to recommend, does not share this intuition. The projectivist rigidifies the result of his projection and thus legitimately claims objectivity for this result. But he is content with so much objectivity. He would immediately grant that derigidification brings the process of projection back into focus and thus displays the dependence on the cognitive subject. However, there is no need to decide the dispute about intuitions. The point rather is that Lewis’ idea which was not good enough for himself helps projectivism to some arguably sufficient notion of objectivity while allowing to admit, in another sense, the dependence of lawhood on our inductive standards.26

Projection Turns the Principal Principle into a Special Case of the Reflection Principle

The last remark puts the cart before the horse. We still do not know what the projectivistic understanding of chances is actually supposed to be. In order to explain it, let us follow the Lewisian track of the best-system analysis, but let us avoid, contra Lewis, to give it an ontological turn, let us rather keep it within its epistemological home. This will lead us onto well-trodden paths, but I said right at the beginning that there are no new discoveries to be made.27

The best system is, first of all, based on complete experience, on complete knowledge of particular observable facts. If these should be only finitely many, then all statistical methodology tells us that they do not allow for guaranteed conclusions with respect to objective probabilities; to force a decision, for whatever reasons, is simply unjustified. This conclusion certainly remains true when we include the broader inductive considerations relevant to best systems. If the set of particular facts should be infinite, the situation is not really different. If a die is actually cast infinitely many times, the propensities of the throws will change, simply because the die will physically change, and then the limit of relative frequency does not help us to a definite conclusion concerning the propensities. This is our epistemic situation vis à vis a small die, and I do not see why it should be different with respect to large worlds. In the strictest sense, nothing is repeatable. In saying this I flatly deny Humean Supervenience, of course.

Hence, it is actually unfeasible to precisely detect chances, even given complete knowledge of particular facts. The detectibility is rather merely counterfactual. Suppose we could run our world over and over again, indeed infinitely many times, suppose that all repetitions were governed by the same objective chance mechanism, and suppose we could learn all particular facts within not only one, but all repetitions. Then we would finally have established the chance law P w of w, at least with probabilistic certainty. The last proviso is essential. If we live in a chancy world, we know a priori that there is a chance for misleading evidence, and we know a priori that even counterfactually ideal evidence cannot close the gap; the difference between probability 0 and impossibility is ineliminable.

If we want to describe this ideal detectibility of chances more formally, we obviously have to consider W 0×W , i.e., not only the original space W= W 0of worlds of particular facts, but besides the space W of infinitely many possible counterfactual runs of the actual world; each w W thus is an infinite sequence of possible worlds, each being a complete course of particular facts. (The term “W 0” is introduced only in order to distinguish that copy of Wfrom its infinitely many counterfactual repetitions.) And we have to extend our probabilistic notions to W 0×W . If the actual world wis governed by the chance law \({P}_{w} \in \mathcal{P}\)defined for propositions over W 0, then these infinite sequences are governed by the product (or Bernoulli) measure \({P}_{w}^{\infty }\in {\mathcal{P}}^{\infty }\)which is the infinite product of P w with itself and which is defined for propositions over W . According to P w the individual runs are governed by the same chance law P w , and they are stochastically independent from one another; thus are our counterfactual suppositions for the ideal detectibility of P w . Finally, we have to assume an a priori credence C 0 also defined for propositions over W 0×W . C 0 is not concerned with chances; it only captures our a priori expectations about all the particular facts in W 0×W . Of course, it extends the factual part of C 0; i.e., for each proposition AW 0we have C 0 (A) = C 0(A). I shall soon say a bit more about C 0 .

What we just said about the counterfactual detectibility of chances then condenses into what I would like to call the Knowability Principle:

$$(\mathrm{KP})\ {C}_{0}^{\infty }(A\vert {w}^{\infty }) = {P}_{ w}(A)\ {P}_{w}^{\infty }\mbox{ -}\mathrm{almost}\ \mathrm{surely}\ \mathrm{for}\ \mathrm{all}\ {P}_{ w}\ \mathrm{and}\ \mathrm{all}\ A \subseteq {W}_{0}.$$

The left-hand side is indeed a random variable with w as random argument. That the equation holds P w -almost surely is to say that the set of w for which the equation holds has P w -probability 1. The expression “C 0 (A| w )” is once more sloppy mathematics; it is short for the limit of the conditional credence of Awhen the condition infinitely grows into w .

Instead of an ontologically conceived Humean Supervenience of chances on the actual particular facts, we thus have (KP) asserting the counterfactual knowability on the basis of counterfactual particular facts. We shall soon see how (KP) reduces to still more basic rationality constraints on C 0 . “Knowability” is perhaps too strong a word; strictly speaking, we can never know the chances, we can only be almost sure of them. However, (KP) captures all what (counterfactual) particular facts can tell us about chances; even counterfactually there is no more to know; (KP) is our best approximation to knowability.

I introduced (KP) only as our epistemological substitute for the misguided ontological Humean Supervenience of chances. In fact, (KP) follows from standard principles. So far, we have not yet explicitly considered relative frequencies. This is easily done, though. Let rf(A)(w ) stand for the limit (if it exists) of the relative frequency of the realization of the proposition Ain the infinite random sequence w . Then two further principles hold, namely the (strong) Law of Large Numbers:

$$(\mathrm{LLN})\quad \mathit{rf }(A)({w}^{\infty }) = {P}_{ w}(A)\ {P}_{w}^{\infty }\mbox{ -}\mathrm{almost}\ \mathrm{surely}\ \mathrm{for}\ \mathrm{all}\ {P}_{ w}\ \mathrm{and}\ \mathrm{all}\ A \subseteq {W}_{0},$$

and the so-called Reichenbach Axiom (recommended by Hilary Putnam to Carnap in 1953; cf. Carnap 1980, p. 120):

$$(\mathrm{RA})\quad {C}_{0}^{\infty }(A\vert {w}^{\infty }) = \mathit{rf }(A)({w}^{\infty })\ \mathrm{for}\ \mathrm{all}\ {w}^{\infty }\ \mathrm{and}\ \mathrm{all}\ A \subseteq {W}_{ 0},$$

which says that our beliefs should increasingly and in the limit perfectly align with the observed relative frequencies, whatever they are. (KP), (LLN), and (RA) form a triangle connecting credence, chance, and relative frequency. Among the three, (LLN) and (RA) are the more basic ones. (LLN) is not a rationality postulate, but a mathematical theorem. Moreover, given (LLN), (KP) obviously follows from (RA), but not vice versa, because the equality of (KP) holds only almost surely.

Indeed, I find that de Finetti’s representation theorem fits perfectly to my counterfactual set-up, thus providing further insight into the Reichenbach Axiom. This is why I have emphasized at the beginning of this paper that I do hardly more than rearrange de Finetti’s philosophy of probability. The a priori credence C 0 should be a symmetric measure over the product space, i.e., the event that ngiven propositions realize in the first nrepetitions has the same credence as the event that these propositions realize in any other nrepetitions. This seems even more compelling in our counterfactual set-up, where all repetitions are equal by fiat, than in any factual set-up. De Finetti’s representation theorem tells that all and only symmetric measures are mixtures of product or Bernoulli measures, indeed unique mixtures. Hence, symmetry entails the principle of non-negative instantial relevance (cf. Humburg 1971, p. 228). Moreover, given symmetry, (RA) is equivalent to the assumption that the support or carrier of the mixture is the space of all product measures. This in turn makes clear that, given symmetry, (RA) entails the principle of positive instantial relevance (cf. Humburg 1971, p. 233). This may suffice as a brief reminder of the familiar epistemological home of the Reichenbach Axiom and thus of the epistemological grounds of the Knowability Principle.

My next point is that (KP) entitles us to project the credence C 0 for W 0×W , i.e., for the actual world and its infinitely many counterfactual repetitions onto the credence C 0for \({W}_{0} \times \mathcal{P}\), i.e., for the actual world and its chance measure. The Projection Ruletells for each proposition AW 0and each set \(\mathcal{Q}\subseteq \mathcal{P}\)of chance measures for Wthat:

$$(\mathrm{PROJ})\quad {C}_{0}(A \times \mathcal{Q}) = {C}_{0}^{\infty }(A \times \{{w}^{\infty }\vert {C}_{ 0}^{\infty }(.\vert {w}^{\infty }) \in \mathcal{Q}\}).$$

The Projection Rule thus says that a priori our credence that the true chance measure is in \(\mathcal{Q}\)(and that some factual proposition Aholds) is the same as our credence (that Aholds and) that the counterfactual infinite evidence w moves us into some state in \(\mathcal{Q}\).

Why is (PROJ) legitimate? (KP) says that for each possible P w the set of w making C 0 diverge from P w is a P w -null set. Due to its symmetry, however, C 0 is a mixture of all the P w . Hence, the set of w making C 0 diverge from all measures in \(\mathcal{Q}\)is also a C 0 -null set, because its C 0 -probability is a mixture of all the P w -null sets involved. Note, again, that (PROJ) is not an ontological thesis reducing chance to counterfactual infinite sequences of factual worlds. The ontological slack between truth and evidence is ineliminable. However, the ontological slack has not the slightest epistemological weight and cannot surface in the epistemological rule (PROJ); it is a genuine ‘don’t care’.

The upshot of these considerations is that the Minimal Principle is an immediate consequence of the Projection Rule. Take \(\mathcal{Q} = \{{P}_{w}\vert {P}_{w}(A) = x\}\). Then (PROJ) specializes to

$${C}_{0}(A\vert {P}_{w}(A) = x) = {C}_{0}^{\infty }(A\vert \{{w}^{\infty }\vert {C}_{ 0}^{\infty }(A\vert {w}^{\infty }) = x\}) = x.$$

And this is nothing but (MP), which we have seen is all we need together with (RED) (and (AdP)) to duplicate Lewis’ account. Thus, the replacement of the ontological doctrine of Humean supervenience by the epistemological Knowability Principle (which backed up the Projection Rule) at the same time replaces the conflict with (OP) by a confirmation of (OP).28

I find it illuminating to cast the point into a somewhat different form. For this purpose, we have to introduce the final player of my scenario, van Fraassen’s so-called Reflection Principle. It is entirely about subjective probability. There we have static rationality postulates like Coherence or the axioms of mathematical probability, Regularity, Symmetry, etc., and we have dynamic rationality postulates the best known of which is, of course, the Rule of Conditionalization. About the most basic of these dynamic postulates is the Reflection Principle 29 :

$$(\mathrm{RP})\quad {C}_{t}(A\vert {C}_{{t}^{{\prime}}}(A) = x) = x.$$

Here, C t is the subject’s credence or subjective probability at time t, and it is understood that t is later than t. In other words, C t specifies the prior and \({C}_{{t}^{{\prime}}}\)the posterior probabilities of the subject. The Reflection Principle thus says: Given the condition that my future probability for some proposition is x, my present probability for it is also x. In short: I trust now what I assume to be my future belief.

It is clear why (RP) is called an auto-epistemic principle; it assumes that my future beliefs are the objects of my present beliefs (even only as a supposition). If one accepts the richer auto-epistemic framework, then (RP) proves to be a most general dynamic doxastic law entailing conditionalization and its generalizations; it is even amenable to a Dutch Book justification (cf. Gaifman 1988; Hild 1998b). It is also obvious that (RP) is a rationality postulate of restricted validity. For instance, I should not now trust my future beliefs I will have when drunken, and when now reading the newspaper I should believe (within limits) what I have read even given that tomorrow I will have forgotten what I have read. Hence, I should reasonably trust only those of my future beliefs that I have acquired in a reasonable fashion and that I entertain from a superiorpoint of view, which is certainly provided by experience (and maybe in other ways as well).

The similarity between the Minimal and the Reflection Principle strikes the eye, though they are about different subject matters. However, the similarity is easily turned into entailment. Take (RP), replace C t by the ‘first’ a priori credence C 0 and \({C}_{{t}^{{\prime}}}\)by the ‘last’ credence C 0 (. | w ) counterfactually completely informed. (RP) thus spezializes to

$$(\mathrm{{RP}}^{\infty })\quad {C}_{ 0}^{\infty }(A\vert {C}_{ 0}^{\infty }(A\vert {w}^{\infty }) = x) = x.$$

Note that (RP) is in fact a theorem, not merely a rationality postulate. As above, (PROJ) finally turns (RP) into (MP).30 To summarize, in counterfactual ‘future’ we are completely informed about the counterfactual manifestations of the propensities in wof particular facts, thus completely informed we can infer the chances in w, and hence (MP) turns out as a special case of (RP).

Humean Projection

What is the significance of these mathematically trivial transformations? If projectivism is the doctrine that some objective traits of the world can only be understood as objectified projections of human attitudes, how does the previous section support projectivism concerning chances? To resume, the story is as follows: We postulate chances, and we know that they are different from our subjective probabilities. Yet, we also know the rational shape of our credences, we know how we change and improve them, we know according to (KP) that we cannot say anything better than that the chances are what our credences would be after that infinite counterfactual information, not by necessity, but with probability 1, and we know according to (PROJ) that we may identify our credences about chances with our credences about that counterfactual information and what we learn from it. We are aware of the ontological gap between chance and credence, but our epistemological bridge over it leaves nothing to be desired. In this sense I take chance to be a projection from credence.

Jeffrey (1965, Section 12.7) discusses the general idea that objective probabilities are objectifiedsubjective ones, and in (2004, p. 19) he says, referring to Hume, that “chances are simply projections of robustfeatures of judgmental probabilities from our minds out into the world” (his emphasis). Maybe he had the same picture in mind as the one developed here. However, objectification as he describes it in Jeffrey (1965, Section 12.7) is admittedly not very objective; it just means conditioning subjective probabilities wrt the true member of some partition of the possibility space considered (or the limit of these conditionings wrt a sequence of the true members of ever finer partitions). Of course, the result depends on the initial subjective probabilities as well as on the chosen partition. Jeffrey argues that this latitude has some advantages, but it seems clear that the general idea needs refinement.

Lewis (1980, pp. 278f.) is pleased that his account may be understood as offering such a refinement. According to (OP), it is the history-chance partition, as he calls it, which is thecorrect objectifying partition, and according to (OP) it is more simply the chance partition consisting of all T w themselves. Skyrms (1980, Section IA4) makes, in effect, the same proposal, though he opts for more pragmatic flexibility than Lewis and rather hides the chance nature of his conditioning partition. It is a matter of taste whether one should call this a confirmation or a trivialization of Jeffrey’s general idea. In any case, Jeffrey (2004, p. 20) reminds us that “on the Humean view it is ordinaryconditions, making no use of the word ‘chance,’ that appear” in the condition of (MP) or in the conditioning partition (my emphasis). Jeffrey insists on the point because otherwise his objectification idea has no prospect of offering an analysis of chance, a prospect Lewis (1980, pp. 288ff.) explicitly denies.

So, how does Jeffrey’s general idea fare with Humean Projection as construed here? According to (PROJ) it is indeed the partition consisting of all T w which is invoked in objectification; it is, however, to be conceived as the partition into all {w | C 0 (. | w ) = P w }. Hence, we have obeyed Jeffrey’s reminder; we have used ordinary conditions making no use of the word “chance”. Still, I am not sure whether Jeffrey would be satisfied. His examples always use partitions of the original possibility space Wof particular facts, whereas I move to a partition of the possibility space W of infinite counterfactual repetitions of W. Only there particular facts can get as close to objective chances as they can get; and if this is so, then Jeffrey’s objectification within the space Wcan at best reach pragmatically weakened forms. The detour via W appears unavoidable to me.

My continuous massive invocation of counterfactuality may have raised, however, suspicions from the outset. Skyrms (1980, p. 31) has already warned that “attempts to construe propensities as modalized relative frequencies only make things worsein this regard” (his emphasis), the regard being the use of the law of large numbers as an analysis of propensity. Skyrms is right. We have seen that chances do not ontologically reduce even to propositions over the counterfactual space W ; the slack isineliminable. However, W serves here only epistemological, not ontological purposes.

For the same reason I am not worried by Lewis (1994, p. 477), when he says “I think that’s a blind alley”, thereby referring to “thinking of frequencies not in our actual world, but rather in counterfactual situations” (in order to deal with his puzzling case of unobtainium). Within his set-up he is indeed right. There, relative frequencies in counterfactual situations can inform us about the actual world, only if we have ascertained beforehand that the counterfactual situation is governed by the same chance law as the actual world. Thus, we would have to solve, according to Lewis, the supervenience issue for the counterfactual situation in order to solve it for the actual world; and this merely defers the issue. However, this is not our problem. We do not have the telescope view onto counterfactual situations, to use Kripke’s terms; it was rather part of our counterfactual stipulation that all repetitions of Wbe governed by the same chance law P w ; there is no need to ascertainthe chance law of the repetitions. I do not see why this counterfactual stipulation should be illegitimate. We always think about counterfactual situations and what we would believe given this or that situation, and in order to get Humean Projection running in our way, we only consider extreme cases of this kind. Specializing, or extending, (RP) to (RP), in order to derive (MP), is not a misuse of the Reflection Principle; it is an extreme, though legitimate use.

Well, it may be legitimate; still it hardly helps. Given the extreme counterfactual evidence we may be as certain about chances as we can. Our actual evidence, however, is infinitely poorer. Indeed doubly so; we can inquire only a tiny part of our actual world and never the counterfactual repetitions. The counterfactual construction may, and should, I think, satisfy philosophers, but it is of no use for scientists and statisticians who cannot do better than gathering actual evidence and drawing conclusions from this insufficient basis. This, however, is something to acknowledge, not to deplore. The philosophical account provides the ideal standard, and it then is a methodological issue how best to approximate the ideal within our factual limits. Statisticians have developed most sophisticated test methods, of which randomization is an important part. But there are also more general preconceptions:

In principle, the scheme of partial determination governing our world may be any chance measure whatsoever. In principle, the whole world has the propensity to move into this or that state, and propensities may vary from here to there and from now to then. In our counterfactual scenario we could discover any wild distribution of chances, but in the actual world we want to understand the ‘mechanics’ of partial determination. The ground rule is: equal causes, equal effects; or rather, equal conditions, equal propensities – which gets bite only by restricting “equal”. The relevant conditions should be few, not many. If we are lucky, we have kept constant all relevant conditions during a row of some thousands throws of some die, and then we may take the actual row as approximating the counterfactual sequence. The relevant conditions should be local, or contiguous, to use Hume’s term. Non-locality is one of the mysteries created by quantum mechanics. Crystal balls are miraculous for the same reason. I find it incoherent to say that a given type of events is only partially determined, but can be unfailingly foreseen with a certain crystal ball. Rather, I would then take these events as fully determined – but would not understand how determination, i.e., the crystal ball works in these cases. If we are lucky, we shall be able to construe the chance law governing our world as a Markov process. If we develop different ideas about space and time, we have to adapt our preconceptions of the ‘mechanics’ of determination. And so forth.

If we do not succeed with our preconceptions, it is unclear how we would respond. In the extreme case, the idea of partial (or full) determination would dissolve. Thus it seems obvious to me that there is more to the notion of chance than just the Principal Principle. There are also all these preconceptions connecting chance with space and time, simplicity, orderliness, and whatnot. It is such things mentioned by Arntzenius, Hall (2003, pp. 177f.) when they arrive at the same conclusion. These preconceptions are modifiable, but only within limits; beyond the notion of chance will crumble.

Do such considerations reintroduce the epistemological double standard of which I have accused Lewis in the section Humean Supervenience? No. With regard to the ideal counterfactual evidence we can simply stick to de Finetti’s story of the symmetric a priori credence satisfying the Reichenbach axiom and thus converging almost surely to the true chance measure, whatever it may be. Here, we do not need help from the additional considerations just mentioned. We have to rely on them when and because we try to make sense of our very restricted evidence. Thus, the second epistemological story that I have just indicated does not interfere with, but rather complements, the account I have extensively presented.

Since we have sacrificed Humean Supervenience, we also have avoided the ontological double standard and the resulting conflict between (OP) and (NP). We can, and do, simply stick to (OP) and reject (NP) as nonsense.

However, if we sacrifice (HS), we cannot do so without considering Lewis’ two main reasons for it. The one consists in his ontological preferences. Without doubt, if (HS) were true, the resulting ontological picture would be most elegant and satisfying. Those rejecting Humean supervenience have different preferences and acknowledge irreducible dispositions, capacities, causes, necessities, or propensities. The projectivist, in particular, has a special story to tell about these matters that explains them ultimately with our subjective condition without diminishing their objectivity. I do not think that this ontological dispute can be resolved with general arguments. It is a matter of details, and there we have at least seen that Lewis had difficulties to maintain his prima facie elegance.

His second reason, though, is more pertinent and more urgent. It is best put in Lewis (1986, pp. xvf.):

I could admit that the chances do not supervene on the arrangement of qualities. Why not? I am not moved just by loyalty to my previous opinions. That answer works no better than the others. Here again the unHumean candidate for the job turns out to be unfit for its work. The distinctive thing about chances is their place in the ‘Principal Principle,’ which compellingly demands that we conform our credences about outcomes to our credences about chances. I haven’t the faintest notion how it might be rational to conform my credences about outcomes to my credences about some mysterious unHumean magnitude. Don’t try to take the mystery away by saying that this unHumean magnitude is none other than chance! I say that I haven’t the faintest notion how an unHumean magnitude can possibly do what it must do to deserve the name – namely, fit into the principle about rationality of credences – so don’t just stipulate that it bears that name. Don’t say: here is chance, now is it Humean or not? Ask: Is there any way that an unHumean magnitude could [fill the chance-role]? the answer is ‘no’

He repeats the point in Lewis (1994, pp. 484f.) with more confidence, having been shown a way out of the paradox of undermining futures generated by (HS).

His own response to this challenge is (HS). It is no mystery how particular facts constrain credence; and if chance supervenes on particular facts, it is in principle no mystery how chance constrains credence. And thus he sets out to remove paradox by modifying (OP). Right at the beginning of the section Chance-Credence PrinciplesI indicated that this is the basic puzzle affecting the Principal Principle. The quotation indeed suggests that Lewis thinks that (HS) is the onlysolution of the puzzle (even though his challenge is directed foremost to the position of David Armstrong). How may the projectivist respond?

For the projectivist the puzzle has a straightforward solution.31 This is clear from his general strategy. For him, chances are not alien features cognitive access to which is bound to be mysterious; they are of our own breeding. We need not speak figuratively, though; we have prepared a precise answer. Lewis is right; there is no mystery how particular facts constrain credence. However, van Fraassen is also right; there is in principle no mystery how future credence can constrain present credence. And we have seen that according to the projectivistic reconstrual the Principal Principle is nothing but an extreme application of the Reflection Principle. This was the whole point of my construction in the previous section. To be sure, in that application a priori credence is constrained by an extremely counterfactual ‘future’ credence. However, it is mostly counterfactual future credence to which (RP) applies, and we should certainly not bother about being more or less extreme. In this way, the projectivist is able to remove the puzzling air from the Principal Principle. Chance, being almost surely identical to projected credence objectified, must constrain a priori credence precisely in the way summarized in (OP).

Appendix on Ranking Functions and Deterministic Laws: The Same All Over Again

The whole of this paper immediately and perfectly carries over to full determination or natural necessity and deterministic laws. Lewis tells the same story, this story meets the same criticism, and I have a precise projectivistic substitute story. Indeed, all this is more or less a matter of routine; I do not have to write a twin paper. Let me just indicate the basic points.

A very common, and also Lewis’, assumption is that laws are regularities which in turn are mere generalizations expressed by universally quantified sentences. However, not all regularities are laws; we have to be selective. Lewis offers his best-system analysis of laws in order to discriminate them from mere regularities. He thinks laws Humeanly supervene on particular facts, and he constrains the supervenience of laws in the same way as that of chance. The only point missing is that the Principal Principle and the ensuing discussion have no explicit deterministic counterpart.

The problems remain. (HS) is again ontologically as well as epistemologically constrained. Carroll (1994, Chapter 3) and Ward (2002, Section 3) attempt to specify examples of two worlds in which the same facts, but different laws obtain. Black (1998, p. 376) suggests “that laws can undermine themselves, in that the laws of the universe might allow that the laws of the universe could have been otherwise.” Hence, it looks like we are running into the same kind of problems with deterministic laws as we have extensively discussed here with respect to chance. Lewis’ reasons for sticking to (HS) are also the same. Without (HS) we could not understand the idea of necessitation. Hence, the dialectic situation is as before. What, though, could be the constructive alternative? This is indeed much less clear than in the probabilistic case where subjective and objective probabilities and their delicate relation are perhaps not fully understood, but familiar for a very long time.

I think the basic mistake lies already in the common assumption that laws are (a special kind of) regularities. In this respect, laws are much more deceptive than chances. One immediately sees that chances are modalities; they take propositions as arguments and somehow assign numbers to them. By contrast, laws appear to be mere propositions, and modality is prima facie not involved. Any subsequent mounting of modality is then bound to create mysteries. The alternative, though, is not to start with a primitive necessitation operator, as Armstrong does in his analysis of lawhood. This is no less mysterious. Also, it will not do to conceive of deterministic laws as a limiting case of chance laws, not only for the reason that a chance of 1 is not quite necessitation. This is not the place, however, to go through all the various accounts of lawhood. Let me just say that I believe that the alternative must be somehow to tell the same kind of story as we did in the probabilistic case. But how?

The answer is: with the help of ranking functions (first presented in Spohn (1983, 1988), where I called them ordinal conditional functions). As in probability, we must start with the subjective side, with the representation of belief. This is what a ranking function does. A ranking functionκ for a given possibility space Wis a function from Winto the set of nonnegative integers such that κ(w) = 0 for some wW. The ranking is extended to propositions AWby defining κ(A) = min{κ(w) | wA}. And conditional ranks are defined by \(\kappa (B\vert A) = \kappa (B \cap A) - \kappa (A)\).

Ranks are degrees of disbelief. κ(A) = 0 says that Ais not disbelieved at all; κ(A) = n> 0 says that Ais disbelieved to degree n. Hence, \(\kappa (\overline{A}) >0\)expresses that \(\overline{A}\)is disbelieved (to some degree) and hence that Ais believed (to the same degree). Thus, ranking functions, unlike probability measures, represent belief (acceptance, holding to be true). This is their most distinctive feature due to which they can be related to deterministic as opposed to probabilistic laws. Unlike doxastic logic, or even AGM belief revision theory, ranking functions can also account for a full dynamics of belief; this means at the same time that they embody a full inductive logic. Basically, this dynamics consists in conditionalization, just as in probability theory.32 The reason why this works perfectly is that conditional ranks as defined above behave almost exactly like conditional probabilities. Indeed, the parallel extends much farther. Practically all virtues of Bayesian epistemology can be carried over to ranking functions. (For a fuller explanation of these claims see Spohn 1988, 2009.)

One thing we can now do, for instance, is to state the Reflection Principle in ranking terms:

$$(\mathrm{RP}\kappa )\quad {\kappa }_{t}(A\vert {\kappa }_{{t}^{{\prime}}}(A) = n) = n,$$

which says that given you disbelieve Atomorrow to degree nyou so disbelieve it already today. (RPκ) is indeed a strengthening of Binkley’s principle.33 All the remarks about the probabilistic version (RP) apply here as well.

I have emphasized that ranking functions must be interpreted as representing doxastic states. They represent what a subject takes to be true or false, but they are not true or false themselves. However, to some extent they can be objectified so that it makes sense to apply truth and falsity to them, just as to propositions. How this objectification works is a somewhat tricky story elaborated in Spohn (1993). According to this story, most ranking functions cannot be objectified. This appears to be different with chances. Any credence measure for Wcould, it seems, also serve as a chance measure for W. But maybe not. We have seen above that there is more to chance and that one might, for instance, suggest that only probability measures representing a Markov process can be chance measures.

Anyway, what I have proposed in Spohn (1993) is that causal laws or, in the present terms, schemes of full determination are just such objectifiable ranking functions, a view I have philosophically more thoroughly explained in Spohn (2002). The crucial point is that the inductive behavior is thus directly built in into laws and not subsequently imposed on something propositional. Moreover, for laws so conceived we can tell de Finetti’s complete story as shown in detail in Spohn (2005): If the ranking function κ is such a scheme of full determination for W, we can again form the infinite product space W and the product ranking function κindependently repeating κ infinitely many times. Any symmetric ranking function over W is then a unique mixture of such product ranking functions, which will converge to the true law ( = product ranking function) with increasing evidence. Hereby, the role relative frequencies have in the probabilistic case is taken over by the number of exceptions in the deterministic case.

In sum, we have here all the ingredients for telling exactly the parallel story about necessitation or full determination as we have told about partial determination. Deterministic laws are, in the way explained, projections of ranking functions, i.e., of subjective states representing beliefs and their dynamics.

Notes

1 I have written a minor note, Spohn (1987), which foreshadows the general line of thought, and a German paper Spohn (1999), of which the present paper is a substantial elaboration.

2 There presumably are deep connections between metaphysical and natural necessity. Still, the two kinds of necessity must at first be kept apart. Metaphysical necessity is tied up with identity and existence, natural necessity is not, prima facie. Here, I shall deal only with the latter without worrying about its connection to the former.

3 Similar phrasings may be found in Popper (1990, pp. 18f.) and Miller (1995, p. 138).

4 For instance, Fetzer (2002) shares realism about propensities, but responds to such concerns by embedding propensities into an embracive account of explanation and abductive inference. While I am in sympathy to his general approach, I do not want to explicitly enter the topic of explanation. Of course, that topic is tightly interwoven with our present one, but it has its own intricacies, in particular, when it comes to saying what ‘the best explanation’ might be. As far as I can see, we shall be able to side-step these intricacies here without loss.

5 My reference book is Rosenthal (2004) that offers forceful criticisms of prominent variants of the propensity interpretation.

6 Logue (1995) apparently pursues the same goal. However, he insists on having only one notion of probability, a personalistic one, and he does not present an explicit projectivistic construal of objective probability. The only further probability book where the idea is taken up is Rosenthal (2004, pp. 199ff.). In fact, the challenge of understanding objective probability as it is built up in this book in a most pressing way provoked me to elaborate my (1999) into the present paper.

7 Concerning deterministic laws, Ward (2002) also claims to give a projectivistic account which he extends to chance in Ward (2005). However, while I agree with his critical diagnosis, our constructive approaches widely diverge, as will become clear at the end of this paper.

8 Often, direct inference is more narrowly understood as the more contested ‘straight rule’ that recommends credence to equal observed relative frequency.

9 The puzzle is vividly elaborated by Rosenthal (2004, Section 6.3).

10 I shall even prefer sentential over set theoretical representations of propositions.

11 This is Constraint 2 of Skyrms (1980, pp. 163–165), applied to degrees of belief and propensities.

12 Even at the risk of appearing pedantic, let me at least once note what the correct set-theoretic representation of (MP*) is. There, credence is not about facts and chance, but rather about facts and evolutions of chance, i.e., about \(W \times \mathcal{P}^{T}\), where T is the set of points of time. (Only at the end of the next section shall we be able to return to our initial simpler conception of credence.) (MP) then says that \({C}_{0}(A\vert \{\pi \in \mathcal{P}^{T}\vert \pi (t)(A) = x\}) = x\), where the condition consists of all those evolutions of chance according to which the chance of A at t is x.

13 Hall (2004, p. 103) arrives at another definition of admissibility. However, it is clear that his move from his (3.12) to his (3.13) offers another sufficient condition the necessity of which is not argued for. In any case, his definiens entails mine, but not vice versa.It is unfortunate that my paper was essentially finished before I could get aware of this paper of Ned Hall, which covers much of the same ground as mine, though with different twists and conclusions. Thus, my comparative remarks will be confined to some footnotes.

14 Of course, I am presupposing classical time throughout. I do not venture speculating about the consequences of relativistic time for our topic.

15 Proof: According to (CP) we have \({C}_{0}(A\vert {H}_{\mathit{wt}}\ \&\ {P}_{\mathit{wt}}(A\vert {H}_{\mathit{wt}}) = x) = x\). According to (DP) P wt (A| H wt ) = xexpresses the same proposition as P wt (A) = x. Hence, we also have \({C}_{0}(A\vert {H}_{\mathit{wt}}\ \&\ {P}_{\mathit{wt}}(A) = x) = x\). This just says that H wt is admissible wrt Ain wat t. Since any t–historical Eis the disjoint union of some H wt , Eis admissible, too, wrt Ain wat t.

16 For a recent reinforcement of the problem see Hájek (2007).

17 This follows from the graphoid axioms for conditional probabilistic independence; cf., e.g., Spohn (1978, pp. 102f.).

18 Why is Pnow double-indexed? Because we have to say for the world wnot only what the chances are in wat t( = P wt ), but also what the chances would have been if H vt had been its history up to t( = P w,vt ).

19 This is different from the identification of probabilities of conditionals with conditional probabilities, of which Lewis (1976) has warned us.

20 The satisfiability of the consistency requirement is obvious in the case of discrete time with a first point of time. In the other cases one has to allude to convergence theorems for descending martingales; cf., e.g., Bauer (1968, p. 281).

21 Hall (2004, p. 96) undertakes the same reduction. P w is what he calls ur-chance.

22 If this algebra were a complete one, this translation of (HS) would indeed be correct.

23 We might, of course, strengthen (HS) to the effect that chances at tsupervene on no more than factual history up to t; then chance information is only about the past, and the paradox cannot arise. Lewis (1980, pp. 291f.) already mentions this option before clearly seeing the paradox. In 1986 (p. 131) he even expresses a preference for it after recognizing the paradox. In 1994 (Section 6) he finally rejects it, rightly in my view.

24 Proof: Take (CP), specialize Bto H wt & T w and apply (DP) for omitting H wt from the condition of P wt . Then you get \({C}_{0}(A\vert {H}_{\mathit{wt}}\ \&\ {T}_{w}\ \&\ {P}_{\mathit{wt}}(A\vert {T}_{w}) = x) = x\), which is nothing but (NP), since H wt & T w entail P wt (A| T w ) = x.

25 Cf. Arntzenius, Hall (2003, pp. 176f.) for a fuller explanation of the point.

26 Sturgeon (1998) argues that the restrictions put on (HS) are indeed incoherent, however they are specified. Hall (2004, pp. 108ff.) also critically discusses how the inference of chances from facts is supposed to go.

27 Loewer (1996, pp. 114f.) also discusses the point and recommends rigidification.

28 This refers to Stalnaker (1978).

29 The point is indeed one of deep and general importance. It applies, I believe, to objecthood in general, certainly a most fundamental matter. Wecut up the world into pieces, weconstitute objects by saying which properties or kinds of properties are essential or constitutive for them. (This allows for the case that we fix only a space of possible essential properties of an object and leave it to the actual world to fix the actual essential properties.) Still, the objects thus constituted are objects independent of us, their objectivity is in no way impaired by our constituting them, in particular because we constitute objects in such a way that our constituting is notessential for them. The point extends to properties. In two-dimensional semantics each predicate expresses a (derigidified) concept and denotes a (rigidified) property, and while most concepts are, as I say, a priori relational, only few properties are necessarily relational – two notions of relationality that are particularly relevant vis à vis color predicates; cf. my (1997, pp. 367ff.). This footnote indicates the direction into which this paper would need most to be further thought through.

30 This section elaborates the core of the predecessor paper Spohn (1999).

31 Hall (2004, pp. 108f.) envisages the same kind of argument, also with reference to de Finetti’s representation theorem, though without actually endorsing it. He ascribes the argument to a position he calls ‘primitivist hypothetical frequentism, which, however, is not mine. As he describes it, this kind of frequentist equates chance with limiting hypothetical relative frequency and considers it to be a brute metaphysical fact that this equation is correct. By contrast, I emphasized the almost unnoticeable epistemological-ontological gap, and I do not see the necessity to close it per fiat.

32 The Reflection Principle is explicitly stated in van Fraassen (1984); there its deep philosophical relevance was fully recognized. He returns to it at length in van Fraassen (1995). Other references are Goldstein (1983) and Spohn (1978, pp. 162f.) where I stated an equivalent principle (called the Iteration Principle by Hild 1998a, p. 329) within an auto-epistemic or reflexive decision-theoretic setting and under the restrictions usually accepted nowadays. Penetrating discussions may in particular be found in Hild (1998a, b).

33 Skyrms (1980, Appendix 2) already observed that there is a common form to such principles that is open to various interpretations. Following Gaifman (1988), the common form might be called ‘expert principle’, since it describes trust in some kind of expert. For this unified view see in particular Hall (2004) and Hájek (2007). However, it is only our Projection Rule which establishes an entailment between the expert principles considered here, i.e., (RP) and (MP).

34 As mentioned in footnote 31, Hall (2004, p. 109) also envisages the solution defended here (with some doubts concerning its general feasibility). However, he envisages it only as a possibility in order to prove the point he is up to in his paper, viz., that the reductionist claiming (HS) need not have an advantage over the non-reductionist vis à vis this issue. For him (cf. p. 107), a no less acceptable response seems to be to declare the Principal Principle analytic and to reject any further justificatory demands. As I have explained in the Section Chance-Credence Principles, this will not do. We have a real challenge here which requires some substantial response.

35 The idea that belief is just probability 1 is not only intuitively unsatisfactory, but also theoretically defective, because conditionalization does not work for extreme probabilities and beliefs could then only be accumulated and never revised. (Popper measures solve this problem just as half-way as does AGM belief revision; see Spohn 1986.) This is the essential reason why it does not work to correspondingly conceive deterministic laws as limiting cases of chance laws.

36 It says that if I believe now that I shall believe tomorrow that p, I should already now believe that p. Binkley (1968) introduced it in relation to the surprise examination paradox.