In Press and Dyson (2012), it was shown that in the two-player Iterated Prisoner’s Dilemma (IPD), there exists a strategy for a player X to win over her evolutionary opponent Y who, without a theory of mind about X, can only accede to X’s extortion. Nevertheless, Press and Dyson stressed that having a theory of mind can help Y resolve X’s strategy and turn the game into an ultimatum game. Therefore, given that both players have a theory of mind about the other, there exists no definite value-maximizing strategy for IPD. In this paper, I shall argue that the same holds for the original Prisoner’s Dilemma (PD): logic in itself does not suggest a value-maximizing strategy for PD—defection—as many have believed.

Players of a PD game are no doubt in an embarrassing or ‘dilemmatic’ situation and decision theorists have definitely done a good job analyzing it. Yet, the modest goal of the present paper remains: to show that, even in a one-off PD game, logic does not tell us to defect.

1 The Prisoner’s Dilemma

An interesting fact about the Prisoner’s Dilemma is that while many researchers, such as Robert TriversFootnote 1 and Robert Axelrod,Footnote 2 have drawn our attention to the far-reaching applicability of the IPD game and have explored all sorts of possible strategies for it, they seem to agree on one thing—the strategy for the original one-off PD is straightforward: Defect, period. For example, commenting on Press and Dyson (2012), Stewart and Plotkin (2012, p. 10134) has this to say,

If the Prisoner’s Dilemma is played only once, it always pays to defect — even though both players would benefit by both cooperating.

And in Nowak and Highfield (2012, p. 29), Nowak holds this view as well,

In the single shot game, the one that I analyzed earlier in the discussion of the payoff matrix of the Prisoner’s Dilemma, it was logical to defect. (italics mine)

I think both of them are wrong in thinking that it is logical to defect in the PD game, and I will illustrate that it is indeed not easy to formalize the Prisoner’s Dilemma as a (logical) dilemma.

According to Shorter Oxford English Dictionary (2007), a dilemma is

  1. 1.

    In RHETORIC, a form of argument involving an opponent in choice between two (or more) alternatives, both equally unfavourable. In LOGIC, a syllogism with two conditional major premisses and a disjunctive minor premiss.

On the face of it, PD fits both of these criteria, being of the form, “I cooperate or defect; if I cooperate then my deed is inconsistent with the fact that defection is always a better choice; if I defect then my deed is inconsistent with the fact that both players cooperating is a better choice; therefore a contradiction is inevitable.” But in fact it does not. Clearly, the ‘argument’ and ‘syllogism’ in the quoted entry are meant to be sound arguments, but, as we will see soon, the argument associated with the first horn of PD, i.e. the argument for defection, is not sound at all.

Let a 1 and a 2 be two persons involved in a PD situation, and let (R, S, T, P) be the reward for a 1 when the collective action of a 1 and a 2 lies in (C1C2, C1D2, D1C2, D1D2), where Ci and Di denote the cooperation and the defection of a i respectively, with the assumption that T > R > P > S and 2R > T + S. Presumably, the following two horns are involved in PD:

  • Horn 1. C2 or D2; If C2 then N(D1); If D2 then N(D1)/∴N(D1); similarly, an argument for N(D2).

  • Horn 2. Yet, the values associated with C1C2 are higher than that associated with D1D2.

Here N(D1) stands for “the rational being a 1 should defect”.

Now, rationality is indeed a highly praised value in itself, but to establish Nowak’s claim that it was logical to defect, we need to formulate it in logical terms. What can “the rational being a 1 should defect” mean? The best that I can think of is that a rational agent is one who will do whatever logic suggests her to do, and N(D1) then simply means that D1 is a logical consequence of the antecedent provided that agent a 1 seeks to maximize her value and is aware of the antecedent.Footnote 3 So, in ideal cases such as the Prisoner’s Dilemma in question, we can, for simplicity, assume that the agent a 1 is rational, and reformulate the first horn of Prisoner’s Dilemma as follows, with no normative mode in sight.

Horn 1 We have a sound argument for the conclusion D1:

 

P0

C2 or D2

 

P1

If C2 then D1

 

P2

If D2 then D1

[Main]

- - - - - - - - - - - - - - - - - -

 

D1

The argument [Main] of Horn 1 is, on the face of it, an instance of Disjunction EliminationFootnote 4 and the premisses P0 ~ P2 seem all true as well. So the Prisoner’s Dilemma is prima facie a dilemma—it advices a 1 to defect (by Horn 1) and not to defect (by Horn 2) at the same time.

However, if we look at [Main] more closely, we will see that the premisses P1 and P2 are actually supported by the following hidden arguments:

 

B1

If C2 then v(D1) > v(C1)

 

G

If v(D1) > v(C1) then D1

[Supportive 1]

- - - - - - - - - - - - - - - - - -

 

(P1)

If C2 then D1

and

 

B2

If D2 then v(D1) > v(C1)

 

G

If v(D1) > v(C1) then D1

[Supportive 2]

- - - - - - - - - - - - - - - - - -

 

(P2)

If D2 then D1

Here v(X1) stands for the reward that a 1 receives when she adopts the strategy X.Footnote 5

How do we interpret the sentences in these arguments, especially the conditionals B1, B2 and G? The soundness of these arguments no doubt depends on how we interpret these conditionals. However, before we spell them out in detail in Sect. 3, let us briefly review, in the next section, the Kripkean possible world semantics and Ramsey’s Test that shall play an essential role in our analysis of the arguments.

2 The Possible World Semantics and Ramsey’s Test

Despite its great explanatory power, the Kripkean possible world semantics is often criticized for the fact that the so called ‘possible worlds’ are unrealistic and difficult to pin down or grasp. However, for a description of the PD game—the subject of the present paper—the possible world semantics turns out to be a perfect tool, and all possible worlds can be explicitly described. There are altogether four possible worlds, C1C2, C1D2, D1C2, and D1D2 respectively, and the binary accessibility relation R on the universal set W of possible worlds is such that every world is accessible to each other (including itself). In other words, we are now in position to analyze the PD based on an S5 setting. The fact that S5 is by far the simplest modal system will help us spot a problem more easily, should there really be one.

Another useful tool that we shall be referring to is Ramsey’s Test. Ramsey famously says the following about conditionals,

If two people are arguing ‘If p will q?’ and are both in doubt as to p, they are adding p hypothetically to their stock of knowledge and arguing on that basis about q; so that in a sense ‘If p, q’ and ‘If p, -q’ are contradictories.

— Ramsey (1990), p155, footnote 1

Here are two important observations relevant to the task of this paper.

First, in terms of the possible world semantics, when someone is assessing “if p, q” at world w, the adding of p (hypothetically) to her stock of knowledge amounts to a modification—more specifically, a shrinking—of the set of possible worlds accessible to w, so that it includes only those possible worlds in which p holds. This way of assessing a conditional has two consequences, (1) in evaluating a conditional, what we actually resort to is a set of possible worlds rather than a possible world; (2) the set of possible worlds in question can vary from conditional to conditional.

Second, when a person is asserting a conditional “if p, q” for which the consequent q itself involves the self-reflexive indexical “I”, things can become complicated.Footnote 6 She has two options—after adding p hypothetically to her stock of knowledge, she either supposes that the “I” in q has p in her stock of knowledge or she does not. These two options will be characterized as ‘subjective’ and ‘objective’ interpretations of such a conditional respectively.Footnote 7 The following pair of conditionals

S :

If there is a bomb in this room, I will leave the room in no time,

O :

If there is a bomb in this room, I will be blown into pieces,

can be true according to different interpretations. A subjective interpretation would make sentence S true, while an objective interpretation renders sentence O true. As a matter of fact, the subjective interpretation seems defeasible, because once challenged by “but, you might not know that there was a bomb!” one would normally withdraw his former assertion. Nonetheless, it is still an interpretation that we often adopt in our daily conversation.

More importantly, this phenomenon may happen to cases not involving self-reflexive indexical as well. For example, concerning P1, we may ask ourselves: After C2 is added to our stock of knowledge, does the person a 1 knows about C2 too, so as to make the decision to defect? For a subjective interpretation, the answer is “yes”, but for an objective interpretation, the answer would be a “no”.Footnote 8

3 Is the Argument for Defection Sound?

Recall that the first horn of PD consists of three arguments [Main], [Supportive 1], and [Supportive 2]. Now, whether these arguments are sound hinges on how we interpret the conditionals.

The naive reading

To begin with, if we simply read the conditional “if p, q” as the material implication, and every sentence is supposed to be evaluated against a possible world (or, more properly, a truth assignment), then the three arguments become:

 

P0

C2 ∨ D2

 

P1

C2 ⊃ D1

 

P2

D2 ⊃ D1

[Main]

- - - - - - - - - - - - - - -

 

D1

 

[Supportive 1]

 

[Supportive 2]

B1

C2 ⊃ v(D1) > v(C1)

B2

D2 ⊃ v(D1) > v(C1)

G

v(D1) > v(C1) ⊃ D1

G

v(D1) > v(C1) ⊃ D1

∴ (P1)

C2 ⊃ D1

∴ (P1)

D2 ⊃ D1

Apparently, all three arguments above are valid. However, if we have been more careful, we would have found that the formula v(D1) > v(C1), which plays a key role in the two supportive arguments, simply makes no sense at a possible world w. Given any w, only one of v(D1) and v(C1) is applicable, because in w the agent a 1 either defects or cooperates, but not both. Therefore, the premisses P1 and P2 of [Main] are not supported by meaningful, let alone sound, arguments, hence we find no reason to accept the conclusion D1 of the [Main] argument. Affixing the operator □ to all the material conditionals, that is, turning all material implications into strict implications, would not help either, because the mismatch of C2 and v(D1) > v(C1)—as statements concerning different entities—remains.

Some may disagree with my claim that C2 ⊃ v(D1) > v(C1) does not make sense at a possible world w, because, apparently, given that C2 holds at w, in evaluating v(D1) > v(C1), we can simply resort to the set R w of all possible worlds accessible from w, and v(D1) > v(C1) is true on R w if the v(D1) of any world in R w is greater than the v(C1) of any other world in R w whenever v(X) makes sense. However, their objection would not work. Recall that here we are interpreting the conditional B1 as a material conditional. So, in getting the R w for the evaluation of v(D1) > v(C1), we have no way to impose the constraint that the possible worlds in R w that we are interested in are only those at which the antecedent holds. Therefore, as we have seen in the previous section, the R w will be consisting of all four possible worlds C1C2, C1D2, D1C2, and D1D2, and clearly v(D1) > v(C1) would not hold, even making no sense, on R w .

In sum, the naïve trial—interpreting the conditional as either material conditional or strict conditional—fails in the following way, and we cannot reach the conclusion of defection.

 

Validity

Truth of all premisses

[Main]

×

[Supportive 1]

×

[Supportive 2]

×

3.1 The Ramsey–KSL Reading

To grasp our intuition that, in evaluating B1, we only consider v(D1) > v(C1) for the worlds at which C2 holds, we have to incorporate Ramsey’s idea into the Kripkean possible world semantics, and consider a Kripke–Stalnaker–Lewis (in short, KSL)-style reading of the conditional. As a matter of fact, the Ramsey–KSL way of understanding the conditional B1 is quite natural and we make such statements all the time. For example, the Gibbard–Harper style Causal Decision Theory can reckon “if C2 then v(D1) > v(C1)” meaningful on the ground that, given that the other player cooperates, my utility in the nearest world where I defect and he co-operates would be greater than my utility in the nearest world where we both co-operate. Alternatively, we can say that the sentence B1 is true at w if we search the set of all those worlds accessible from w in which a 2 cooperates, and compare the v(D1) of any world with the v(C1) of any other world in it and find that the former is always higher than the latter. Either way, the antecedent sets a condition on the set of accessible possible worlds associated with a possible world, in contrast to setting a condition on the possible world itself. We can even say that the antecedent of B1 is essentially concerned with a possible set of (accessible) possible worlds rather than with a possible world. It is analogous to the following: “John’s friends all know each other” is a condition on the set of friends of John, while “John is tall” is a condition on John himself.Footnote 9

Specifically, in the case of PD, our preferred truth condition for B1 is that B1 is true at a possible set U of possible worlds if and only if so long as U is a subset of the extension ||C2|| then the v(D1) of D1C2 is higher than the v(C1) of C1C2 provided both values obtain. According to this reading, both B1 and B2 are surely true.

Now how about the premiss G, “if v(D1) > v(C1) then D1”? According to Ramsey’s Test, we should add v(D1) > v(C1) into our stock of knowledge and see whether we would accept D1. The adding of the antecedent into our stock of knowledge amounts to restricting our consideration to the nearest world whose set of accessible worlds is such that v(D1) > v(C1) holds. But, there are a couple of problems here.

First, in the Kripkean scheme, the set of accessible possible worlds for each possible world is the same, namely W. So, v(D1) > v(C1) does not hold at any possible world. We can either regard such a counter-possible conditional as automatically true, or regard it as meaningless. However, as we would certainly be reluctant to also regard “if v(D1) > v(C1) then C1” as true, the second option seems more preferable. To make sense of the conditional G, we simply cannot stick to the default set of accessible worlds determined by the accessibility relation of Kripke’s semantics. The natural choice would be, as we did for B1, to regard v(D1) > v(C1) as imposing a restriction on the set of possible worlds accessible from a possible world, so that G is to be evaluated against possible sets of accessible possible worlds rather than against possible worlds. Then, G is true with respect to a set of possible sets of accessible possible worlds if for every possible set U of possible worlds on which v(D1) > v(C1) holds, C1 holds at the world from which the possible worlds in U are accessible. This is intuitively plausible, but we should bear in mind that, strictly speaking, the usual Stalnaker/Lewis style account of conditionals does not accommodate talks of this sort. A possible world paired with a restricted/modified set of accessible possible worlds seems to be what we should have at hand in order to check whether “not v(D1) > v(C1) or C1” holds, and G is true with respect to a set of such pairs if and only if “not v(D1) > v(C1) or C1” holds for every pair. In other words, G asserts that for every possible world whose modified set of accessible possible worlds is such that v(D1) > v(C1) holds, C1 holds. Again, as in the case of B1, while the traditional Stalnaker/Lewis-style conditional restricts our attention to possible worlds for which the antecedent holds, G restricts our attention to those possible sets of accessible possible worlds for which the antecedent holds.

Granted that we can charitably interpret the conditional G so that our attention is restricted to pairs (U, w)—where w is a possible world and U is a possible set of possible worlds accessible from w—such that v(D1) > v(C1) holds at U, would it always be the case that C1 holds at w then? This leads to a second concern which involves the subtle distinction that we mentioned near the end of Sect. 2. We need to distinguish between an objective interpretation of G and a subjective interpretation of G. And whether a 1 has a privileged access to our knowledge of v(D1) > v(C1) or not will make all the difference.Footnote 10 The objective minded do not assume that a 1 has the antecedent, namely v(D1) > v(C1), in her stock of knowledge, yet the subjective minded do. We are thus divided between the following two interpretations of G.

3.1.1 Objectively Conceived a 1

Even if v(D1) > v(C1) holds for some set U of possible worlds, there is no guarantee that a 1 knows this fact and would consequently defect. The fact that a 1 is assumed to be a completely rational being alone does not help because a 1’s decision needs to be grounded on the knowledge of whether v(D1) > v(C1) holds for U, but even if we assume that a 1 always knows whether v(D1) > v(C1) holds for the set V of possible worlds that she has in mind, a 1’s action is independent of the state of U—after all, U and V are distinct sets—unless the state of U is a sort of public knowledge/regulation that everyone, in particular a 1, is aware of. Therefore, G is in general false. As a result, while both [Supportive 1] and [Supportive 2] are valid arguments, they are unsound because they both contain a false premiss, G. In the same vein, the argument [Main] is unsound as it contains two unwarranted premisses P1 and P2. The problem of [Main] is worse than that, because it in itself is not a valid argument to start with, and we will talk about that later.

In sum, according to this interpretation, we have

 

Validity

Truth of all premisses

[Main]

×

×

[Supportive 1]

×

[Supportive 2]

×

3.1.2 Subjectively Conceived a 1

If we grant G the subjective interpretation, so that a 1 has an unrealistic, privileged access to the fact that v(D1) > v(C1),Footnote 11 and the set V mentioned earlier becomes the same as U, then G can indeed be accepted to be true. In this case, while P0 is a truism by stipulation, P1 and P2 are both supported by sound arguments—both [Supportive 1] and [Supportive 2] are valid arguments and all the premisses B1, B2 and G are true. Therefore, the premisses of [Main] are all true now indeed.

Nevertheless, the argument [Main] is still unsound as it itself is not a valid argument in the first place. Clearly, R w ⊆ |C2 ∨ D2|, R w ∩ |C2| ⊆ |D1|, and R w ∩ |D2| ⊆ |D1| together do not lead us to R w ⊆ |D1|, as the property of a set is not exhausted by the properties of its constituent subsets. So, according to this interpretation, we have

 

Validity

Truth of all premisses

[Main]

×

[Supportive 1]

[Supportive 2]

Therefore, a 1 still has not got any logical reason to defect.

By looking closely at [Main], an argument that has allegedly shown that defection is the logical choice for the PD, we have found that the argument is not sound after all. Apparently, there is no simple way that Horn 1 can be conceived as a sound argument in propositional modal logic, and it is reasonable to suspect that the PD is not a logical dilemma at all, unless someone can put Horn 1 in another way and prove that it is indeed sound.

3.2 The Hi-World Reading

In the Ramsey–KSL reading, even if we can charitably interpret the argument [Main] and the supportive arguments in the spirit of Stalnaker/Lewis conditional and discover that [Main] is indeed not a sound argument, some of the conditionals involved in the arguments still could not be expressed in terms of traditional modal logical terms. This is due to the fact that the conditionals involve a mixture of modality of different levels, yet the Kripkean semantics simply does not provide us with a tool to analyze them, for instance, it does not allow us to consider a possible world and a possible set of possible worlds at the same time. Incidentally, the recently developed hi-world semantics, which can in effect take care of any mixture of iterated modalities at one go, turns out to be able to provide us with a new way of interpreting modal formulas so that the arguments discussed in the preceding subsection can be properly formulated in terms of usual modal formulas, and receive an interpretation that capture our intuition concerning the Ramsey–KSL reading. In this subsection, we will formulate the arguments in question in terms of hi-world semantics and see from another angle why the argument [Main] is unsound.

In Becker (1952), the German logician O. Becker proposes that we should be clear about whether a sentence is to be evaluated at a case (a possible world) or a case class (a set of possible worlds, or an iterated set of possible worlds). In particular, a conditional can be concerned with a possible world w or a set U 1 of possible worlds. A material implication α ⊃ β concerns a possible world, while a strict implication α → β ≡ □(α ⊃ β) concerns a set of possible worlds. Failing to make such a distinction can lead us to wrongly accept the validity of “Obama is not here./∴ If Obama is here, he will buy everybody a drink.” Clearly, the premiss here is concerned with a world, the actual world, while the conclusion is concerned with a set of possible worlds.

However, things can be more complicated than that. As we may encounter sentences such as □(□α ⊃ β), Becker’s idea of separating w and U 1 proves to be too naïve. There indeed can be all sorts of other possibilities. One needs a more comprehensive semantic scheme for the task of analyzing it, and the hi-world semantics introduced in Tsai (2012) serves this purpose perfectly. The reader is referred to the “Appendix” for an outline of the semantics. Basically, a hi-world s takes the form (U 0, U 1, U 2, …) where U 0 is simply a possible world w 0, U 1 is a set of possible worlds and U 2 is a set of sets of possible worlds. A hi-world t is a sub-hi-world of s provided that π i (t) ∈ π i+1(s) for all i, where π i is the projection into the ith component. Every sentence is concerned with some suitable portion(s) of a hi-world. For example, the sentence □α ⊃ β is true at a hi-world s = (w 0, U 1, U 2, …) provided that U 1 ∩ I(α)c is nonempty or w 0 ∈ I(β), and the truth of □(□α ⊃ β) at s amounts to that for every sub-hi-world t of s, □α ⊃ β holds for t.Footnote 12 Furthermore, so far as the present paper is concerned, we can impose a mild condition on our universal set of hi-worlds: every hi-world is its own sub-hi-world. This is equivalent to the self-reflexivity of the accessibility relation of a Kripkean model, and it allows us to obtain α from □α.

Now let us see how we can formalize Ramsey’s conditional in terms of the hi-world semantics. According to Ramsey, to accept a conditional “if α then β” is to add α into our stock of knowledge and based on that arrive at β. In terms of hi-worlds, one’s stock of knowledge amounts to a set of subsets of hi-worlds, and the intersection of these subsets is the set of hi-worlds that she has in mind. The adding of α into her stock of knowledge amounts to shrinking the set of hi-worlds she has in mind by taking its intersection with the extension ||α||M of α.Footnote 13 This phenomenon can be illustrated in terms of the U i’s. Consider a conditional “if p then q” where both p and q are non-modal sentences. To add p into our stock of knowledge and then consider q amounts to shrinking U 1 to U 1 ∩ I(p) and to see whether the resulting U 1 ∩ I(p) lies in I(q). So, “if p then q” can be translated into □(p ⊃ q), as the truth condition for the latter is U 1 ⊆ I(p)c ∪ I(q), which is equivalent to U 1 ∩ I(p) ⊆ I(q).Footnote 14 The situation for the Prisoner’s Dilemma is more complicated, as we will see soon, but the basic ideas are the same.

To find a more probable interpretation of the argument(s) in question, we need to fix quite a number of problems. Let us begin with B1 and B2. Without loss of generality, I shall be concerned with B1 only here. As we have seen, the original B1, namely “if C2 then v(D1) > v(C1)”, seems outright true, yet if we translate it as “C2 ⊃ v(D1) > v(C1)”, then the fact that the antecedent and consequent concern different levels of worlds, namely w 0 and U 1, will immediately turn it into a false statement. What is wrong with the translation, and how can we fix it then? The following remarks will guide us to a right translation of B1.

First, as we have explained earlier, when we say “if C2 then v(D1) > v(C1)”, we are not asserting a specific connection between w 0 and U 1. Rather, we are claiming that given that a set U 1 is such that the agent a 2 cooperates, v(D1) > v(C1) holds for that U 1. In other words, both the antecedent and the consequent of B1 are concerning the same level of a hi-world, namely the U 1. As a consequence, we should arrive at some modified translation of B1 which contains □C2 ⊃ v(D1) > v(C1) as a proper part. In everyday language, B1 can be put in the following way: “if a set of possible worlds is such that agent a 2 always cooperates then v(D1) > v(C1) holds for that set”.

Second, the remark in the last section concerning the formalization of a Ramsey conditional applies to B1 as well. In other words, in evaluating B1, we first add □C2 into our stock of knowledge, which amounts to taking the intersection of the set ŝ of all sub-hi-worlds of s with ||□C2||M, and then decide whether the resulting set is a subset of ||v(D1) > v(C1)||M. So, the correct translation of B1 should be □(□C2v(D1) > v(C1)) instead.Footnote 15

Strictly speaking, when we are unsure of whether C2 or D2 holds necessarily, the expression v(D1) > v(C1) makes no sense at all, because each of v(D1) and v(C1) may have two distinct values. However, charitably speaking, by v(D1) > v(C1) we could mean that for any world w of U 1 for which v(D1) is applicable, and for any world w′ of U 1 for which v(C1) is applicable, the value v(D1) is greater than the value v(C1). The present remark is important in the sense that without such an interpretation, the antecedent of G would be meaningless for most U 1’s.

Now, concerning G, we should note the following. First, in contrast to B1 and B2, for which the antecedents C2 and D2 are elevated into □C2 and □D2 in hi-world semantics so that the antecedents and consequents are about the same level of worlds, the consequent D1 of G is really about the plain world w 0 and the conditional presumes that agent a 1 is rational.

Second, according to Ramsey’s Test, to decide whether to accept G, we need to add the antecedent v(D1) > v(C1) into our stock of knowledge and, based on that, decide whether a 1 defects always. Recall that v(D1) > v(C1) is true with respect to U 1 provided that the value for a 1 in those possible worlds of U 1 in which she defects is always higher than that associated with those possible worlds of U 1 in which she cooperates. However, adding the antecedent v(D1) > v(C1) into our stock of knowledge amounts to shrinking U 2 so that for all the elements V 1’s of U 2, v(D1) > v(C1) holds, and to see if a 1 defects always amounts to seeing if for all elements w’s of U 1, D1 holds at w. So, in terms of hi-world semantics, G can be translated into □(v(D1) > v(C1) ⊃ D1) and it is true at s if and only if ŝ ⊂||v(D1) > v(C1) ⊃ D1||M, where again, ŝ stands for the set of all sub-hi-worlds of s.

Finally, given that B1, B2 and G are translated as above, P1 and P2 can, un-surprisingly, be pinned down as □(□C2 ⊃ D1) and □(□D2 ⊃ D1) respectively, and P0 surely is primarily concerned with U 1 rather than w 0, and thus should be translated into □(C2 ∨ D2). So the three arguments that we are concerned with are expressed as follows. Again, α→ β stands for the strict conditional □(α⊃ β).

 

P0

□(C2 ∨ D2)

 

P1

□C2 → D1

 

P2

□D2 → D1

[Main]

- - - - - - - - - - -

 

D1

 

[Supportive 1]

 

[Supportive 2]

B1

□C2v(D1) > v(C1)

B2

□D2v(D1) > v(C1)

G

v(D1) > v(C1) → D1

G

v(D1) > v(C1) → D1

∴ (P1)

□C2 → D1

∴ (P2)

□D2 → D1

In terms of hi-world semantics, an argument is valid provided that for all possible hi-worlds s = (w 0, U 1, U 2, …), if the premisses are all true at s then the conclusion is true at s, and the argument is sound provided that the premisses are all true with respect to the actual hi-world as well. Undoubtedly, [Supportive 1] and [Supportive 2] are both valid arguments. However, granted that a 1 is a rational being, do we want to accept that G is true? It depends on how we read into a 1. If she is rational but not omniscient then even if v(D1) > v(C1) holds for U 1 she may not know it, so G is not true, so [Supportive 1] and [Supportive 2] are unsound. If, on the other hand, we grant a 1 the mental power of knowing that v(D1) > v(C1) holds for U 1 whenever it holds, then G is true and both of the supportive argument are sound and, as a consequence, the argument [Main] have three true premisses. Do we then finally arrive at a sound argument in support of the first horn of PD? By no means. Clearly, □(C2 ∨ D2), □C2 → D1, and □D2 → D1 cannot lead us to D1—given that a hi-world s = (w 0, U 1, U 2, …) is such that U 1 ⊆ I(C2 ∨ D2), “for any w′∈U 1 and V 1 ∈ U 2, V 1 ⊄ I(C2) or w′ ∈ I(D1)”, and “for any w′ ∈ U 1 and V 1 ∈ U 2, V 1 ⊄ I(D2) or w′ ∈ I(D1)” all hold, we still cannot conclude that w 0 ∈ I(D1), even if we presuppose that every hi-world is its own sub-hi-world, in particular, w 0 ∈ U 1 and U 1 ∈ U 2.

In sum, the arguments that seem to support the first horn of PD involve a mixture of modality of different levels. These arguments can either be dealt with in terms of (1) the usual KSL-style account of conditionals, or be reformulated in terms of (2) the hi-world semantics, which can in effect take care of any mixture of iterated modalities at one go. Furthermore, the rational agent a 1 can be assumed to either (i) have or (ii) does not have a privileged access to the truth of the antecedent. However, for each of the four possible interpretations, 1(i), 1(ii), 2(i), and 2(ii), we find that the argument [Main] is never a sound argument. So the first horn of PD remains in want of a sound argument that would support it.Footnote 16

4 A Meta-argument for the Non-existence of a Logical Argument for Defection

A reviewer for an earlier version of this paper has objected that I have not exhausted all possible formulations of the arguments in question, in particular, I have not considered formulating PD in terms of quantified modal logic, so I have not ruled out the possibility that there is indeed a more complicated argument which proves that defection is the logical consequence of a PD game. A short answer to this objection would be that it is the responsibility of those who claim that it is logical to defect in a PD game to come up with an explicit sound argument for their claim. However, it would be much better if I can find a meta-argument that shows that, in general, no such sound argument exists, and this is what I would attempt to do in this section.

An anonymous reviewer for this journal reminds me that the notion of a game involves not only players and available strategies, but also a complex system of knowledge, preferences and beliefs, and without capturing the interaction between the players, my treatment of the PD game would not be complete. Indeed, to model the interaction between players in a game is very important. However, given that in this paper what we are interested is the one-off PD game rather than the iterated PD game, the “interaction” between the two players can at best amount to envisaging the opponent’s thoughts and trying to outwit them. Yet, as your opponent can be any kind of players (rational, emotional, religious, criminal, your twin etc.), without knowing in advance who you are playing with, it is indeed difficult, if not impossible, to formulate a uniform argument for defection that captures what is going on in one’s thoughts about the opponent’s thoughts in a PD game. Luckily, the modest goal of this paper is merely to show that logic alone does not tell us to defect in a PD game, and it turns out that a reductio ad absurdum meta-argument concerning two ideally rational players suffices to achieve this goal. Moreover, in the process of presenting the meta-argument, we will have, in effect, taken into account the “interaction” between the two players—represented as a repeated reflection on each player’s logical actions.

To be more specific, in this section, instead of trying to imagine what the underlying argument(s) are when one claims that it is logical to defect (LTD) and spill much ink on it, I will, for the sake of argument, simply assume that there indeed exists, as many authors have believed, a sound argument that allows them to get to the LTD conclusion. And if there exists such an argument for someone to always defect, it would work for the special case where the opponent is a perfectly rational being as well. Then I show that this will lead to the paradoxical result that it is logical for the two rational players in a PD game to cooperate as well. By reductio ad absurdum, we then have proved that there cannot be such an argument for defection, contrary to what the other authors have believed.

Let me assume, without explicitly spelling out the argument, that we have an argument [*] in support of D1,

 

\( \fancyscript{K} \)

[*]

- - - - - - - - - -

 

∴ D1

while the \( \fancyscript{K} \) here denotes the set consisting of all the premisses known to both players. These premisses are either rules of the PD game or are logically deducible from these rules and/or other known facts. For example, we can imagine that among the premisses in \( \fancyscript{K} \) there is one that says that both agents are perfectly rational beings who abide by logical rules.

Next, I would show that [*] would lead to a “paradoxical” result, namely that agent a 1 would cooperate as well.

By stipulation, agent a 2 is a rational being and the public information contained in \( \fancyscript{K} \) is available to agent a 2 as well, so we would have the following sound argument [*′] too.

 

\( \fancyscript{K} \)

[*′]

- - - - - - - - - - - -

 

∴ D2

In general, if there exists a pure logical argument for an agent in a PD game to take a particular action X, then by symmetry, the other agent would be forced by logic to take the same action.

Now, as a consequence of the soundness of [*] and [*′], we have the truth of D1 ∧ D2. But then consider the following argument,

 

P1

D1 ∧ D2

 

P2

If D1 ∧ D2 then ((C1 ∧ C2) ∨(D1 ∧ D2))

 

P3

If ((C1 ∧ C2) ∨ (D1 ∧ D2)) then v(C1) > v(D1)

 

P4

If v(C1) > v(D1) then C1

[Paradox, Naïve]

- - - - - - - - - - - - - - - - - - - -

 
 

∴ C1

 

P1 is true by the hypothetical soundness of [*] (and [*′]). P2 is true for most, if not all, interpretations of conditionals. P3 is true by the specification of the game. Finally, P4 seems true by the fact that agent a 1 is a rational being and that he would maximize his value whenever possible. An analogous argument [Paradox, Naïve′] would then give us C2 as well. So, on the face of it, we have obtained a paradoxical result, namely that if there is sound argumentFootnote 17 in support of D1 ∧ D2 then there exists a sound argument in support of C1 ∧ C2 as well.

Recall, however, that we have a similar argument earlier,

C2 ∨ D2

If C2 then v(D1) > v(C1)

If D2 then v(D1) > v(C1)

If v(D1) > v(C1) then D1

- - - - - - - - - - - - - - - - - - - -

∴ D1

We have shown that this is an unsound argument in Sect. 3. And one of the reasons that it fails to be sound is that even if v(D1) > v(C1) holds, agent a 1 may not know it, so he may not defect accordingly to maximize his value. If I am correct in maintaining that this is indeed a problem, then the P4 of [Paradox, Naïve] may not be true as well. In other words, [Paradox, Naïve] is in need of modification. The actions of both agents are guided not only by their rationality but also by their knowledge. So, it is necessary to distinguish between a proposition A and corresponding proposition K(A), where the latter stands for that both agents know A.Footnote 18

As a result, we obtain the following modified argument

 

P1

K(D1 ∧ D2)

 

P2

If K(D1 ∧ D2) then K((C1 ∧ C2) ∨ (D1 ∧ D2))

 

P3

If K((C1 ∧ C2)∨(D1 ∧ D2)) then K(v(C1) > v(D1))

 

P4

If K(v(C1) > v(D1)) then C1

[Paradox]

- - - - - - - - - - - - - - - - - - - - - - - -

 

∴ C1

This becomes a sound argument which, together with its counterpart argument [Paradox′] for C2, would entail C1 ∧ C2. Surely, this is an unwelcoming, “paradoxical” result: given that it is logical to defect in a PD game, it is logical to cooperate in a PD game as well. What is the problem after all? The answer is simple, there is simply no sound argument in support of defection to begin with! Insofar as we do not hypothesize the existence of such an argument, we would not have come to this paradoxical result in the first place.

Given that it is not logical to defect in a PD game, one might suspect that perhaps it is logical to cooperate in a PD game instead. But is it so? Recall that if two twins play a PD game and by definition their final action of cooperation or defection would agree with each other, then it is logical for them to cooperate. Now, for a pair of perfectly rational beings who are not twins, playing a PD game with each other, would it be logical for them to cooperate as well? Adopting the strategy we employed earlier, we can assume that there is a sound argument in support of cooperation. By symmetry we would obtain the truth of C1 ∧ C2. Then the following argument [Paradox~]

 

P1

K(C1 ∧ C2)

 

P2

If K(C1 ∧ C2) then K((C1 ∧ C2) ∨ (D1 ∧ C2))

 

P3

If K((C1 ∧ C2) ∨ (D1 ∧ C2)) then K(v(D1) > v(C1))

 

P4

If K(v(D1) > v(C1)) then D1

[Paradox~]

- - - - - - - - - - - - - - - - - -

 

∴ D1

and its counterpart argument [Paradox~′] for D2 would lead us to D1 ∧ D2. So, again we obtain a “paradoxical” result: given that it is logical to cooperate in a PD game, it is logical to defect in a PD game as well. Again, the “paradox” can be easily avoided by dropping the assumption that it is logical to cooperate.

So, the conclusion here is that for a pair of perfectly rational beings playing a one-off PD game with each other, logic itself does not tell them to defect, nor does it tell them to cooperate. Their action can at best be influenced by other factors or concerns. This is not a strange result at all. To illustrate this point, let us consider the following simple example. Two perfectly rational beings are playing a ⊕–⊗ game with each other. Each player can either play ⊕ or play ⊗. If the two players produce the same sign then they both get one point. If the signs they produce are different then they both get zero point. Does logic tell them to play ⊕? Or, does logic tell them to play ⊗? Of course not. Evidently, each player knows that the best result would come when they produce the same sign, but they simply have no means to know beforehand what sign the other player would produce. So, despite that both players are perfectly rational beings who abide by logical laws, there is no logical argument to support the statement that ⊕1 if and only if2, nor is there a logical argument for the statement that ⊗1 if and only if2. As a result, logic can offer them no help at all.Footnote 19

Now, an interesting question to ask concerning the PD game is this: would perfectly rational twins knowingly playing with their twin accept the soundness of [Paradox~]? After all, the P1 would be true for them. I think the answer is no, and the problematic premiss is P3, because v(D1) would be without reference to begin with. As a result, we still would not be bothered by a paradox.

In sum, for players who are ideal twins, there are indeed extra-logical factors such as gene compositions or mystical connections that would help them to come to the logical decision of cooperation, but for non-twins, logic itself neither instructs them to defect nor instructs them to cooperate. If one mistakenly thought that logic does instruct the players to opt for one option, he will be forced by logic to admit the paradoxical result that the players would opt for the other option as well. But, as we have stressed repeatedly, we should not have that false impression in the first place, in particular, logic does not tell us to defect at all.

5 Some Final Remarks

A reviewer for an earlier version of the paper has suggested that by resorting to expected utility, one finds a perfect formulation of the argument underlying the horn of the Prisoner’s Dilemma that we have been discussing in Sect. 1. The basic idea is that the expected utilities EU(C1) and EU(D1) associated with the agent’s cooperation and defection can be given as

$$ \begin{aligned} EU\left( {{\text{C}}_{1} } \right) \, & = P\left( {{\text{C}}_{2} } \right)V\left( {{\text{C}}_{1} | {\text{C}}_{2} } \right) \, + P\left( {{\text{D}}_{2} } \right)V\left( {{\text{C}}_{1} | {\text{D}}_{2} } \right), \\ EU\left( {{\text{D}}_{1} } \right) \, & = P\left( {{\text{C}}_{2} } \right)V\left( {{\text{D}}_{1} | {\text{C}}_{2} } \right) \, + P \left( {{\text{D}}_{2} } \right)V\left( {{\text{D}}_{1} | {\text{D}}_{2} } \right) \\ \end{aligned} $$

respectively, where P(X) is the probability of X, and V(X|Y) is the expected value of the agent when the action X of the agent is paired with the action Y of the opponent, and evidently EU(D1) is greater than EU(C1), independently of the probabilities, so it is rational for the agent to defect.

Once again, this is precisely what the present paper sets to argue against. Logic itself does not tell us to defect, impaired rationality does. To see that the expected utility argument above does not work for the case that we are concerned with here, namely, the one-off Prisoner’s Dilemma, observe the following:

  1. 1.

    It is not a spelt-out, sound argument to start with. In particular, why does EU(D1) > EU(C1) imply D1? In my presentation in the preceding sections, so long as we know D2 or know C2, v(D1) > v(C1) makes sense and, since both v(D1) and v(C1) are definite real values that the agent cares, v(D1) > v(C1) → D1 is readily acceptable, However, in case we are unsure about whether D2 or C2, v(D1) and v(C1) become meaningless. In contrast, the EU(D1) and EU(C1) here make sense even if we are unsure about whether D2 or C2. But, the problem here becomes: why should the agent make decisions based on the relative value of these two expected utilities? The proponents of the expected utility account are responsible for providing the missing link that explains an agent’s concern for expected utility. In particular, why should the agent care only about the expected utility for a particular move rather than about some (weighted) sum of expected utilities for all possible subsequent moves that are about to come?Footnote 20 After all, we are concerned with a one-off PD game, and shouldn’t we take into consideration, in advance, all possible effects before we make the one and only move?

  2. 2.

    Note that the formulas for the expected utilities EU(D1) and EU(C1) resemble that of the expected fitness for two strategies C (always cooperates) and D (always defects) in a social evolutionary context.Footnote 21 Specifically,

    $$ \begin{aligned} W\left( {\text{C}} \right) \, & = Pr\left( {{\text{C}}|{\text{C}}} \right)V\left( {{\text{C}}|{\text{C}}} \right) \, + Pr\left( {{\text{D}}|{\text{C}}} \right)V\left( {{\text{C}}|{\text{D}}} \right), \\ W\left( {\text{D}} \right) \, & = Pr\left( {{\text{C}}|{\text{D}}} \right)V\left( {{\text{D}}|{\text{C}}} \right) \, + \, Pr\left( {{\text{D}}|{\text{D}}} \right)V\left( {{\text{D}}|{\text{D}}} \right), \\ \end{aligned} $$

    where Pr(Y|X) is the conditional probability of an X interacting with a Y. Assuming that the chance of meeting a co-operator or a defector is independent of the strategy that one adopts, and that the frequency of individuals that adopt the strategy C is p, then we have

    $$ \begin{aligned} W\left( {\text{C}} \right) \, & = p \, V\left( {{\text{C}}|{\text{C}}} \right) \, + \, \left( {1 - p} \right)V\left( {{\text{C}}|{\text{D}}} \right), \\ W\left( {\text{D}} \right) \, & = p \, V\left( {{\text{D}}|{\text{C}}} \right) \, + \, \left( {1 - p} \right)V\left( {{\text{D}}|{\text{D}}} \right). \\ \end{aligned} $$

Now, clearly, W(D) > W(C), independently of p, as V(D|C) > V(C|C) and V(D|D) > V(C|D), so the selection favours strategy D, and D is an evolutionary stable strategy, which actually makes the value W(D) to decrease from generation to generation. In other words, the system would reach an evolutionary dead end in the end. However, again, why should the one-off PD player a 1 care about the evolutionary group fitness in the first place? The evolutionary dead end of all defections evidently suggests that we should have a second thought about it.

  1. 3.

    Recall that in social evolution, the fitness of a strategy at the present generation will affect the frequency of the strategy at the next generation, which in turns affects the fitness of the strategy in the next generation.Footnote 22 I claim that if our agent a 1 is rational enough to compare \( W_{\text{D}}^{{\prime }} \)(D) and \( W_{\text{C}}^{{\prime }} \)(C)—regarding them as more relevant to her benefit—rather than comparing W(D) and W(C) as a blind evolutionary system does, than she would not reach the conclusion that defection is the rational choice. Here \( W_{\text{X}}^{{\prime }} \)(Y) stands for the fitness of strategy Y after a 1, as a particular individual, chooses to perform X previously. An example suffices to illustrate this point.

Let the playoff matrix be

 

C

D

C

10000

0

D

10001

1

And, for simplicity, assume that the action of a 1 would affect the frequency p of co-operators by 1/100—the action (either cooperation or defection) of agent a 1, being an individual in the population himself, would either increase or decrease the population’s overall chance of meeting a co-operator. Then we have w(D) − w(C) = 1, regardless of p, but

$$ \begin{aligned} W_{\text{C}}^{{\prime }} \left( {\text{C}} \right) \, & = \left( {p + \, 1 /100} \right)V\left( {\text{C|C}} \right) \, + \, \left( {1 - p - \, 1 /100} \right)V\left( {{\text{C}}|{\text{D}}} \right) \, = w\left( {\text{C}} \right) \, + \, 100 \\ W_{\text{D}}^{{\prime }} \left( {\text{D}} \right) \, & = \, \left( {p - \, 1 /100} \right)V\left( {\text{D|C}} \right) \, + \, \left( {1 - p + \, 1 /100} \right)V\left( {\text{D|D}} \right) \, = w\left( {\text{D}} \right) \, - 100 \\ \end{aligned} $$

Therefore, W C (C) − W D (D) = (w(C) − w(D)) + 200 = 199 > 0. In other words, even if the agent is concerned primarily with expected utility, the revised expected utility would tell her that there is no rational ground for defection.