The importance of working memory updating in the Prisoner’s dilemma

Soutschek, Alexander; Schubert, Torsten

doi:10.1007/s00426-015-0651-3

The importance of working memory updating in the Prisoner’s dilemma

Original Article
Published: 18 February 2015

Volume 80, pages 172–180, (2016)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Psychological Research Aims and scope Submit manuscript

The importance of working memory updating in the Prisoner’s dilemma

Download PDF

Alexander Soutschek^1,2 &
Torsten Schubert¹

935 Accesses
11 Citations
1 Altmetric
Explore all metrics

Abstract

Successful cooperation requires that humans can flexibly adjust choices to their partner’s behaviour. This, in turn, presupposes a representation of a partner’s past decisions in working memory. The aim of the current study was to investigate the role of working memory processes in cooperation. For that purpose, we tested the effects of working memory updating (Experiment 1) and working memory maintenance demands (Experiments 2 and 3) on cooperative behaviour in the Prisoner’s dilemma game. We found that demands on updating, but not maintenance, of working memory contents impaired strategy use in the Prisoner’s dilemma. Thus, our data show that updating a partner’s past behaviour in working memory represents an important precondition for strategy use in cooperation.

Promotion of Cooperative Behavior in Social Dilemma Situation - How Group Heuristics, Restriction of Short-Term Memory, and Penalty Promote Cooperative Behavior

Can episodic memory deter cheating and promote altruism?

Article Open access 18 April 2024

The Importance of the Lateral Prefrontal Cortex for Strategic Decision Making in the Prisoner’s Dilemma

Article 04 August 2015

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Evolutionary accounts of altruism assume that the existence of social cooperation in human primates presupposes the evolution of specific cognitive abilities. Among others, it is necessary that humans can reliably discriminate between cooperative partners and selfish free-riders to protect themselves from being exploited (Axelrod & Hamilton, 1981; Trivers, 1971). An effective strategy to avoid exploitation by free-riders is to use a tit-for-tat strategy: an individual should reciprocate cooperation only if her partner cooperates, whereas she should defect if being confronted with a defective partner. Although a large number of theoretical and empirical studies examined strategic decision-making in social cooperation (Axelrod & Hamilton, 1981; Nowak & Sigmund, 1993a, b; Trivers, 1971), it remains unclear which cognitive abilities are involved in strategic decision-making in social interactions. Because playing tit-for-tat requires representing the interaction partner’s last choice in order to adjust one’s own strategy to it, it is reasonable to assume that working memory processes are involved in social cooperation. The goal of the current study was to provide evidence for the importance of working memory processes in cooperative behaviour.

In game theory, social cooperation is often examined by analysing behaviour in the Prisoner’s dilemma game (PDG): In the PDG, a player A chooses whether to either cooperate (C) or defect (D) with a second player B. The payoffs in the PDG are arranged such that, independently of player B’s decision, player A can maximise her outcome if she defects: If player B cooperates, then player A’s outcome is higher if she defects, too (unreciprocated defection; DC), compared if she cooperates (mutual cooperation; CC). Importantly, also in the case that player B defects, player A’s payoff is higher if she defects (mutual defection; DD) compared to if she cooperates (unreciprocated cooperation; CD). As a consequence of this arrangement of payoffs, mutual defection is the Nash-equilibrium in the PDG because changing the strategy (i.e., starting to cooperate) leads to worse outcomes under the condition that the partner continues with defection. Importantly, however, both partners’ payoff would be higher in the case of mutual cooperation compared to mutual defection (i.e., CC is pareto-superior). If the PDG is played iteratively, this discrepancy between the Nash-equilibrium (mutual defection) and the pareto-superior solution (mutual cooperation) results in the following dilemma: on the one hand, cooperation leads to the better long-term outcome than defection because CC > DD. On the other hand, mutual cooperation is no stable solution of the dilemma (no Nash-equilibrium) because DC > CC. The use of a tit-for-tat strategy allows resolving this dilemma (Axelrod & Hamilton, 1981; Trivers, 1971): a player should start cooperating if her partner has cooperated in the preceding round in order to establish mutual cooperation. However, if a partner has defected in the previous round, then a player should respond with defection to avoid being exploited by a free-rider.

From a cognitive perspective, playing tit-for-tat in the PDG requires working memory (WM) processes: a player must encode the partner’s last decision in WM in order to be able to adjust her own decisions accordingly. In line with this assumption, a previous study showed memory demands to impair strategic decision-making in the PDG (Milinski & Wedekind, 1998): while playing the PDG, participants performed a supplementary memory task in which they should search for identical pairs among a set of 32 cards. After each PDG choice, participants uncovered two cards: if these were identical, they were removed from the set of cards; if they showed different pictures, they were returned. Importantly, this memory task required dissociable different memory processes, namely the updating as well as the maintenance of WM contents (Braver et al., 1997; Morris & Jones, 1990). While WM updating processes include monitoring for task-relevant new information and replacing old, irrelevant WM contents with new, relevant ones (e.g., the motives of the currently uncovered cards), demands on WM maintenance are dependent upon the number of items which are currently stored in WM (e.g., the motives of all cards that had been uncovered in previous rounds). Consequently, it remains an open question whether the observed impaired strategy use in the PDG was caused by demands on WM updating or on WM maintenance. Moreover, since PDG performance in the memory group was compared with a control group playing only the PDG, it is also possible that not WM processes per se but the demands of performing two tasks simultaneously (the PDG and the memory task) are responsible for the observed effects.

The current study addressed this issue and tested how WM updating, WM maintenance, and dual-task processing demands affect decision-making in the PDG. Similar to the study of Milinski and Wedekind (1998), we applied a dual-task approach in which participants played the PDG and simultaneously performed a WM task exposing selective demands on updating or maintenance processes. Subjects played the PDG against a computer partner which used a tit-for-tat strategy (see below for details), allowing us to test how participants strategically adjust their choices to the partner’s behaviour in an experimentally controlled way. Although subjects appear to show quite similar behaviour when playing the PDG against a computer or a human partner (Rilling et al., 2002), we are aware that playing the PDG with a human partner may involve further social processes which cannot be measured with a computer-variant of the PDG. However, both real “social” and computer-based PDGs require adjusting one’s own strategy to the partner’s behaviour, such that the current paradigm allowed us to assess the impact of WM demands on strategic decision-making in the PDG in general.

Experiment 1 examined the effects of WM updating and dual-task processing on the PDG, whereas Experiments 2 and 3 tested whether high demands on WM maintenance impair strategy use in the PDG.

Experiment 1

Experiment 1 tested the involvement of WM updating processes in cooperation in the PDG. For that purpose, participants played an iterative PDG and, simultaneously, performed a variant of the n-back task in which a stream of letters was sequentially presented to the participants. The n-back task allows manipulating the demands on the updating and monitoring of WM contents in an experimentally controlled way (Braver et al., 1997). We administered the n-back task in two difficulty levels: While in the 1-back condition, participants should decide whether the currently presented letter is identical with the letter presented in the previous trial, they were instructed to respond only to a pre-defined letter in the 0-back condition, without the need to update WM contents on every trial. While only the 1-back condition required updating WM contents, both the 0-back and the 1-back task required maintaining one item in WM (i.e., the pre-defined target latter in case of the 0-back condition and the lastly presented letter in the 1-back condition). Thus, the n-back task allowed us to manipulate updating demands while controlling for load on WM maintenance. We hypothesized that, if playing tit-for-tat requires WM updating in order to adjust one’s decisions to the partner’s behaviour, then the demands of the 1-back task on updating should impair participants’ ability to use a tit-for-tat strategy. In particular, we expected that participants use a tit-for-tat strategy less often in the 1-back relative than in the 0-back condition.

In addition, we administered also a control condition in which participants played the PDG without a supplementary n-back task. This manipulation allowed us to test for effects of dual-task processing on the PDG: if the demands of performing two tasks simultaneously interfere with strategy use in the PDG, then this should result in different PDG choices between the control and the 1-back condition.

Methods

Participants

Nineteen right-handed volunteers (14 female; M _age = 23.7, SD_age = 4.6) who were recruited at the Humboldt-Universität participated in Experiment 1 after having given informed consent. All participants were naïve to the purpose of the study and were paid 8 euro per hour plus a performance-dependent bonus (see below).

Experimental design and procedure

Participants performed two tasks simultaneously: an iterated PDG and an n-back task. In the PDG, participants should choose between cooperating and defecting with a virtual opponent. As cover story, we told participants that, for organisational reasons, they would play the PDG against a computer instead of a human partner, and asked them to make their decisions as if playing against a human partner. We also stressed that the computer would simulate the behaviour of real partners and that the computer’s decisions to cooperate or defect would partly depend on their own choices, as is the case for human partners (however, no details about the algorithms used by the computer were specified). In fact, the computer played a tit-for-tat strategy and cooperated or defected with a probability of 80 % depending on whether the player had cooperated or defected, respectively, in the preceding trial (Rilling et al., 2002; Rilling, Sanfey, Aronson, Nystrom, & Cohen, 2004a). We used the following payoff matrix: Participants received two cent in case of mutual cooperation (CC), whereas they lost one cent in case of unreciprocated cooperation (CD). Mutual defection (DD) and unreciprocated defection (DC) were rewarded with zero and three cent, respectively. We informed participants that they would receive the cumulated outcomes in addition to their basic payment.

The second task was a letter-version of the n-back paradigm in which white letters were presented on the screen centre between two rounds of the PDG (Braver et al., 1997; Soutschek, Strobach, & Schubert, 2013).We used phonologically similar letters in German (B, D, P, T, and W) to increase task difficulty. In the 0-back condition, we instructed participants to press the left shift-key with the left index finger if a specific, pre-defined letter (e.g., B) was presented. The target letter in the 0-back condition was defined in the instruction before the start of a 0-back block. In the 1-back condition, participants should respond only if the currently presented letter was identical with the letter presented in the preceding trial (e.g., if the letter “D” was presented in two subsequent trials). Thus, while both the 0-back and the 1-back condition required the maintenance of one item in WM, only the 1-back task demanded, in addition, WM updating processes.

On each trial, subjects performed first the n-back task and then played one round of the PDG. Every trial started with the presentation of a letter for the n-back task for 1000 ms, followed by a fixation cross (1500 ms). If the n-back letter was a target stimulus, then the response had to be executed while the letter or the fixation cross was presented. Next, participants were asked whether they would like to cooperate or to defect in the PDG (1500 ms). During this interval, participants should indicate their decision by pressing the keys “N” (for cooperation) or “M” (for defection) with the right index or middle finger, respectively, on a QWERTZ keyboard. After an interval of 500 ms, participants received a visual feedback on their own and their opponent’s outcome in the current trial (1000 ms). Following an inter-trial interval of 500 ms, the next trial started, again with the presentation of a letter for the n-back task (Fig. 1).

The experimental design included three different task conditions: control, 0-back, and 1-back condition. In the control condition, participants were instructed to play only the PDG and to ignore the letters presented for the n-back task. In contrast to that, we advised participants to play the PDG and to perform also the 0-back or the 1-back task in the 0-back or the 1-back condition, respectively. Three blocks were administered for every task condition, resulting in a total of nine blocks which were presented in randomised order. Every block contained a total of 15 trials. In the 0-back and 1-back conditions, three target stimuli were presented per block.

Statistical analysis

We analysed cooperation rates (i.e., number of cooperation decisions/number of cooperation and defection decisions) in the PDG and hit rates (number of correctly detected targets/number of targets) in the n-back task. For tests of significance, we calculated ANOVAs and planned comparisons with a significance threshold of 5 %.

Results

PDG

We analysed cooperation rates in the PDG with a repeated-measures ANOVA including the factors Previous decision (partner cooperated vs. partner defected) and n-back (control vs. 0-back vs. 1-back). While the factors Previous decision and n-back showed no significant effects, F(1,18)s < 3.06, ps > 0.098, η ²_p s < 0.145, we found a significant Previous decision × n-back interaction, F(1,18) = 3.30, p < 0.05, η ²_p = 0.155, suggesting that participants’ responses to the behaviour of their partners differed between the n-back conditions. To examine this effect in more detail, we tested whether participants played tit-for-tat, i.e. cooperated more often when the partner had cooperated vs. defected in the preceding trial, in the different n-back conditions. We found higher cooperation rates when the partner had cooperated compared to when the partner had defected in the control and the 0-back condition, t(18)s > 2.17, ps < 0.05, but not in the 1-back condition, t < 1, p > 0.87. This suggests that participants adjusted their decisions to their partner’s behaviour (i.e., played tit-for-tat) in the control and the 0-back but not in the 1-back condition. While cooperation rates did not differ between the n-back conditions when the partner had defected in the preceding round, ts < 1, ps > 0.52, cooperation rates were significantly higher in the 1-back condition than in the control and the 0-back condition when the partner had defected in the preceding round, t(18)s > 2.10, ps < 0.05 (Fig. 2).

N-back task

We compared hit rates (i.e., correct responses to n-back targets) with a paired-samples t test. We found a significantly reduced n-back performance in the 1-back (77 %) compared to the 0-back (88 %) condition, t(18) > 2.10, p < 0.05.

Discussion

The aim of Experiment 1 was to test the role of WM updating in the PDG. We found that no tit-for-tat strategy was used in the PDG when participants simultaneously performed the 1-back task, whereas participants played tit-for-tat in the control and the 0-back condition. Since playing tit-for-tat is considered to be an effective strategy in the PDG (Axelrod & Hamilton, 1981), these results show that WM updating is an important cognitive precondition for successful cooperation. Cooperation requires the ability to flexibly adjust one’s behaviour to the partner’s decision in order to avoid being exploited by free-riders. Our data suggest that the flexible adjustment of behaviour, in turn, presupposes the updating of the partner’s last decision in WM.

In addition, the results of Experiment 1 provide no evidence for an effect of dual-task processing demands per se on choices in the PDG because no significant differences occurred between the control and the 0-back condition. This suggests that the demands on WM updating, and not on dual-task processing per se, impaired the use of tit-for-tat strategies in the 1-back condition. We would like to note that the comparison between the control condition and the 0-back condition does not allow drawing conclusions regarding the impact of WM maintenance demands on the PDG because the maintenance demands in the 0-back condition (maintain the target letter in WM) are confounded with task-switching processes (switching between PDG and 0-back task). Therefore, we conducted a further experiment in order to test the impact of WM maintenance demands on choices in the PDG.

Experiment 2

Experiment 2 examined whether demands on WM maintenance affect decision-making in the PDG. Participants played the PDG together with a memory task which required maintaining either one or six items in WM (Soutschek et al., 2013). We presented participants one or six numbers before the start of a PDG block and they should reproduce the numbers after the block. Thus, this memory task required only the maintenance but not the updating of WM contents during the PDG, allowing us to test whether high demands on WM maintenance, similar to demands on WM updating, interfere with playing tit-for-tat in the PDG. We would like to note that maintaining items in WM may involve some kind of “refreshing” process which directs attention to the items stored in WM (Vergauwe & Cowan, 2014). Such a “refreshing” of contents stored in WM may, at a first glance, appear to be conceptually similar to the updating WM of contents However, while updating involves monitoring for new relevant information and replacing old (irrelevant) WM contents with new relevant ones, “refreshing” only operates on items already maintained in WM. Thus, contrary to the n-back task in Experiment, the memory task of Experiment 2 did not require WM updating processes.