Introduction

Living systems do not operate in isolation. Interactions between entities define the structure and function of the living world. These interactions can be categorically simplified to antagonistic or complementary interactions; or competitive and cooperative behavior. Cooperative behavior among individuals is a defining characteristic of complex organismal societies and was considered a particular difficulty, “…actually fatal to my theory…” for Darwin (1859) in light of his theory of evolution by natural selection. The difficulty arises from a distinct definition of the word cooperation that is more properly termed altruism. Altruistic behavior is defined by a cooperative act that imposes a fitness penalty on the actor, and a concomitant fitness benefit for the recipient of the act. For Darwin, and many other evolutionary biologists (Hamilton 1964; Doebeli and Hauert 2005), this type of behavior should not evolve by natural selection. And yet it has. Why?

There are two primary explanations for the evolution of cooperative behavior. One explanation for this type of behavior is based on genetic relatedness. In this formulation, costly behavior will benefit the genes of the actor if there is a sufficiently high statistical probability those genes are present in the recipient. This is formally called kin selection and was developed by Hamilton (1964). But this doesn’t help explain the observation of altruistic behavior, particularly in human societies, among unrelated individuals. Another explanation for altruistic behavior is based on the frequency of interactions between individuals, regardless of relatedness, in a society. When the opportunity for cooperative, or altruistic, behavior is iterated by repeated interactions between individuals who are likely to reciprocate acts of cooperation, such behavior is more likely to evolve (Trivers 1971).

One obstacle for the evolution of reciprocal cooperation comes from the understanding that selfish behavior benefits in a collective cooperative group setting because there is an intrinsic cost in the act of cooperation (Romano and Balliet 2017). If there were no cost, there would be no benefit to avoiding cooperation. Therefore, for cooperation to evolve in any setting, short term costs of cooperation must be more than recovered in the long term (Kurzban et al. 2015.

The Prisoner’s Dilemma is an elegant game theory model that addresses the inherent stress selfish individuals place on cooperative group behavior (Axelrod 1980). Assuming a selfish non-cooperator is called a defector; the model is defined by the following interaction conditions: the payoff for a defector interacting with a cooperator (T) is greater than a cooperator interacting with another cooperator (R), which is greater than a defector interacting with another defector (P), which is greater than a cooperator interacting with a defector (S); and 2R is greater than T + S. The model encapsulates the dilemma of cooperation because for any interaction individual selection would favor defection from cooperation. But since the game is symmetrical any two individuals would benefit more from collective cooperation than from collective defection (Killingback and Doebeli 2002).

We have taken advantage of these properties of The Prisoner’s Dilemma (PD) game to develop a web application that was used to test the likelihood of cooperative behavior in different group settings and the effect game play would have on student comprehension of instructional material related to the evolution of cooperation. Hypotheses regarding cooperative behavior were largely derived from Trivers’ (1971) development of the evolution of reciprocal altruism, principally that cooperative behavior was more likely to evolve in smaller distinct group settings that we termed “Non-random”, when compared to larger group settings that we termed “Random”. The hypothesis of cooperation evolving in selective group environments was tested by distinct versions of the web application. Student comprehension of instructional material related to the evolution of reciprocal altruism was assessed with pre- and post-game play assessments. The primary learning goals associated with implementation of the game were to promote student engagement and demonstrate comprehension of content related to evolution and cooperation. Qualitative attitudinal assessments addressed student appreciation of, and engagement in PD game playing.

Methods

An online interactive game with user and administrative functions was developed for use in the classroom. Two versions of the game were developed to test hypotheses related to reciprocal altruism (Trivers 1971). One version, called Random, allowed all users in a class to access the game environment with an equal probability of encountering other online users in dyadic interactions defined by the rules of the game Prisoner’s Dilemma (Rapaport & Chammah 1965). The other version of the game, called Non-random, confined the interactions to groups of four players defined by the administrator under encoded rules that limited the composition of the four-player group according to the established online identities. Online identities were established upon player registration, which consisted of username and password, a randomly assigned one of three colors, Blue, Red, and Yellow, and a number generated by the application. After player login with their username and password, the display consists of the color and number combination, such as Red-23, to other players currently online (Fig. 1). Each player online will be able to check other players online with such color-number format. The four-player Non-random version of the game grouped three players of different color with another player matching the color of one of the three group members. The intended purpose of the game was to collect personal decisions and scores based on the rules of the Prisoner’s Dilemma game and emphasize and engage students in the concepts related to the evolution of cooperative behavior as taught in the evolution course BIOL 3000.

Fig. 1
figure 1

The Georgia Gwinnett College Prisoner’s Dilemma Game Play screen is shown above. Player identities are defined by color and number. Decision buttons allow players to cooperate or defect. Previous round decisions are made visible for both players, and scores, based on a payoff matrix, for game play are shown (upper left)

Georgia Gwinnett College (GGC) is an undergraduate degree-granting liberal arts college in the University System of Georgia. GGC is an open-access minority-serving institution, and recently certified as a Hispanic-serving institution, in the Atlanta metropolitan area. Class sizes in the School of Science and Technology are relatively small, with laboratory classes capped at 24 students, lecture classes capped at 28 students. The Evolution course (BIOL 3000) is a lecture class with outcome goals related to a broad survey of evolutionary concepts, and is a required course for General Biology concentration majors.

Georgia Gwinnett College students in the Evolution course, BIOL 3000, played the online Prisoners Dilemma (PD) game multiple semesters, different students playing each semester. A total of 228 players, 137 players in the Random treatment and 91 in the Non-random treatment, across the semesters played the online game throughout the semester. Students were introduced to the rules of the game and the payoff matrix (Fig. 2) in class and through online resources at the beginning of each semester. Students were not aware of the applied treatment effects. Students were incentivized to adopt “winning” strategies by the offer of extra credit points at the end of each semester for the student(s) with the highest score. Students were not penalized for not participating or for low scores.

Fig. 2
figure 2

The payoff matrix for Prisoner’s Dilemma game play. The matrix represents scoring for the prisoner (rows) given the playing partner’s decision (columns). Students in Georgia Gwinnett College’s BIOL 3000 (Evolution) class were given the above matrix prior to game play

Game-play consisted of weekly, or bi-weekly, time periods at the beginning of class lasting approximately five minutes in which students would log into the online PD game server and play a series of rounds consisting of from three to five iterations of the game. Each game was a one-on-one (dyadic) encounter between students randomly assigned according to treatment effects and consisted of iterated decisions to cooperate or defect from cooperation. Students were aware of the color/number of the other player but were encouraged to keep online identities confidential. Though some loss of confidentiality may have been unavoidable in the small class settings, the loss of anonymity was not necessarily a hindrance to the successful application of the game. Prior knowledge is an important component in the evolution of cooperation (Trivers 1971). After each iteration scores and decisions were shown to players and cumulative scores across rounds were shown to students individually. The number of rounds per session was determined by the time period and rate of play. The number of iterations per game was determined by the administrator.

Assessment of instructional content related to the evolution of cooperative behavior and game theory was administered prior to game play at the beginning of each semester and at the end of each semester after game play ceased (Pre- and Post-game play). Content assessment consisted of five questions with four multiple choice answers each (Supp. Material 1). Pre- and post-game play assessment scores were analyzed using Welch two sample t-test in R statistical language (R Core Team 2020). Qualitative attitudinal assessments of the PD Game were conducted at the end of each semester using a Likert scale for scoring. All assessments were voluntary and anonymous.

Player score information was collected and stored in the server which could be exported as an Excel file for analysis at the end of each semester. Percent cooperation was determined as a fraction of the maximum collective score (six points per iteration) weighted by the number of game iterations per round. Random and Non-random treatments were analyzed for significant differences in overall percent cooperation using R statistical language and a Welch two sample t-test. Strategic game play, defined as Tit-For-Tat (TFT) or Pavlov (Milinski 1993; Wedekind and Milinski 1996), was determined for the subset of games for which strategies could be determined (games one through three). The determination protocol for both strategies was based on the probability of cooperation, Pc, given previous game conditions such that Tit-For-Tat Pc = 1 and Pavlov Pc = 0 after previous defection with partner cooperation, and Tit-For-Tat Pc = 0 and Pavlov Pc = 1 for previous defection and partner defection. Strategic differences between treatments within rounds of play, and within treatments between rounds of play were analyzed using a chi-squared test for equality of proportions (prop.test function) in R statistical language (R Core Team 2020).

Results

The 137 students in the Random treatment played a total of 2248 rounds, 963 of which were five-game rounds and 1285 were three-game rounds, for a total of 8670 cooperate/defect decisions made by students. The 91 students in the Non-random treatment played a total of 1251 rounds, 445 of which were five-game rounds, 677 were four-game rounds and 129 were three-game rounds, for a total of 5320 cooperate/defect decisions made by students.

Average percent cooperation for all students was 37.0 (s = 35.9). Treatment effects appeared to be a predictor of cooperative behavior. Significant differences for percent cooperation were observed between the two treatments (t = 10.66, df = 2192.5, p-value < 2.2e-16) with higher average percent cooperation for Non-random treatment students, 45.9 (s = 39.6) than for Random treatment students, 31.9 (s = 32.5) (Fig. 3).

Fig. 3
figure 3

Percent cooperation (+/- 1 SE) among Georgia Gwinnett College students in BIOL 3000 (Evolution) class while playing the Prisoner’s Dilemma game. The Random treatment player environment was a class of students where playing partners were chosen randomly. The Non-random player environment was a pre-determined set of four players. Significant differences across treatments were observed (p < 0.001)

Decision-making behavior of students in both treatments conformed more frequently to strategic decision-making behavior referred to as Tit-For-Tat, than Pavlov strategic decision-making behavior. Strategic decision-making was based on previous knowledge, so the information from the first round was used as previous knowledge for the second round. Significantly higher levels of TFT were observed for both treatments across all rounds of play (Fig. 4), with a significant increase in TFT decision-making between rounds two, three, and four (x2 = 250, df = 3, p < 0.001). Non-random players were more likely to initiate a TFT decision strategy (round 2 × 2 = 6.53, df = 1, p < 0.02) compared to random mode players, and a significant increase was observed between rounds one and two (x2 = 35.65, df = 3, p < 0.001).

Fig. 4
figure 4

The frequency of Tit-for-Tat decision strategies employed by students in Random and Non-random class treatments is shown above. Decision strategies were determined for game rounds two through five. Frequencies are relative to the alternative decision strategy called Pavlov, such that Pavlov decision strategy frequency is 1 – Tit-for-Tat decision strategy frequency

Significant differences were observed for pre- and post-game play instructional content assessment (t = 5.6505, df = 213.89, p-value = 5.07e-08; Fig. 5). Average assessment score for pre-game play was 29.6 (s = 23.0); average assessment score for post-game play was 47.3 (s = 26.1). Student responses to post-content qualitative assessment suggested the PD game was easy to use, enjoyable, and improved student understanding and confidence regarding instructional content (Fig. 6).

Fig. 5
figure 5

Average instructional content assessment scores (+/- 1 SE) are shown for pre-game play and post-game play for Georgia Gwinnet College students in BIOL 3000 playing the Prisoner’s Dilemma online game. Significant differences across treatments are indicated with asterisks (p < 0.001)

Fig. 6
figure 6

Likert scale mean values (+/- 1 SE) for the assessment questions listed are shown above. Assessment questions were administered to Georgia Gwinnett College BIOL 3000 (Evolution) classes at the end of each semester

Discussion

Smaller social environments, as those modeled by the Non-random treatment, increase the opportunity for specific interactions between individuals and allow for greater prior knowledge in regard to decision-making. These are considered preconditions for the evolution of cooperation (Trivers 1971). Smaller social groupings are considered necessary prerequisites for increasing the probability of the evolution of cooperation (Ale et al. 2013). In general, the observed treatment effects, greater percentage of cooperation in the smaller group setting (Non-random), conforms with expectation regarding the evolution of cooperation.

Observed cooperation was low in both treatments (31.9-45.9%), but not unexpected given game constraints regarding restrictions on the number of rounds players would interact. Though the number of interaction rounds varied from three to five, at the discretion of the game administrator, for any given class period the number of interaction rounds was set. This allowed students to develop end-game strategies that incentivized non-cooperation. For example, pairwise cooperation in early rounds created an environment in which the first player to choose non-cooperation (defect) would be rewarded (Selten and Stoecker 1986). When there is a known finite number of rounds non-cooperation is considered an evolutionarily stable strategy (ESS) (Axelrod and Hamilton 1981). Our overall percent cooperation results align with previous findings with multiplayer iterated Prisoner’s Dilemma game play (Grujic et al. 2012) where percent cooperation was approximately 30%, similar to random treatment observed percent cooperation, and to findings with global competition, a reward scenario similar to the one used in the present study, in which 44% cooperation was observed (West et al. 2006), similar to non-random treatment results.

Decision-making was observed in the context of two well-described PD game strategies: Tit-for-Tat (TFT) and Pavlov (Axelrod 1980; Axelrod and Hamilton 1981; Nowak and Sigmund 1993). Both strategies use prior knowledge to inform subsequent decisions, but do it in different ways. TFT starts with cooperation then replies in kind to the previous decision of the partner such that previous partner cooperation results in subsequent cooperation, and previous non-cooperation results in subsequent non-cooperation. Pavlov is considered a “win-stay, lose-shift” strategy (Nowak and Sigmund 1993) in which prior knowledge is used to maximize payoffs such that the winning decision for cooperation (when the partner also cooperates) or non-cooperation (when the also partner cooperates) results in the same behavioral decision the following round (win-stay). Any other combination, which results in the lowest point value for that decision, results in the behavioral shift the following round (lose-shift). While TFT is based on reciprocation, Pavlov is based on reward.

Both treatments suggested student players were much more likely to employ TFT-like decision making behavior, rather than Pavlov (Fig. 4). TFT is considered an ESS when prior knowledge plays a significant role in decision-making (Ale et al. 2013). In our game-play environment the Non-random treatment is assumed to have greater prior knowledge, as is the higher round number. Consistent with expectation higher round numbers resulted in higher frequencies of TFT (Fig. 4). However, the Non-random treatment did not result in higher TFT frequency compared to the random treatment. The Non-random treatment TFT plateau reached in round 3 may reflect a limit to reciprocal altruism in the face of an alternative Pavlov strategy. It should be noted that strategic game play was not discussed when the game was initiated in the classes. Students were left to discover decision-making strategies on their own.

Instructional content assessment scores, though low overall, showed significant post-game play gains (Fig. 5). The largest contributor to the low post-course scores was question one regarding the definition of game theory (Supp. Material 1). Only 25% of the class correctly answered this question, all other questions were correctly answered 50% or greater among students, with the highest percentage answered correctly being questions 4 and 5 regarding the evolution of cooperative behavior and the theoretical foundation of altruistic behavior evolution, respectively. Upon reflection, the question specifically based on game theory, was less informative in terms of course outcome goals, which are specifically about evolutionary theory. Though game theory can be applied to evolution, indeed that was the point of the PD Game in this application, undergraduate education in evolution at GGC is focused on broader concepts in evolution. The definition of game theory might be more applicable for education in the fields of computer science and economics where modeling plays a more important role. Given the results, future iterations of the PD game assessment might remove or replace this question with one that is more directly related to evolution and cooperative behavior. Though low, the highest scores for questions directly related to course content on the evolution of cooperative behavior supported the contention that the active learning strategy exemplified by the PD game favorably complemented the instructional content.

Another potential explanation for the overall low post course assessment scores could be that the content devoted explicitly to the evolution of cooperative behavior was not the primary focus of the course. The course, BIOL 3000, offers a broad survey of concepts in evolution. Though material regarding the assessment questions were presented in the course, it was not the primary focus and students may have “forgotten” previously taught material. There is a trend among undergraduate students of forgetting previously taught material, based on anecdotal evidence from comments made by several GGC professors. Since the post-course content assessment was administered at the end of the semester and the content on the evolution of cooperation was taught midway through the semester, there may have been some learning decay. A more integrated approach, where discussion and debriefing sessions throughout the course more thoroughly integrate the PD game with core course concepts, may work better in future applications.

The point-scoring system in the prisoner’s dilemma was designed to mimic the recognized cost of cooperation and reward of selfish behavior observed in various settings (Axelrod and Hamilton 1981). Our choice to use an extra credit reward for the students with the highest scores was intended to incentivize the students to develop decision-making strategies that would mimic natural selection on organisms that are forced to repeatedly interact with others, where there are clear costs and benefits for both cooperative and selfish behavior (Wilkinson 1984; Rotics and Clutton-Brock 2021). By incentivizing students with extra-credit, the learning goals associated with student engagement and with content related to how cooperative behavior evolves were promoted. Regardless of game-play score, learning outcomes related to student understanding of instructional content were achieved, as assessment question responses from students indicated (Fig. 6). These findings support the contention that competition-based learning is not based on score, but on active engagement in the competition (Burguillo 2010). Competition-based application of the Prisoner’s Dilemma game has been previously used in high school (Gracia-Lazaro et al. 2012), economics (Grujik et al. 2012), biology (West et al. 2006), and computer science (Burguillo 2010) courses.

Student responses to a qualitative assessment of the PD game were positive regarding ease of use, enjoyment, and application (Fig. 6). Using a Likert scale, students agreed with the statements “I think the game is easy to use”, and “I enjoy playing the game”. Students also found the game was applicable to their comprehension of instructional content, in general, and to the evolution of cooperation specifically. These results are not surprising given the wealth of data supporting active learning, in general, as useful tool for undergraduate engagement and comprehension (Michael 2006; Russell et al. 2015), and the PD game, specifically, in terms of education related to more advanced concepts related to conflicts of interest (Dennis 2015; Bruno et al. 2018).

The PD game is an active practice for students that can illustrate the conditions for the evolution of cooperation- namely prior knowledge and a considered response to cooperation or non-cooperation. Given the limitations of the game initiated at Georgia Gwinnett College, several future developments may be useful for further analysis of cooperative decision-making behavior, and education in evolution. Due to the end-game strategy option mentioned previously, which encourages non-cooperation (Selten and Stoecker 1986), increasing the number of interaction rounds per game may be useful for gauging the effect of end-game strategies on the evolution of cooperation. Another development that would involve a significant modification to the game would be to include a punishment option, in which players could choose to punish non-cooperative partners, at a cost, that would result in point deductions, similar to punishment options Leighton (2014) describes for public goods games. Punishment options provide the opportunity to disincentivize the temptation to not cooperate (Burguillo 2010). For a more complete analysis of the pedagogical efficacy of the PD game a control treatment in which the same instructional content, and preferably the same instructor, without the PD game could be used as a comparison to the experimental treatment. It may also be useful in future iterations of the PD game to incorporate periodic debriefing sessions (Bruno et al. 2018) in which students are encouraged to discuss applications of the game and strategic game-play options.

The results of the content and attitudinal assessments suggest students gained a better understanding of the concepts related to the evolution of cooperation (Fig. 5), enjoyed playing the game, and found the game was useful for achieving learning goals (Fig. 6). Optional additional comments from the attitudinal assessment, such as: “I believe PD and The Selfish Gene helped my understanding throughout this course.”, “It’s was a really important game to better understand the concept” and “People act in their own self-interest” supported the contention that the administration of the game can engage students in important concepts related to cooperative behavior.

Conclusions

The PD game serves as an excellent opportunity for students of evolution to actively participate in decision-making behavior that highlights the conflicts associated with the evolution of cooperation. Although this game can be played at any level of education, the complexity of iterative decision-making, prior knowledge, and strategic game play seem to make this more suitable for upper-level high school and undergraduate students, in our opinion. When combined with instructional content related to the evolution of cooperation (Trivers 1971; Hamilton 1964) a more comprehensive curriculum that incorporates active learning can be developed. The PD game platform used in the present study was intended to be a first step in the development of an educational tool that increases student engagement in content related to the evolution of cooperation.

Our dual purpose in developing the online game was to: (1) test hypotheses related to the evolution of cooperation and (2) test the efficacy of the game in an undergraduate evolution course. Student decision making behavior collected during the course of game play allows administrators of the PD Game to analyze the two game environments, random and non-random, for cooperative behavior. Evolutionary theory regarding cooperation suggests that smaller group environments with increased opportunities for interaction should promote the evolution of cooperative behavior. Relative to the game’s random environment, the non-random environment provides an experimental treatment to test hypotheses related to these conditions. Increased cooperation in the non-random treatment reported herein, seem to support existing theory.

Incorporating the PD Game in curriculum related to the evolution of cooperation is straightforward. Student registration and game play can take place on a laptop or cell phone with wireless access. Multiple rounds can be played within a very short space of time since a typical round takes less than one minute. Existing literature on the game as a tool for understanding the evolution of cooperation provides excellent supplementary material for instructional content on evolution, behavior, and social psychology (Trivers 1971; Axelrod and Hamilton 1981; Kurzban et al. 2015). Results from the initial offering of the game suggest students enjoyed playing the game, found it applicable to the study of evolution, and increased their understanding of instructional content related to the evolution of cooperation.

The results from the game can be shared with students at the end of the semester as an educational component itself- reviewing human decision-making behavior and strategies for interaction. To this extent the game can be applied in many different educational, or other, contexts. Strategic PD game play has been used in many different settings for different learning goals that have the concept of cooperative behavior as a common theme. Our game play results provide one example of how environmental setting can impact cooperative behavior, and how the game can be used to evaluate strategic behavior related to cooperation, such as the Tit-for-Tat and Pavlov strategies evaluated in this study. The game play results were not shared with the class as administered in BIOL 3000 at GGC. This may have been a shortcoming of the administration of the game. Routine debriefing sessions and discussion of overall results, including strategic behavior analysis, would likely increase student learning outcomes (Bruno et al. 2018) and therefore be recommended practice in future applications of the game.