Keywords

1 Introduction

One set of interesting social phenomena are collective dilemmas, such as the Tragedy of the Commons [1]. Although often discussed in rather abstract terms, collective dilemmas exist in every shared kitchen, every community project, every collective endeavour or service. Collective dilemmas are of interest as in the real world they seem often to resolve despite their dilemma structure. This is similar to the finding of the high cooperation in Prisoners’ Dilemma games in experiments with real people, contrary to the predictions of game theory. Similar to the resolution of this empirical incongruity, there are several approaches to analyse collective dilemmas, such as invoking institutions [2], norms of fairness [3, 4], collective identity and group belonging [5] or collective reasoning [6, 7].

This plethora of solutions seems to suggest that there is no one-size-fits-all solution to human decision-making. A meta-framework systematising this variety of decision-making is the Computational Action Framework for Computational Agents (CAFCA). CAFCA is a two dimensional framework of contexts, where each dimension has three elements, a social dimension constituted by the individual, social and collective and the reasoning dimension consisting of automatic, strategic and normative reasoning [8].

2 Collective Strategies in a Prisoner’s Dilemma Tournament

In the 1950s the idea of a Prisoner’s Dilemma (PD) was developed to discuss decision situations in which the outcome for one actor are dependent upon the choices of another actor. More specifically, the choices are either defect or cooperate with the other actor. In its simplest form the PD is a game described by the payoff matrix in Table 1.

Table 1 General Prisoner’s Dilemma pay-off matrix (T = temptation pay-off, R = reward pay-off, P = punishment pay-off, and S = sucker pay-off) satisfying that T>R>P>S

In a multiplayer Prisoner’s Dilemma setting, a collectivist strategy can be seen in different ways. Our starting point to determine the switch from an individualistic to a collective mode of decision-making is the size of the coalition k (where coalition means the subjectively experienced group of peers). If an agent thinks the coalition is big enough to make it worthwhile for the collective, it will start cooperating. We concentrate on three collective strategies for making make decisions based on a collectivity value that is updated after each round of the game. These strategies are labelled the individual strategy, the memory strategy, and the neighbourhood strategy respectively.

Initially, the agents are scattered randomly on a grid and the collectivity value (the relative collective mindedness of the agent, a real between 0 and 1) is distributed randomly across the agent population. A threshold for unconditional cooperation is determined (to be compared to the collectivity measure). Three modes of behaviour change are implemented:

  1. 1.

    Individual Payoff: k is extrapolated using one’s own last payoff as an estimate of k. There is a choice between global and local dynamics for updating the collectivity level (and for the cases where memory is relevant the memory is updated as well). In the local case the average payoff within the game’s radius is considered. In the global case it is the agent’s own last-round-payoff. The payoffs are also used to generate dynamics for changing the collectivity of the agents. If an agent has defected in the last round and the respective payoff is lower than the reward payoff, its collective commitment goes up by 0.01. If an agent has cooperated and the respective payoff is higher than the reward or lower or equal to punishment, its collectivity goes down by 0.01.

    1. (a)

      Collectivity ≤ cooperation threshold then the agent defects.

    2. (b)

      Collectivity > cooperation threshold, then the agent.

      • Cooperates if it’s last round payoff is ≥ Sucker pay-off.

      • Defects otherwise

  2. 2.

    Memory: k is extrapolated from an agent’s memory, experiencing defection above a threshold makes agents defect. Memory is a list consisting of 1s and 0s. In every round the last item in the list is deleted and a 1 or 0 appended in the front, depending on whether the experience was positive (1) or negative (0). A positive experience is one in which the last round payoff is ≥ to the Reward payoff. Memory can be constructed globally or locally similar to the collectivity dynamic.

    1. (a)

      Collectivity > cooperation threshold, the agent cooperates.

    2. (b)

      Collectivity ≤ cooperation threshold, the agent.

      • Cooperates if the number of positive interactions >= memory threshold.

      • Defects otherwise.

  3. 3.

    Neighbourhood Evaluation: k is extrapolated from the average neighbourhood payoff.

    1. (a)

      Collectivity > the cooperation threshold, the agent.

      • Cooperates if the neighbourhood payoff > Punishment.

      • Defects otherwise.

  4. 4.

    Collectivity ≤ the cooperation threshold, the agent defects.

The model was implemented in Repast and pitches a set of decision-making strategies against each other, comparing the average score achieved in each round. The strategies compared are Tit-for-Tat (TfT), Always Defect, Always Cooperate & Random Choice as examples of individualistic strategies and Individual, Memory and Neighbourhood as three implementations of collective decision-making. The individualist strategies were compared to the NetLogo implementation of the iterated PD [9] and displayed the same behaviour. Four experiments were performed to compare the collective strategies against TfT.

Simulations ran for 200 steps and we average over 4 runs (we ran 5 runs and removed the outlier as in the original Axelrod tournament [10, 11]). We also used the same pay-off values (T=5, R=3, P=1 and S=0). The main results were that whilst some collective strategies (Individual and Memory) perform worse than TfT, the neighbourhood-based collective strategy performs similar or outperforms TfT. Table 2 below shows that the results for the different strategies are relatively similar and that the combined model resembles the results of the TfT model.

Table 2 Average scores in iterated Prisoner’s Dilemma situations for three collective strategies in comparison to TfT

3 Team Reasoning on the Commons

Team reasoning is an extension to game theory which allows keeping the idea of utility maximisation but changes the agent the way utility calculation is applied to from the individual to the group or collective. Team reasoning explicitly allows for both, individual and collective utility and the main question is when agents switch from one mode to another. The simplest theory for switching put forward is that of Bacharach [6]. According to Bacharach people automatically switch between individual and collective reasoning when the collective solution to the situation—or “a game”—is strongly Pareto dominant. It is not a Nash equilibrium but Pareto Optimality that people are looking out for.

In [12] a model of the Tragedy of the Commons is presented which implements a variety of “psychological dispositions” such as cooperativeness, fairness, reciprocity, conformity, and risk aversion. These can be seen as implementations of various normative decision mechanisms. Due to space restrictions we will not discuss the results in detail here but rather present an implementation of an operationalisation of this switch of between individual and collective utility maximisation.

When Team Reasoning is switched on, an agent in the model compares the expected utility from a selfish and a cooperative action and if the payoff of the latter is greater than the former, the action reward is set to −1, making it less likely for the agent to add a cow to the pasture. The model was run over the parameter space in Table 3 below.

Table 3 Parameter space for experimentation

Most of the variables were kept constant. We investigated the influence of team-reasoning on levels of sustainability, inequality and efficiency. Sustainability was assessed by the number of runs until the system runs out of grass. Inequality was measured by the Gini-coefficient [13] plus an absolute measurement of herdsmen with 2 cows or less (an arbitrary minimum level). The Gini coefficient is the most commonly used measure of inequality among a population, expressing the statistical of resources between 0 (absolute equality) and 1 (absolute inequality). Efficiency was assessed by comparing the number of cows and the levels of grass. The variables that were varied were selfishness and cooperativeness. Each combination was run ten times.

Results in [12] shows (a) that selfish scenarios are not sustainable, and (b) explores several psychological amendments to the payoff function, showing that some lead to sustainable outcomes. The selfish scenario compares the financial benefit of adding a cow with the cost adding the cow. The cooperative scenario compares the groups’ average financial benefit with the cost of this action in comparison to the current situation.

We interpret the cooperative scenario as utility maximisation for the group, equivalent to the individual utility maximisation of the selfish case. Team reasoning is simply implemented by comparing the individual utility and the cooperative utility. Varying selfishness and cooperativeness determines to which extent the final decision is informed by their respective calculation.

The first result is that team reasoning simulations are sustainable even if the levels of selfishness are high. Figure 1 shows that in the individualistic model runs the Gini coefficient is low due to unsustainability of the Commons (top of Fig. 1). In the collectivistic case, the Commons is sustainable resulting in a higher Gini coefficient than in the individualistic case except in the bottom right corner (bottom Fig. 1). Thus team reasoning is a viable option to use as a decision-making strategy in commons dilemmas.

Fig. 1
figure 1

Simulation results displaying the Gini-coefficient value variations in the individualistic (top) and collectivist (bottom) case respectively. X-axis = cooperativeness, Y-axis = selfishness

4 Conclusions

We presented two implementations of kinds of collective reasoning into models of social dilemma, one the iterated PD, the other the Tragedy of the Commons. The main purpose of the simulations was to show how alternatives to the classical rational choice decision-making can be used to model the empirical phenomenon of social dilemma being resolved. The first model explored whether collective strategies can compete in payoff terms to the winning strategy Tit-for-Tat in the original tournament exploration. The answer was that overall collective strategies perform similarly to TfT and that the neighbourhood focussed collective strategy overall outperforms TfT. The second model explored whether an explicit implementation of team reasoning can be used to explain the resolution of collective resource dilemmas such as the Tragedy of the Commons. The experiments showed that team reasoning indeed outperforms simple selfish decision strategies and has in addition positive consequences for the equity of society, without relevant reduction in profits. Future work is to fully explore the extension to the Tragedy of the Commons model and implement other versions of team reasoning which are more dependent on group features.