1 Introduction

Cooperation is a key feature of collective organization of groups of individuals, and its key role on the evolution of species has been already addressed [1,2,3,4,5]. However, how cooperation emerges and persists from fierce competition is an overreaching problem since it is against with the fundamental principles of natural selection. A common mathematical framework for addressing cooperation is evolutionary game theory. Since the seminal work of network reciprocity is proposed by Nowak and May [6], many studies focused on the structure of interactions among individuals have been conducted on different topologies, such as small-world networks [7,8,9,10], scale-free networks [11,12,13,14,15,16,17,18], as well as random and adaptive networks [19,20,21,22] (see [23] for a review). In addition, social factors, such as reputation, age, reward, punishment, aspiration or voluntary participation, have been studied in the framework of network reciprocity (see [24] for review). Very recently, interdependent networks have been considered to be a proper framework for these studies, and seemingly irrelevant perturbations in one network can have extremely big and catastrophic consequences in another network [25,26,27]. It has been shown that networks’ interdependence broadens the mechanism scope that supports cooperation [28,29,30], including interdependent network reciprocity, self-organized interdependence between two layers, optimal interdependence as well as information sharing [29, 31,32,33,34].

Unlike previous works, in this paper we propose a new learning ability rule and test its performance on interdependent networks. The rule consists in the fact that an individual weakens its learning ability only if its own payoff exceeds that of its environment. Conversely, if the payoff is below the environment, the learning ability is strengthened. In addition, only the players having a learning ability above a given threshold can be rewarded with an external link to players belonging to another network. Our results show that, when the threshold T is moderate, the individuals’ learning ability at the stationary state spontaneously organizes into a two-class distribution, and the formed species diversity supports the evolution of cooperation even if natural selection strongly favors defection.

Our manuscript is organized as follows: in Sect. 2 we present the details of the coevolutionary model, while Sect. 3 is devoted to the presentation of our main results. Finally, we summarize the main conclusions and discuss potential directions for future research.

2 Methods

Let us start with players that are staged on two disjoint square lattices of size \(L*L\) called network A and network B, with periodic boundary conditions. Each player has four direct neighbors and is initially taken to be either as a cooperator (C) or a defector (D) with equal probability. The strategies are described as:

$$\begin{aligned} s_x={1 \atopwithdelims ()0}\qquad or \qquad s_x={0 \atopwithdelims ()1}. \end{aligned}$$
(1)

The accumulation of payoffs \(P_x\) and \(P_{x^{\prime }}\), in both networks, occurs as follows. For easier comparison, one of the most popular game models, weak prisoner’s dilemma game (PDG), was adopted on both networks since it captures all the essential features of prisoner’s dilemma. Mutual cooperation (defection) yields the reward (punishment) \(R = 1\) (\(P = 0\)), for the cooperation–defection pairwise interaction, the cooperator receives the sucker’s payoff \(S = 0\), and the defector enjoys the temptation to defect \(T = b\) [2, 24]. The payoff matrix A is defined by:

$$\begin{aligned} \mathbf {A} = \left( \begin{array}{ll} 1 \qquad 0 \\ b \qquad 0 \end{array}\right) . \end{aligned}$$
(2)

The total payoff \(P_x\) of player x is calculated via the following equation:

$$\begin{aligned} P_x=\sum _{y \in \Omega _x} s^{T}_x A s_y. \end{aligned}$$
(3)

where \(\Omega _x\) is the set of all neighbors of player x.

Since our coevolution setup may lead to heterogeneous coupling, only those players who meet a certain condition (described as Eq. (6)) can benefit from another network. The fitness of these players is given by \(F_x=P_x+0.5*P_{x^{\prime }}\); for players those who cannot get additional benefit, its fitness is equal to \(F_x = P_x\). Following the definition of fitness, strategy imitation, between player x and one of its random neighbors y from network A, occurs with the following probability:

$$\begin{aligned} W=\frac{w_{x}}{1+\exp {[(F_x-F_y)/K]}}. \end{aligned}$$
(4)

Strategy adoption in network B was determined analogously. For simplicity, we fixed the value of strength of selection parameter to be \(K=0.1\) [35]. \(w_x\) is defined as player’s learning ability and evolves with player’s performance according to winner-weaken-loser-strengthen rule: if player x wins over the competition, its learning ability will be weakened. Many examples in biology seem to obey our rules, for example, one possible explanation of the Cambrian explosion is that the rise of level of oxygen in biosphere made all kind of species suddenly appeared on earth in just over 20 million years, forming a flourishing scene of multiple species simultaneously [36, 37]. Once the environment becomes no longer suitable for certain species, they will fasten their evolutionary speed to adapt to the environment.

$$\begin{aligned} \left\{ \begin{array}{ll} w_{x}= w_{x}-\delta &{}\quad \text {if }P_{x}\ge \overline{P}\\ w_{x}= w_{x}+\delta &{}\quad \text {if }P_{x}<\overline{P}\\ \end{array} \right. \end{aligned}$$
(5)
Fig. 1
figure 1

The coevolution of cooperation and network interdependence resolves social dilemma for \(b=1.03\) (a) and \(b=1.8\) (b), respectively. Moderate values of T can warrant the best environment for cooperation to spread

Fig. 2
figure 2

Characteristic snapshots reveal that the promotion of cooperation is due to species diversity. From left to right, snapshots of the upper (top) and lower (bottom) network as obtained for \(T=0.5\), \(\delta =0.3\) after 0, 260, 1000, 2000, 50,000 MCS. Cooperators and defectors are denoted by red and light gray, respectively. To visualize different states of players that have or does not have external link to its corresponding players in the other network, we use wine to represent cooperators that have external links and gray to make defectors that have no external links. The final pure C phase is not shown

Fig. 3
figure 3

In the absence of species diversity, cooperation on the two networks goes to extinction soon. From left to right, snapshots of the upper network (first and third row) and lower network (second and fourth row) as obtained for \(T = 0\) (first two row) and \(T = 1.1\) (last two row) after 0, 3, 10, 20, 5000 MCS. The applied color code is the same as that of Fig. 2

Here, \(\overline{P}\) is defined as the average payoff of the neighbors of x, and \(0 \le \delta \le 1\) is a parameter that represents the velocity change of learning ability. In addition, we apply \(w_x=1\) as the initial value, and in order to avoid frozen states, we assume the minimal value of \(w_x\) equal to 0.1. Considering the fact that the nature of humans is to pursue fairness, we let losers benefit from another network, whereas winner cannot. That is to say, only when the teaching activity satisfies \(w_x \ge T\), player x is allowed to have an external link to its corresponding player in another network. If \(w_x<T\), the external link is terminated. Such a setup promises the equality of players on the network. Furthermore, in the summary parts, we also make comparison with [38], in which the converse conditions are used: only the winners can benefit from the other network, while loser cannot. To make our setup clearer, we described it by the following equations (Eq. 6):

$$\begin{aligned} \left\{ \begin{array}{ll} F_x=P_x+0.5*P_{x^{\prime }} &{}\quad \text {if }w_{x}\ge T\\ F_x=P_x &{}\quad \text {if }w_{x}<T\\ \end{array} \right. \end{aligned}$$
(6)

We simulate our model by means of an asynchronous updating rule, i.e., each player on the networks has a chance to updating its strategy once on average during a full Monte Carlo step. In our simulation, the system size varies from \(L=100\) to 2, 000 and the equilibration required up to \(10^{5}\) to \(10^{6}\) steps. (These choices of simulation parameters allow to avoid finite size effects and to get accurate results.)

3 Results

Fig. 4
figure 4

Spatial evolution from a prepared initial state. From left to right, snapshots of the upper network (first row) and lower network (second row) as obtained for \(T = 0.5\) after 0, 100, 400, 2000, 4000 MCS. The applied color code is the same as that of Figs. 2 and 3. Here, the final pure C phase is not shown

At the beginning, we focus on two values of b, in which the performance of cooperation is different. For \(b = 1.03\), cooperation can survive on an isolated square lattice by the well-known phenomenon of network reciprocity. However, when \(b=1.8\), cooperation can no longer survive on the only base of network reciprocity, so that additional reciprocity mechanisms are required. The fraction of cooperation is reported in the color maps of Fig. 1 versus the threshold T (under which an external link is terminated) and the velocity change \(\delta \), for \(b = 1.03\) (Fig. 1a) and \(b=1.8\) (Fig. 1b). Let us take a closer look at Fig. 1a. Obviously, \(\delta =0\) and \(\delta =1\) recover the cases of traditional network reciprocity and interdependent network reciprocity, respectively. Therefore, its cooperation rate is equal to 0.61 and 0.23 in interdependent network and single network, respectively, as already manifested in previous works [38]. However, \(\delta > 0\) incorporates our winner-weaken-loser-strengthen rule into the system. Interestingly, there exists a moderate value of learning ability threshold T, at which cooperation fares best. When T is too large or too small, the evolution of cooperation is impeded, and the cooperation rate drops to 0 suddenly. When T is less than 0.1, all players reach the threshold; the evolution of cooperation thus proceeds with the support of interdependent network reciprocity and our winner-weaken-loser-strengthen rule. On the other hand, \(T = 1.1\) leaves the two populations fully independent, and the only supporting mechanism is our winner-weaken-loser-strengthen together with traditional network reciprocity. The same phenomenon occurs also for \(b=1.8\). In other words, our winner-weaken-loser-strengthen rule behaves bad either on single network or on interdependent networks, but it behaves better when the network is optimally interdependent. In what follows, we give some explanation why cooperation is supported when our rule is enforced together with optimal interdependence.

Fig. 5
figure 5

a Time course of the fraction of cooperation for \(T = 0, 0.5, 1.1\), respectively. b The fraction of strategies in the whole population for \(T = 0.5\), \(\delta = 0.3\) among the distinguished cooperative players (\(f_{Cd}\)), ordinary cooperative players (\(f_{Co}\)), distinguished defective players (\(f_{Dd}\)), as well as among ordinary defective players (\(f_{Do}\)). Figure 5 further demonstrates the key role of wine and light gray on the evolution of cooperation

Figure 2 depicts the characteristic evolution snapshots for moderate value of T, in which species diversity appears. Since moderate T value leads to a optimally interdependent network, we use wine (gray) to denote cooperators (defectors) who satisfy \(w \ge T\) and red (light gray) to represent cooperators (defectors) with whom whose condition is not satisfied. These snapshots clearly emphasize that the species diversity, in association with player’s state (whether they have external link), warrants powerful support for cooperators to form compact clusters and spread. Indeed, the strategy change rule is conspicuous, but we can still observe that the spreading cooperators are surrounded by light gray (third panel) on both networks, and wine can diffuse till full C phase. That means that wine and light gray play a critical rule on the evolution of cooperation. For easier comparison, the snapshots obtained by the same set of parameters for \(T = 0\) and \(T = 1.1\) are given in Fig. 3. In the absence of species diversity (each player has only one state for both cases), cooperation goes to extinction soon. In order to get further understanding about the strategy change rule, Fig. 4 gives a representative spatial evolution from a prepared initial state. It is clear that boundary gray changes to light gray, and then most of the light gray changes to wine, which further demonstrate that promoting cooperation is due to species diversity.

The next step of our study is examining the evolution of cooperation when starting from a random initial condition. For an easier comparison, the parameter values of T are consistent with the above. In panel (a) of Fig. 5, we can observe that cooperation soon goes to extinction when T is too large or too small. However, there is a typical negative feedback effect of cooperation for moderate T, which is the effect of species diversity except for the synchronization effect of interdependent networks. Figure 5b shows the fraction of different strategies in the whole population, among the distinguished cooperative players (\(f_{Cd}\)), ordinary cooperative players (\(f_{Co}\)), distinguished defective players (\(f_{Dd}\)), as well as among ordinary defective players (\(f_{Do}\)). It is obvious that gray and wine change to light gray till cooperation and gray is blocked by light gray owe to the self-organization of the system. Later, most of the light gray changes to wine, and gray changes to light gray till a full C phase is attained. This finding is a further demonstration of the key role of light gray and wine on the cooperation dynamics.

Finally, the evolution of the distribution of learning abilities is reported in Fig. 6. The parameters used are the same as those in Figs. 2 and 3. There is a two-class society phenomenon, where players have larger learning ability, i.e., \(w=1\) is the majority, but they coexist with a nonnegligible number of players with lower learning ability, which ensures special diversity in the system. It is worth mentioning that, in our setup, losers who have larger learning ability can benefit from the other network, while winners who have lower learning ability cannot. Such a setup reduces the gap between strong players and weak players; in this sense, we enable the equality on networks and further promise the emergence of full C phase even for large b (i.e., \(b=1.8\)), where cooperation cannot survive by traditional network reciprocity. Conversely, if this gap is widened, cooperation goes extinct at \(b=1.3\) [38].

Fig. 6
figure 6

The spontaneous emergence of a two-class society, whereby only the upper class (loser) is able to benefit from the other network, while the downer class (winner) cannot. Presented results are obtained for \(T = 0.5\) and \(\delta = 0.3\). The two-class society phenomenon helps to establish species diversity, which promise cooperation to spread

4 Conclusion

In summary, we have explored the winner-weaken-loser-strengthen rule. In detail, the learning ability was increased by \(\delta \) for winners whose payoffs are larger than environment, and decreased by the same value for losers whose payoffs are lower than environment. We have shown that due to species diversity, cooperation can spread to the whole network even for relatively large values of the parameter which quantifies the individual temptation to defect (i.e., \(b = 1.8\)). On the contrary, lacking of species diversity leads cooperation to extinction for too small or too large values of T. Asymmetric interactions are widespread in nature and human society; however, their effects on cooperation are still unclear. If the degree of asymmetric becomes large, can we observe a more obvious boosting effect? Our answer is no; if the gap between strong and weak players is widened, we are unable to observe full C phase for \(b = 1.3\) [38]. Conversely, if the degree of this asymmetric is reduced, we can still observe full C phase even for \(b=1.8\). In a word, we have demonstrated the key role of enabled equality on the evolution of cooperation.

The species diversity provides another clue for us to investigate the evolution of cooperation, from which the interesting phenomenon, i.e., self-organization of strategies, phase transition as well as the effect of rock–paper–scissors, can be observed  [39,40,41]. The presented results broaden the scope of this concept and extend to the interdependent networks. Hopefully, our results can inspire more studies focus on the species diversity, especially from the viewpoint of human behavior experiment.