Keywords

1 Introduction

The problem of a long-term sustainable cooperation designing is an important issue in the theory of dynamic games and their applications in economics (see, e.g., [4, 7, 10, 18, 20, 27, 33] and the references therein). A recent solution called subgame-perfect core (S-P Core) for games in extensive form (with only terminal payoffs) that takes into account both sustainable cooperation motivation and subgame perfection property [29] was proposed in [6]. When considering possible deviations from the cooperative agreement in a subgame along the cooperative history the S-P Core focuses only on the players which still have decision nodes in the current subgame [6, 14, 15, 27] (so-called “active” players). The S-P Core consists of such distributions of the total cooperative payoff (in the corresponding terminal nodes) that no active coalition can implement a joint profitable deviation from the cooperative agreement as a game unfolds along the cooperative path. An extension of the S-P Core to specific n-person discrete-time game of climate change with current payoffs was introduced in [4].

We provide an extension of the S-P Core concept to more general class of extensive-form games, when the payoffs are defined at all nodes of the game tree (such assumption is more natural when modelling different real-life situations (see, e.g., [4, 10, 13, 27])). This extension (called \(\beta \)-S-P Core) is based on designing a certain payoff distribution procedure (PDP) \(\beta \) [10, 26, 27, 33] (now the players redistribute the current total payoff at each position in the cooperative history). It is worth noting that PDP-based approach introduced for differential games in [26] has been extended to various models of dynamic games (see, e.g., [8, 10, 21, 25, 27, 33]) since it provides a powerful tool to sustain a long-term cooperative agreement. A novel solution that incorporates both the core concept and the PDP technique was introduced in [28].

We prove that \(\beta \)-S-P Core satisfies several advantageous properties (see Prop. 1–2, Remark 3) for class of extensive games under consideration and provides a useful mechanism for the players to sustain the cooperative agreement. To choose a specific PDP from the \(\beta \)-S-P Core we adopt an optimization based approach which aims to maximize the relative benefit from cooperation of the least winning player (see, e.g., [23]). Lastly, for 2-person extensive-form game we provide an algorithm to construct this specific PDP and consider how one can apply the \(\beta \)-S-P Core concept to the analysis of an extensive-form version of known fishery-management model with asymmetric players [2, 10, 18].

The rest of the paper is organised as follows. The specification of an extensive-form game with payoffs defined at all nodes is recalled in Sect. 2. In Sect. 3 we derive the \(\beta \)-S-P Core definition which is based on the payoff distribution procedure approach. The strategic and coalitional support of the \(\beta \)-S-P Core are examined in Sect. 4. In addition, in Sect. 4, we introduce a rule for the \(\beta \)-S-P Core refinement and provide an algorithm how to construct a specific PDP from the \(\beta \)-S-P Core. We apply the suggested mechanism of sustainable cooperation in Sect. 5 to examine a two-player asymmetric extensive-form fishery management model. Section 6 is a conclusion.

2 Extensive-Form Game with Current Payoffs at Each Node

We need to remind the basic notations and assumptions that are accepted in the theory of finite extensive-form games (see, e.g., [12, 15, 16, 27] for details). Let \(N=\{1,\ldots ,n\}\) be the players’ set; K denotes the game tree with the original node (root) \(x_0\) and the set of all nodes (positions) P; S(x) denotes the set of direct successors of intermediate node x.

Let \(P_i\) denote the finite set of decision nodes of player i (where this player “moves”), \(P_i\cap P_j=\emptyset \), for all \(i,j\in N\), \(i\ne j\), while \(P_{n+1}=\{z^j\}_{j=1}^m\) is the set of terminal nodes, \(S(z^j) = \emptyset \) \(\forall z^j\in P_{n+1}\). Denote by \(\omega =(x_0,\ldots ,x_{t-1},x_t,\ldots ,x_T)\) the trajectory in the game tree (also called the path or the history), \(x_{t-1}=S^{-1}(x_t), \ 1\le t\le T\), \(x_T=z^j\in P_{n+1}\); where subscript t in \(x_t\) indicates the number of this position in the trajectory \(\omega \). Let the payoff \(h_i(x)\) of the ith player, \(i \in N\) be defined at all nodes \(x\in P\). Lastly, we consider the case of non-negative current payoffs, that is, \(h_{i}(x) \geqslant 0 \ \forall i\in N, \ x\in P\) .

In the following, we use \(G^{P}(n)\) to denote the class of extensive-form n-player games with perfect information (see, e.g., [10, 12, 27] for details), where \(\varGamma ^{x_0}\in G^{P}(n)\) is a game with original node \(x_0\). The pure strategy \(u_i(\cdot )\) of the ith player is a function that determines for every position \(x\in P_i\) the subsequent position \(u_i(x)\in S(x)\) the player i is going to choose at x. Denote by \(U_i\) the finite set of all possible strategies of the player i, while \(U=\prod _{i\in N}U_i\) denote a corresponding set of strategy profiles. Each pure strategy profile \(u=(u_1,\ldots ,u_n)\in U\) determines a unique history \(\omega (u)=(x_0,\ldots ,x_{t},x_{t+1},\ldots ,x_T)= (x_0,x_1(u), \ldots , x_t(u), x_{t+1}(u),\ldots , x_T(u))\), where \(x_{t+1}=\) \(u_j(x_t)\in S(x_t)\) if \(x_t\in P_j\), \(0\le t \le T-1\), \(x_T\in P_{n+1}\), and, hence, a set of all the players’ payoffs. For any strategy profile u the value of the player i’s payoff (objective) function is determined as follows:

$$\begin{aligned} H_i(u)= \tilde{h}_i(\omega (u))= \sum _{\tau =0}^T h_i(x_{\tau }(u)). \end{aligned}$$

According to [10, 12, 27] each intermediate position \(x_t\in P\setminus P_{n+1}\) forms a subgame \(\varGamma ^{x_t}\) with the subgame tree \(K^{x_t}\) and the subgame root \(x_t\). Let \(P_i^{x_t}\), \(i\in N\), denote the restriction of \(P_i\) on \( K^{x_t}\), while \(u_i^{x_t}\), \(i\in N\), is the restriction of the ith player’s strategy \(u_i(\cdot )\) in \(\varGamma ^{x_0}\) on the set \(P_i^{x_t}\). The strategy profile \(u^{x_t}=(u_1^{x_t},\ldots ,u_n^{x_t})\) generates the history \(\omega ^{x_t}(u^{x_t})= (x_t,x_{t+1}, \ldots , x_T)=(x_t,x_{t+1}(u^{x_t}),\ldots , x_T(u^{x_t}))\) in the subgame and, hence, a set of the player’s payoffs in \(\varGamma ^{x_t}\):

$$\begin{aligned} H_i^{x_t}(u^{x_t})=\tilde{h}_i^{x_t}(\omega ^{x_t}(u^{x_t}))=\sum _{\tau =t}^T h_i(x_{\tau }(u^{x_t})). \end{aligned}$$
(1)

Note that (1) differs from the subgame payoff definition for extensive-form games with terminal payoffs (see [12, 27] for details).

Definition 1

[24] A strategy profile \(u = (u_1, u_2, \ldots , u_n)\) constitutes a Nash Equilibrium (NE) in \(\varGamma ^{x_0}\in G^{P} (n)\) if \(\forall i\in N\) and for any strategy \(v_i\in U_i\) the following inequality holds: \(H_i(v_i, u_{-i}) \leqslant H_i(u_i, u_{-i}).\)

Definition 2

[29] A strategy profile u forms a subgame perfect equilibrium (SPE) in \(\varGamma ^{x_0}\in G^{P} (n)\) if \(\forall x\in P \setminus P_{n+1}\) the restriction of u in the subgame \(\varGamma ^x\) still constitutes a Nash equilibrium in the subgame, that is \(u^x \in NE(\varGamma ^x)\).

Note that every extensive-form game \(\varGamma ^{x_0}\in G^{P} (n)\) with payoffs defined at all nodes possesses pure strategy SPE (see, e.g., [27]).

3 Payoff Distribution Procedure and the Subgame-Perfect Core Concept

Let \(\bar{\omega }= \bar{\omega }(\bar{u}) = (x_0 = \bar{x}_0, \ldots , \bar{x}_t, \ldots , \bar{x}_T)\) be a cooperative trajectory, i.e.

$$\begin{aligned} \max \limits _{u\in U} \sum \limits _{i\in N} H_i(u) = \sum \limits _{i\in N} H_i(\bar{u}) = \sum \limits _{i\in N} \sum \limits _{\tau =0} ^{T} h_i(\bar{x}_\tau ) = \sum \limits _{i\in N} \widetilde{h}_i(\bar{\omega }). \end{aligned}$$
(2)

For the sake of simplicity we focus on the case when either there exists a unique cooperative history in \(\varGamma ^{x_0} \in G^P(n)\) or the players employ a specific approach (for instance, the PRB algorithm introduced in [16]) to choose a unique cooperative history from all the trajectories \(\bar{\omega }\) meeting (2). Note that such an approach should satisfy subgame consistency (see, e.g. [10, 16, 26, 27, 33]) that is a fragment of the cooperative path in the subgame \(\varGamma ^{\bar{x}_t}, \bar{x}_t \in \bar{\omega }\) has to remain cooperative history in this subgame. A vector \((p_1^{\bar{x}_t},\ldots , p_n^{\bar{x}_t})\) such that

$$\begin{aligned} \sum _{i\in N} p_i^{\bar{x}_t} = \sum _{i\in N} \sum _{\tau = t}^T h_i (\bar{x}_\tau ) = \sum _{i\in N} \widetilde{h}_i (\bar{\omega }^{\bar{x}_t}), \end{aligned}$$
(3)

determines a possible sharing rule of the aggregated cooperative (subgame) payoff between the players and could be interpreted as a cooperative solution recognized in the subgame \(\varGamma ^{\bar{x}_t}, \bar{x}_t \in \bar{\omega }\).

Let \(\beta = \{\beta _i(\bar{x}_\tau )\}, i = 1, \ldots , n; \tau = 0, \ldots , T; \bar{x}_\tau \in \bar{\omega }\) be the Payoff Distribution Procedure (PDP) which is designed for some cooperative solution \((p_1, \ldots , p_n)=(p_1^{x_0}, \ldots , p_n^{x_0})\) (see, e.g., [10, 13, 26, 27, 33]). The PDP approach implies that all the participants have agreed to split the aggregated cooperative payoff in \(\varGamma ^{x_0}\) among the players according to the vector \((p_1,\ldots , p_n)\) and, moreover, to allocate every player’s cooperative payoff \(p_i\) along the cooperative history \(\bar{\omega }\) in accordance with some specific payment schedule which is called PDP. Then, \(\beta _i(\bar{x}_\tau )\) specifies the actual payment that the player i has to receive at position \(\bar{x}_\tau \in \bar{\omega }\) instead of \(h_i(\bar{x}_\tau )\) when the players use PDP \(\beta \). We aim to design such a PDP \(\beta \) that all the coalitions \(S\subset N\) will have an incentive to follow a sharing rule \((p_1^{\bar{x}_t},\ldots , p_n^{\bar{x}_t})\) in any subgame \(\varGamma ^{\bar{x}_t}, \bar{x}_t \in \bar{\omega }\) along the cooperative history. Denote by \(\widetilde{\beta }_i (\bar{\omega }^{\bar{x}_t})\) the total payment which the i-th player is expected to gain in accordance with PDP \(\beta \) in the subgame \(\varGamma ^{\bar{x}_t}\), i.e.

$$\begin{aligned} \widetilde{\beta }_i (\bar{\omega }^{\bar{x}_t}) = \widetilde{\beta }_i (\bar{x}_t, \bar{x}_{t+1},\ldots , \bar{x}_T) = \sum _{\tau = t}^T \beta _i (\bar{x}_\tau ). \end{aligned}$$

Consider such PDP \(\beta \) that satisfy the following conditions (see [13, 16, 26, 27] for details):

Definition 3

[16] The PDP \(\beta \) for the payoff vector \((p_1,\ldots , p_n)\) meeting (3) satisfies the subgame efficiency property if for all \(\bar{x}_t \in \bar{\omega }= (\bar{x}_0,\ldots , \bar{x}_T)\) and for all \(i\in N\)

$$\begin{aligned} \widetilde{\beta }_i (\bar{\omega }^{\bar{x}_t}) = p_i^{\bar{x}_t}. \end{aligned}$$
(4)

Condition (4) for \(t=0\) (called the efficiency in the original game [16, 26, 27]) means that PDP \(\beta \) could be reasonably considered as a time schedule for the ith player’s payoff \(p_i\) allocation.

Definition 4

[13, 26, 27] The PDP \(\beta =\{\beta _{i}(\bar{x}_\tau )\}\) meets the strict balance condition if for every position \(\bar{x}_\tau \in \bar{\omega }, \tau = 0,\ldots , T\)

$$\begin{aligned} \sum _{i\in N}\beta _{i}(\bar{x}_\tau )=\sum _{i\in N} h_i(\bar{x}_\tau ). \end{aligned}$$
(5)

If (5) holds a rule \(\beta \) could be implemented by the players without any loans or credits.

Remark 1

Note that if the PDP \(\beta \) meets the subgame efficiency condition (4) it necessarily satisfies the strict balance constraint (5).

Lastly, we suppose that PDP \(\beta \) meets non-negativity condition, i.e.

$$\begin{aligned} \beta _i(\bar{x}_\tau ) \geqslant 0\ \ \forall i\in N \ \ \forall t = 0,\ldots , T. \end{aligned}$$
(6)

The player i is called active in \(\varGamma ^{x}\) if he/she still has a decision position in \(\varGamma ^{x}\), i.e. \(P_i \cap K^x \ne \emptyset \) (see [6, 14, 15, 27]). Following [6], we suppose that coalition \(S \subset N\) is active at x if each player \(i\in S\) is active in \(\varGamma ^{x}\). For each coalition \(S\subset N\) that is active at position x denote by \(\varGamma ^{x, S}\) the induced game, which differs from the original game \(\varGamma ^{x}\) only in that coalition S becomes a new player while \(h_S(y) = \sum \limits _{i\in S} {h_i(y)}\), \(y\in P^{x}\). Lastly, we denote by \(\gamma (S; x)\), \(x\in P \setminus P_{n+1}\), \(S\subset N\) the highest possible SPE payoff of S in the induced subgame \(\varGamma ^{x, S}\).

Consider a coalition \(S\subset N\) which follows a cooperative agreement (that is, a PDP \(\beta \) is implemented at each position in the cooperative history from the origin \(\bar{x}_0\) till some intermediate position \(\bar{x}_t \in \bar{\omega }, 1 \leqslant t \leqslant T - 1\)), but decides to break down the cooperative agreement in the subgame \(\varGamma ^{\bar{x}_t}\) (S is active in \(\varGamma ^{\bar{x}_t}\)). Then, the highest payoff a coalition S could get in the original game \(\varGamma ^{x_0}\) is equal to \(\sum \limits _{\tau = 0}^{t-1} \beta _S (\bar{x}_\tau ) + \gamma (S; \bar{x}_t).\) If we assume that \(\sum \limits _{\tau = 0}^{t-1} \beta _S (\bar{x}_\tau ) + \gamma (S; \bar{x}_t) \leqslant \sum \limits _{\tau = 0}^{t-1} \beta _S (\bar{x}_\tau ) + \sum \limits _{\tau = t}^{T} \beta _S (\bar{x}_\tau ),\) then there is no reason for any active coalition S to deviate from the cooperative agreement at \(\bar{x}_t\) . Hence, we obtain a rather simple conditions that ensure subgame-consistency of the cooperative scenario (S is active coalition at \(\bar{x}_t\)) :

$$\begin{aligned} \gamma (S; \bar{x}_t) \leqslant \sum \limits _{\tau = t}^{T} \beta _S (\bar{x}_\tau ), \ \ \bar{x}_t \in \bar{\omega }, \ \ t = 0,\ldots , T - 1. \end{aligned}$$
(7)

The concept of \(\beta \)-S-P Core was firstly introduced in [17], yet without providing the detailed analysis of its basic properties. The most important properties of the \(\beta \)-S-P Core (see Proposition 1 and 2 below) are proved in this paper.

Definition 5

[17] The set of all payoff distribution procedures \(\beta \) which satisfy (4), (3), (5), (6), and (7) is called the \(\beta \)-subgame-perfect core (\(\beta \) -S-P Core) for a game \(\varGamma ^{x_0} \in G^P(n)\) in extensive form with payoffs defined at all nodes.

The concept of subgame-perfect core introduced in [6] for games in extensive form with terminal payoffs implies that the players redistribute the highest total payoffs at corresponding terminal nodes to support the cooperative scenario. Note that the \(\beta \)-S-P Core in accordance with Def. 5 is an extension of the subgame-perfect core concept [6] to more general class of extensive games which allows the payoffs transfers (via appropriate PDP) in all nodes along the cooperative history and, hence, provides a powerful tool to sustain the cooperative agreement. Since the properties of the payoff distribution procedure \(\beta \) are incorporated in Def. 5 one may refer to the S-P Core for the class of extensive-form games \(G^P(n)\) with payoffs defined at all nodes as the \(\beta \)-S-P Core or the S-P Core that is based on the payoff distribution procedure. Note that the subgame-perfect cooperative agreements suggested in [4] for specific discrete-time dynamic game of climate change also imply payoff transfers between countries in each period.

4 Properties and Implementation of the \(\beta \)-S-P Core

If a game \(\varGamma ^{x_0} \in G^P(n)\) possesses non-empty \(\beta \)-S-P Core then any particular PDP \(\beta \) from the core generates a related extensive-form game \(\varGamma _{\beta }^{x_0}\) that differs from the original game \(\varGamma ^{x_0}\) only in that the payoffs \(h_i(\bar{x}_t), \ \ i\in N\) at every position \(\bar{x}_t \in \bar{\omega }\) along the cooperative history are replaced by \(\beta _i(\bar{x}_t)\). Note that similar approach was used earlier, in particular, in [27] to define a “regularized game” for differential or extensive-form game with payoffs at all nodes and also in [6] to introduce a “strategic transform” of an extensive-form game with only terminal payoffs.

The following proposition is an extension of the Proposition 1 from [6] to more general class of extensive games \(G^P(n)\). The difference is that now we need to adopt distinct subgame payoff definition (1) and use more general concept of the related non-cooperative game \(\varGamma _{\beta }^{x_0}\). This proposition was firstly provided (yet without proof) in [17].

Proposition 1

Let the \(\beta \)-S-P Core of a game \(\varGamma ^{x_0} \in G^P(n)\) in extensive form with payoofs defined at all nodes be non-empty, while \(\beta = \beta _i(\bar{x}_t), \ \ i\in N, \ \ \bar{x}_t \in \bar{\omega }\), is a PDP from the \(\beta \)-S-P Core. Then there exists a subgame-perfect equilibrium \(\underline{u} = (\underline{u}_1,\ldots , \underline{u}_n)\) in a related extensive-form game \(\varGamma _{\beta }^{x_0}\) that generates the same (cooperative) trajectory \(\bar{\omega }= (\bar{x}_0,\ldots , \bar{x}_T)\) with a resulting SPE players’ payoffs vector \(H_i(\underline{u}) = \sum \limits _{t = 0}^{T} \beta _i(\bar{x}_t) = \widetilde{\beta }_i(\bar{\omega }), \ \ i=1,...,n\).

Proof

Let \(\bar{\omega }= \bar{\omega }(\bar{u}) = (\bar{x}_0, \ldots , \bar{x}_t, \bar{x}_{t+1},\ldots , \bar{x}_T)\) denote a unique cooperative history in \(\varGamma ^{x_0}\) meeting (2) while \(\bar{u}_i(\bar{x}_t) = \bar{x}_{t+1}\) denote a ”correct choice” of the i-th player in her / his decision node \(\bar{x}_t \in P_i\) according to the cooperative scenario \(\bar{u} = (\bar{u}_1,\ldots ,\bar{u}_n)\).

Given payoff distribution procedure \(\beta = \{\beta _i (\bar{x}_\tau )\}, \ \ i\in N, \ \ \tau = 0,\ldots , T\) meeting (3)–(6) and (7) we employ a backwards induction procedure (see, for instance, [11, 27, 29]) to construct a SPE \(\underline{u} = (\underline{u}_1,\ldots , \underline{u}_n)\) in a related non-cooperative game \(\varGamma _\beta ^{x_0}\). Note that for any intermediate node \(x \in P \setminus P^{n+1}\) apart from the cooperative history (i.e., \(x \notin \bar{\omega }\)) the subgame \(\varGamma _\beta ^x\) and \(\varGamma ^x\) coincide since a replacing of the payoffs in \(\bar{x}_t, \bar{x}_t \in \bar{\omega }\) does not affect the payoffs in the subgame starting at x. Hence, we focus on the SPE strategies \(\underline{u}_i (\cdot )\) construction only in nodes \(\bar{x}_t\) along the cooperative history \(\bar{\omega }\).

Step 1. Consider subgames \(\varGamma _\beta ^{\bar{x}_{T-1}}\) and \(\varGamma ^{\bar{x}_{T-1}}\) of the length 1. Let \(\bar{x}_{T-1} \in P_i\). Then \(S = \{i\}\) is a coalition which is active at \(\bar{x}_{T-1}\), and (7) takes the form:

$$\begin{aligned} \beta _i(\bar{x}_{T-1}) + \beta _i(\bar{x}_{T}) \geqslant \gamma (\{i\}, \ \ \bar{x}_{T-1}). \end{aligned}$$
(8)

Since the players’ payoffs in the related subgame \(\varGamma _\beta ^{\bar{x}_{T-1}}\) at all but one terminal node \(\bar{x}_{T}\) are the same as in \(\varGamma ^{\bar{x}_{T-1}}\), and \(\gamma (\{i\}, \ \ \bar{x}_{T-1})\) denotes to the highest i-th player’s SPE payoff in \(\varGamma ^{\bar{x}_{T-1}}\), inequality (8) implies that the choice \(\underline{u}_i (\bar{x}_{T-1}) = \bar{x}_T\) corresponds to the SPE behavior of the i-th player in the related subgame \(\varGamma _\beta ^{\bar{x}_{T-1}}\).

Step t (t = 2, ..., T). Consider subgames \(\varGamma _\beta ^{\bar{x}_{T-t}}\) and \(\varGamma ^{\bar{x}_{T-t}}\) of the length t assuming that the SPE \(\underline{u} = (\underline{u}_1, \ldots , \underline{u}_n)\) has been already constructed in all the related subgames \(\varGamma _\beta ^{\bar{x}_{T-\tau }}, \ \ 1 \leqslant \tau < t\). Let \(\bar{x}_{T-t} \in P_j\). Then, applying (7) for \(S = \{j\}\) we get:

$$\begin{aligned} \beta _j(\bar{x}_{T-t}) + \beta _j(\bar{x}_{T-t+1}) + \ldots + \beta _j(\bar{x}_{T}) \geqslant \omega (\{j\}, \ \ \bar{x}_{T-t}). \end{aligned}$$
(9)

Since the players’ payoffs in the related subgame \(\varGamma _\beta ^{\bar{x}_{T-t}}\) along all but one history \(\bar{\omega }(\bar{x}_{T-t}) = (\bar{x}_{T-t}, \bar{x}_{T-t+1}, \ldots , \bar{x}_{T})\) are the same as in \(\varGamma ^{\bar{x}_{T-t}}\), and \(\gamma (\{j\}, \ \ \bar{x}_{T-t})\) is the highest SPE payoff the j-th player can achieve in \(\varGamma ^{\bar{x}_{T-t}}\), inequality (9) ensures that the choice \(\underline{u}_j (\bar{x}_{T-t}) = \bar{x}_{T-t+1}\) corresponds to the SPE behavior of player j at position \(\bar{x}_{T-t}\) of the related subgame \(\varGamma _\beta ^{\bar{x}_{T-t}}\).

Hence, we designed (via backwards induction procedure) a strategy profile \(\underline{u} = (\underline{u}_1, \ldots , \underline{u}_n)\) which constitutes a SPE in the related game \(\varGamma _\beta ^{\bar{x}_0}\). Note that for all \(j \in N\) by the construction \(\underline{u}_j (\bar{x}_t) = \bar{u}_j (\bar{x}_t)\) at each node \(\bar{x}_t \in \bar{\omega }\cap P_j\). Therefore, subgame perfect equilibria \((\underline{u}_1, \ldots , \underline{u}_n)\) for \(\varGamma _\beta ^{\bar{x}_0}\) generates cooperative trajectory \(\bar{\omega }\) and the players’ payoffs \(\widetilde{\beta }_i(\bar{\omega }) = \sum \limits _{\tau = 0}^T \beta _i(\bar{x}_\tau ), \ i \in N\).

An extensive-form game \(\varGamma \in G^P(n)\) can be converted into a coalitional (cooperative) game by defining a worth (or a characteristic) function for each coalition \(S\subset N\). We adopt in the paper the concept of \(\gamma \)-characteristic function introduced in [5] which implies that the guaranteed payoff of coalition S in \(\varGamma ^{x_0}\) is the Nash equilibria payoff which this coalition is expected to get in the induced game \(\varGamma ^{x_0, S}\).

Suppose that the players consider some cooperative solution \((p_1^{x_0},\ldots , p_n^{x_0})\) meeting (3) for \(t=0\), that is a distribution of the total cooperative payoff

$$\begin{aligned} v(N)=v^{x_0}(N)=\sum _{i\in N} \widetilde{h}_i (\bar{\omega }) \end{aligned}$$
(10)

in \(\varGamma ^{x_0} \in G^P(n)\) while \(\beta \) be any payoff distribution procedure for vector \((p_1^{x_0}, \ldots , p_n^{x_0})\).

For any coalition \(S\subset N, S \ne N \), one can calculate the value

$$\begin{aligned} v(S)=v^{x_0}(S) = \max _{\bar{x}_t\in \bar{\omega }\setminus \{\bar{x}_T\}} ( \sum _{\tau = 0}^{t-1} \beta _S(\bar{x}_\tau ) + \gamma (S; \bar{x}_t)), \end{aligned}$$
(11)

while the maximum has to be taken only over those decision positions \(\bar{x}_t\) in the cooperative history \(\bar{\omega }\) at which S is active. Note that (11) determines the highest guaranteed payoff a coalition S could receive in the original game \(\varGamma ^{x_0}\) given all possible behavior schemes with one switching from cooperative scenario to non-cooperative one (see also the paragraph above formula (7)).

Note that for any \((p_1^{\bar{x}_0},\ldots , p_n^{\bar{x}_0})\) meeting (3) and any PDP \(\beta \) function (10), (11) is well-defined for every coalition \(S\subset N\) since every coalition is active at \(x_0\). Hereafter we will consider a coalitional game (vN) with characteristic function (10), (11).

Definition 6

(see., e.g., [10, 27, 30]). The core of the coalitional game (vN) is a payoff vector \((p_1^{x_0}, \ldots , p_n^{x_0})\) meeting (3) and the inequalities:

$$\begin{aligned} \sum _{i\in S} p_i^{x_0} \geqslant v(S), \ \ S\subset N. \end{aligned}$$
(12)

The following proposition is an extension of the Proposition 2 from [6] to more broad class of extensive-form games \(G^P(n)\) with payoffs at all nodes. The difference is that now we use characteristic function of special type that takes into account the dynamics of payments in an extensive game \(\varGamma ^{x_0} \in G^P(n)\). In addition, some properties of the PDP \(\beta \) (like subgame efficiency condition) are now involved in Prop. 2.

Proposition 2

Let payoff distribution procedure \(\beta \) belong to the \(\beta \)-S-P Core of a game \(\varGamma ^{\bar{x}_0} \in G^P(n)\) in extensive form with payoffs defined at all nodes. Then the corresponding vector \((p_1^{x_0}, \ldots , p_n^{x_0})\) belongs to the core of the cooperative game (vN) with characteristic function (10), (11).

Conversely, let PDP \(\beta \) satisfy subgame efficiency condition (4) and non-negativity constraint (6) while the corresponding cooperative solution \((p_1^{x_0}, \ldots , p_n^{x_0})\) is in the core of the cooperative game (vN) with characteristic function (10), (11). Then PDP \(\beta \) belongs to the \(\beta \)-S-P Core of an extensive-form game \(\varGamma ^{\bar{x}_0}\).

Proof

If PDP \(\beta \) belongs to the \(\beta \)-S-P Core then according to (7) for any coalition \(S \subset N\) and every node \(\bar{x}_t\in \bar{\omega }\setminus \{\bar{x}_T\}\) such that S is active at \(\bar{x}_t\) the following inequality holds

$$\begin{aligned} \sum \limits _{\tau = t}^{T} \beta _S (\bar{x}_\tau ) \geqslant \gamma (S; \bar{x}_t). \end{aligned}$$

If we add \(\sum \limits _{\tau = 0}^{t-1} \beta _S (\bar{x}_\tau )\) to both sides and take maximum of the right hand side over all nodes \(\bar{x}_t\) in the cooperative history \(\bar{\omega }\) at which coalition S is active, we get in accordance with (11)

$$\begin{aligned} \sum _{i\in S} \widetilde{\beta }_i (\bar{\omega }) \geqslant v(S). \end{aligned}$$

Since \(\widetilde{\beta }_i (\bar{\omega })=p_i^{x_0}\) due to (4) for \(t=0\) the vector \((p_1^{x_0}, \ldots , p_n^{x_0})\) satisfies (12) and hence belongs to the core of (vN).

Conversely, if \((p_1^{x_0}, \ldots , p_n^{x_0})\) belongs to the core of the cooperative game (vN) then it follows from (12), (11) and (4) for \(t=0\) that for any coalition \(S\subset N\) and every node \(\bar{x}_t\in \bar{\omega }\setminus \{\bar{x}_T\}\) such that S is active at \(\bar{x}_t\) the following inequality holds:

$$\begin{aligned} \sum _{i\in S} \sum _{\tau = 0}^{T} \beta _i(\bar{x}_\tau ) \geqslant \sum _{\tau = 0}^{t-1} \beta _S(\bar{x}_\tau ) + \gamma (S; \bar{x}_t). \end{aligned}$$
(13)

If we eliminate coinciding addends in both parts of (13) we get

$$\begin{aligned} \sum _{\tau = t}^{T} \beta _S(\bar{x}_\tau ) \geqslant \gamma (S; \bar{x}_t), \end{aligned}$$

hence, PDP \(\beta \) satisfies (7).

The strict balance constraint (5) holds due to Remark 1. Since all the conditions (4), (3), (5), (6), and (7) are valid the PDP \(\beta \) belongs to the \(\beta \)-S-P Core of \(\varGamma ^{x_0}\).

Remark 2

Proposition 2 provides a coalitional support to the \(\beta \)-S-P Core, namely any PDP \(\beta \) in the \(\beta \)-subgame-perfect core corresponds to the payoff distribution \((p_1^{x_0}, \ldots , p_n^{x_0})\) from the core of certain coalitional game (vN).

Remark 3

Since the components \(\beta _i (\bar{x}_t), \ \ i \in N, \ \ t = 0, \ldots , T,\) of the PDP \(\beta \) from the \(\beta \)-S-P Core have to satisfy a finite number of non-strict linear inequalities (7), (6) and linear equations (3)–(5) a nonempty \(\beta \)-S-P Core is a convex polytope B in \(R^{n \times (T+1)}\).

Obviously, the main concern of the jth player when choosing a specific PDP \(\beta \) from \(\beta \)-S-P Core is maximization of her / his cooperative payoff \(p_j = \widetilde{\beta }_j (\bar{\omega })\). We adopt in the paper an optimization-based approach for the refinement of the \(\beta \)-S-P Core, i.e. a specific rule for choosing \(\widetilde{\beta }_j (\bar{\omega }), \ \ j \in N\). This rule takes care of the relative benefit from cooperation (RBC) of the least winning player. Denote by \(\varDelta _j\) the range of the j-th player’s payoffs in \(\varGamma ^{x_0}\). The value \(\frac{ \widetilde{\beta }_j - \gamma (\{j\}; x_0)}{\varDelta _j}\) could be considered as the relative benefit of player j that is achieved due to cooperation. Note that the relative benefit of cooperation could be also interpreted as the ”value of cooperation” (see, e.g. [9] for details). Hence, the above approach which we will refer later as the maxmin RBC rule implies the solution of the following problem

$$\begin{aligned} \max _{\beta \in B} \min _{j \in N} \frac{ \widetilde{\beta }_j - \gamma (\{j\}; x_0)}{\varDelta _j}. \end{aligned}$$
(14)

Remark 4

In a game with only 2 players optimization problem (14) takes the form

$$\begin{aligned} \frac{ \widetilde{\beta }_1 - \gamma (\{1\}; x_0)}{\varDelta _1} = \frac{ \widetilde{\beta }_2 - \gamma (\{2\}; x_0)}{\varDelta _2}. \end{aligned}$$
(15)

In a 2-player game \(\varGamma ^{x_0}\in G^P(2)\) to determine unique distribution \(\{\beta _i (\bar{x}_t)\}\), \(\bar{x}_t\in \omega \), of the total i-th player payoff \(\widetilde{\beta }_i\) (15) meeting (5), (6) and (7) one can use the following algorithm which aims to minimize the payoff transfers at each node \(\bar{x}_t\).

Algorithm:

  1. 1.

    Find a cooperative trajectory \(\bar{\omega }=(\bar{x}_0, \ldots , \bar{x}_t,\ldots , \bar{x}_T)\). If there are at least two paths meeting (2) employ PRB algorithm (see [13]) to choose a unique cooperative history.

  2. 2.

    Check whether the system of non-strict linear inequalities (6), (7) and linear equations (3)–(5) is compatible (one could use, for instance, any software package for solving Linear Programming problems). If no, then subgame-perfect cooperative agreement in \(\varGamma ^{x_0}\) does not exist, i.e. the \(\beta \)-S-P Core is empty. If \(\beta \)-S-P Core is non-empty solve (15) to calculate \(\widetilde{\beta }_1=\sum \limits _{\tau =0}^{T}\beta _1(\bar{x}_\tau )\) and \(\widetilde{\beta }_2\).

  3. 3.

    Using (5), (6) and (7) obtain the system of double inequalities for \(\widetilde{\beta }_1(\bar{\omega }^{\bar{x}_1})=\sum \limits _{\tau =1}^{T}\beta _1(\bar{x}_\tau )\), \(\widetilde{\beta }_1(\bar{\omega }^{\bar{x}_2})=\sum \limits _{\tau =2}^{T}\beta _1(\bar{x}_\tau ), \ldots , \ \widetilde{\beta }_1(\bar{\omega }^{\bar{x}_{T-1}})=\beta _1(\bar{x}_{T-1})+\beta _1(\bar{x}_{T})\) in the form:

    $$\begin{aligned} \left\{ \begin{array}{l} c_1^1\leqslant \widetilde{\beta }_1(\bar{\omega }^{\bar{x}_1})\leqslant C_1^1 \\ c_1^2\leqslant \widetilde{\beta }_1(\bar{\omega }^{\bar{x}_2})\leqslant C_1^2\\ \vdots \\ c_1^{T-1}\leqslant \beta _1(\bar{x}_{T-1})+\beta _1(\bar{x}_{T})\leqslant C_1^{T-1} \end{array} \right. , \end{aligned}$$
    (16)

    where \(0\leqslant c_1^{t}\leqslant C_1^{t}\), \(t=1, \ 2, \ldots , T-1\).

  4. 4.

    Accept \(\beta _1(\bar{x}_{T})=h_1(\bar{x}_{T})\), hence, no transfers are expected at terminal node \(\bar{x}_{T}\).

  5. 5.

    Solve (16) backwards, implying minimal payoff transfer \(|\beta _1(\bar{x}_{t})-h_1(\bar{x}_{t})|\) at each node \(\bar{x}_{t}\), \(t=T-1, T-2, \ldots , 1\), that is sufficient to satisfy (16).

  6. 6.

    Calculate \(\beta _1(\bar{x}_{0})=\widetilde{\beta }_1-\sum \limits _{\tau =1}^{T}\beta _1(\bar{x}_{\tau })\).

  7. 7.

    Calculate \(\beta _2(\bar{x}_{t})\), \(t=0, \ldots , T\), using strict balance condition (5).

5 \(\beta \)-S-P Core for Fishery-Management Extensive-Form Model

To provide an application of the \(\beta \)-S-P Core in the resource exploitation models in extensive form we explore a finite version of the original fishery-management model [18] that has been studied in [10]. Note that this game-theoretical fishery-management model belongs to a broad class of renewable resource extraction models (see, e.g., [1, 2, 18, 19, 21]).

The main sources of the players’ asymmetry in the renewable resource extraction models as well as in dynamic environmental models are discussed in [2, 3, 22]. The players may have different costs, different discount rates, they may value the residual stock differently, e.t.c. We focus in the paper on the case when the players have the same discount factor \(\delta \), and the only source for asymmetry is that the players value differently the resource residual stock (after the fishery process ends). Namely, we assume that \(K_1 > K_2\) in (18). Note that such an assumption is discussed in [3, 10, 17]. However, an algorithm for sustainable cooperation provided in the paper could be employed for extensive-form fishery-management model that takes into account other sources of the players’ asymmetry.

It is worth noting that the symmetric case of extensive-form fishery-management model was studied in [17], and we will briefly compare the results obtained.

Example. (An extensive-form fishery-management with asymmetric players).

Denote by y(t) a fish biomass (state variable) in year t, \(t=0,1,\ldots , T\), which evolves in accordance with the difference equation \(y(t+1)=a\cdot y(t)\), where \(a>1\) is the annual net growth rate. Suppose that only two players exploit the fishery and let \(u_j(t)\ge 0\) denote the catch amount of player j in year t (strategy or control variable). Given the initial state condition \(y(0)=y^0\) the system dynamics is described by the equation

$$\begin{aligned} y(t+1)=a\cdot \left( y(t)-(u_1(t)+u_2(t))\right) , \end{aligned}$$
(17)

while \(0\le u_1(t)+u_2(t)\le y(t)\). Player j aims to maximize the objective (payoff) function of the form

$$\begin{aligned} H_j(\cdot )=\sum \limits _{t=0}^{T-1} \delta _j^t\sqrt{u_j(t)}+K_j\cdot \delta _j^T\sqrt{y(T)},\ j=1, 2, \end{aligned}$$
(18)

where \(\delta _j\in [0,1)\) is a discount factor and \(K_j>0\) reflects how the jth player evaluates the worth of the fish biomass remainder (fishery’s scrap or bequest) after the fishery process ends.

To specify the fishery-management model (17)–(18) in an extensive-form framework we need to make some additional assumptions. Firstly, for the sake of simplicity assume that each player can fish out at only two levels: High (\(u_j^{H}=H_j\)) and Low (\(u_j^{L}=L_j\)) and consider two-period model, i.e., \(t=0, 1, 2\). Secondarily, to deal with a game of perfect information assume that the player 1 moves first at each year while the player 2 moves second, i.e., player 2 is aware of the first player’s decision when choosing own harvest level.

Fig. 1.
figure 1

Fishery-management model in extensive form

The resulting fishery-management model for given values of parameters: \(y(0)=10\); \(a=1.25\); \(u_j^H=H_j=3,\ u_j^L=L_j=1\); \(\delta _1=\delta _2=\delta =0.9\); \(K_1=1,\ K_2=0.5\), is presented in Fig. 1. Let \(P_1=\{x_0, x_3, x_4, x_5, x_6\}\), \(P_2=\{x_1, x_2, x_7 - x_{14}\}\), \(P_{n+1}=\{z_1, \ldots , z_{16}\}\), the right move at each decision position \(x_k\in P_j\) corresponds to \(u_j(x_k)=H_j\) (with only one exception for \(x_{14}\)) while the left alternative corresponds to \(L_j\). According to (18) the current payoffs \(\sqrt{u_j(0)}\) in \(x_1-x_6\) correspond to the first year of fishery process (hence, \(\delta ^0=1\)) whereas the current payoffs \(\sqrt{u_j(1)}\) in \(x_7-x_{14}\) and \(z_1-z_{16}\) correspond to the second year and have to be multiplied by \(\delta \). Note that if \(u_1(0)=u_2(0)=H\) then \(y(1)=5\), and if the first player again chooses \(u_1(1)=H_1\) at \(x_6\), then the player 2 can not choose \(u_2(1)=H_2\) at \(x_{14}\) (the case of over-fishing) due to constraint \(u_1(1)+u_2(1)\le y(1)\). We take it into account assuming that \(h_2(z_{16})=0\). Additionally, we admit some positive payoffs \(h_j(x_0),\ j=1, 2\), at the root \(x_0\) which could be interpreted as the players’ initial assets (for instance, we let \(h_j(x_0)=1,\ j=1, 2\), in Fig. 1).

The above fishery-management model in extensive form admits a unique SPE (the players’ equilibrium strategies for each decision positions are determined via backwards induction procedure and marked as dotted lines in Fig. 1). This SPE generates history \(\omega ^{SPE}=(x_0, x_2, x_5, x_{12}, z_{12})\) with the payoffs (5.4; 4.12).

The cooperative history \(\bar{\omega }= (x_0, x_1, x_3, x_8, z_{4})\), that is marked in bold in Fig. 1, implies maximal summary payoff \(\widetilde{h}_1(\bar{\omega })+\widetilde{h}_2(\bar{\omega })=5.37+4.47=9.84\).

Comparing these results with symmetric case of similar model studied in [17] one can notice that the SPE history is preserved, while the cooperative history has been changed. Namely, in asymmetric model the cooperative behavior implies more fishing efforts of both players at the second time period and consequently lower fish biomass remainder. The \(\beta \)-S-P Core is turned out to be non-empty for both cases. Obviously, these results rely heavily on the chosen parameters values.

Note that system (3)–(6) and (7) for the extensive-form game in Fig. 1 is compatible, and hence, the \(\beta \)-S-P Core is non-empty. To determine unique payment vector \((\widetilde{\beta }_1(\bar{\omega }), \widetilde{\beta }_2(\bar{\omega }))\) from the \(\beta \)-S-P Core one can employ the maxmin RBC rule. If we denote \(\widetilde{\beta }_1(\bar{\omega })=5.37+\varepsilon \), \(\widetilde{\beta }_2(\bar{\omega })=4.47-\varepsilon \), Eq. (15) takes the form

$$\frac{(5.37+\varepsilon )-5.4}{5.98-4.54}=\frac{(4.47-\varepsilon )-4.12}{5.14-3.37} \iff \varepsilon =0.17.$$

Hence, \(\widetilde{\beta }_1(\bar{\omega })=5.54\), \(\widetilde{\beta }_2(\bar{\omega })=4.3\). We apply the above algorithm to calculate all the components \(\beta _i(x_t)\) of this payoff distribution procedure \(\beta \).

\(\bar{\omega }\)

\(x_0\)

\(x_1\)

\(x_3\)

\(x_8\)

\(z_4\)

\(\widetilde{\beta }_j(\bar{\omega })\)

\(\beta _1(x_t)\)

1.55

0.62

0

1.56

1.81

5.54

\(\beta _2(x_t)\)

0.45

0.38

1

0

2.47

4.3

\(\beta _1(x_t)-h_1(x_t)\)

0.55

−0.38

0

0

0

0.17

The (minimal in absolute value) payoff transfers (from player 2 to player 1) at every position in the cooperative history are presented in the lowest row. Note that the cooperative history for the subgame \(\varGamma ^{x_3}\) coincides with the SPE path, hence, no transfers in nodes \(x_3\), \(x_8\) and \(z_4\) are needed to sustain cooperative agreement.

However, inequalities (7) for \(S=\{1\}\) and \(S=\{2\}\) in \(x_1\in \bar{\omega }\) take the form: \(\gamma (\{1\}; x_1) = 3.67 \leqslant \beta _1(x_1) + 3.37;\) \(\gamma (\{2\}; x_1) = 3.85 \leqslant \beta _2(x_1) + 3.47.\) Obviously, \(\beta _1(x_1) \geqslant 0.3\), \(\beta _2(x_1) \geqslant 0.38\), and the players have to redistribute current payoffs at \(x_1\) (a transfer at least 0.38 from player 1 to player 2 is needed to create an incentive to cooperate for player 2 at \(\varGamma ^{x_1}\)). Similar remark is valid for the origin \(x_0\).

Note that if we treat this extensive-form game as the game where the payoffs are defined and could be redistributed between the players in only terminal nodes the S-P Core [6] of such a game is empty.

6 Concluding Remarks

Since the \(\beta \)-S-P Core definition for a game \(\varGamma ^{x_0} \in G^P(n)\) with payoffs defined at all nodes allows the payoffs transfers at each node of the cooperative history, it provides a powerful tool to sustain cooperative scenario. We employ the maxmin RBC rule for selecting unique payment vector \((\widetilde{\beta }_1(\bar{\omega }), \widetilde{\beta }_2(\bar{\omega }))\) from the \(\beta \)-S-P Core in Example. One may consider a natural refinement of the maxmin RBC rule which implies that the players should initially implement the sequentional elimination of strictly dominated strategies procedure (see, e.g., [10, 23, 27]) and then estimate the absolute ranges \(\varDelta _j\) of their payoffs. If we apply this refinement to the fishery-management model in Example we obtain another PDP from the \(\beta \)-S-P Core, namely \(\widetilde{\beta }_1(\bar{\omega })=5.516; \widetilde{\beta }_2(\bar{\omega })=4.324\). It is surely of interest to consider other approaches for the \(\beta \)-S-P Core refinement as well as to study additional properties of this cooperative solution (for instance, time consistency [10, 26, 27] and irrational-behavior-proof condition [32, 33]).

Hopefully, similar approach could be applied to other discrete-time dynamic games of bio-resource management (see, e.g., [1, 21]) and other dynamic models of climate change (see, e.g., [20]). An open question is whether the \(\beta \)-S-P Core concept can be adapted to the analysis of the dynamic interaction between different political and religious movements (see [31] for details).