Abstract
Multi-person stopping games with players’ priorities are considered. Players observe sequentially offers Y 1,Y 2,… at jump times T 1,T 2,… of a Poisson process. Y 1,Y 2,… are independent identically distributed random variables. Each accepted offer Y n results in a reward G n =Y n r(T n ), where r is a non-increasing discount function. If more than one player wants to accept an offer, then the player with the highest priority (the lowest ordering) gets the reward. We construct Nash equilibrium in the multi-person stopping game using the solution of a multiple optimal stopping time problem with structure of rewards {G n }. We compare rewards and stopping times of the players in Nash equilibrium in the game with the optimal rewards and optimal stopping times in the multiple stopping time problem. It is also proved that presented Nash equilibrium is a Pareto optimum of the game. The game is a generalization of the Elfving stopping time problem to multi-person stopping games with priorities.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Suppose that a company is going to open m new departments that will be ordered (ranked) according to their importance. Rank 1 denotes the most important department, and consequently rank m refers to the least important one. So, the firm offers m secretary positions. The applications will be considered up to a fixed deadline U. There is a selection committee that interviews candidates arriving one by one at jump times of a Poisson process. We assume that candidates’ “skills” form a sequence of i.i.d. random variables with known probability distribution. The selection process is organized similarly to the classical secretary problem but taking into account the departments’ rankings. So, a candidate who is being interviewed may be accepted for department 1; if she is rejected, she may be accepted for department 2, and so on until the first acceptance. Candidates rejected for department i cannot be considered in the future. The aim is to select candidates with maximal expected “skills”. So, one may say that each department acts as an independent player in a stopping game with priorities.
We will formulate the problem as an m-person stopping game with priorities in which random offers are presented at jump times of a homogeneous Poisson process. Such a game has been considered in Ferenstein and Krasnosielska [8]. In this paper, we propose a new solution and we prove that a proposed strategy is a Nash equilibrium, which allows removing some assumption made in Ferenstein and Krasnosielska [8]. The difference between the solution proposed in this paper and those in [8] will be more thoroughly discussed at the end of the paper.
The game considered is a generalization, to the case of several players, of the optimal stopping time problem formulated and solved first by Elfving [6], later considered also by Siegmund [18]. Various modifications of the structure of the reward in the Elfving problem were considered in Albright [1], David and Yechiali [3], Krasnosielska [12, 13], Gershkov and Moldovanu [10], Parlar et al. [16].
Stadje [19] considered an optimal multi-stopping time problem in Elfving setting, in which the final reward is the sum of selected discounted random variables. Various stopping games with rewards observed at jump times of a Poisson process were considered in Dixon [4], Enns and Ferenstein [7], Saario and Sakaguchi [17], Ferenstein and Krasnosielska [9]. Stopping games were introduced in seminal paper by Dynkin [5] as an application of optimal stopping time problems, since then often referred as Dynkin games. An extensive bibliography on stochastic games can be found in Nowak and Szajowski [15].
1 Multiple Stopping Time Problem
Let us recall the multi-stopping time problem presented in Stadje [19]. Let Y 1,Y 2,… be nonnegative independent identically distributed random variables with continuous distribution function F and E(Y 1)∈(0,∞), Y 0=0. The random variables Y 1,Y 2,… are sequentially observed at jump times 0<T 1<T 2<⋯ of a homogeneous Poisson process N(s), s≥0, with intensity function p(u) and T 0=0. Moreover, assume that the sequences \(\{Y_{n}\}_{n=1}^{\infty}\) and \(\{T_{n}\}_{n=1}^{\infty}\) are independent. Let r:[0,∞)→[0,1] be a right continuous, non-increasing function satisfying the conditions
r(0)=1, r(s)>0 for s∈[0,U) and r(s)=0 for s≥U, where U∈(0,∞). Assume that the set of points of discontinuity of r is finite. Let G n =Y n r(T n ), n∈ℕ0, where ℕ0=ℕ∪{0}, G ∞=0.
Note that without loss of generality we can assume that p(u)≡1 because a non-homogeneous Poisson process can be reduced to a homogeneous Poisson process with intensity 1 (see [2, pp. 113–114]).
Let \(\mathcal{F}_{n}=\sigma(Y_{1},\ldots,Y_{n},T_{1},\ldots,{T_{n}})\), n∈ℕ, \(\mathcal{F}_{0}=\{\emptyset,\varOmega\}\), \(\mathcal{F}_{\infty}=\sigma (\bigcup_{n\in\mathbb{N}_{0}}\mathcal{F}_{n})\). Let \({\mathcal{M}}\) be the set of stopping times with respect to \(\{\mathcal{F}_{n}\}_{n=0}^{\infty}\), and \(\mathcal{M}_{k}(m)=\{(\tau^{1},\ldots,\tau^{m}):\tau^{1},\ldots ,\tau^{m}\in\mathcal{M}, k\leq\tau^{1}<\cdots<\tau^{m}\}\), k∈ℕ.
Note that the values of stopping times τ i are natural numbers and τ i=n means selecting an offer arriving at time T n .
We are interested in finding an optimal m-stopping time for \(\{G_{n}\}_{n=1}^{\infty}\), that is, the m-stopping time \((\tau^{1,m}_{k},\ldots,\tau^{m,m}_{k})\in \mathcal{M}_{k}(m)\) such that
and the optimal mean reward \(E(G_{\tau^{1,m}_{k}}+\cdots+G_{\tau^{m,m}_{k}})\).
Interpretation
The problem can be interpreted as a problem of selling m commodities of the same type, where the offers are received sequentially and must be refused or accepted immediately on arrival.
Let {s 0,…,s l }, where 0=s 0<s 1<⋯<s l−1<s l =U, l<∞, contains all points of discontinuity of the discount function r. Let \(\bar{F}(s)=1-F(s)\) and \(H(s)=E(Y_{1}\mathbb{I}(Y_{1}>s))\), s∈ℝ.
In the theorem below functions γ i determining the optimal expected and conditional expected total reward are obtained.
Theorem 1
(Stadje [19])
-
1.
There exist m unique functions γ 1(x)≥γ 2(x)≥⋯≥γ m(x), x≥0, such that:
-
(i)
For i∈{1,…,m}, the function γ i is continuous and has continuous derivative on intervals (s j ,s j+1), j∈{0,1,…,l−1}.
-
(ii)
For x∈(s j ,s j+1),
$$\frac{d}{dx}\gamma^{i}(x)= \left \{ \begin{array}{l@{\quad}l} \gamma^{1}(x)\bar{F}(\frac{\gamma^{1}(x)}{r(x)})- r(x)H(\frac {\gamma^{1}(x)}{r(x)}), & \mbox{\textit{if}}\ i=1, \\\noalign{\vspace{2pt}} \gamma^{i}(x)\bar{F}(\frac{\gamma^{i}(x)}{r(x)})- r(x)H(\frac {\gamma^{i}(x)}{r(x)}) \\\noalign{\vspace{2pt}} \quad{}+r(x)H(\frac{\gamma^{i-1}(x)}{r(x)})-\gamma^{i-1}(x)\bar {F}(\frac{\gamma^{i-1}(x)}{r(x)}),& \mbox{\textit{if}}\ i>1. \end{array} \right . $$ -
(iii)
For each i, γ i(x)=0 for x≥U.
-
(i)
-
2.
The additional expected reward which can be obtained from selling i instead of i−1 commodities is
$$\gamma^{i}(0)= \left \{ \begin{array}{l@{\quad}l} \sup_{\tau^{1}\in\mathcal{M}_1(1)}E(G_{\tau^{1}}), & \mbox{\textit{if}}\ i=1, \\\noalign{\vspace{4pt}} \sup_{(\tau^{1},\ldots,\tau^{i})\in\mathcal {M}_1(i)}E(\sum_{j=1}^iG_{\tau^{j}}) \\\noalign{\vspace{4pt}} \quad {}-\sup_{(\tau^{1},\ldots,\tau^{i-1})\in \mathcal {M}_1(i-1)}E(\sum_{j=1}^{i-1}G_{\tau^{j}}), & \mbox{\textit{if}}\ i\geq2. \end{array} \right . $$
For k∈ℕ and i∈{1,…,m}, define
and τ i(∞)=∞.
For k∈ℕ∪{+∞} define
The Markov time \(\tau^{i,m}_{k}\) can be interpreted as a time of selling the ith commodity among m commodities for sale, if we start the process of selling at the time of the kth observation. Note that \(\tau^{i,m}_{k}\) is the first time after the stopping time \(\tau^{i-1,m}_{k}\), at which the reward is not smaller than the optimal conditional expected reward from selling the ith commodity in the future if we have i instead of i−1 commodities for sale. Therefore, γ i(T n ) determines the minimum acceptable offer of selling the m−i+1-th commodity at time T n . Hence, γ i is a threshold below which it is not profitable to sell the m−i+1-th commodity.
Theorem 2
(Stadje [19])
\((\tau^{1,m}_{k},\ldots,\tau^{m,m}_{k})\) is an optimal m-stopping time in \(\mathcal{M}_{k}(m)\) for the sequence \(\{G_{n}\}_{n=0}^{\infty}\).
Lemma 1
For each m,k∈ℕ we have
Proof
Note that
where the last equality follows from Kösters [11, Lemma 5] and Stadje [19, p. 233]. □
Note that for each m∈ℕ
and in particular
Note that from monotonicity of the sequence \(\{\gamma^{i}(\cdot)\}_{i=1}^{m}\) we get
and from Theorem 2 we obtain
Moreover, for k,m∈ℕ and i∈{2,…,m}, we have
Lemma 2
For k,m∈ℕ, m≥2, i∈{1,…,m−1} we have
Proof
2 The Game
Suppose that there are m>1 ordered players who sequentially observe rewards G n at times T n , n=1,2,…. Players’ indices 1,2,…,m correspond to their ordering (ranking) called priority so that 1 refers to the player with the highest priority and m to the lowest one. Each player is allowed to obtain one reward at time of its appearance on the basis of the past and current observations and taking into account the players’ priorities. More precisely, Player i, say, who has just decided to make a selection at T n gets the reward G n if and only if he has not obtained any reward before and there is no player with higher priority (lower order) who has also decided to take the current reward. As soon as the player gets the reward, he quits the game. The remaining players select rewards in the same manner, their priorities remain as previously.
2.1 Model of the Game
In this section, we make the same assumptions and denotations as in Sect. 1. Moreover, let D be the set of sequences of 0–1 valued \(\{\mathcal {F}_{n}\}\)-adapted random variables. Let \(\psi^{m,i}=\{\psi^{m,i}_{n}\}_{n\in\mathbb{N}}\in D\) be a strategy of Player i in the m-person game. If \(\psi^{m,i}_{n}=1\), then, at time of observation of the nth offer, the decision of Player i is: I want to finish the game and take the reward G n . If \(\psi^{m,i}_{n}=0\), then his decision is: I continue the game.
We say that ψ m is the profile of the m-person game, if ψ m=(ψ m,1,…,ψ m,m), where ψ m,i∈D for i∈{1,2,…,m}. Let us recursively define \(\sigma_{m}^{i}(\psi^{m})\), i∈{1,…,m}, as follows:
Under the strategy profile ψ m, the reward of Player i, i∈{1,…,m}, is \(G_{\sigma_{m}^{i}(\psi^{m})}\) and the mean reward is
Let us recall that the strategy profile \(\varphi^{m}=(\varphi^{m,1},\ldots ,\varphi^{m,m})\in D^{m}=\underbrace{D\times\cdots\times D}_{m}\) is a Nash equilibrium strategy in D m, if for any profile ψ m=(ψ m,1,…,ψ m,m)∈D m we have
where
Let D 1⊆D be the set such that {p n } n∈ℕ∈D 1 if and only if {p n } n∈ℕ∈D and on the set \(\{ \tau_{1}^{m,m}\leq n\}\) we have p n =1, where \(\tau_{1}^{m,m}\) is given in (3). Our aim is to find a Nash equilibrium in the set \(D_{1}^{m}=\underbrace {D_{1}\times\cdots\times D_{1}}_{m}\) for the m-person game for which the reward for Player i is \(G_{\sigma_{m}^{i}(\psi^{m})}\), i∈{1,…,m}.
Using Stadje’s result from Theorem 2, we have that m selected rewards which maximize the expected total reward appear no later than \(\tau_{1}^{m,m}\). This motivates the searching of a Nash equilibrium strategy in the set \(D_{1}^{m}\).
2.2 Construction of Nash Equilibrium
Define \(\sigma_{i}^{k}\) for k∈ℕ∪{+∞} and i∈{1,…,m} as follows:
where \(\sigma_{i}^{+\infty}=+\infty\).
Lemma 3
\(\sigma^{k}_{i}\leq\tau_{k}^{i,i}\leq\tau^{m,m}_{k}\) for i≤m, k,m∈ℕ.
Lemma 4
\(\sigma^{k}_{i}\) for i,k∈ℕ is stopping time with respect to \(\{\mathcal{F}_{n}\}_{n=0}^{\infty}\).
Proof
Immediately from Lemma 3 and (11). □
Lemma 5
For m∈ℕ, m≥2, i∈{1,…,m−1}, on {τ i(k)>τ m(k)} we have
Lemma 6
\(\sigma_{i}^{k}\neq\sigma_{j}^{k}\) for i≠j.
Let \(\hat{\psi}^{m}=(\hat{\psi}^{m,1},\ldots,\hat{\psi}^{m,m})\), where \(\hat{\psi}^{m,i}=\{\hat{\psi}^{m,i}_{n}\}_{n=1}^{\infty}\) and
From Lemma 4, we get that \(\{\hat {\psi}_{n}^{m,i}\}\) is a sequence of 0-1 valued \(\{\mathcal{F}_{n}\} \)-adapted random variables. Hence, from Lemma 3, we get
for i∈{1,…,m}. According to the above profile, Player i in the m-person game will behave in the same manner as Player i in the i-person game, that is,
Proposition 1
\({\sigma}^{i}_{m}(\hat{\psi}^{m})=\sigma_{i}^{1}\) for m∈ℕ, i∈{1,…,m}.
Proof
Proof uses induction on i and Lemma 6. □
It follows from Proposition 1 that, for i∈{1,…,m}, Player i stops playing in the m-person game at the same time as in the i-person game, that is, \({\sigma }^{i}_{m}(\hat {\psi}^{m})={\sigma}^{i}_{i}(\hat{\psi}^{i})\).
Lemma 7
For each k,m∈ℕ, we have
where \(\tau^{i,m}_{k}\) are defined in (3).
Theorem 3
For m∈ℕ, we have
Proof
Using (9), Proposition 1 and Lemma 7, we get
Hence, from Theorem 2 and (4) we get the assertion. □
Lemma 8
For each m∈ℕ and i∈{1,…,m}, we have
Proof
From (9) and Proposition 1, we have
Using Theorem 3 and (17), we get
Hence, \(V_{i,i}(\hat{\psi}^{i})=\gamma^{i}(0)\), which together with (17) gives the conclusion. □
Note that according to (17), the expected reward of Player i in the m-person game with profile \(\hat{\psi}^{m}\) is equal to the expected reward of Player i in the i-person game with profile \(\hat{\psi}^{i}\).
Lemma 9
For each m,k∈ℕ we have
and
Proof
Note that \(\mathcal{M}_{k}(m)\subseteq\{(\tau^{1},\ldots,\tau^{m}):\tau^{1},\ldots,\tau^{m}\in{\mathcal{M}_{k}(1)},\tau^{i}\neq\tau^{j},i\neq j\}\). Let \(\tau^{1},\ldots,\tau^{m}\in\mathcal{M}_{k}(1)\) be any different stopping times. Define τ (1)=min{τ 1,…,τ m}, τ (i)=min{τ j:τ j>τ (i−1),j∈{1,…,m}}, i=2,…,m. Then \((\tau^{(1)},\ldots,\tau^{(m)})\in\mathcal{M}_{k}(m)\). □
Theorem 4
The profile \(\hat{\psi}^{m}\) is a Nash equilibrium in \(D_{1}^{m}\).
Proof
For \(\psi^{m}=(\psi^{m,1},\ldots,\psi^{m,m})\in D_{1}^{m}\) and i∈{1,…,m}, from (9) and (16), we have
Note that from (7) and (8) we have \(\sigma_{m}^{j}((\hat{\psi }^{m})^{-i},{\psi}^{m,i})\neq\sigma_{m}^{l}((\hat{\psi}^{m})^{-i},{\psi }^{m,i})\), j≠l. Hence, from the definition of D 1 and (13), we get \(\sigma_{m}^{k}((\hat{\psi }^{m})^{-i},{\psi}^{m,i})\leq\tau^{m,m}_{1}+m<\infty\). Therefore, using Lemma 9 and (4), we have
Finally, from (18), (19), and (16), we get \(V_{m,i}((\hat{\psi}^{m})^{-i},{\psi}^{m,i})\leq\gamma^{i}(0)=V_{m,i}(\hat {\psi}^{m})\). □
Let us recall that the profile \(\varphi^{m}\in D_{1}^{m}\) is Pareto-optimal in \(D^{m}_{1}\) if there does not exist a profile \({\psi}^{m}\in D^{m}_{1}\) such that for all i∈{1,…,m}
and at least one of the above inequalities is strict.
Theorem 5
\(\hat{\psi}^{m}\in D^{m}_{1}\) is Pareto-optimal in \(D^{m}_{1}\).
Proof
Assume that there exists \({\varphi^{m}}\in D_{1}^{m}\) such that \(V_{m,i}(\varphi^{m})\geq V_{m,i}(\hat{\psi}^{m})\), i=1,2,…,m and at least one of the inequalities is strict. Then, from Theorem 3 and (4),
On the other hand, from (9),
which with (20) gives a contradiction. □
In the proposition below, we will show that players in the m-person game will choose the same rewards as those optimal in the multiple stopping problem. However, note that the stopping time selected by Player i in the m-person game can be different from the ith stopping in the multiple stopping problem.
Proposition 2
Define \(\sigma_{(1)}^{k}=\min\{\sigma_{1}^{k},\ldots,\sigma^{k}_{m}\}\) and \(\sigma_{(i)}^{k}=\min\{\sigma_{j}^{k}: \sigma^{k}_{j}>\sigma^{k}_{(i-1)},j\in\{ 1,\ldots ,m\}\}\) for i∈{1,…,m}.
Corollary 1
\(\sigma_{(i)}^{k}=\tau^{i,m}_{k}\) for i∈{1,…,m}.
In Theorem 6 below, we will show that presented Nash equilibrium is a sub-game perfect equilibrium. Let us remind that a sub-game perfect equilibrium is a Nash equilibrium if after any history all remaining players’ strategy profile is a Nash equilibrium in the remaining part of the game. Let \(V_{m,i}^{k}(\psi^{m})\) be conditional expected reward for Player i in the m-person game at time of the k-th offer, that is, \(V_{m,i}^{k}(\psi^{m})=\mathbb {I}(\sigma_{m}^{i}(\psi^{m})\geq k)E(G_{\sigma_{m}^{i}(\psi^{m})}\mid\mathcal{F}_{k-1})\). Note that \(V_{m,i}^{1}(\psi^{m})=V_{m,i}(\psi^{m})\).
Theorem 6
The profile \(\hat{\psi}^{m}\) is a sub-game perfect equilibrium.
Summary
In Theorem 4, we have proved that the strategy profile \(\hat{\psi}^{m}\) is a Nash equilibrium. According to the strategy, Player i in the m-person game behaves as Player i in the i-person game (Eq. (14)). Moreover, their selected stopping times and expected rewards are equal (Proposition 1 and Eq. (17)). Additionally, the expected reward of Player i in the m-person game in Nash equilibrium is equal to the expected reward from selling the ith good in the future, if there are i instead of i−1 goods for sale (Lemma 8).
Example 1
Let us consider three-person game. We show that the player with the lowest priority in the three-person game will finish the game at one of three stopping times: \(\tau_{1}^{1,3},\tau_{1}^{2,3},\tau_{1}^{3,3}\). From (6) we obtain \(\tau^{2}({\tau^{3}(1)+1})=\tau^{2,3}_{1}\). Moreover, from (10) and (6) we have
Note that \(\tau^{3}(1)=\tau_{1}^{1,3}\). Hence from (11) we have
In a similar way, we obtain that Player 2 in three-person game will finish the game at one of the two stopping times: \(\tau_{1}^{1,2},\tau_{1}^{2,2}\), that is,
Moreover, the player with the highest priority will finish the game at \(\sigma_{1}^{1}=\tau_{1}^{1,1}\).
Note that the ith player’s stopping time is different from the optimal time of selling the ith commodity, but his expected reward is equal to the optimal expected reward, which can be obtained from selling the ith commodity.
3 Discussion
In this section, we compare in detail the results obtained in this paper with those presented in Ferenstein and Krasnosielska [8]. Let us briefly present the solution obtained in [8]. There it is assumed that there exist continuous functions \(\hat{V}_{m,i}(\cdot)\) such that \(\hat{V}_{m,i}(u)\leq \hat {V}_{m,i-1}(u)\), i∈{2,…,m}, \(u\in\mathbb{R}_{0}^{+}\), where \(\hat {V}_{m,i}(u)\) is an“optimal” mean reward of Player i in the m-person game with reward structure G n (u)=Y n r(u+T n ). Next, there are defined Markov times \(\hat{\tau}^{i}(u)=\inf\{n\geq1:G_{n}(u)\geq \hat {V}_{m,i}(u+T_{n})\}\), i=1,2,…,m, which are used to construct the following strategy:
-
Player 1 finishes the game at Markov time \(\hat{\tau}^{m}(u)\) if and only if \(\hat{\tau}^{1}(u)=\hat{\tau}^{m}(u)\); if \(\hat{\tau }^{1}(u)>\hat {\tau}^{m}(u)\), then Player 1 behaves as Player 1 in the m−1-person game starting at Markov time \(\hat{\tau}^{m}(u)+1\);
-
Player i, i∈{2,…,m−1}, finishes the game at Markov time \(\hat{\tau}^{m}(u)\) if and only if \(\hat{\tau}^{i-1}(u)>\hat {\tau }^{i}(u)=\hat{\tau}^{m}(u)\); if \(\hat{\tau}^{i-1}(u)=\hat{\tau }^{m}(u)\), then Player i behave as Player i−1 in the m−1-person game starting at \(\hat{\tau}^{m}(u)+1\); if \(\hat{\tau}^{i}(u)>\hat{\tau}^{m}(u)\), then Player i behaves as Player i in the m−1-person game starting at \(\hat{\tau}^{m}(u)+1\);
-
Player m finishes the game at Markov time \(\hat{\tau}^{m}(u)\) if and only if \(\hat{\tau}^{m-1}(u)>\hat{\tau}^{m}(u)\); if \(\hat{\tau }^{m-1}(u)=\hat{\tau}^{m}(u)\), then Player m behaves as Player m−1 in the m−1-person game starting at \(\hat{\tau}^{m}(u)+1\).
Next, it is proved that Player i stops in the i-person game at the same time as in the m-person game and in consequence the “optimal” expected reward of Player i in the m-person game is equal to the “optimal” expected reward of Player i in the i-person game, that is, \(\hat{V}_{m,i}(u)=\hat{V}_{i,i}(u)\). To compute \(\hat{V}_{i,i}(\cdot)\), it was assumed \(\hat{V}_{m,1}(u)=\hat{V}_{1,1}(u)\) and for i≥2
where \(\hat{V}_{1,1}(u)\) is the optimal expected reward in the Elfving problem. Hence, as in the Elfving problem, for each i∈{2,…,m} the differential equations describing \(\hat{V}_{i,i}(\cdot)\) have been obtained. Next, it was stated (without proof) that the equations obtained have exactly one solution. Moreover, there were given some arguments suggesting that the profile is a Nash equilibrium. Note that the assumption given by (22) is a modification of those made in Elfving [6], and it was removed in Siegmund [18].
Note that the differential equations obtained in Ferenstein and Krasnosielska [8] are the same as those in Stadje [19], that is,
From (16), w have that the expected reward of Player i in the m-person game with profile \(\hat{\psi}^{m}\) is equal to γ i(0). Hence, from (16), (23), and Theorem 4 we get that the profile is a Nash equilibrium.
The connections between the problem solved in Stadje [19] and the solution of the game presented in Ferenstein and Krasnosielska [8] are discussed in detail in the doctoral dissertation of Krasnosielska [14].
4 Proofs
Proof of Lemma 3
From (3), for i=1, we have \(\sigma^{k}_{1}=\tau^{1}(k)=\tau_{k}^{1,1}\). Assume that for each k∈ℕ we have \(\sigma^{k}_{i-1}\leq\tau_{k}^{i-1,i-1}\). We will show that \(\sigma^{k}_{i}\leq\tau_{k}^{i,i}\). From (11) and Lemma 2, we have
The second inequality in Lemma 3 follows from (6). □
Proof of Lemma 5
From (2), we have
for i∈{1,…,m−1} and k,n∈ℕ. From (6), we obtain \(\tau^{i}({\tau^{m}}(k)+1)\leq \tau^{m-i+1,m}_{k}<\infty\). Hence, the stopping times appearing in (24) are finite. Therefore, from (24) on {τ i(k)>τ m(k)}, we have
Consider two cases. If i=1, then from (10) and (25), we get the assertion of the lemma. Let i∈{2,…,m−1}. From (5), we get {τ i−1(k)>τ m(k)}⊇{τ i(k)>τ m(k)}. Therefore, from (25) on {τ i(k)>τ m(k)} we have τ i−1(τ m(k)+1)=τ i−1(k). Hence, from (25) on {τ i(k)>τ m(k)}, we get
□
Proof of Lemma 6
It is enough to show that \(\sigma_{i}^{k}\neq\sigma_{j}^{k}\), j<i. The proof uses induction on i. From (11) on {τ 1(k)>τ 2(k)} is \(\sigma_{2}^{k}=\tau^{2}(k)<\tau^{1}(k)=\sigma_{1}^{k}\). Additionally, on {τ 1(k)=τ 2(k)}
Hence, \(\sigma_{2}^{k}\neq\sigma_{1}^{k}\).
Assume that \(\sigma_{i-1}^{k}\neq\sigma_{j}^{k}\) for j∈{1,2,…,i−2}, k∈ℕ.
We will prove that \(\sigma_{i}^{k}\neq\sigma_{j}^{k}\) for j∈{1,…,i−1}. From (11) on {τ i−1(k)>τ i(k)}
where we used (5) and (11). It should still be shown that \(\sigma_{i}^{k}\neq\sigma_{j}^{k}\) on {τ i−1(k)=τ i(k)}. Consider two cases. Let j=1. On {τ 1(k)=τ i(k)} we have \(\sigma_{i}^{k}=\sigma_{i-1}^{\tau^{i}(k)+1}=\sigma_{i-1}^{\tau ^{1}(k)+1}>\tau^{1}(k)=\sigma_{1}^{k}\). While on {τ 1(k)>τ i−1(k)=τ i(k)} we have \(\sigma_{i}^{k}=\sigma_{i-1}^{\tau^{i}(k)+1}\neq\sigma_{1}^{\tau ^{i}(k)+1}=\sigma_{1}^{k}\), where we used the induction assumption and Lemma 5. Hence, \(\sigma_{i}^{k}\neq\sigma_{1}^{k}\). Now, let j∈{2,…,i−1}. On {τ j−1(k)=τ i(k)} we have \(\sigma_{i}^{k}=\sigma_{i-1}^{\tau^{i}(k)+1}\neq\sigma_{j-1}^{\tau ^{i}(k)+1}=\sigma_{j-1}^{\tau^{j}(k)+1}=\sigma_{j}^{k}\). On {τ j−1(k)>τ j(k)=τ i(k)} we have \(\sigma_{i}^{k}=\sigma_{i-1}^{\tau^{i}(k)+1}>\tau^{i}(k)=\tau^{j}(k)=\sigma_{j}^{k}\). On {τ j−1(k)≥τ j(k)>τ i−1(k)=τ i(k)}, j∈{2,…,i−2}, we obtain \(\sigma_{i}^{k}=\sigma_{i-1}^{\tau^{i}(k)+1}\neq\sigma_{j}^{\tau ^{i}(k)+1}=\sigma_{j}^{k}\), where we used the induction assumption and Lemma 5. Hence, for j∈{1,…,i−1} on {τ i−1(k)=τ i(k)}, we have \(\sigma_{i}^{k}\neq\sigma_{j}^{k}\), which together with (26) completes the proof. □
Proof of Lemma 7
The proof uses induction on m. Note that for each k∈ℕ, we have \(\sigma_{1}^{k}=\tau^{1}(k)=\tau_{k}^{1,1}\). Hence, for m=1, we get (15). Now assume that for each k∈ℕ and given m≥2, we have
To prove that (15) is satisfied for \(k\in\mathbb {N,}\) we will use the equality
which will be proved later. From (27) and the induction assumption, we obtain
where above, the middle equality follows from Lemma 2 and \(\tau^{m}(k)=\tau^{1,m}_{k}\), which follows from (3).
Now we will prove (27). From (11), we get
where the last equality follows from (10) and (5). Adding the first and third summands appearing on the right side of the above equality, and using (5), we obtain
Additionally, adding the fourth and the last summand on the right side of (28) and using (11), we get
Now, plugging (29) and (30) into (28), and changing the index of summation in the last but one summand on the right side of (28) we get the final form of (28) as follows:
where the last equality is a consequence of Lemma 5. Hence, (27) is true. □
Proof of Proposition 2
The proof uses induction. For m=1, the assertion follows immediately from (10) and (3). Assume that \(\{\sigma_{1}^{k},\ldots,\sigma^{k}_{m-1}\}=\{\tau^{1,m-1}_{k},\ldots,\tau^{m-1,m-1}_{k}\}\). We will prove (21). Note that from (11) and Lemma 2 the right side of (21) is equal to
Note that the set \(\{\sigma_{1}^{k},\ldots,\sigma^{k}_{m}\}\) has m different elements (Lemma 6) and the set \(\{ \tau^{1,m}_{k}, \ldots,\tau^{m,m}_{k}\}\) has also m different elements. Therefore, from (31), it is enough to show that for each i∈{1,…,m} we have
We will consider three cases. Let i=m, then from (11) we have \(\sigma_{m}^{k}\in\{\tau^{m}(k),\sigma_{m-1}^{\tau ^{m}(k)+1}\}\). Let now i∈{2,…,m−1}. Then from Lemma 5 and (11), we have
Let i=1, then from Lemma 5 and (11) we have \(\sigma_{1}^{k}=\sigma_{1}^{\tau ^{m}(k)+1}\mathbb{I} (\tau^{m}(k)<\tau^{1}(k))+\tau^{1}(k)\mathbb{I}(\tau^{m}(k)=\tau^{1}(k))\in \{\sigma_{1}^{\tau^{m}(k)+1},\tau^{m}(k)\}\). □
Proof of Theorem 6
Assume that we are just before the observation of the kth offer and up to this time l≤m players remain in the game, say players numbered i 1,…,i l . Let \(\mathcal{I}=\{i_{1},\ldots,i_{l}\}\). Note that only players i 1,…,i l remain in the game up to the time of observation of the kth offer if and only if \(\sigma_{m}^{i_{j}}(\hat{\psi}^{m})\geq k\) for each j∈{1,…,l} and \(\sigma_{m}^{j}(\hat{\psi}^{m})<k\) for each \(j\in\{1,\ldots,m\} \backslash \mathcal{I}\). Let
We want to prove that for \(\psi^{m}=(\psi^{m,1},\ldots,\psi^{m,m})\in D_{1}^{m}\) such that \(\{\psi^{m,i}_{n}\}_{n=1}^{k-1}=\{\hat{\psi}^{m,i}_{n}\}_{n=1}^{k-1}\), i∈{1,…,m}, and for j∈{i 1,…,i l } we have
Note that \(A=\bigcap_{j\in\mathcal{I}} \{\sigma_{m}^{j}({\psi }^{m})\geq k\} \cap\bigcap_{j\in\{1,\ldots,m\}\backslash\mathcal{I}}\{\sigma_{m}^{j}({\psi}^{m})<k\}\) for profile ψ m defined above. Moreover, according to (11), \(\sigma_{i_{j}}^{1}=\sigma_{j}^{k}\) on A for j∈{1,…,l}. Therefore, from (9) and Proposition 1, we get for n≤l
where we used (15) and Lemma 1. As in Theorem 4, we have \(\sigma_{m}^{j}((\hat{\psi }^{m})^{-i},{\psi}^{m,i})\neq\sigma_{m}^{l}((\hat{\psi}^{m})^{-i},{\psi }^{m,i})\), j≠l, and \(\sigma_{m}^{j}((\hat{\psi}^{m})^{-i},{\psi }^{m,i})\leq\tau^{m,m}_{1}+m<\infty\). Therefore, using Lemma 9, we have
where we used the result of Kösters [11, Lemma 5] and Lemma 1. Finally, from (32), (33), and Lemma 1, we get
where in the last but one equality we used (15). □
References
Albright SCh (1974) Optimal sequential assignments with random arrival times. Manag Sci 21:60–67
Chow YS, Robbins H, Siegmund D (1971) Great expectations: the theory of optimal stopping. Houghton Mifflin, Boston
David I, Yechiali U (1985) A time-dependent stopping problem with application to live organ transplants. Oper Res 33:491–504
Dixon MT (1993) Equilibrium points for three games based on the Poisson process. J Appl Probab 30:627–638
Dynkin EB (1969) Game variant of optimal stopping. Sov Math Dokl 10:270–273
Elfving G (1967) A persistency problem connected with a point process. J Appl Probab 4:77–89
Enns EG, Ferenstein E (1990) A competitive best-choice problem with Poisson arrivals. J Appl Probab 27:239–256
Ferenstein E, Krasnosielska A (2009) Nash equilibrium in a game version of Elfving problem. In: Pierre B, Gaitsgory V, Pourtallier O (eds) Advances in dynamic games and their applications. Analytical and numerical developments. Birkhäuser, Boston, pp 399–414
Ferenstein E, Krasnosielska A (2009) A version of the Elfving optimal stopping time problem with random horizon. In: Petrosjan L, Mazalov V (eds) Game theory and applications, vol 14. Nova Science Publishers, New York, pp 40–53
Gershkov A, Moldovanu B (2010) Efficient sequential assignment with incomplete information. Games Econom Behav 68:144–154
Kösters H (2004) A note on multiple stopping rules. Optimization 53:69–75
Krasnosielska A (2009) A version of the Elfving problem with random starting time. Stat Probab Lett 79:2429–2436
Krasnosielska A (2010) A time dependent best choice problem with costs and random lifetime in organ transplants. Appl Math 37:257–274
Krasnosielska A (2011) Issues of optimal stopping inspired by Elfving problem. Doctoral Dissertation, Warsaw University of Technology (in Polish)
Nowak AS, Szajowski K (1999) Nonzero-sum stochastic games. In: Bardi M, Raghavan Parthasarathy TES (eds) Stochastic and differential games: theory and numerical methods. Birkhäuser, Boston, pp 297–343
Parlar M, Perry D, Stadje W (2007) Optimal shopping when the sales are on—a Markovian full-information best-choice problem. Stoch Models 23:351–371
Saario V, Sakaguchi M (1992) Multistop best choice games related to the Poisson process. Math Jpn 37:41–51
Siegmund DO (1967) Some problems in the theory of optimal stopping. Ann Math Stat 38:1627–1640
Stadje W (1987) An optimal k-stopping problem for the Poisson process. In: Bauer BP, Konecny F, Wertz W (eds) Proceedings of the 6th Pannonian symposium on mathematical statistics. Reidel Publishing Company, Dordrecht, pp 231–244
Acknowledgements
We are grateful to Andrzej Nowak for his question concerning Pareto optimum of the profile leading to Theorem 5 and two anonymous reviewers for their comments which led to Proposition 2 and Theorem 6. The research of Anna Krasnosielska-Kobos (née Anna Krasnosielska) has been partially supported by the Polish Ministry of Science and Higher Education (Grant No. N N201 367236) and the European Union in the framework of European Social Fund through the Warsaw University of Technology Development Programme.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.
About this article
Cite this article
Krasnosielska-Kobos, A., Ferenstein, E. Construction of Nash Equilibrium in a Game Version of Elfving’s Multiple Stopping Problem. Dyn Games Appl 3, 220–235 (2013). https://doi.org/10.1007/s13235-012-0070-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13235-012-0070-7