Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

The abstract concept of game has proved to be a fruitful metaphor in theoretical computer science [1]. Several decision problems can, indeed, be encoded as path-forming games on graph, where a player willing to achieve a certain goal, usually the verification of some property on the plays derived from the original problem, has to face an opponent whose aim is to pursue the exact opposite task. One of the most prominent instances of this connection is represented by the notion of parity game [18], a simple two-player turn-based perfect-information game played on directed graphs, whose nodes are labelled with natural numbers called priorities. The goal of the first (resp., second) player, a.k.a., even (resp., odd) player, is to force a play \(\pi \), whose maximal priority occurring infinitely often along \(\pi \) is of even (resp., odd) parity. The importance of these games is due to the numerous applications in the area of system specification, verification, and synthesis, where it is used as algorithmic back-end of satisfiability and model-checking procedures for temporal logics [6, 8, 16], and as a core for several techniques employed in automata theory [7, 10, 15, 17]. In particular, it has been proved to be linear-time interreducible with the model-checking problem for the modal \(\mu \) Calculus [8] and it is closely related to other games of infinite duration, such as mean payoff [5, 11], discounted payoff [24], simple stochastic [4], and energy [3] games. Besides the practical importance, parity games are also interesting from a computational complexity point of view, since their solution problem is one of the few inhabitants of the UPTime \(\cap \) CoUPTime class [12]. That result improves the NPTime \(\cap \) CoNPTime membership [8], which easily follows from the property of memoryless determinacy [7, 18]. Still open is the question about the membership in PTime. The literature on the topic is reach of algorithms for solving parity games, which can be mainly classified into two families. The first one contains the algorithms that, by employing a divide et impera approach, recursively decompose the problem into subproblems, whose solutions are then suitably assembled to obtain the desired result. In this category fall, for example, Zielonka’s recursive algorithm [23] and its dominion decomposition [14] and big step [19] improvements. The second family, instead, groups together those algorithms that try to compute a winning strategy for the two players on the entire game. The principal members of this category are represented by Jurdziński’s progress measure algorithm [13] and the strategy improvement approaches [2022].

Recently, a new divide et impera solution algorithm, called priority promotion (PP, for short), has been proposed in [2], which is fully based on the decomposition of the winning regions into dominions. The idea is to find a dominion for some of the two players and then remove it from the game, thereby allowing for a recursive solution. The important difference w.r.t. the other two approaches [14, 19] based on the same notion is that these procedures only look for dominions of a certain size in order to speed up classic Zielonka’s algorithm in the worst case. Consequently, they strongly rely on this algorithm for their completeness. On the contrary, the PP procedure autonomously computes dominions of any size, by suitably composing quasi dominions, a weaker notion of dominion. Intuitively, a quasi dominion Q for player \(\alpha \in \{ 0, 1 \}\) is a set of vertices from each of which player \(\alpha \) can enforce a winning play that never leaves the region, unless one of the following two conditions holds: (i) the opponent \(\overline{\alpha }\) can escape from Q (i.e., there is an edge from a vertex of \(\overline{\alpha }\) exiting from Q) or (ii) the only choice for player \(\alpha \) itself is to exit from Q (i.e., no edge from a vertex of \(\alpha \) remains in Q). A crucial feature of quasi dominion is that they can be ordered by assigning to each of them a priority corresponding to an under-approximation of the best value for \(\alpha \) the opponent \(\overline{\alpha }\) can be forced to visit along any play exiting from it. Indeed, under suitable and easy to check assumptions, a higher priority quasi \(\alpha \)-dominion Q\(_{1}\) and a lower priority one Q\(_{2}\), can be merged into a single quasi \(\alpha \)-dominion of the higher priority, thus improving the approximation for Q\(_{2}\). This merging operation is called a priority promotion of Q\(_{2}\) to Q\(_{1}\). The PP solution procedure has been shown to be very effective in practice and to often significantly outperform all other solvers. Moreover, it also improves the space complexity of the best know algorithm by an exponential factor, since it only needs space against the required by Jurdziński’s approach [13], where n and k are, respectively, the numbers of vertexes and priorities of the game. Unfortunately, the PP algorithm also exhibits exponential behaviours on a simple family of games. This is due to the fact that, in general, promotions to higher priorities requires resetting previously computed quasi dominions at lower ones.

In order to mitigate the problem, we propose in this paper a new algorithm, called RR for region recovery, which is built on top of PP and is based on a form of conservation property of quasi dominions. This property provides sufficient conditions for a subset a quasi \(\alpha \)-dominion to be still a quasi \(\alpha \)-dominion. By exploiting this property, the RR algorithm can significantly reduce the execution of the resetting phase, which is now limited to the cases when the conservation property is not guaranteed to hold. For the resulting procedure no exponential worst case has been found yet. Experiments on randomly generated games also show that the new approach performs significantly better than PP in practice, while still preserving the same space complexity.

2 Preliminaries

Let us first briefly recall the notation and basic definitions concerning parity games that expert readers can simply skip. We refer to [1, 23] for a comprehensive presentation of the subject.

Given a partial function , by and we denote the domain and range of f, respectively. In addition, \(\uplus \) denotes the completion operator that, taken f and another partial function , returns the partial function , which is equal to g on its domain and assumes the same values of f on the remaining part of A.

A two-player turn-based arena is a tuple , with and , such that is a finite directed graph. (resp., ) is the set of positions of player 0 (resp., 1) and is a left-total relation describing all possible moves. A path in is a finite or infinite sequence of positions in V compatible with the move relation, i.e., , for all . For a finite path \(\pi \), with we denote the last position of \(\pi \). A positional strategy for player \(\alpha \in \{ 0, 1 \}\) on V \(\subseteq \) Ps is a partial function , mapping each \(\alpha \)-position to position compatible with the move relation, i.e., . With we denote the set of all \(\alpha \)-strategies on V. A play in V \(\subseteq \) Ps from a position w.r.t. a pair of strategies , called -play, is a path such that and, for all , if then else . The play function returns, for each position and pair of strategies , the maximal -play .

A parity game is a tuple , where \(\mathcal {A}\) is an arena, is a finite set of priorities, and is a priority function assigning a priority to each position. We denote with PG the class of parity games. The priority function can be naturally extended to games, sets of positions, and paths as follows: ; for a set of positions V \(\subseteq \) Ps, we set ; for a path , we set , if \(\pi \) is finite, and , otherwise. A set of positions V \(\subseteq \) Ps is an \(\alpha \)-dominion, with \(\alpha \in \{ 0, 1 \}\), if there exists an \(\alpha \)-strategy such that, for all \(\overline{\alpha }\)-strategies and positions , the induced play is infinite and . In other words, only induces on V infinite plays whose maximal priority visited infinitely often has parity \(\alpha \). The maximal \(\alpha \)-dominion in a game, denoted , is called winning region of player \(\alpha \). By we denote the maximal subgame of with set of positions contained in and move relation equal to the restriction of Mv to .

The \(\alpha \)-predecessor of V, in symbols , collects the positions from which player \(\alpha \) can force the game to reach some position in V with a single move. The \(\alpha \)-attractor generalises the notion of \(\alpha \)-predecessor to an arbitrary number of moves. Thus, it corresponds to the least fix-point of that operator. When , we say that V is \(\alpha \)-maximal. Intuitively, V is \(\alpha \)-maximal if player \(\alpha \) cannot force any position outside V to enter this set. For such a V, the set of positions of the subgame is precisely . Finally, the \(\alpha \) -escape of V, formally , contains the positions in V from which \(\alpha \) can leave V in one move. The dual notion of \(\alpha \) -interior, defined as , contains the \(\alpha \)-positions from which \(\alpha \) cannot escape with a single move, while the notion of \(\alpha \) -stay, defined as , denotes the \(\alpha \)-positions from which \(\alpha \) has a move to remain in V.

3 Quasi Dominion Approach

The priority promotion algorithm proposed in [2] attacks the problem of solving a parity game by computing one of its dominions D, for some player \(\alpha \in \{ 0, 1 \}\), at a time. Indeed, once the \(\alpha \)-attractor D\(^\star \) of D is removed from , the smaller game is obtained, whose positions are winning for one player iff they are winning for the same player in the original game. This allows for decomposing the problem of solving a parity game to that of iteratively finding its dominions [14].

In order to solve the dominion problem, the idea described in [2] is to introduce a much weaker notion than that of dominion, called quasi dominion, which satisfies, under suitable conditions, a composition property that eventually brings to the construction of a dominion. Intuitively, a quasi \(\alpha \)-dominion Q is a set of positions on which player \(\alpha \) has a witness strategy , whose induced plays either remain inside Q forever and are winning for \(\alpha \) or can exit from Q passing through a specific set of escape positions.

Definition 1

(Quasi Dominion [2]). Let be a game and \(\alpha \in \{ 0, 1 \}\) a player. A non-empty set of positions is a quasi \(\alpha \) -dominion in if there exists an \(\alpha \)-strategy , called \(\alpha \) -witness for , such that, for all \(\overline{\alpha }\)-strategies , with , and positions , the induced play satisfies , if \(\pi \) is infinite, and , otherwise.

Observe that, if all the plays induced by the witness remain in the set Q forever, this is actually an \(\alpha \)-dominion and, therefore, a subset of the winning region of \(\alpha \), with the projection over Q of some \(\alpha \)-winning strategy on the entire game. In this case, the escape set of Q is empty, i.e., , and Q is said to be \(\alpha \) -closed. In general, however, a quasi \(\alpha \)-dominion Q that is not an \(\alpha \)-dominion, i.e., such that , need not be a subset of and it is called \(\alpha \) -open. Indeed, in this case, some induced play may not satisfy the winning condition for that player once exited from Q, e.g., by visiting a cycle containing a position with maximal priority of parity \(\overline{\alpha }\). The set of triples , where Q is a quasi \(\alpha \)-dominion having \(\sigma \) as one of its \(\alpha \)-witnesses, is denoted by QD, and is partitioned into the sets and of open and closed quasi \(\alpha \)-dominion triples, respectively.

Similarly to the other divide et impera techniques proposed in the literature, the one reported in [2], called PP, does not make any algorithmic use of the witness strategy associated with a quasi dominion Q, as this notion is only employed in the correctness proof. In this work, instead, we strongly exploit the effective computability of such a witness in order to considerably alleviate the collateral effects of a reset operation required by PP to ensure the soundness of the approach, which is also responsible for the exponential worst cases. Indeed, this algorithm needs to forget previously computed partial results after each compositions of two quasi-dominions, since the information computed during the entire process cannot ensure that these results can be still correctly used in the search for a dominion. In this work, instead, we exploit the following simple observation on the witness strategies, formally reported in Lemma 1, to determine which partial results can be safely preserved.

Fig. 1.
figure 1

Witness strategy

In general, quasi \(\alpha \)-dominions are not closed under restriction. For example, consider the quasi 1-dominion of Fig. 1 with unique 1-witness strategy and its subset . It is quite immediate to see that Q is not a quasi 1-dominion, since , but player 1 does not have a strategy that induces infinite 1-winning plays that remain in Q. Indeed, requires player 1 to exit from Q by going from d to c. On the contrary, the subset is still a 1-dominion with 1-witness strategy , since . Hence, the restriction \(\sigma \) of to is still a well-defined strategy on Q. Inspired by this observation, we provide the following sufficient criterion for a subset Q of a quasi \(\alpha \)-dominion to be still a quasi \(\alpha \)-dominion.

Lemma 1

(Witness Strategy). Given a quasi \(\alpha \)-dominion having as \(\alpha \)-witness and a subset of positions , the restriction is an \(\alpha \)-witness for Q iff .

The proof-idea behind this lemma is very simple. Any infinite play induced by the restriction of \(\sigma \) on Q is necessarily winning for player \(\alpha \), since it is coherent with the original \(\alpha \)-witness of as well. Now, if , we are also sure that any finite play ends in , as required by the definition of quasi dominion. Therefore, \(\sigma \) is an \(\alpha \)-witness for Q, which is, then, a quasi \(\alpha \)-dominion. On the other hand, if , there exists a finite play induced by \(\sigma \) that does not terminate in . Hence, \(\sigma \) is not an \(\alpha \)-witness. In this case, we cannot ensure that Q is a quasi \(\alpha \)-dominion.

figure a

The priority promotion algorithm explores a partial order, whose elements, called states, record information about the open quasi dominions computed along the way. The initial state of the search is the top element of the order, where the quasi dominions are initialised to the sets of positions with the same priority. At each step, a new quasi \(\alpha \)-dominion Q together with one of its possible \(\alpha \)-witnesses \(\sigma \) is extracted from the current state, by means of a query operator , and used to compute a successor state, by means of a successor operator , if Q is open. If, on the other hand, it is closed, the search is over. Algorithm 1 implements the dominion search procedure . A compatibility relation connects the query and the successor operators. The relation holds between states of the partial order and the qua si dominions triples that can be extracted by the query operator. Such a relation defines the domain of the successor operator. The partial order, together with the query and successor operator and the compatibility relation, forms what is called a dominion space.

Definition 2

(Dominion Space). A dominion space for a game is a tuple , where (1) is a well-founded partial order w.r.t. with distinguished element , (2) is the compatibility relation, (3) is the query operator mapping each element to a quasi dominion triple such that, if then , and (4) is the successor operator mapping each pair to the element with .

The notion of dominion space is quite general and can be instantiated in different ways, by providing specific query and successor operators. In [2], indeed, it is shown that the search procedure is sound and complete on any dominion space \(\mathcal {D}\). In addition, its time complexity is linear in the execution depth of the dominion space, namely the length of the longest chain in the underlying partial order compatible with the successor operator, while its space complexity is only logarithmic in the space size, since only one state at the time needs to be maintained. A specific instantiation of dominion space, called PP dominion space, is the one proposed and studied in [2]. In the next section, we propose a different one, called RR dominion space, which crucially exploits Lemma 1 in order to prevent a considerable amount of useless reset operations after each quasi dominion composition, to the point that it does not seem obvious whether an exponential lower bound even exists for this new approach.

4 Priority Promotion with Region Recovery

In order to instantiate a dominion space, we need to define a suitable query function to compute quasi dominions and a successor operator to ensure progress in the search for a closed dominion. The priority promotion algorithm proceeds as follows. The input game is processed in descending order of priority. At each step, a subgame of the entire game, obtained by removing the quasi domains previously computed at higher priorities, is considered. At each priority of parity \(\alpha \), a quasi \(\alpha \)-domain Q is extracted by the query operator from the current subgame. If Q is closed in the entire game, the search stops and returns Q as result. Otherwise, a successor state in the underlying partial order is computed by the successor operator, depending on whether Q is open in the current subgame or not. In the first case, the quasi \(\alpha \)-dominion is removed from the current subgame and the search restarts on the new subgame that can only contain positions with lower priorities. In the second case, Q is merged together with some previously computed quasi \(\alpha \)-dominion with higher priority. Being a dominion space well-ordered, the search is guaranteed to eventually terminate and return a closed quasi dominion. The procedure requires the solution of two crucial problems: (a) extracting a quasi dominion from a subgame and (b) merging together two quasi \(\alpha \) -dominions to obtain a bigger, possibly closed, quasi \(\alpha \)-dominion.

Fig. 2.
figure 2

Quasi dominions.

Solving problem (b) is not trivial, since quasi \(\alpha \)-dominions are not, in general, closed under union. Consider the example in Fig. 2. Both and are quasi 0-dominions. Indeed, and are the corresponding 0-witnesses. However, their union is not a quasi 0-dominion, since the 1-strategy forces player 0 to lose along any infinite play starting from either a or b.

A solution to both problems relies on the definition of a specific class of quasi dominions, called regions. An \(\alpha \)-region R of a game is a special form of quasi \(\alpha \)-dominion of with the additional requirement that all the positions in have the maximal priority in . In this case, we say that \(\alpha \)-region R has priority p. As a consequence, if the opponent \(\overline{\alpha }\) can escape from the \(\alpha \)-region R, it must visit a position with the highest priority in it, which is of parity \(\alpha \).

Definition 3

(Region [2]). A quasi \(\alpha \)-dominion R is an \(\alpha \)-region in if and all the positions in have priority , i.e., .

Observe that, in any parity game, an \(\alpha \)-region always exists, for some \(\alpha \in \{ 0, 1 \}\). In particular, the set of positions of maximal priority in the game always forms an \(\alpha \)-region, with \(\alpha \) equal to the parity of that maximal priority. In addition, the \(\alpha \)-attractor of an \(\alpha \)-region is always an (\(\alpha \)-maximal) \(\alpha \)-region. A closed \(\alpha \)-region in a game is clearly an \(\alpha \)-dominion in that game. These observations give us an easy and efficient way to extract a quasi dominion from every subgame: collect the \(\alpha \)-attractor of the positions with maximal priority p in the subgame, where , and assign p as priority of the resulting region R. This priority, called measure of R, intuitively corresponds to an under-approximation of the best priority player \(\alpha \) can force the opponent \(\overline{\alpha }\) to visit along any play exiting from R.

Proposition 1

(Region Extension [2]). Let be a game and an \(\alpha \)-region in . Then, is an \(\alpha \)-maximal \(\alpha \)-region in .

A solution to the second problem, the merging operation, is obtained as follows. Given an \(\alpha \)-region R in some game and an \(\alpha \)-dominion D in a subgame of that does not contain R itself, the two sets are merged together, if the only moves exiting from \(\overline{\alpha }\)-positions of D in the entire game lead to higher priority \(\alpha \)-regions and R has the lowest priority among them. The priority of R is called the best escape priority of D for \(\overline{\alpha }\). The correctness of this merging operation is established by the following proposition.

Proposition 2

(Region Merging [2]). Let be a game, an \(\alpha \)-region, and an \(\alpha \)-dominion in the subgame . Then, is an \(\alpha \)-region in . Moreover, if both R and D are \(\alpha \)-maximal in and , respectively, then is \(\alpha \)-maximal in as well.

Fig. 3.
figure 3

Running example.

The merging operation is implemented by promoting all the positions of \(\alpha \)-dominion D to the measure of R, thus improving the measure of D. For this reason, it is called a priority promotion. In [2] it is shown that, after a promotion to some measure p, the regions with measure lower than p might need to be destroyed, by resetting all the contained positions to their original priority. This necessity derives from the fact that the new promoted region may attract positions from lower ones, thereby potentially invalidating their status as regions. Indeed, in some cases, the player that wins by remaining in the region may even change from \(\alpha \) to \(\overline{\alpha }\). As a consequence, the reset operation is, in general, unavoidable. The original priority promotion algorithm applies the reset operation to all the lower priority regions. As shown in [2], the reset operation is the main source of the exponential behaviours of the approach. We shall propose here a different approach that, based on the result of Lemma 1, can drastically reduce the number of resets needed.

Figure 3 illustrates the dominion search procedure on an example game. Diamond shaped positions belong to player 0 and square shaped ones to opponent 1. Each cell of the table contains a computed region. The downward arrow denotes that the region is open in the subgame where is computed, while the upward arrow means that the region gets to be promoted to the priority in the subscript. The measure of the region correspond to the index of the row in which the region is contained. Empty slots in the table represent empty regions, while a slot with symbol \(=\) in it means that the it contains the same region as the corresponding slot in the previous column.

Assume the dashed move \(\texttt {(g,k)}\) is not present in the game. Then, following the idea sketched above, the first region obtained is the single-position 0-region {a} at priority 8, which is open because of the two moves leading to e and i. At priority 7, the open 1-region \(\texttt {\{j,k\}}\) is formed, by attracting k to j according to Proposition 1, which is open in the subgame where {a} is removed. The procedures proceeds similarly, processing all the priorities down to 1 and extracting the regions reported in the first column of the table of Fig. 3. Those are all open regions in their corresponding subgames, except for the 1-region \(\texttt {\{g,h\}}\) at priority 1, which is closed in its subgames but not in the entire game. This region has a move (\(\texttt {g,f}\)) leading to region 3 and Proposition 2 is then applied, which promotes this region to 3, obtaining a new 1-region \(\texttt {\{e,f,g,h\}}\) with measure 3. This one is again closed in its subgames and, due to move (\(\texttt {h,b}\)), triggers another application of Proposition 2, which promotes all of its positions to region 5 and resets the positions in region 4 to their original priority. The search resumes at priority 5 and the maximization of that region attracts position c as well, forming region \(\texttt {\{b,c,e,f,g,h\}}\) with measure 5. In the resulting subgame, the procedure now extracts the open 1-region {d} at priority 3. The residual game only contains position i, that forms a closed 0-region with a move leading to region 8. This triggers a new promotion that resets the position of all the regions with measure lower than 8, namely the regions with measures 7 and 5. After maximization of the target region, positions b, c, d, h, and j are all attracted to form the 0-region in the first row of column 4. The reset of previous region 7 releases position k which now forms an open 0-region of priority 4. Similarly, positions e and f, reset by the last promotion, form an open 1-region at priority 3. Finally, at priority 1 the closed 1-region {g} is extracted and promoted, by move (\(\texttt {g,f}\)), to region 3, forming the set \(\texttt {\{e,f,g\}}\). Since no move from 0-positions lead outside the set, this region is closed in the entire game and a 1-dominion has been found.

During the simulation above, three resets have been performed. The first one resets 0-region with measure 4 in column 2, after promotion of region to priority 5. Indeed, the maximization of the resulting region attracts position c, leaving the set {d} with measure 4. However, according to Definition 1, this is not a quasi 0-dominion, since its 1-escape is empty and player 1 can win by remaining in the set forever. After the promotion of region {i} to 8 in column 3, both the regions in rows 7 and 5 get reset. The maximization of the target region of the promotion, i.e., , attracts positions b, c, d, h, and j. As a consequence, for similar reasons described above for position d, the residual position k at priority 7 must be reset to its original priority. Notice that, in both the considered cases, Lemma 1 does not apply. Indeed, the 0-witness strategy for region is , the 0-stay set of the residual region {d} is the set itself, and the restriction of \(\sigma \) to {d} leads outside {d}, hence, it does not belong to . A similar argument applies to set {k} as well.

As opposed to this case, however, the reset of region 5 can be avoided, thanks to Lemma 1. Indeed, a 1-witness for that region is and, in this case, the residual set after the promotion and the maximization of the target region 8 is , whose 1-stay set is . The restriction of \(\sigma \) to that set is, however, contained in and the lemma applies. Note that, avoiding the reset of region with measure 5, containing in column 4, would also avoid the computation of regions 4, 3, and 1, and the promotion of region 1 to 3 that leads to column 5. Indeed, the residual region 5 is a 1-region, according to the lemma, and is also closed in the entire game.

If, however, the dashed move was added to the game, the reset of region 5 would be necessary. The reason is that, in this case, the 0-escape set would contain position g, which can escape to position k. As a consequence, would not be a 1-region as the escape set contains a position with priority non-maximal in the subgame, contrary to what is required by Definition 3.

In summary, we can exploit Lemma 1 and Definition 3 to avoid resetting regions after a promotion whenever (i) the witness strategy of the residual region satisfies the condition of the lemma, and (ii) its escape set only contains positions of maximal priorities in the subgame. This is the core observation that allows the definition of the RR approach, which is formally defined in the following.

The RR Dominion Space. We can now provide the formal account of the RR dominion space. We shall denote with the set of region triples in and with and the sets of open and closed region triples, respectively.

Similarly to the PP algorithm, during the search for a dominion, the computed regions, together with their current measure, are kept track of by means of an auxiliary priority function , called region function. Given a priority , we denote by (resp., , , and ) the function obtained by restricting the domain of r to the positions with measure greater than or equal to p (resp., greater than, lower than, and different from p). Formally, , for . By , we denote the largest subgame obtained by removing from all the positions in the domain of . In order for the RR procedure to exploit Lemma 1, it also needs to keep track of witness strategies of the computed region. To this end, we introduce the notion of witness core. A strategy is an \(\alpha \) -witness core for an \(\alpha \)-region R if (i) it is defined on all positions having priority lower than , i.e., , and (ii) it is a restriction of some \(\alpha \)-witness for R, i.e., . Intuitively, a witness core only maintains the essential part of a witness and can be easily transformed into a complete witness by associating every position with an arbitrary successor in R. The result of any such completion is an actual witness, since any infinite path passing through v is forced to visit a maximal priority of parity \(\alpha \).

Definition 4

(Region-Witness Pair). Let be a priority function, a strategy, and a priority. The pair is a region-witness pair w.r.t. p if, for all with , and , the following two conditions hold:

  1. 1.

    if , then R is an \(\alpha \)-region in the subgame with \(\alpha \)-witness core \(\sigma \);

  2. 2.

    if , there exists a quasi \(\alpha \)-dominion with \(\alpha \)-witness such that (i) , (ii) , and (iii) .

In addition, is maximal above iff, whenever , it holds that R is \(\alpha \)-maximal in as well.

As opposed to the PP approach, where a promotion to a priority p resets all the regions of measure lower than p, the RR algorithm resets lower regions only when it cannot ensure their validity. This is done one region at a time, during the descend phase. If, while reading a set at a certain priority , the conditions of Lemma 1 are not met by or the escape of that region contains positions of priority lower than q, then is reset.

Contrary to PP, for which the set contained in r at each measure q must be an \(\alpha \)-region, RR requires such a property only for those regions with measure \(q \ge p\), as expressed by Item 1 of the previous definition. For each \(q < p\), instead, we simply require the set of positions contained in r at that measure to be a subset of some previously computed quasi dominions of the same player. This is done by requiring that the strategies recorded in \(\tau \) be subsets of witnesses of these dominions, as described in Item 2. In this way, to verify that is still a quasi \(\alpha \)-dominion, RR can apply the property stated in Lemma 1.

The status of the search of a dominion is encoded by the notion of state s of the dominion space, which contains the current region-witness pair and the current priority p reached by the search in . Initially, r coincides with the priority function pr of the entire game , \(\tau \) is the empty strategy, and p is set to the maximal priority available in the game. To each of such states , we then associate the subgame at s defined as , representing the portion of the original game that still has to be processed.

The following state space specifies the configurations in which the RR procedure can reside and the relative order that the successor function must satisfy.

Definition 5

(State Space). A state space is a tuple , where:

  1. 1.

    is the set of triples , called states, where (a)  is a region-witness pair w.r.t. p, (b) r is maximal above p, and (c) .

  2. 2.

    ;

  3. 3.

    for any two states , it holds that iff either (a) there exists a priority with such that (a.i)  and (a.ii) , or (b) both (b.i)  and (b.ii)  hold.

Condition 1 requires every region with measure \(q > p\) to be \(\alpha \)-maximal, where \(\alpha = q\) mod 2. This implies that . Moreover, the current priority p must be one of the measures recorded in r. Condition 2 specifies the initial state. Finally, Condition 3 defines the ordering relation among states, which the successor operation has to comply with. It asserts that a state is strictly smaller than another state if either there is a region recorded in with some higher measure q that strictly contains the corresponding one in and all regions with measure grater than q are equal in the two states, or state is currently processing a lower priority than the one of .

As reported in Definition 2, the compatibility relation describes which regions are compatibles with a state, i.e., which region triples can be returned by the query operator and used by the successor function. A region triple is compatible with a state if R is an \(\alpha \)-region in the current subgame . Moreover, if such a region is \(\alpha \)-open in that game, it has to be \(\alpha \)-maximal and needs to necessarily contain the current region of priority p in r.

Definition 6

(Compatibility Relation). An open quasi dominion triple is compatible with a state , in symbols , iff (1)  and (2) if R is \(\alpha \)-open in then (2.a) R is \(\alpha \)-maximal in and (2.b) .

figure b

Algorithm 2 provides a possible implementation for the query function compatible with the region-recovery mechanism. Given the current state , Line 1 simply computes the parity \(\alpha \) of the priority p to process at s. Line 3, instead, computes the attractor w.r.t. player \(\alpha \) in subgame of the region contained in r at p, as determined by Line 2. Observe that here we employ a version of the \(\alpha \)-attractor that, given an \(\alpha \)-witness core for , also computes the \(\alpha \)-witness for R. This can easily be done by first extending with the attraction strategy on the \(\alpha \)-positions in and, then, by choosing, for any \(\alpha \)-positions in with a successor in , any one of those successors. The resulting set R is, according to Proposition 1, an \(\alpha \)-maximal \(\alpha \)-region of containing with \(\alpha \)-witness \(\sigma \).

figure c

The promotion operation is based on the notion of best escape priority mentioned above, namely the priority of the lowest \(\alpha \)-region in r that has an incident move coming from the \(\alpha \)-region, closed in the current subgame, that needs to be promoted. This concept is formally defined as follows. Let be the interface relation between R and r, i.e., the set of \(\overline{\alpha }\)-moves exiting from R and reaching some position within a region recorded in r. Then, is set to the minimal measure of those regions that contain positions reachable by a move in I. Formally, . Such a value represents the best priority associated with an \(\alpha \)-region contained in r and reachable by \(\overline{\alpha }\) when escaping from R. Note that, if R is a closed \(\alpha \)-region in , then is necessarily of parity \(\alpha \) and greater than the measure p of R. This property immediately follows from the maximality of r above p. Indeed, no move of an \(\overline{\alpha }\)-position can lead to a \(\overline{\alpha }\)-maximal \(\overline{\alpha }\)-region. For instance, for 1-region with measure 1 in Column 1 of Fig. 3, we have that and . Hence, .

Algorithm 3 implements the successor function. Given the state and one of its possible compatible region triples open in the original game , it produces a successor state . Line 1 checks if R is open in the subgame as well. If this is the case, at Line 2, the next state is generated by the auxiliary function, called next state function, described below, which also applies the required resets. On the other hand, if R is closed in , the procedure performs the promotion of R, by exploiting Proposition 2. Indeed, Line 3 computes the best escape priority \(p^\star \) to which R needs to be promoted, while Line 4 sets the measure of R to \(p^\star \) and merges the strategies contained in \(\tau \) with the \(\alpha \)-witness \(\sigma \) of R. Observe that, unlike in the PP successor function, no reset operation is applied to r at this stage.

figure d

Finally, Algorithm 4 reports the pseudo code of the next state function, the essential core of the RR approach. At a state , Line 1 computes the priority \(p^\star \) of the successive set of positions occurring in r starting from p in descending order of priority. Then, Line 2 verifies whether this set is actually a region, by computing the truth value of the formula \(\phi \) described below applied to the triple . If this is the case, the successor state of s is returned at Line 3. On the other hand, if the check fails, the algorithm resets, at Line 4, the positions in to their original priority stored in the priority function pr of the game, and deletes, at Line 5, the associated strategy contained in \(\tau \). Finally, at Line 6, the next state function is recursively applied to the newly obtained state.

To check whether a set of positions at a certain priority \(q < p\) is an \(\alpha \)-region with , we make use of the formula , which verifies that (i) is a witness core for R and (ii) the escape only contains positions of maximal priorities in the subgame. The two predicates are formally defined as follows.

Intuitively, if holds, we are sure that . Moreover, due to Item 2 of Definition 4, R is a subset of a quasi \(\alpha \)-dominion having a witness containing \(\sigma \). Therefore, by Lemma 1, we immediately derives that \(\sigma \) is a witness core for R. Additionally, the formula just checks that the second condition of the definition of region is also met.

The following theorem establishes the correctness of the RR approach.

Theorem 1

(Dominion Space). For a game , the structure , where is given in Definition 5, is the relation of Definition 6, and and are the functions computed by Algorithms 2 and 3 is a dominion space.

The RR procedure drastically reduces the number of resets needed to solve a game w.r.t. PP. In particular, the exponential worst-case game presented in [2] does not work any more, since the execution depth of the associated RR dominion space is only quadratic in the parameter of game family. Unfortunately, at the present time, we are not able to provide a better asymptotic upper bound for the time complexity w.r.t. the PP one.

5 Experimental Evaluation

The technique proposed in the paper has been implemented in the tool PGSolver [9], which collects implementations of several parity game solvers proposed in the literature and provides benchmarking tools that can be used to evaluate the solver performances.Footnote 1

Fig. 4.
figure 4

Comparative results on 2000 random games with up to 20000 positions (from [2]).

Figure 4 compares the running times of the new algorithm RR against the original version PP and the well-known solvers Rec and Str, implementing the recursive algorithm [23] and the strategy improvement technique [22], respectively. This first pool of benchmarks is taken from [2] and involves 2000 random games of size ranging from 1000 to 20000 positions and 2 outgoing moves per position. Interestingly, random games with very few moves prove to be much more challenging for the priority promotion based approaches than those with a higher number of moves per position, and often require a much higher number of promotions. Since the behaviour of the solvers is typically highly variable, even on games of the same size and priorities, to summarise the results we took the average running time on clusters of games.

Therefore, each point in the graph shows the average time over a cluster of 100 different games of the same size: for each size value n, we chose the numbers of priorities, with , and 10 random games were generated for each pair n and k. We set a time-out to 180 s (3 min). The new solver RR shows a significant improvement on all the benchmarks. All the other solvers provided in PGSolver, including the Dominion Decomposition [14] and the Big Step [19] algorithms, perform quite poorly on those games, hitting the time-out already for very small instances. Figure 4 shows only the best performing ones on the considered games, namely Rec and Str.

Fig. 5.
figure 5

Comparison between PP and RR on random games with 50000 positions on a logarithmic scale.

Similar experiments were also conducted on random games with a higher number of moves per position and up to 100000 positions. The resulting games turn out to be very easy to solve by all the priority promotion based approaches. The reason seems to be that the higher number of moves significantly increases the dimension of the computed regions and, consequently, also the chances to find a closed one. Indeed, the number of promotions required by PP and RR on all those games is typically zero, and the whole solution time is due exclusively to a very limited number of attractors needed to compute the few regions contained in the games. We reserve the presentation of the results for the extended version.

To further stress the RR technique in comparison with PP, we also generated a second pool of much harder benchmarks, containing more than 500 games, each with 50000 positions, 12000 priorities and 2 moves per positions. We selected as benchmarks only random games whose solution requires PP between 30 and 6000 s. The results comparing PP and RR are reported in Fig. 5 on a logarithmic scale. The figure shows that in three cases PP performs better than RR. This is due to the fact that the two algorithms may follow different solution paths within the dominion space and that following the new technique may, in some cases, defer the discovery of a closed dominion. Nonetheless, the RR algorithm does pay off significantly on the vast majority of the benchmarks, often solving a game between two to sixteen times faster than PP.

In [2] it is shown that PP solves all the known exponential worst cases for the other solvers without promotions and, clearly, the same holds of RR as well. As a consequence, RR only requires polynomial time on those games and the experimental results coincide with the ones for PP.