1 Introduction

A team of intelligent agents acting in a shared environment has to coordinate its steps in order to achieve common goals. Multiagent planning with a deterministic model describes such problems and proposes techniques to solve them. Provided that the agents are obliged to keep information about their individual abilities private, they are not allowed to communicate it with other agents. Consider an application domain in which several logistic companies have to cooperate to fulfill complex transportation tasks, which cannot be managed by any of the companies on its own. Although the companies have to cooperate, still they need to keep their know-how secret, being it their modes of transportation or local routes. In such a situation, the companies represented by planning agents would not benefit from any competitive behavior as the objective is common for all the companies, but they still have to keep parts of their knowledge private because of their local competition.

The concept of private knowledge is not new in multiagent planning because it can factor a planning problem and thus positively affect the complexity of the planning process [6]. Nevertheless, multiagent planners usually do not target this particular facet of the problem. Some multiagent state-space search algorithms avoid this problem by obfuscating or aggregating private information on interconnecting states. In contrast, our coordination-centric approach completely preserves private knowledge as only public projections of agent plans are exchanged. Moreover, our algorithm benefits from decrease of required communication among the agents when compared to state-space search algorithms.

The most used multiagent planning model MA-Strips  [6] prescribes a scheme which information has to stay public. Although the original motivation was not to jeopardize completeness of the planning process required for complexity assurances, most of the planners in the literature stick to this particular definition. The most notable exception is the FMAP planner [33] which allows to mark public information in the planning problem description. Besides representation of local plans as totally or partially ordered sequences of actions, a compact representation of set of local plans utilizing various types of finite-state machines was proposed in [12] and our recent work [36]. In [35], we have proposed notions of external actions and public plan extensibility. When planning with external actions, agents are informed about public actions of other agents. Hence, they are able to plan actions for other agents. However, external actions are striped of private information, and thus, it can happen that an agent plans an external action inappropriately. The notion of extensibility allows to recognize plans where external actions are used correctly. In [36], we have used extensibility with finite-state machines to outline a generic scheme of multiagent planners further elaborated in this work. In [17], we have improved an extensibility-based planner with a type system checker for process calculi utilized to efficiently approximate plan extensibility.

In this article, we present an extensibility-based multiagent planning algorithm which utilizes finite-state machines to compactly represent sets of plans. We call this representation planning state machines (PSM). Although the main idea of PSMs was briefly sketched previously [36], the formal development including proofs of soundness and completeness is first presented here. PSMs not only allow us to compactly represent (even infinite) sets of plans by a finite structure but mainly allow us to effectively implement operations crucial for our planning algorithms. Furthermore, we combine a previous extensibility approximation [17] with a method of a distributed relaxed heuristic used, for example, in MADLA planner [32]. This gives rise to a multiagent planner outperforming the state-of-the-art FMAP planner.

We use classical multiagent benchmark domains found in the literature to evaluate our planner. We provide a comprehensive domains description, and we compare experimental results with other state-of-the-art planners. Finally, we analyze the benchmark domains from the point of view of different privacy classifications. Different privacy classifications differ in facts explicitly revealed to other agents. While other planners are usually designed with a fixed privacy classification in mind, we show that our planner can be easily adjusted to work with various privacy classifications. Hence, we provide a user the freedom to choose what is public, and we can directly compare our planner with other planners. Furthermore, we show that a restricted public knowledge can even improve planner performance.

2 Related work

In classical planning, the plan synthesis process is typically a systematic search in either space of states, plans, or a combination of both. First multiagent planner for MA-Strips called planning first [28] used a global plan-space search and local forward state-space search. FMAP [34] is a representative of a multiagent partial ordered planner using heuristic search to locally improve efficiency and global distributed search in the space of the partial plans. A representative planners of the coordination-centric planning are distributed planning by graph merging (DPGM) [11, 30], best response planning [21], and \(\mu \)-SATPLAN [10]. A planner Distoplan [12] and following A# planner [20] pioneered the idea of planning by means of finite-state machines (FSM). However, the A# planner was evaluated only on one planning domain. Our approach represents agent plans as FSMs as well. Additionally, we use a principle of intersection of the FSMs effectively acting as filter for unfeasible combination of plans of different agents. Moreover, our motivation was to provide a practical planning system; thus, we aimed at an efficient implementation and thorough experimental evaluation of our planner.

Distributed MA-Strips multiagent planners in the literature can be roughly separated to three groups by privacy preservation. Most of the planners follows concept of privacy by information obfuscation [4, 27] or information aggregation [32, 33]. With information obfuscation, agents are allowed to communicate private information with other agents as far as the information is obfuscated such that only the owning agent can understand it (e.g., the name of an action is replaced by a hash code). With information aggregation, the information is aggregated such that only the owning agent knows all details (e.g., a summed up cost of private actions can be send to other agents). An exception is the GPPP planner [26] providing full privacy by communicating only public information. Our approach also provides full privacy. Especially in the contrast to the obfuscation principle, we can reduce the size of plan space because privacy preservation act as natural abstraction of the problem from perspective of particular agents. Our principle thus allows lower requirements on the communicated data.

Multiagent planning can also be seen as a specific form of factored planning [1]. Factored planning tries to decompose a planning problem into possibly independent subproblems. Solving these subproblems scales linearly with the size of the domain and in the worst case exponentially with the size only of the largest subproblem and interactions among subproblems. Obviously, the catch is that not all planning problems can be factored enough to benefit from such efficiency gain. In [7], causal graphs [2] of the planning domains are used to identify when factorization is computationally beneficial. A practical algorithm based on this result and on principle of decomposition trees [9] was proposed in [22]. This principle can be also seen as a variation on localized planning [24]. The difference between multiagent planning and factored planning is that in multiagent planning the factorization is fixed and given by agent abilities.

3 Multiagent planning

Similarly as in classical planning, we assume a planning model based on extension of Strips  [13] compactly representing a deterministic transition system. The multiagent extension follows the principles proposed in MA-Strips by Brafman and Domshlak in [6]. Agent capabilities are described as a finite repertoire of agent’s Strips actions. The agent actions possibly affect only parts of the environment, thus inducing local planning problems of the particular agent. Aptly, this (partial) “separation of concerns” of the agents keeps the private information local. Also it helps to increase efficiency of the planning process by hiding parts irrelevant for other agents.

The agents are cooperative and coordinated and concurrently plan and execute their local plans in order to achieve a joint goal. The environment wherein the agents act is classical with deterministic actions. The following formal preliminaries restate the MA-Strips problem [6] and define local planning problems and plan extensibility [35] required for the following sections.

3.1 Planning problem

An MA-Strips planning problem \(\varPi \) is a quadruple \(\varPi =\langle P,\{\alpha _i\}^n_{i=1},I,G\rangle \), where \(P\) is a set of facts, \(\alpha _i\) is the set of actions of ith agent, \(I\subseteq P\) is an initial state, and \(G\subseteq P\) is a set of goal facts. Selector functions \(\textsf {facts} _{}(\varPi ), \textsf {agents} (\varPi ), \textsf {init} _{}(\varPi )\), and \(\textsf {goal} (\varPi )\) are defined so that \( \varPi = \langle \textsf {facts} _{}(\varPi ),\textsf {agents} (\varPi ),\textsf {init} _{}(\varPi ),\textsf {goal} (\varPi ) \rangle \) holds for any problem \(\varPi \).

An action an agent can perform is a quadruple containing unique action id and three subsets of \(\textsf {facts} _{}(\varPi )\) which in turn denote the set of preconditions, the set of add effects, and the set of delete effects. Action ids are arbitrary atomic objects, and we always consider ids to be unique within a given problem. Selector functions \(\textsf {id} (a), \textsf {pre} (a), \textsf {add} (a)\), and \(\textsf {del} (a)\) are defined so that \( a=\langle \textsf {id} (a), \textsf {pre} (a),\textsf {add} (a),\textsf {del} (a) \rangle \) holds for any action \(a\). Moreover, let \(\textsf {eff} (a) = \textsf {add} (a)\cup \textsf {del} (a)\).

An agent is identified with its capabilities; in other words, the ith agent \(\alpha _i=\{a_{1},\ldots ,a_{m}\}\) is determined by a finite set of actions it can preform in the environment. We use metavariable \(\alpha \) to range over agents from \(\varPi \). A planning state s is a finite set of facts, and we say that fact p holds in s, or that p is valid in s, iff \(p\in s\). When \(\textsf {pre} (a)\subseteq s\), state progression function \(\gamma \) is defined classically as \(\gamma (s,a) = (s \setminus \textsf {del} (a))\cup \textsf {add} (a)\).

Example 1

Throughout the paper, we shall use the following running example concerning a small logistic company. The company owns two transport vehicles (\(\texttt {plane} \) and \(\texttt {truck} \)) and operates three locations (\(\texttt {prague} , \texttt {brno} \), and \(\texttt {ostrava} \)). A \(\texttt {plane} \) can travel from \(\texttt {prague} \) to \(\texttt {brno} \) and back, while a \(\texttt {truck} \) provides connection between \(\texttt {brno} \) and \(\texttt {ostrava} \). The company receives a delivery job to transport the \(\texttt {crown} \) from \(\texttt {prague} \) to \(\texttt {ostrava} \). The company manager needs to plan tasks for the vehicle operators so that the delivery job is done.

This delivery problem can be expressed using MA-Strips as follows. Actions and \(\texttt {drive} ({{\texttt {\textit{loc}}}}_1,{{\texttt {\textit{loc}}}}_2)\) describe movement of \(\texttt {plane} \) and \(\texttt {truck} \), respectively. Actions \(\texttt {load} ({{\texttt {\textit{veh}}}},{{\texttt {\textit{loc}}}})\) and \(\texttt {unload} ({{\texttt {\textit{veh}}}},{{\texttt {\textit{loc}}}})\) describe loading and unloading of \(\texttt {crown} \) by a given vehicle at a given location.

We define two agents \({{\texttt {\textit{Plane}}}}\) and \({{\texttt {\textit{Truck}}}}\). The agents are defined by sets of executable actions as follows.

$$\begin{aligned} \begin{array}{lll} {{\texttt {\textit{Plane}}}}= \{ &{} \texttt {fly} (\texttt {prague} ,\texttt {brno} ), \texttt {load} (\texttt {plane} ,\texttt {prague} ), \texttt {unload} (\texttt {plane} ,\texttt {prague} ), \\ &{} \texttt {fly} (\texttt {brno} ,\texttt {prague} ), \texttt {load} (\texttt {plane} ,\texttt {brno} ), \texttt {unload} (\texttt {plane} ,\texttt {brno} ) \ \ \} \\ {{\texttt {\textit{Truck}}}}= \{ &{} \texttt {drive} (\texttt {brno} ,\texttt {ostrava} ), \texttt {load} (\texttt {truck} ,\texttt {brno} ), \texttt {unload} (\texttt {truck} ,\texttt {brno} ), \\ &{} \texttt {drive} (\texttt {ostrava} ,\texttt {brno} ), \texttt {load} (\texttt {truck} ,\texttt {ostrava} ), \texttt {unload} (\texttt {truck} ,\texttt {ostrava} ) \ \ \} \end{array} \end{aligned}$$

Aforementioned actions are defined using facts \(\texttt {at} ({{\texttt {\textit{veh}}}},{{\texttt {\textit{loc}}}})\) to describe possible vehicle locations and facts \(\texttt {in} (\texttt {crown} ,{{\texttt {\textit{loc}}}})\) and \(\texttt {in} (\texttt {crown} ,{{\texttt {\textit{veh}}}})\) to describe positions of \(\texttt {crown} \). We omit action ids in examples when no confusion can arise. For example, we have the following.

$$\begin{aligned} \begin{array}{ll} \texttt {fly} ({{\texttt {\textit{loc}}}}_1,{{\texttt {\textit{loc}}}}_2) = &{} \langle \{ \texttt {at} (\texttt {plane} ,{{\texttt {\textit{loc}}}}_1) \}, \{ \texttt {at} (\texttt {plane} ,{{\texttt {\textit{loc}}}}_2) \}, \{ \texttt {at} (\texttt {plane} ,{{\texttt {\textit{loc}}}}_1) \} \rangle \\ \texttt {load} ({{\texttt {\textit{veh}}}},{{\texttt {\textit{loc}}}}) = &{} \langle \{ \texttt {at} ({{\texttt {\textit{veh}}}},{{\texttt {\textit{loc}}}}), \texttt {in} (\texttt {crown} ,{{\texttt {\textit{loc}}}}) \}, \{ \texttt {in} (\texttt {crown} ,{{\texttt {\textit{veh}}}}) \}, \{ \texttt {in} (\texttt {crown} ,{{\texttt {\textit{loc}}}}) \} \rangle \end{array} \end{aligned}$$

The initial state and the goal are given as follows.

$$\begin{aligned} \begin{array}{l} I= \{ \texttt {at} (\texttt {plane} ,\texttt {prague} ), \texttt {at} (\texttt {truck} ,\texttt {brno} ), \texttt {in} (\texttt {crown} ,\texttt {prague} ) \} \\ G= \{ \texttt {in} (\texttt {crown} ,\texttt {ostrava} ) \} \end{array} \end{aligned}$$

The goal reflects the delivery requirement. \(\square \)

3.2 Privacy classification of facts and actions

In MA-Strips multiagent planning, each fact is classified either as public or as internal out of computational or privacy concerns. MA-Strips specifies this classification as follows. A fact is public when it is mentioned by actions of at least two different agents. A fact is internal for agent \(\alpha \) when it is not public but mentioned by some action of \(\alpha \). A fact is relevant for \(\alpha \) when it is either public or internal for \(\alpha \). Relevant facts contain all the facts which agent \(\alpha \) needs to understand, because other facts are internal for other agents and thus not directly concern \(\alpha \). Formal definitions and notations used throughout the paper are presented in the upper parts of Fig. 1.

Fig. 1
figure 1

MA-Strips privacy classification of facts and actions

It is possible to extend the set of public facts to contain additionally some facts that would be internal by the above definition. This is important for our experimental evaluation because some multiagent planners use different facts classification. It is an advantage of our planner that it can be used with different facts classification because (1) we provide a user the freedom to choose what is public and (2) we can directly compare our planner with planners that use different classifications. The only requirement for our planner is that every fact shared by at least two agents is public. Furthermore, it is common in the literature [27] to require that all the goals are public. An MA-Strips problem with internal goals can be easily transformed to an equivalent problem without internal goals (see Sect. 7.3), and thus, we omit internal goals in formal presentation. Then \(\textsf {pub-facts} (\varPi )\) is defined as the minimal superset of the intersection from the definition that satisfies \(G\subseteq \textsf {pub-facts} (\varPi )\). In the rest of this paper, we suppose \(G\subseteq \textsf {pub-facts} (\varPi )\) and also another simplification common in the literature [6] which says that \(\alpha _i\) are pairwise disjoint.Footnote 1

Example 2

In our running example, the only fact shared by the two agents is \(\texttt {in(crown} , \texttt {brno)} \). As we require \(G\subseteq \textsf {pub-facts} (\varPi )\), we have the following facts classification.

$$\begin{aligned} \begin{array}{rl} \textsf {pub-facts} (\varPi )= \{ &{} \texttt {in} (\texttt {crown} ,\texttt {brno} ), \texttt {in} (\texttt {crown} ,\texttt {ostrava} ) \} \\ \textsf {int-facts} ({{\texttt {\textit{Plane}}}}) \!=\! \{ &{} \texttt {at} (\texttt {plane} ,\texttt {prague} ), \texttt {at} (\texttt {plane} ,\texttt {brno} ), \texttt {in} (\texttt {crown} ,\texttt {prague} ), \texttt {in} (\texttt {crown} ,\texttt {plane} ) \} \end{array} \end{aligned}$$

\(\square \)

MA-Strips further extends this classification of facts to actions as follows. An action is public when it has a public effect, otherwise it is internal. Strictly speaking, MA-Strips defines an action as public whenever it mentions a public fact even in a precondition (i.e., when \(\textsf {facts} _{}(a)\cap \textsf {pub-facts} (\varPi )\ne \emptyset \)). However, our method of multiagent planning does not rely on synchronization of public preconditions, and hence, we can allow actions with only public preconditions to be internal. For our planner, it is enough to know that internal actions do not modify public state. Formal definitions and notations are presented in the lower part of Fig. 1.

3.3 Local planning problems

In MA-Strips problems, agent actions are supposed to manipulate a shared global state when executed. In multiagent planning with external actions, a local planning problem is constructed for every agent \(\alpha \). Each local planning problem of \(\alpha \) is a classical Strips problem containing \(\alpha \)’s own actions together with information about public actions of other agents. These local planning problems allow us to divide an MA-Strips problem to several Strips problems which can be solved separately by a classical planner. This paper describes a way how to find a solution of an MA-Strips problem, but it does not address the question of execution of a plan in some real-world environment.

The projection \({F}\mathop {\triangleright }{\alpha }\) of set of facts F to agent \(\alpha \) is the restriction of F to the facts relevant for \(\alpha \). Hence, projection removes from F facts not relevant for \(\alpha \), and thus, it represents F as understood by agent \(\alpha \). The projection \({a}\mathop {\triangleright }{\alpha }\) of action a to agent \(\alpha \) removes from \(a\) facts not relevant for \(\alpha \), again representing a as seen by \(\alpha \). The projections are formally defined as follows.

Definition 1

Given \(\varPi \), let F be an arbitrary set \(F\subseteq \textsf {facts} _{}(\varPi )\) of facts and let \(a\) be an action from \(\varPi \). The projection \({F}\mathop {\triangleright }{\alpha }\) of F to \(\alpha \in \textsf {agents} (\varPi )\) and the projection \({a}\mathop {\triangleright }{\alpha }\) of action a to \(\alpha \) are defined as follows.

$$\begin{aligned} {F}\mathop {\triangleright }{\alpha } = F\cap \textsf {rel-facts} (\alpha )\qquad {a}\mathop {\triangleright }{\alpha } = \langle \textsf {id} (a), {\textsf {pre} (a)}\mathop {\triangleright }{\alpha },\ {\textsf {add} (a)}\mathop {\triangleright }{\alpha },\ {\textsf {del} (a)}\mathop {\triangleright }{\alpha } \rangle \end{aligned}$$

The action projection is extended to sets of actions element-wise. \(\square \)

Note that \({a}\mathop {\triangleright }{\alpha }=a\) when \(a\in \alpha \). Hence, projection to \(\alpha \) alters only actions of agents other than \(\alpha \). Also note that action ids are preserved under projection.

Example 3

In our example, we have the following.

$$\begin{aligned} \begin{array}{rcl} {\texttt {fly} (\texttt {prague} ,\texttt {brno} )}\mathop {\triangleright }{{{\texttt {\textit{Plane}}}}} &{}=&{} \texttt {fly} (\texttt {prague} ,\texttt {brno} ) \\ {\texttt {fly} (\texttt {prague} ,\texttt {brno} )}\mathop {\triangleright }{{{\texttt {\textit{Truck}}}}} &{}=&{} \langle \emptyset ,\emptyset ,\emptyset \rangle \\ {\texttt {load} (\texttt {truck} ,\texttt {brno} )}\mathop {\triangleright }{{{\texttt {\textit{Plane}}}}} &{}=&{} \langle \{\texttt {in} (\texttt {crown} ,\texttt {brno} )\},\emptyset , \{\texttt {in} (\texttt {crown} ,\texttt {brno} )\} \rangle \\ {\texttt {unload} (\texttt {truck} ,\texttt {ostrava} )}\mathop {\triangleright }{{{\texttt {\textit{Plane}}}}} &{}=&{} \langle \emptyset ,\{\texttt {in} (\texttt {crown} ,\texttt {ostrava} )\},\emptyset \rangle \end{array} \end{aligned}$$

\(\square \)

In multiagent planning with external actions, every agent \(\alpha \) is from the beginning equipped with projections of public actions of other agents. These projections, which we call external actions, describe how agent \(\alpha \) sees effects of public actions of other agents. In a local planning problem, an agent needs external actions so that he can create a plan which contains also public actions of other agents. The set of actions in a local planning problem of agent \(\alpha \) simply contains actions of agent \(\alpha \) together with external actions of \(\alpha \). Now it is easy to define a local planning problem \({\varPi }\mathop {\triangleright }{\alpha }\) of agent \(\alpha \) also called projection of \(\varPi \) to \(\alpha \) as a classical Strips problem. The set of facts \(P\) and the initial state \(I\) are restricted to those facts relevant for \(\alpha \). There is no need to restrict the goal \(G\) because all the goal facts are public and thus relevant for all the agents. A formal definition follows.

Definition 2

Given an MA-Strips problem \(\varPi \), the local planning problem \({\varPi }\mathop {\triangleright }{\alpha }\) of agent \(\alpha \) is defined for every \(\alpha \in \textsf {agents} (\varPi )\) as the classical Strips problem

$$\begin{aligned} {\varPi }\mathop {\triangleright }{\alpha }= \langle {\textsf {facts} _{}(\varPi )}\mathop {\triangleright }{\alpha },\ \alpha \cup \textsf {ext-actions} (\alpha ),\ {\textsf {init} _{}(\varPi )}\mathop {\triangleright }{\alpha },\ G \rangle \end{aligned}$$

where the set \(\textsf {ext-actions} (\alpha )\) of external actions of agent \(\alpha \) is defined as follows.

$$\begin{aligned} \begin{array}{lcl} \textsf {ext-actions} (\alpha )= & {} \bigcup _{\beta \ne \alpha }({\textsf {pub-actions} (\beta )}\mathop {\triangleright }{\alpha }) \qquad (\text{ for } \text{ all } \beta \in \textsf {agents} (\varPi )) \end{array} \end{aligned}$$

\(\square \)

Example 4

In our example, all the actions arranging vehicle movements are internal. Public are only the actions providing package treatment at public locations (\(\texttt {brno} , \texttt {ostrava} \)). Hence, the set \(\textsf {pub-actions} ({{\texttt {\textit{Plane}}}})\) contains only actions \(\texttt {load} (\texttt {plane} ,\texttt {brno} )\) and \(\texttt {unload} (\texttt {plane} ,\texttt {brno} )\), while \(\textsf {pub-actions} ({{\texttt {\textit{Truck}}}})\) is as follows.

$$\begin{aligned} \{\ \texttt {load} (\texttt {truck} ,\texttt {brno} ), \texttt {unload} (\texttt {truck} ,\texttt {brno} ), \texttt {load} (\texttt {truck} ,\texttt {ostrava} ), \texttt {unload} (\texttt {truck} ,\texttt {ostrava} ) \ \} \end{aligned}$$

Hence, \(\textsf {ext-actions} ({{\texttt {\textit{Truck}}}})\) has two actions and \(\textsf {ext-actions} ({{\texttt {\textit{Plane}}}})\) has four actions. This yields the local problem \({\varPi }\mathop {\triangleright }{{{\texttt {\textit{Plane}}}}}\) with 10 actions and the problem \({\varPi }\mathop {\triangleright }{{{\texttt {\textit{Truck}}}}}\) with eight actions. \(\square \)

3.4 Planning with external actions

We would like to solve agent local problems separately and compose local solutions to a global solution of \(\varPi \). However, not all local solutions can be easily composed to a solution of \(\varPi \). Concepts of public plans and their extensibility help us to recognize local solutions which are conductive to this aim.

A plan \(\pi \) is a sequence of actions \(\langle a_1,\ldots ,a_k \rangle \). A plan \(\pi \) defines an order in which the actions are executed by their unique owner agents. It is supposed that independent actions can be executed in parallel. A solution of \(\varPi \) is a plan \(\pi \) whose execution transforms the initial state I to the state \(s\) such that \(G\subseteq s\). A local solution of agent \(\alpha \) is a solution of the local planning problem \({\varPi }\mathop {\triangleright }{\alpha }\). Let \(\textsf {sols} ({\varPi })\) and \(\textsf {sols} ({\varPi }\mathop {\triangleright }{\alpha })\) denote the sets of all the solutions of MA-Strips problem \(\varPi \) and all the local solutions of \(\alpha \), respectively.

Example 5

Let us consider the following plans.

$$\begin{aligned} \begin{array}{ll} \pi _0 = \langle &{} \texttt {load} (\texttt {plane} ,\texttt {prague} ),\ \texttt {fly} (\texttt {prague} ,\texttt {brno} ),\ \texttt {unload} (\texttt {plane} ,\texttt {brno} ),\ \\ &{} \texttt {load} (\texttt {truck} ,\texttt {brno} ),\ \texttt {drive} (\texttt {brno} ,\texttt {ostrava} ),\ \texttt {unload} (\texttt {truck} ,\texttt {ostrava} ) \ \rangle \\ \pi _1 = \langle &{} {\texttt {unload} (\texttt {truck} ,\texttt {ostrava} )}\mathop {\triangleright }{{{\texttt {\textit{Plane}}}}} \ \rangle \\ \pi _2 = \langle &{} {\texttt {unload} (\texttt {plane} ,\texttt {brno} )}\mathop {\triangleright }{{{\texttt {\textit{Truck}}}}},\ \texttt {load} (\texttt {truck} ,\texttt {brno} ),\ \\ &{} \texttt {drive} (\texttt {brno} ,\texttt {ostrava} ),\ \texttt {unload} (\texttt {truck} ,\texttt {ostrava} )\ \rangle \end{array} \end{aligned}$$

It is easy to check that \(\pi _0\) is a solution of our example MA-Strips problem \(\varPi \). Plan \(\pi _1\) is a solution of \({\varPi }\mathop {\triangleright }{{{\texttt {\textit{Plane}}}}}\) because projection \({\texttt {unload} (\texttt {truck} ,\texttt {ostrava} )}\mathop {\triangleright }{{{\texttt {\textit{Plane}}}}}\) of \({{\texttt {\textit{Truck}}}}\)’s public action simply produces the goal state out of the blue. Finally, \(\pi _2\in \textsf {sols} ({{\varPi }\mathop {\triangleright }{{{\texttt {\textit{Truck}}}}}})\). \(\square \)

A public plan \(\sigma \) is a plan that contains only public actions. A public plan can be seen as a solution outline that captures execution order of public actions while ignoring agents internal actions. A public plan can be safely sent to any agent because it contains only public information. In order to avoid confusions between public and external versions of the same action, we formally define public plans to contain only public action ids. For a plan \(\pi \) of \(\varPi \) (or a plan of \({\varPi }\mathop {\triangleright }{\alpha }\)), we define the public projection \({\pi }\mathop {\triangleright }{\star }\) of \(\pi \) as the sequence of all public action ids from \(\pi \) preserving their order. Public projection of a plan thus removes any internal actions from \(\pi \). Formal definition follows.

Definition 3

A public plan \(\sigma \) is a sequence of public action ids. Given a plan \(\pi \) of \(\varPi \) (or of \({\varPi }\mathop {\triangleright }{\alpha }\)), the public projection \({\pi }\mathop {\triangleright }{\star }\) of \(\pi \) is defined to be the public plan \( {\pi }\mathop {\triangleright }{\star } = \langle \textsf {id} (a): a\in \pi \ \text{ and } \ a\in \textsf {pub-actions} (\varPi )\rangle \). Public projection is extended to sets of plans element-wise. \(\square \)

Example 6

In our example, we know that \(\pi _0\in \textsf {sols} ({\varPi })\) and \(\pi _1\in \textsf {sols} ({{\varPi }\mathop {\triangleright }{{{\texttt {\textit{Plane}}}}}})\) and \(\pi _2\in \textsf {sols} ({{\varPi }\mathop {\triangleright }{{{\texttt {\textit{Truck}}}}}})\). Thus, we can construct the following public solutions.

$$\begin{aligned} \begin{array}{ll} {\pi _0}\mathop {\triangleright }{\star } = \langle &{} \textsf {id} (\texttt {unload} (\texttt {plane} ,\texttt {brno} )), \textsf {id} (\texttt {load} (\texttt {truck} ,\texttt {brno} )), \textsf {id} (\texttt {unload} (\texttt {truck} ,\texttt {ostrava} )) \rangle \\ {\pi _1}\mathop {\triangleright }{\star } = \langle &{} \textsf {id} (\texttt {unload} (\texttt {truck} ,\texttt {ostrava} )) \rangle \\ {\pi _2}\mathop {\triangleright }{\star } = \langle &{} \textsf {id} (\texttt {unload} (\texttt {plane} ,\texttt {brno} )), \textsf {id} (\texttt {load} (\texttt {truck} ,\texttt {brno} )), \textsf {id} (\texttt {unload} (\texttt {truck} ,\texttt {ostrava} )) \rangle \end{array} \end{aligned}$$

Note that \({\pi _0}\mathop {\triangleright }{\star } = {\pi _2}\mathop {\triangleright }{\star }\) and also note that we have omitted the projection operator (\(\triangleright \)) because ids are preserved under projection. \(\square \)

From every solution \(\pi \) of \(\varPi \) (or of \({\varPi }\mathop {\triangleright }{\alpha }\)), we can construct a uniquely determined public plan \(\sigma ={\pi }\mathop {\triangleright }{\star }\). On the other hand, for a single public plan \(\sigma \) there might be more than one, or none, solutions with public projection \(\sigma \). A public plan \(\sigma \) is called extensible when there is a solution of \(\varPi \) with public projection \(\sigma \). Similarly, when there is a solution of \({\varPi }\mathop {\triangleright }{\alpha }\) with public projection \(\sigma \), then \(\sigma \) is called \(\alpha \)-extensible. Extensible public plans give us an order of public actions which is acceptable for all the agents. Thus, extensible public plans are very close to solutions of \(\varPi \), and it is relatively easy to construct a solution of \(\varPi \) once we have an extensible public plan. Hence, our algorithms will aim at finding extensible public plans. The following formally defines public plan extensibility.

Definition 4

Let \(\sigma \) be a public plan of \(\varPi \).

$$\begin{aligned} \begin{array}{rcl} \sigma \text{ is } extensible &{} \quad \text{ iff }\quad &{} \exists \pi \in \textsf {sols} ({\varPi }): {\pi }\mathop {\triangleright }{\star }=\sigma \\ \sigma \text{ is } \alpha \text{- }extensible &{} \quad \text{ iff }\quad &{} \exists \pi \in \textsf {sols} ({{\varPi }\mathop {\triangleright }{\alpha }}): {\pi }\mathop {\triangleright }{\star }=\sigma \end{array} \end{aligned}$$

\(\square \)

Example 7

In our example, we can see that \({\pi _0}\mathop {\triangleright }{\star }\) is extensible because it was constructed from the solution of \(\varPi \). For the same reason, we see that \({\pi _1}\mathop {\triangleright }{\star }\) is \({{\texttt {\textit{Plane}}}}\)-extensible and \({\pi _2}\mathop {\triangleright }{\star }\) is \({{\texttt {\textit{Truck}}}}\)-extensible. It is easy to see that \({\pi _2}\mathop {\triangleright }{\star }\) is also \({{\texttt {\textit{Plane}}}}\)-extensible. However, \({\pi _1}\mathop {\triangleright }{\star }\) is not \({{\texttt {\textit{Truck}}}}\)-extensible because \({{\texttt {\textit{Truck}}}}\) needs to execute other public actions prior to \(\texttt {unload} (\texttt {truck} ,\texttt {ostrava} )\). \(\square \)

The following proposition states the correctness of the multiagent planning with external actions. It establishes the relationship between extensible and \(\alpha \)-extensible plans. Its direct consequence is that to find a solution of \(\varPi \) it is enough to find a local solution \(\pi _\alpha \in \textsf {sols} ({\varPi }\mathop {\triangleright }{\alpha })\) which is \(\beta \)-extensible for every other agent \(\beta \). A constructive proof follows.

Theorem 1

([35]) Public plan \(\sigma \) of \(\varPi \) is extensible if and only if \(\sigma \) is \(\alpha \)-extensible for every agent \(\alpha \in \textsf {agents} (\varPi )\).

Example 8

We have seen previously that \({\pi _2}\mathop {\triangleright }{\star }\) is \({{\texttt {\textit{Truck}}}}\)-extensible and also \({{\texttt {\textit{Plane}}}}\)-extensible. Hence, we know that there is some solution of \(\varPi \) even without knowing \(\pi _0\). Furthermore, the proof of Theorem 1 shows how to reconstruct the solution. On the other hand, we know that \({\pi _1}\mathop {\triangleright }{\star }\) is not \({{\texttt {\textit{Truck}}}}\)-extensible and thus \({\pi _1}\mathop {\triangleright }{\star }\) is not extensible. \(\square \)

4 Planning state machines (PSM)

The basic idea behind the multiagent planning algorithm described in this paper is based on Theorem 1, and it can be described briefly as follows. Every agent \(\alpha \) keeps generating new solutions of its local planning problem \({\varPi }\mathop {\triangleright }{\alpha }\) and announces their public projections to all the other agents. Hence, the set \(\varDelta _\alpha \) of public plans generated so far by \(\alpha \) is known by all the agents. Once there is a single public plan \(\sigma \) generated by all the agents, we can stop the algorithm yielding \(\sigma \) as the public solution of \(\varPi \). This is because every plan generated by agent \(\beta \) is automatically \(\beta \)-extensible and hence \(\sigma \) is extensible by Theorem 1.

In this section, we utilize finite-state machines to effectively represent sets of plans (or public plans) of a Strips problem mentioned in the above algorithm description. These finite-state machines, which we call planning state machines (PSM), are described in Sect. 4.1. PSMs allow us to effectively implement operations which are crucial for our multiagent planning algorithm. These operations are (1) adding a new solution to an existing PSM (Sect. 4.2), (2) computing a public projection of a PSM (Sect. 4.3), and (3) intersecting public projections of PSMs (Sect. 4.4).

4.1 Basics of planning state machines

Finite-state machines [16] are widely used in computer science for manifold purposes. In this section, we utilize state machines to recognize and compute solutions of Strips and MA-Strips planning problems. To achieve this, we use the set of planning actions \(A\) as an alphabet while planning states (sets of facts) become states of our planning state machine (PSM). PSM state-transition \(\delta \) simply resembles planning state progression function \(\gamma \). Hence, a PSM accepts words over \(A\), that is, plans.

For our purposes, a deterministic finite-state machine (DFS) is a tuple \(\langle \Sigma ,S,s_0,\delta ,F \rangle \) where \(\Sigma \) is a finite alphabet, S is a finite set of states, \(s_0\in S\) is an initial state, \(\delta \) is a complete state-transition function (\(\delta :S\times \Sigma \rightarrow S\)), and \(F\subseteq S\) is a set of accepting states. A non-deterministic finite-state machine (NFS) is a tuple \(\langle \Sigma ,S,s_0,\delta ,F \rangle \) much like a DFS, but the state-transition function is non-deterministic, that is, \(\delta :S\times \Sigma \rightarrow \mathcal {P}(S)\). For a DFS (for an NFS respectively), the state-transition function \(\delta \) can be naturally extended to \(\delta ^\star :S\times \Sigma ^\star \rightarrow S\) (respectively to \(\delta ^\star :S\times \Sigma ^\star \rightarrow \mathcal {P}(S)\)) where \(\Sigma ^\star \) is the set of all finite words over alphabet \(\Sigma \). A word \(w\in \Sigma ^\star \) is accepted by a DFS when \(\delta ^\star (s_0,w)\in F\). A word \(w\in \Sigma ^\star \) is accepted by an NFS when \(\delta ^\star (s_0,w)\cap F\ne \emptyset \).

Definition 5

A planning state machine (PSM) of a Strips problem \(\varPi =\langle P,A,I,G \rangle \) is a DFS \(\varGamma =\langle \Sigma ,S,I,\delta ,F \rangle \) where

  1. (1)

    alphabet \(\Sigma \) is the set action ids \(\Sigma =\{\textsf {id} (a):a\in A\}\),

  2. (2)

    states are sets of facts (\(S\subseteq \mathcal {P}(P)\)) with \(I\in S\),

  3. (3)

    transitions satisfy that \(\delta (s,\textsf {id} (a))=s'\) implies \(\gamma (s,a)=s'\),

  4. (4)

    and accepting states are \(F=\{s\in S: G\subseteq s \}\).

Let \(\textsf {accept} (\varGamma )\) denote the set of all plans accepted by \(\varGamma \). \(\square \)

In general, a PSM does not need to contain all possible planning states of \(\varPi \) because S is only required to be a subset of \(\mathcal {P}(P)\) which contains I. However, the following soundness result can be trivially proved for any PSM.

Lemma 1

Let \(\varGamma \) be a PSM of a Strips problem \(\varPi \). Then \( \textsf {accept} (\varGamma ) \subseteq \textsf {sols} ({\varPi })\).

Proof

Follows directly from Definition 5 and definition of \(\gamma \). \(\square \)

The opposite inclusion does not necessarily hold. However, for a given Strips problem \(\varPi \) we can easily construct a complete PSM \(\varGamma \) which accepts all the solutions of \(\varPi \). A complete PSM needs to contain all the possible transitions and all (reachable) planning states. A complete PSM can be constructed by a breadth-first search starting from the initial state and by adding all reachable planning states together with all possible transitions. Of course, this construction is highly ineffective, but it shall be used below to demonstrate basic operations on PSMs. The following defines a complete PSM.

Definition 6

A PSM \(\varGamma =\langle \Sigma ,S,I,\delta ,F \rangle \) of \(\varPi =\langle P,A,I,G \rangle \) is complete when

  1. (1)

    \(S=\mathcal {P}(P)\), and

  2. (2)

    transitions additionally satisfy that \(\gamma (s,a)=s'\) implies \(\delta (s,\textsf {id} (a))=s'\) whenever \(\gamma (s,a)\) is defined.\(\square \)

Hence, a complete PSM accepts every solution of \(\varPi \), formally as follows.

Lemma 2

For a complete PSM \(\varGamma \) of \(\varPi \), it holds that \( \textsf {accept} (\varGamma ) = \textsf {sols} ({\varPi })\).

Proof

Follows directly from Lemma 1 and the definition of \(\gamma \). \(\square \)

4.2 Extending a PSM with solutions

The first operation we define on PSMs is extending an existing PSM with a new solution \(\pi \). The operation is denoted \(\varGamma \mathop {\oplus }\pi \), and its result is an extended PSM which accepts all the plans as \(\varGamma \) and additionally \(\pi \). The operation is implemented simply by traversing \(\pi \) and by adding corresponding states and transitions to \(\varGamma \). The following constructive definition suggests an implementation linear in the size of the added solution. Note that in the definition we consider \(\delta \) to be a set of triples, writing \(\langle s,a,s' \rangle \in \delta \) instead of \(s'\in \delta (s,a)\).

Fig. 2
figure 2

A motivation example for computing PSM public projection. We suppose a context where \(\mathsf {p}n\) are public and \(\mathsf {i}n\) internal actions and where \(\mathsf {a}, \mathsf {b}, \mathsf {c}\) are public and \(\mathsf {x}, \mathsf {y}\) internal facts. Accepting states are marked bold. The initial state is \(\{\}\)

Definition 7

Let a PSM \(\varGamma =\langle \Sigma ,S,I,\delta ,F \rangle \) of \(\varPi \) and a solution \(\pi =\langle a_1,\ldots ,a_n \rangle \) of \(\varPi \) be given. Denote \(s_0=I\) and \(s_{i}=\gamma (s_{i-1},a_{i-1})\) for \(0< i\le n\). The PSM \(\varGamma \mathop {\oplus }\pi \) is defined as follows.

$$\begin{aligned} \varGamma \mathop {\oplus }\pi = \langle \Sigma , S\cup \{s_0,\ldots ,s_n\}, I, \delta \cup \{\langle s_{i-1},\textsf {id} (a_{i-1}),s_{i} \rangle : 0 < i\le n \}, F\cup \{s_n\} \rangle \end{aligned}$$

\(\square \)

The operation \(\oplus \) can extend the set of accepted plans by more than \(\pi \). However, the following lemma states that the PSM \(\varGamma \mathop {\oplus }\pi \) accepts all the plans as \(\varGamma \), and additionally other plans including \(\pi \). Additionally accepted plans other than \(\pi \) do not cause any problem because Lemma 1 ensures that every additionally accepted plan is a solution of \(\varPi \). Important is that \(\textsf {accept} (\varGamma \mathop {\oplus }\pi )\subseteq \textsf {sols} ({\varPi })\) holds.

Lemma 3

Let \(\varPi \) be a classical \(\textsc {Strips} \) problem, let \(\varGamma \) be a PSM of \(\varPi \), and let \(\pi \in \textsf {sols} ({\varPi })\). Then \(\varGamma \mathop {\oplus }\pi \) is correctly defined and \(\textsf {accept} (\varGamma )\cup \{\pi \}\subseteq \textsf {accept} (\varGamma \mathop {\oplus }\pi )\).

Proof

Follows from Definition 7. \(\square \)

4.3 Public planning state machines

Previous sections define a planning state machine \(\varGamma \) which represents the set \(\textsf {accept} (\varGamma )\) of plans. From \(\varGamma \), we would like to compute the corresponding set of public plans, that is, the set \({\textsf {accept} (\varGamma )}\mathop {\triangleright }{\star }=\{{\pi }\mathop {\triangleright }{\star }:\pi \in \textsf {accept} (\varGamma )\}\). In this section, we achieve this by transforming PSM \(\varGamma \) to a public planning state machine which (1) accepts exactly the aforementioned set of public plans and (2) contains only public information. We call this operation the public projection of PSM \(\varGamma \), and we denote it \({\varGamma }\mathop {\triangleright }{\star }\).

Public PSMs will be exchanged among agents during our multiagent planning algorithm. Therefore, out of privacy concerns, it is essential that public PSMs contain only public information. A first attempt to construct a public PSM from PSM \(\varGamma \) would be to treat internal actions as \(\varepsilon \)-transitions and eliminate them from \(\varGamma \) using standard algorithm. The standard algorithm to eliminate \(\varepsilon \)-closures simply “bridges” \(\varepsilon \)-transitions with new transitions. As for internal facts contained within states, a first attempt is simply to delete them. Let us consider the example PSM \(\varGamma _1\) from Fig. 2 (left). After eliminating internal transitions and after deleting internal facts from states, we obtain the PSM \(\varGamma _2\) (Fig. 2, middle). Unfortunately, \(\varGamma _2\) also accepts the plan \(\langle \mathsf {p1}, \mathsf {p2}, \mathsf {p3}, \mathsf {p4} \rangle \) which is not a public projection of any plan accepted by \(\varGamma _1\). The problem is that two different states of \(\varGamma _1\), namely \(\{\mathsf {a},\mathsf {x}\}\) and \(\{\mathsf {a},\mathsf {y}\}\), were merged in \(\varGamma _2\) after removing internal facts \(\mathsf {x}\) and \(\mathsf {y}\). To solve this problem, we introduce integer marks to distinguish states which would otherwise became equal after removing internal facts. This is demonstrated by PSM \(\varGamma _3\) (Fig. 2, right). It is easy to check that \(\varGamma _3\) accepts exactly public projections of the plans accepted by \(\varGamma _1\). Also note that \(\varGamma _3\) is non-deterministic because of the non-deterministic transitions from the initial state. Hence, public projection can introduce non-determinism.

In order to formally define public PSMs, we need to define public projections of states and actions. The public projection \({F}\mathop {\triangleright }{\star }\) of a set of facts F is simply the restriction of F to public facts. The public projection \({a}\mathop {\triangleright }{\star }\) of action \(a\) restricts facts in \(a\) to public facts preserving action id.

Definition 8

Let \(\varPi \) be an MA-Strips problem. Let F be an arbitrary set \(F\subseteq \textsf {facts} _{}(\varPi )\) and let \(a\) be an action from \(\varPi \). The public projection \({F}\mathop {\triangleright }{\star }\) of F and the public projection \({a}\mathop {\triangleright }{\star }\) of action a are defined as follows.

$$\begin{aligned} {F}\mathop {\triangleright }{\star } = F\cap \textsf {pub-facts} (\varPi )\qquad {a}\mathop {\triangleright }{\star } = \langle \textsf {id} (a), {\textsf {pre} (a)}\mathop {\triangleright }{\star },\ {\textsf {add} (a)}\mathop {\triangleright }{\star },\ {\textsf {del} (a)}\mathop {\triangleright }{\star } \rangle \end{aligned}$$

Public projection is extended to sets of actions element-wise. \(\square \)

The previous discussion explained why public PSMs need to contain integer-labeled states and why public PSMs need to be non-deterministic. Hence a public PSM of an MA-Strips problem \(\varPi \) is an NFS with the following properties.

Definition 9

A public PSM of an MA-Strips problem \(\varPi \) is an NFS \(\varDelta =\langle \Sigma ,S,I_0,\delta ,F \rangle \) where

  1. (1)

    the alphabet is \(\Sigma =\{\textsf {id} (a): a\in \textsf {pub-actions} (\varPi )\}\),

  2. (2)

    states are integer-labeled sets of public facts (\(S\subseteq \mathcal {P}(\textsf {pub-facts} (\varPi ))\times \mathbb {N}\)),

  3. (3)

    the initial state is \(I_0 = \langle {\textsf {init} _{}(\varPi )}\mathop {\triangleright }{\star },0 \rangle \) and \(I_0\in S\),

  4. (4)

    transitions satisfy that \(\langle s',i' \rangle \in \delta (\langle s,i \rangle ,\textsf {id} (a))\) implies \(\gamma (s,{a}\mathop {\triangleright }{\star })=s'\),

  5. (5)

    and \(\langle s,i \rangle \in F\) implies \(\textsf {goal} (\varPi )\subseteq s\).

Let \(\textsf {accept} (\varDelta )\) denote the set of all plans accepted by \(\varDelta \). \(\square \)

Now we describe the public projection algorithm to compute \({\varGamma }\mathop {\triangleright }{\star }\) from \(\varGamma \). It is motivated by the standard \(\varepsilon \)-elimination algorithm [16, Chapter 2.5] extended with integer-mark introduction and public projection of states. For every state \(s\), the internal closure set \(\textsf {int-closure} _{\varGamma }(s)\) contains all the states reachable from \(s\) by internal transitions only. The set \(\textsf {int-closure} _{\varGamma }(s)\) can be computed by a DFS, and the following definition gives its semantics. We omit the index \(\varGamma \) when no confusion can arise.

Definition 10

Given PSM \(\varGamma \) of agent local problem \({\varPi }\mathop {\triangleright }{\alpha }\), an internal closure of \(s_0\), denoted \(\textsf {int-closure} _{}(s_0)\), is the least set of states such that

  1. (1)

    \(s_0\in \textsf {int-closure} _{}(s_0)\), and

  2. (2)

    whenever \(s\in \textsf {int-closure} _{}(s_0)\) for some \(s\) then for all \(a\in \textsf {int-actions} (\alpha )\) it holds that \(\delta (s,\textsf {id} (a))\in \textsf {int-closure} _{}(s_0)\).

In other words, the set \(\textsf {int-closure} _{}(s_0)\) contains \(s_0\) and all the states reachable from \(s_0\) by transitions corresponding to internal actions. \(\square \)

figure e

Once internal closures are computed for every state of \(\varGamma \), the public projection algorithm proceeds as described by Algorithm 1. The role of the state renaming \(\rho \) is to translate states of \(\varGamma \) to states of the public projection. For every state \(s\) of \(\varGamma \), the renaming defines the state \(\rho (s)\) in the constructed public PSM consisting of the public projection of \(s\) and a unique integer mark. The second foreach cycle which starts at line 14 takes care of “bridging” of internal transitions. When state \(s'\) is reachable from s in \(\varGamma \) by (zero or more) internal transitions followed by one public transition \(\textsf {id} (a)\), \({\varGamma }\mathop {\triangleright }{\star }\) will contain a transition from \(\rho (s)\) to \(\rho (s')\) labeled with \(\textsf {id} (a)\). Finally, the condition at line 19 marks accepting states.

Fig. 3
figure 3

Example of computing PSM public projection. We suppose a context where \(\mathsf {p}n\) are public and \(\mathsf {i}n\) internal actions and where \(\mathsf {a}, \mathsf {b}, \mathsf {c}\) are public and \(\mathsf {x}, \mathsf {y}\) internal facts. Accepting states are marked bold. The initial state is \(\{\}\)

The PSM public projection algorithm can be alternatively explained on the example from Fig. 3. PSM \(\varGamma _1\) (Fig. 3, left) is the input PSM. PSM \(\varGamma _2\) (Fig. 3, middle) is obtained from \(\varGamma _1\) by eliminating internal transitions. PSM \(\varGamma _3\) (Fig. 3, right) is obtained from \(\varGamma _2\) by public projection of states and by marks introduction. Note that \(\varGamma _3\) additionally compresses \(\varGamma _2\) by unifying states with equal public projection which has equal sets of outgoing transitions. This is an optimization implemented in our planner but omitted from formal presentation. Another optimization is to remove states unreachable from the initial state and to remove states from which no accepting state is reachable. None of the above optimizations affects the semantics.

The following definition defines \({\varGamma }\mathop {\triangleright }{\star }\) as the result of Algorithm 1 and formally states algorithm correctness.

Definition 11

Let \({\varPi }\mathop {\triangleright }{\alpha }\) be a local problem of agent \(\alpha \) and let \(\varGamma \) be a PSM of \({\varPi }\mathop {\triangleright }{\alpha }\). The public projection of \(\varGamma \), denoted \({\varGamma }\mathop {\triangleright }{\star }\), is the result of Algorithm 1. \(\square \)

Lemma 4

Let \({\varPi }\mathop {\triangleright }{\alpha }\) be a local problem of agent \(\alpha \) and let \(\varGamma \) be a PSM of \({\varPi }\mathop {\triangleright }{\alpha }\). Then \({\varGamma }\mathop {\triangleright }{\star }\) is a public PSM of \(\varPi \) and \(\textsf {accept} ({\varGamma }\mathop {\triangleright }{\star })={\textsf {accept} (\varGamma )}\mathop {\triangleright }{\star }\).

Proof

The inclusion (\(\subseteq \)) is proved by extending a public solution \(\sigma \in \textsf {accept} ({\varGamma }\mathop {\triangleright }{\star })\) with internal actions to a solution \(\pi \in \textsf {accept} (\varGamma )\). The opposite inclusion (\(\supseteq \)) is proved by simulating a plan \(\pi \in \textsf {accept} (\varGamma )\) in the state space of \(\varGamma \) and by constructing a corresponding simulation of \({\pi }\mathop {\triangleright }{\star }\) in the state space of \({\varGamma }\mathop {\triangleright }{\star }\). \(\square \)

4.4 Intersection of public PSMs

The previous section describes how to compute the public projection of a PSM. Suppose we have two public PSMs of an MA-Strips problem \(\varPi \). This section describes how to compute an intersection of two public PSMs which is a public PSM which accepts the plans accepted by both the original PSMs.

There is a standard algorithm [16, Theorem 4.8] to compute an intersection of two arbitrary NFSs. The standard algorithm defines an intersection of NFS \(\varDelta _1\) and NSF \(\varDelta _2\) as a new NFS whose set of states is the Cartesian product of states of \(\varDelta _1\) and \(\varDelta _2\). The intersection of NFSs contains a transition between two states when there are corresponding transitions in both the original NFSs \(\varDelta _1\) and \(\varDelta _2\). This standard algorithm, however, needs to be adjusted because the standard algorithm applied to public PSMs would not yield a correctly defined public PSM. The reason is that the structure of states in a public PSM is fixed.

We want to compute an intersection of two public PSMs (of the same MA-Strips problem \(\varPi \)). We can take advantage of the fact that both public PSMs are defined on the same set of states. Moreover, a transition from state \(\langle s,i \rangle \) labeled by action \(a\) uniquely determines \(s'\) in the destination state \(\langle s',i' \rangle \). Hence, we do not need to define the set of states in an intersection as a Cartesian product, but we can use integer-labeled public states and only adjust integer marks appropriately. Thus, an intersection of two public PSMs will be a public PSM. To combine integer marks, we can use arbitrary but fixed injective function from \(\mathbb {N}\times \mathbb {N}\) to \(\mathbb {N}\) such that \(0\cdot 0=0\). A classical example is the Cantor pairing function.Footnote 2

The following defines intersection of public PSMs \(\varDelta _1\) and \(\varDelta _2\) and proves its correctness. The set of states of an intersection PSM is constructed using the Cantor pairing function as follows. Whenever there is a state \(\langle s,i \rangle \) in \(\varDelta _1\) and also a state \(\langle s,j \rangle \) in \(\varDelta _2\), then the intersection PSM contains the state \(\langle s,i\cdot j \rangle \). Hence, every state of the intersection PSM corresponds to uniquely determined states in \(\varDelta _1\) and \(\varDelta _2\). The state transition function of an intersection PSM emulates transition functions of both input public PSMs. A state in the intersection PSM is accepting when both the corresponding states are accepting in \(\varDelta _1\) and \(\varDelta _2\). Finally, note that the Cantor pairing function is not commutative and thus the intersection operation is commutative up to the integer marks. Nevertheless, all possible resulting public PSMs are equal with respect to the set of accepted plans.

Definition 12

Let \(\varDelta _1=\langle \Sigma ,S_1,I,\delta _1,F_1 \rangle \) and \(\varDelta _2=\langle \Sigma ,S_2,I,\delta _2,F_2 \rangle \) be two public PSMs of an MA-Strips problem \(\varPi \). Let \(\cdot \) be the Cantor pairing function. The intersection of \(\varDelta _1\) and \(\varDelta _2\) is a public PSM \(\varDelta _0=\langle \Sigma ,S_0,I,\delta _0,F_0 \rangle \) of problem \(\varPi \) where

  1. (1)

    \(S_0 = \{ \langle s,i\cdot j \rangle : \langle s,i \rangle \in S_1\ \text{ and }\ \langle s,j \rangle \in S_2 \}\), and

  2. (2)

    \(\langle s',i'\cdot j' \rangle \in \delta _0(\langle s,i\cdot j \rangle , {id})\) iff \(\langle s',i' \rangle \in \delta _1(\langle s,i \rangle , {id})\ \text{ and } \ \langle s',j' \rangle \in \delta _2(\langle s,j \rangle , {id})\),

  3. (3)

    and \(F_0 = \{ \langle s,i\cdot j \rangle : \langle s,i \rangle \in F_1\ \text{ and }\ \langle s,j \rangle \in F_2 \}\).

The intersection of \(\varDelta _1\) and \(\varDelta _2\) is denoted \(\varDelta _1\cap \varDelta _2\). \(\square \)

Lemma 5

The intersection \(\varDelta _1\cap \varDelta _2\) of two public PSMs \(\varDelta _1\) and \(\varDelta _2\) of \(\varPi \) is a correctly defined public PSM of \(\varPi \) and the following holds.

$$\begin{aligned} \textsf {accept} (\varDelta _1\cap \varDelta _2) = \textsf {accept} (\varDelta _1)\cap \textsf {accept} (\varDelta _2) \end{aligned}$$

Proof

Follows from Definitions 9 and 12. \(\square \)

4.5 Multiagent planning with complete PSMs

The results from the previous sections now make it easy to introduce Algorithm 2 to compute all public solutions of a given MA-Strips problem \(\varPi \). For every agent \(\alpha \), the algorithm computes the complete PSM of \({\varPi }\mathop {\triangleright }{\alpha }\) and its public projection. All the public PSMs are then intersected, and their intersection is returned as a result. Theorem 2 states that the intersection contains exactly all public solutions of \(\varPi \).

figure f

Theorem 2

Let \(\varPi \) be an MA-Strips problem and \(\varDelta =\mathtt {PsmPlanComplete}(\varPi )\). It holds that \(\textsf {accept} (\varDelta ) = {\textsf {sols} ({\varPi })}\mathop {\triangleright }{\star }\).

Proof

Follows from Lemmas 2, 4, 5, and Theorem 1. \(\square \)

Fig. 4
figure 4

The complete PSM of agent \({{\texttt {\textit{Truck}}}}\) (Example 9)

Fig. 5
figure 5

Public projections of complete PSMs of agents from Example 9

Example 9

In this example, we demonstrate complete PSMs, public projection, and PSM’s intersection on our running Example 1. The complete PSM of agent \({{\texttt {\textit{Truck}}}}\) is presented in Fig. 4. The black node represents the initial state, and the gray nodes are goal states. Dotted edges represent internal actions, dashed edges public actions, and solid edges external actions. To improve clarity, we shorten action labels by the first letters of involved objects, for example, the action \(\texttt {fly} (\texttt {prague} ,\texttt {brno} )\) is shortened as “fpb”, and so on. We also remove edges outgoing goal states, and we omit state labels which can be easily filled in using state progression function \(\gamma \).

The complete PSM of agent \({{\texttt {\textit{Plane}}}}\) is too large for presentation (containing 32 states and 72 transitions). However, its public projection is shown together with the public projection of \({{\texttt {\textit{Truck}}}}\)’s public PSM in Fig. 5. Note how public projection decreases number of states and transitions. The intersection of both public PSMs is shown in Fig. 6. Note that the intersection PSM represents infinite set of all possible public solutions by a finite structure. \(\square \)

Fig. 6
figure 6

Intersection of public PSMs from Fig. 5 representing all possible public solutions

5 Distributed PSM planner

This section describes how to use planning state machines to effectively solve multiagent MA-Strips problems. Section 4.5 already introduced planning algorithm which is correct and complete, and its advantage is that it computes all the public solutions of a given MA-Strips problem. However, its time and space complexity renders it unusable for more complex problems. In this section, we introduce an iterative algorithm based on the idea of PSMs which is designed to be usable in practice. The rest of this section describes the algorithm.

The basic idea can be described as follows. Every agent from the MA-Strips problem \(\varPi \) executes its own planning loop perhaps on a different machine. Every agent \(\alpha \) starts with an empty PSM \(\varGamma _\alpha \). In every iteration, every agent generates a new plan solving its local problem \({\varPi }\mathop {\triangleright }{\alpha }\) and it adds this plan to \(\varGamma _\alpha \). Then public projection \(\varDelta _\alpha ={\varGamma _\alpha }\mathop {\triangleright }{\star }\) is computed, and public PSMs are exchanged among the agents. This process continues until the intersection \(\varDelta = \bigcap _{\beta \in \textsf {agents} (\varPi )}\varDelta _\beta \) is not empty (meaning \(\textsf {accept} (\varDelta )\ne \emptyset \)). A non-empty intersection of public PSMs is thus guaranteed to contain at least one public solution of \(\varPi \).

figure g

The multiagent planning algorithm is described in Algorithm 3. By one iteration of the algorithm, we mean one execution of the loop (lines 314). By a new plan in the first step inside the loop, we mean a plan that was not generated in any of the previous loop iterations. To achieve this, we propose a technique based on known diverse planning techniques. Details are provided in Sect. 5.1. The variable \(\varGamma _\alpha \) keeps a PSM representing all the plans generated so far. The new plan \(\pi _\alpha \) is added to \(\varGamma _\alpha \) and public PSM \(\varDelta _\alpha \) is computed using Algorithm 1. Then all public PSMs are exchanged among the agents. In our implementation, this is synchronization step when the executing agent might need to wait for other agents to finish their computations of public PSMs. However, an alternative non-blocking implementation where the executing agent only updates public PSMs currently available is also possible. Then the intersection \(\varDelta \) of public PSMs of all the agents is computed and possibly returned as a result. The intersection is computed by every agent to avoid a centralized component. In the last step of the loop, public PSMs of other agents are incorporated into the local planning problem \({\varPi }\mathop {\triangleright }{\alpha }\) using landmarks and action costs. This step, described in details in Sect. 5.2, is optional and can be skipped. Theorem 3 states the soundness and completeness of the algorithm.

Theorem 3

(Completeness and soundness) Let \(\varPi \) be an MA-Strips problem such that \(\textsf {sols} ({\varPi })\ne \emptyset \). Let \(\varDelta \) be a result of \(\mathtt {PsmPlanDistributed}({\varPi }\mathop {\triangleright }{\alpha })\) for an arbitrary agent \(\alpha \). Then \(\textsf {accept} (\varDelta )\ne \emptyset \) and \( \textsf {accept} (\varDelta ) \subseteq {\textsf {sols} ({\varPi })}\mathop {\triangleright }{\star } \). Moreover, if the underlying planner (1) is complete, (2) is optimal (with respect to length of generated plan), and (3) allows to generate different plans, then the algorithm always terminates.

Proof

Let \(\pi \) be a solution of \(\varPi \). As a result of completeness and optimality and of the underlying planner, agent \(\alpha \) will generate, in the worst case, all the solutions of \({\varPi }\mathop {\triangleright }{\alpha }\) up to the length of \(\pi \). These must include a solution with the public projection \({\pi }\mathop {\triangleright }{\star }\). As this hold for every agent \(\alpha \), the intersection of their respective public PSMs must be non-empty and the algorithm terminates. This ensures algorithm completeness and termination. The soundness of the algorithm directly follows from the properties of intersection, Theorem 1, and Lemma 1. \(\square \)

5.1 Generating new plans

This section describes how to generate a plan of a classical Strips problem which differs from a set of plans provided as an input. This extension is inspired by diverse planning with homotopy class constraints [3]. In our setting, homotopy classes of plans are naturally defined by plan public projections. That is, two plans \(\pi _1\) and \(\pi _2\) belong to same homotopy class iff \({\pi _1}\mathop {\triangleright }{\star } = {\pi _2}\mathop {\triangleright }{\star }\).

In our implementation, we have extended the FastDownward planner, but the same technique can be used to extend any planner based on a state-space search. The technique is based on the idea of augmented graphs [3]. Every state is extended by a vector of numbers where each vector field corresponds to one of the forbidden plans. The ith vector field value indicates whether the plan ending at this state is different from the ith forbidden plan. Value \(-1\) indicates that current plan differs, while a nonnegative number denotes the position in the corresponding forbidden plan. During action application at some state, we check whether the applied action equals the expected action at the next position in the forbidden plan. The initial state corresponds to the vector of zeros. During the state search, a plan is accepted as a solution only when the plan ends in a state with only \(-1\) values.

Algorithm 3 starts with an empty set of forbidden plans. In every iteration, the generated plan is added to this set. This ensures that the algorithm generates a different plan in every iteration.

As an optimization, we use action costs to force the underlying planner to prefer internal action to public actions, and public actions to external actions. These action costsFootnote 3 correspond to the knowledge an agent has about these actions. An agent has full information about its internal actions, and as they do not affect other agents, they should be used whenever possible. The agent also has full information about its public actions, but because they can affect other agents they should be used more carefully. Finally, the agent has only a limited information about external actions of other agents because some information can be pruned of by public projection. Hence, external actions are the most expensive.

5.2 Guiding plan search using public PSMs

In every iteration of Algorithm 3, the agent receives public PSMs of all other agents. These PSMs contain information about plans found by other agents. Information in these PSMs can be used to guide a new plan generation so that the algorithm finds a solution faster. We incorporate information from public PSMs into the local planning problem by extending the problem with soft-landmark actions and by adjusting action costs. This problem extension influences plan search in the desired way. When agent \(\alpha \) receives public PSM \(\varDelta _\beta \) of another agent \(\beta \), we would like the local plan generator to prefer sequences of public actions suggested by \(\varDelta _\beta \). This is because Algorithm 3 terminates only when all the agents generate the same public solution. Hence, it is preferable to find a local solution \(\pi _\alpha \in \textsf {sols} ({{\varPi }\mathop {\triangleright }{\alpha }})\) such that the public projection \({\pi _\alpha }\mathop {\triangleright }{\star }\) is contained in \(\textsf {accept} (\varDelta _\beta )\). We achieve this by extending \({\varPi }\mathop {\triangleright }{\alpha }\) with special landmark actions without affecting the set of solutions of \({\varPi }\mathop {\triangleright }{\alpha }\). These landmark actions basically duplicate actions from \({\varPi }\mathop {\triangleright }{\alpha }\) but have decreased action costs. The landmark actions have additional preconditions to ensure that landmark actions are used in the order given by \(\varDelta _\beta \).

The process of extending local problem \({\varPi }\mathop {\triangleright }{\alpha }\) with \(\varDelta _\beta \) is sketched as follows. We extend the local problem with a set of fresh facts \(P_{ marks }\) distinct from facts of \({\varPi }\mathop {\triangleright }{\alpha }\) where each fresh fact corresponds to a state of \(\varDelta _\beta \). Hence, we have bijection \(\mu \) from states of \(\varDelta _\beta \) to \(P_{ marks }\). Let \(\varDelta _\beta \) contain a transition from s to \(s'\) labeled by action \(a\). We then extend the local problem with a duplicate of \(a\) which (1) can be applied only in states where \(\mu (s)\) is valid and (2) additionally transforms fact \(\mu (s)\) to \(\mu (s')\). In this way, landmark actions can be applied only in the order given by \(\varDelta _\beta \). The following defines a landmark action.

Definition 13

Let \(a\) be an action and let \( from \) and \( to \) be two facts. The landmark action \(\textsf {lm-act} _{}(a, from , to )\) is defined as follows.

$$\begin{aligned} \textsf {lm-act} _{}(a, from , to ) = \langle \textsf {id} (a), \textsf {pre} (a)\cup \{ from \}, \textsf {add} (a)\cup \{ to \}, \textsf {del} (a)\cup \{ from \} \rangle \end{aligned}$$

For action id \( id \), the landmark action (w.r.t. agent \(\alpha \)) is defined as

$$\begin{aligned} \textsf {lm-act} _{\alpha }(id, from , to )=\textsf {lm-act} _{}(a, from , to ) \end{aligned}$$

where \(a\) is the uniquely determined action of \({\varPi }\mathop {\triangleright }{\alpha }\) with \(\textsf {id} (a)=id\). \(\square \)

Once we have added landmark facts and actions, we just extend the initial state of the local problem with \(\mu (I_0)\) where \(I_0\) is the initial state of \(\varDelta _\beta \). A complete extension of the local problem is formally defined as follows.

Definition 14

Let \(\varPi \) be an MA-Strips problem and \(\alpha \in \textsf {agents} (\varPi )\). Let \(\varDelta =\langle \Sigma ,S,I',\delta ,F \rangle \) be a public PSM of \(\varPi \). Let \(P_{ marks }\) be a set of facts distinct from \(\textsf {facts} _{}(\varPi )\) such that \(|P_{ marks }|=|S|\) and let \(\mu \) be a bijection from S to \(P_{ marks }\). The problem \(\varPi \mathop {\otimes _\alpha }\varDelta \) is defined as the Strips problem \(\langle P_0,A_0,I_0,G \rangle \) where

  1. (1)

    \(P_0=\textsf {facts} _{}({\varPi }\mathop {\triangleright }{\alpha })\cup P_{ marks }\), and

  2. (2)

    \(A_0=\mathsf {actions}({\varPi }\mathop {\triangleright }{\alpha })\cup \{\textsf {lm-act} _{\alpha }( id ,\mu (s),\mu (s')): s'\in \delta (s,id) \}\), and

  3. (3)

    \(I_0=\textsf {init} _{}({\varPi }\mathop {\triangleright }{\alpha })\cup \{\mu (I')\}\).

The problem \(\varPi \mathop {\otimes _\alpha }\varDelta \) is called the local problem of \(\alpha \) extended with \(\varDelta \). \(\square \)

Note that the extension does not affect the goal state condition. There is a straightforward relationship between solutions of \({\varPi }\mathop {\triangleright }{\alpha }\) and solutions of \(\varPi \mathop {\otimes _\alpha }\varDelta _\beta \). Clearly every solution of \({\varPi }\mathop {\triangleright }{\alpha }\) is a solution of the extended problem \(\varPi \mathop {\otimes _\alpha }\varDelta _\beta \). On the other hand, every solution of \(\varPi \mathop {\otimes _\alpha }\varDelta _\beta \) can be translated to the solution of \({\varPi }\mathop {\triangleright }{\alpha }\) by replacing landmark actions with original actions of \({\varPi }\mathop {\triangleright }{\alpha }\).

The crucial point is that landmark actions are assigned significantly decreased action costs so that the underlying planner prefers landmark actions to original actions. In our implementation, the costs are set as follows. Suppose that \({\varPi }\mathop {\triangleright }{\alpha }\) is being extended with \(\varDelta _\beta \). The cost of \(\textsf {lm-act} _{}(a, from , to )\) is set to 1 if \(a\) is owned by \(\alpha \) (the receiver of \(\varDelta _\beta \)) or if \(a\) is owned by \(\beta \) (the sender). Otherwise, the cost 10 is used.

The reasoning behind this choice is that the owner (either \(\alpha \) or \(\beta \)) of the action has full information about it and thus it is more likely to be used correctly.

Definition 14 can be easily adjusted so that it allows repeated extension of \({\varPi }\mathop {\triangleright }{\alpha }\). During planning, problem \({\varPi }\mathop {\triangleright }{\alpha }\) is extended with every \(\varDelta _{\beta _i}\) received from other agents in the previous loop iteration, one by one. Hence, the underlaying planner is launched with the problem

$$\begin{aligned} \varPi \otimes _\alpha \varDelta _{\beta _1}\otimes _\alpha \varDelta _{\beta _2}\otimes _\alpha \cdots \otimes _\alpha \varDelta _{\beta _m} \end{aligned}$$

provided \(\otimes _\alpha \) is left associative. In the first iteration, the local problem \({\varPi }\mathop {\triangleright }{\alpha }\) is used without any extension.

Practical experiments revealed that the actions costs and landmark actions described in this section are crucial for a practical usage of Algorithm 3. For experimental evaluation, see Sect. 8.

6 Improving PSM planner performance

In this section, we propose two methods to improve performance of multiagent planning Algorithm 3. The first method, which we call Psm-v, is based on plan verification, and it is described in Sect. 6.1. The second method, which we call Psm-r, computes a relaxed solution \(\pi \) of a given MA-Strips problem, and its public projection \({\pi }\mathop {\triangleright }{\star }\) is used as an initial landmark by all the agents. Details are provided in Sect. 6.2. Both the methods can also be combined and used together. The combination of the methods is denoted Psm-rv. Section 8 evaluates the impact of the methods both when used separately and when used together.

6.1 Plan verification and analysis

Distributed PSM planner from Algorithm 3 uses public plans generated by other agents as landmarks to guide future plan search (see Sect. 5.2). However, it is desirable to use only extensible plans to guide plan search because non-extensible plans cannot lead to a non-empty public PSMs intersection. Every generated plan should be verified by other agents in order to determine its extensibility. However, extensibility (or \(\alpha \)-extensibility) checking is expensive, and thus, we propose only an approximative method of plan verification. An extension of a PSM planner with a method of plan verification has already been proposed with promising results [17]. The rest of this section describes the method.

First we describe how to approximate \(\alpha \)-extensibility of public plan \(\sigma \). Given a public plan \(\sigma =\langle \textit{id}_1,\ldots ,\textit{id}_n \rangle \), we create a problem \({\varPi }\mathop {\circledast }_{\alpha }{\sigma }\) which is solvable iff \(\sigma \) is \(\alpha \)-extensible. We extend the set of facts with fresh facts \(P_{ marks }=\{m_0,\ldots ,m_n\}\). We add the action \(\textsf {lm-act} _{}(a_i,m_{i-1},m_i)\) for all \(0<i\le n\) to the actions of \({\varPi }\mathop {\circledast }_{\alpha }{\sigma }\) (see Definition 13). The initial state of \({\varPi }\mathop {\circledast }_{\alpha }{\sigma }\) is extended with \(m_0\), but, this time, the last mark fact \(m_n\) is added to the goal of \({\varPi }\mathop {\circledast }_{\alpha }{\sigma }\). This ensures that any solution of \({\varPi }\mathop {\circledast }_{\alpha }{\sigma }\) contains all actions from \(\sigma \) in the right order, possibly interleaved with \(\alpha \)’s internal actions. Hence, every solution of \({\varPi }\mathop {\circledast }_{\alpha }{\sigma }\) can be translated to a solution of \({\varPi }\mathop {\triangleright }{\alpha }\) (but not necessarily the other way round). The following formalizes construction of \({\varPi }\mathop {\circledast }_{\alpha }{\sigma }\) and proves its correctness.

Definition 15

Let \(\varPi \) be a MA-Strips problem and \(\alpha \in \textsf {agents} (\varPi )\). Let \(\sigma =\langle \textit{id}_1,\ldots ,\textit{id}_n \rangle \) be a public plan of \(\varPi \). Let \(P_{ marks }=\{m_0,\ldots ,m_n\}\) be a set of facts distinct from \(\textsf {facts} _{}(\varPi )\). The \(\alpha \)-extensibility check problem of \(\sigma \), denoted \({\varPi }\mathop {\circledast }_{\alpha }{\sigma }\), is the Strips problem \(\langle P_{ marks }\cup ({\textsf {facts} _{}(\varPi )}\mathop {\triangleright }{\alpha }),A,({\textsf {init} _{}(\varPi )}\mathop {\triangleright }{\alpha })\cup \{m_0\},\textsf {goal} (\varPi )\cup \{m_n\} \rangle \) where \(A=\textsf {int-actions} (\alpha )\cup \{\textsf {lm-act} _{\alpha }( id ,m_{i-1},m_i): 0<i\le n\}\). \(\square \)

Theorem 4

([35]) Let \(\varPi \) be an MA-Strips problem, let \(\alpha \) be an agent of \(\varPi \), and let \(\sigma \) be a public plan of \(\varPi \). Then \(\sigma \) is \(\alpha \)-extensible iff \(\textsf {sols} ({{\varPi }\mathop {\circledast }_{\alpha }{\sigma }})\ne \emptyset \).

Proof

Both the implications are proved similarly by replacing public actions with their respective landmark actions \((\Rightarrow )\) or the other way round \((\Leftarrow )\). \(\square \)

A previous attempt to use the above construction of \({\varPi }\mathop {\circledast }_{\alpha }{\sigma }\) can be found [35]. It tries to centrally generate public solutions \(\sigma \) and to verify that \(\sigma \) is \(\alpha \)-extensible by every agent \(\alpha \). A roadblock of this attempt is that it is relatively hard for agent \(\alpha \) to find out that \(\sigma \) is not \(\alpha \)-extensible. Then \(\textsf {sols} ({{\varPi }\mathop {\circledast }_{\alpha }{\sigma }})=\emptyset \), and it usually requires the underlaying planner used to solve \({\varPi }\mathop {\circledast }_{\alpha }{\sigma }\) to traverse the whole search space. That is because the state-of-the-art planners are optimized to find a solution of a given problem and not to determine that the problem is not solvable. That is why we have proposed [17] an approximate method to determine problem solvability.

Previously proposed approximation of \({\varPi }\mathop {\circledast }_{\alpha }{\sigma }\) solvability [17] is done using generic process calculi type system scheme Poly ✶  [18, 25]. The same result can be achieved using planning graphs [14, Chapter 6] which we briefly describe here. We construct a complete relaxed planning graph of \({\varPi }\mathop {\circledast }_{\alpha }{\sigma }\), that is, the planning graph of \({\varPi }\mathop {\circledast }_{\alpha }{\sigma }\) with action delete effects removed. A planning graph with k layers can be constructed in time polynomial in k. Then we examine the last fact layer of the constructed planning graph. Recall that \({\varPi }\mathop {\circledast }_{\alpha }{\sigma }\) contains fresh mark facts \(P_{ marks }=\{m_0,\ldots ,m_n\}\). When mark \(m_i\) is valid in some planning state, it means that (1) public actions \(a_1,\ldots ,a_i\) from \(\sigma \) were already correctly used in the current plan and that (2) the next public action to be used is \(a_{i+1}\). The result of the analysis is the maximum j such that \(m_j\) is in the last fact layer of the relaxed planning graph. This resulting j is interpreted as follows. When \(j<n\), clearly \({\varPi }\mathop {\circledast }_{\alpha }{\sigma }\) is unsolvable because \(m_n\) is a goal fact. Moreover, the result j tells us that there is no way for an agent to follow the public plan \(\sigma \) up to the point where \(a_{j+1}\) can be applied. This gives us an approximation of a valid prefix of \(\sigma \). On the other hand, \(j=n\) does not necessarily implies that \(\sigma \) is \(\alpha \)-extensible because the proposed method is only an approximation of \({\varPi }\mathop {\circledast }_{\alpha }{\sigma }\) solvability.

The above \(\alpha \)-extensibility approximation allows the following plan verification procedure. When agent \(\alpha \) generates a new plan \(\pi _\alpha \), it sends its public projection \({\pi _\alpha }\mathop {\triangleright }{\star }\) to all the other agents. Once other agent \(\beta \) receives \({\pi _\alpha }\mathop {\triangleright }{\star }\), it runs the above \(\beta \)-extensibility check and sends its result back to agent \(\alpha \) (just after line 4 of Algorithm 3). Agent \(\alpha \) collects analysis results from all the other agents and computes their minimum l. Plan \(\pi _\alpha \) is then stripped so that only the first l public actions remain in it. This stripped plan is then used to extend PSM \(\varGamma _\alpha \) in Algorithm 3. Hence, only the stripped plan is used as a landmark to guide future plan search. In the next iteration, a plan with public projection different from \(\pi _\alpha \) is required to be computed. When \({\pi _\alpha }\mathop {\triangleright }{\star }=\langle \textit{id}_1,\ldots ,\textit{id}_n \rangle \), we can even further speed up convergence of Algorithm 3 by forbidding any plan with public prefix \(\langle \textit{id}_1,\ldots ,\textit{id}_l,\textit{id}_{l+1} \rangle \) to be generated in the future. This does not affect completeness of Algorithm 3 (Theorem 3) because only provably non-extensible plans are forbidden.

Example 10

In this example, we demonstrate planning with plan analysis on our running Example 1. The first iteration is as follows. All the agents use an optimal planner in order to solve their local problems and thus agent \({{\texttt {\textit{Plane}}}}\) creates the simplest plan where the goal is reached by agent \({{\texttt {\textit{Truck}}}}\) alone. That is, \({{\texttt {\textit{Plane}}}}\) generates the plan \(\langle \texttt {unload} (\texttt {truck} ,\texttt {ostrava} ) \rangle \) (see also Example 5). This plan does not pass through the verification process of agent \({{\texttt {\textit{Truck}}}}\) because \({{\texttt {\textit{Truck}}}}\) needs to execute additional public actions prior to the last goal-reaching action. In the meanwhile, agent \({{\texttt {\textit{Truck}}}}\) generates a plan with following public projection.

$$\begin{aligned} \sigma = \langle \texttt {unload} (\texttt {plane} ,\texttt {brno} ),\ \texttt {load} (\texttt {truck} ,\texttt {brno} ),\ \texttt {unload} (\texttt {truck} ,\texttt {ostrava} ) \ \rangle \end{aligned}$$

This public plan is extensible and thus passes through the verification check. Landmark actions created from \(\sigma \) are added to \({{\texttt {\textit{Plane}}}}\)’s local problem. The intersection of public PSMs after the first iteration is empty because \({{\texttt {\textit{Plane}}}}\)’s PSM is empty.

In the second iteration, \({{\texttt {\textit{Plane}}}}\) follows the landmarks from the above public plan \(\sigma \) and it succeeds by generating a plan with the public projection \(\sigma \). At this point, public PSMs of both the agents contain \(\sigma \) and hence the algorithm terminates after the second iteration independently on the plan generated by \({{\texttt {\textit{Truck}}}}\). \(\square \)

6.2 Initial relaxed plan landmark

The delete effect relaxation, where delete effects of actions are ignored, has proved its relevance both in Strips planning [15] and recently also in MA-Strips planning [32]. It is known that to find a solution of a relaxed problem is an easier task than to find a solution of the original problem. There are distributed algorithms to find a solution of a relaxed MA-Strips problem using distributed planning graphs [31]. Effective implementation using exploration queues can also be found in the literature [32]. All these algorithms respect privacy, that is, they do not reveal internal facts and actions to other agents.

We use a relaxed solution of MA-Strips problem \(\varPi \) to improve Algorithm 3 as follows. At first we compute some solution \(\pi \) of the relaxation of \(\varPi \). We compute its public projection \({\pi }\mathop {\triangleright }{\star }\) which is a sequence of public action ids. We use this id sequence as an initial landmark in the first loop iteration of Algorithm 3. The sequence is integrated into \({\varPi }\mathop {\triangleright }{\alpha }\) in the same way as in Definition 14 (the public projection \({\pi }\mathop {\triangleright }{\star }\) can be seen as a public PSM which contains only \({\pi }\mathop {\triangleright }{\star }\)). The same initial landmark is used by all the agents. When \({\pi }\mathop {\triangleright }{\star }\) is extensible, every agent \(\alpha \) is likely to generate local solution \(\pi _\alpha \) such that \({\pi _\alpha }\mathop {\triangleright }{\star }={\pi }\mathop {\triangleright }{\star }\) in the first iteration. In that case, the algorithm terminates directly in the first iteration causing a dramatic speedup. Otherwise, the initial landmark is forgotten by all the agents and the algorithm continues by the second iteration as before. Practical impact of initial relaxed plan landmarks is experimentally evaluated in Sect. 8.

Example 11

In our running Example 1, we can find a relaxed solution and compute its public projection. The shortest solution has the following public projection.

$$\begin{aligned} \sigma = \langle \ \texttt {unload} (\texttt {plane} ,\texttt {brno} ),\ \texttt {load} (\texttt {truck} ,\texttt {brno} ),\ \texttt {unload} (\texttt {truck} ,\texttt {ostrava} ) \ \rangle \end{aligned}$$

We can see that this public plan \(\sigma \) is extensible. When the agents uses \(\sigma \) as landmarks in the first iterations, both of them succeed to generate a plan with public projection \(\sigma \). Hence, the intersection of public PSMs is not empty and the algorithm terminates after the first iteration. \(\square \)

7 From theory to implementation

In this section, we describe several interesting problems encountered when implementing a PSM-based planner. Firstly, the theory is based on MA-Strips formalism which is based on Strips. Just like Strips, MA-Strips requires grounded problem specification which is not appropriate for real-world problems. Section 7.1 describes how to move toward more convenient PDDL-like planning language which allows a compact representation by introduction of parametric actions. Then, in Sect. 7.2, we look on formalisms extending single-agent PDDL to multiagent planning problems focusing mainly on definition of privacy of agent knowledge. Finally, in Sect. 7.3, we describe how to handle internal goals without the need to publish them.

7.1 From Strips to PDDL, and back again

Strips language is a formal language often used in automated planning theory. Strips also provides a formal base for MA-Strips. Nevertheless, it is not very practical for real-world problems because it supports only grounded representation. Therefore, in practice and in benchmark tests, planning domain definition language (PDDL) is typically used. PDDL supports predicates with typed parameters which allow to describe a full range of facts or actions by parametric statements. When we convert our running Example 1 into PDDL language, we can have only one parametric action \(\texttt {drive} ({{\texttt {\textit{from}}}},{{\texttt {\textit{to}}}})\) instead of multiple actions for different locations. But backward grounding of this parametric action can create instances which were not in the original domain. To get rid of these instances, new predicate \(\mathtt {isRoad({{\texttt {\textit{from}}}}, {{\texttt {\textit{to}}}})}\) can be introduced. Such a predicate is never part of action effects, and thus, it is constant during execution of any plan. When we ground a PDDL problem, we can evaluate these predicates and we can omit action instances where the predicate evaluates to false because these instances can never be used. Using the same definition of public facts for PDDL as we described in MA-Strips, these constants would be public, because they appear as precondition of actions of different agents. Nevertheless, it is not necessary to communicate them because their evaluation never changes and thus every agent can has its own copy.

A conversion from PDDL to Strips, that is, grounding, is needed in two places. Firstly, we use it to compute the size of a problem because number of predicates and actions of the grounded problem better describes real complexity of the problem. More importantly, it is needed to compute MA-Strips representation of input problems because Fmap problems, which we use as benchmarks (see Sect. 8), are defined in the PDDL format. Hence, input PDDL problem needs to be grounded so that we can use MA-Strips definition of fact privacy classification. Fmap benchmark problems define a separate domain and problem PDDL file for every agent. Firstly, we create a single-agent problem by merging all agent domains and problems. Secondly, we use grounding algorithm implemented in FastDownward plannerFootnote 4 to ground it. Then we take all the facts used in the grounded problem and ground the original agents’ problems to these facts.

7.2 Multiagent PDDL extensions

Strips and PDDL are two standard languages to describe deterministic single-agent planning problems. Nevertheless, there is no similar standard in the multiagent planning. There are several attempts to create such a standard: NADL [19], concurrent interacting actions in Strips [5], MAPL [8], Fmap [34], MA-PDDL [23], concurrent STRIPS [29], and MA-Strips [6].

In our work, we focus on PDDL which allows to describe agent actions and a state of the world more naturally. PDDL is also used as a standard in international planning competition whose planning problems are used as standard banchmarks in planning community. PDDL extensions which we have considered in our work are as follows.

  • Fmap—each agent has its own domain and problem file. PDDL language is extended with \(\mathtt {shared}-\mathtt {data}\) field which allows to specify which predicates are shared with which agents. However, this approach does not always allow to define fact privacy classification as defined by MA-Strips.

  • MA-PDDL—extension of PDDL 3.1 which allows to specify the owner of each action using field \(\mathtt {agent}\). Nevertheless, it does not allow to manually specify what facts are public/internal.

None of the existing PDDL extensions allow to express both Fmap (public facts specified by a list of public predicate names) or MA-Strips (see Sect. 3) privacy definitions. Hence, we propose our extension which allows us to finely tune the amount of knowledge shared among the agents up to the level of single facts.

7.3 Problems with internal goals

So far we have considered MA-Strips problems where all goal facts are public. This can always be achieved by publishing of internal goal facts. Nevertheless, in some cases, agents can have different internal goals and agents might not be willing to share these facts with other agents out of privacy concerns. We still consider cooperating agents. This section describes how to transform an MA-Strips problem with internal goals to an equivalent problem where the original internal goals can be kept internal.

The transformation extends each agent \(\alpha \) with a public confirmation action which can be executed when all the goals of \(\alpha \) are satisfied. The confirmation actions can be executed only at the end of a plan. The goal of the transformed problem expresses that all the agents confirmed their goals.

More formally, let \(\varPi \) be an MA-Strips problem where some or all goals are internal. Firstly, we introduce a fresh fact \(\mathtt {planning}\) with the meaning that planning is in progress, that is, that no confirmation action (of any agent) has been used so far. Fact \(\mathtt {planning}\) shall be initially valid in the initial state of the transformed problem and shall be deleted by the first confirmation action. Every action from \(\varPi \) shall be extended with precondition \(\mathtt {planning}\). Next we introduce fresh virtual goal facts \(\mathtt {done}_1, \ldots , \mathtt {done}_n\). Fact \(\mathtt {done}_i\) means that the confirmation action of the ith agent was used. The confirmation action \(\textsf {confirm} (\alpha ,i)\) of agent the ith agent \(\alpha \) can be used when all the goals relevant for \(\alpha \) (i.e., public goals and \(\alpha \)’s internal goals) are satisfied. Its add effect is the virtual goal \(\mathtt {done}_i\), and it deletes \(\mathtt {planning}\), formally, \( \textsf {confirm} (\alpha ,i) = \langle {\textsf {goal} (\varPi )}\mathop {\triangleright }{\alpha },\{\mathtt {done}_i\},\{\mathtt {planning}\} \rangle \). The goal of the transformed MA-Strips problem is set to \(\{\mathtt {done}_1,\ldots ,\mathtt {done}_n\}\).

There is a direct correspondence between solutions of the original and of the transformed problem. Every solution of the transformed problem can be translated to a solution of the original problem by removing confirmation actions. Moreover, the goal facts of the transformed problem can be freely published because they do not carry any confidential information.

8 Experimental results

We have performed a set of experiments to compare our planners with another state-of-the-art multiagent plannersFootnote 5 and also to evaluate the impact of plan verification on planning times. We have decided to compare our planners with Fmap [34], RDFF [32], and GPPP [26]. All planners are compared on well-defined problems taken from international planning competition (IPC) problems as published by Fmap authors. Fmap classifies facts as public or internal using a manual selection of public predicate names. On the other hand, RDFF and GPPP use privacy classification as defined by MA-Strips. In practice, Fmap public facts are a superset of MA-Strips public facts, and our PSM-based algorithms can handle both privacy classification. In our experiments, we use exactly the same input files as the authors of Fmap used during its evaluation,Footnote 6 and we also use the same time limit of 30 min for each problem.

The above-mentioned state-of-the-art multiagent planners are compared with several variants of our PSM-based planner. Variant Psm is the basic version described by Algorithm 3 in Sect. 5. Then we have two extensions of this basic algorithm. Variant Psm-r uses initial relaxed plan landmarks described in Sect. 6.2. Variant Psm-v is the basic algorithm extended with the plan verification as described in Sect. 6.1. Finally, there is Psm-rv which combines both extensions into a solid planner that benefits from of all these features.

The experiments are organized as follows. Firstly, we describe benchmark domains in Sect. 8.1. Then, in Sect. 8.2, we compare the variants of our algorithm and compare it with Fmap planner which has currently the highest coverage between multiagent planners. In Sect. 8.3, we further analyze the communication of the Psm variants and explore its connection with the number of iteration needed to solve a problem. The experiments are concluded in next Sect. 9 where we focus on different privacy classifications. We compare our algorithm with MA-Strips-based planners and show how the increase of privacy of facts and goals affects performance of our planner.

8.1 Benchmark domains

We have performed experiments on 244 PDDL problems from 10 domains described in Fig. 7. These problems, published by the authors of Fmap, are inspired by traditional benchmark problems of international planning competition (IPC).Footnote 7 The privacy classification is defined as a list of predicates shared with other agent. It is therefore possible to specify that some knowledge is shared between two agents only. However, in these problems all knowledge is always shared among all the agents. Goals are always public. Let us call this set of problems Fmap problems and refer to this level of privacy as Fmap privacy.

In Fmap problems, constants known to all agents are often considered to be internal. For example, every agent knows that there is a road between two cities, but the agents do not know that other agents know it. Thus, for example in Driverlog domain, every truck publishes its \(\mathtt {drive}\) action without precondition that there is a road connecting two cities. In the descriptions bellow, when it is stated that “All information is public,” the problem can contain these internal constants.

Fig. 7
figure 7

Fmap benchmark domains description

8.2 Overall benchmark results

Table 1 shows an overall coverage of solved problems. We can see that the Fmap has better results in most of the domains and also in the overall coverage when compared with basic Psm, Psm-r and Psm-v variants. Nevertheless, we can see that both Psm-r and Psm-v excel in few domains (Psm-r in Elevators and Logistics, and Psm-v in Rovers). Psm extended with both features—Psm-rv—keeps the benefits of these features and outperforms Fmap in the overall score.

Table 1 Number of problems solved by the compared planners

A relaxed plan helps especially in Elevators and Logistics domains. In both domains, the relaxed plan well captures the coordination points, that is, where the passenger (or package) will be transported by which agent (elevator or truck). The relaxed plan thus represents a solution outline which is then extended by each agent with internal actions. This allows to solve the task in a single iteration.

Plan verification, represented by Psm-v variant, helped in Driverlog and Rovers problems where many solutions generated by an agent were unacceptable by some other agent. In Driverlog, this is caused by the privacy of constants defining the topology of the world (the predicates \(\mathtt {link}\) and \(\mathtt {path}\) describing connected locations). The convergence is improved by trimming out parts of plans which are impossible to fulfill, that is, a \(\mathtt {drive}\) action between locations which are not connected by a road. In Rovers, the situation is similar—private constants describe abilities of each rover. When an agent requires some action from another agent which cannot be performed, the verification allows to remove such a plan from landmarks so that other agents are not confused by it.

Table 2 compares run times needed to solve selected tasks solvable by all the Psm variants. We can see that all the Psm variants scale much better than Fmap, especially in the Openstacks problems which are all solved in the first iteration. Psm performs best in most domains. This is a result of the requirement that the selected problems have to be solvable by all the variants and the basic Psm does not need to spend time on relaxed plan creation or on verification.

Table 2 Comparison of run times on selected problems solved by all the planners

Left graph of Fig. 8 shows how much time it is needed to solve different problems by Psm-rv as a function of problem size. The problem size is calculated as a number of actions and facts of the grounded problem (described in Sect. 7.1). Right graph of Fig. 8 shows the time spent during the verification of other agents’ plans. It shows that an agent spends less than one-third of its computation time on verification. In average, it is approximately 14 % of agent computation time. The relative time needed for verification is independent on the problem size, and its grow/descent depends on a particular domain.

8.3 Communication overhead evaluation

Figure 9 shows an average number of iterations required to solve problems of selected domains (left), and the amount of communication among the agents (right). The averages are taken only for the problems solved by all three variants: Psm-r, Psm-v, and Psm-rv. The numbers in parenthesis show how many problems is in this intersection (note that no Depot nor Logistic problem is solvable by Psm-r and thus the domains are not included). The performance measured by the number of iterations does not differ much over the domains (with the exception of elevators where the number of iterations of one problem is significantly decreased). The amount of communication is measured as the total number of actions sent among all agents. This includes the exchange of public plan projections used as landmarks and also queries for plan verification. As expected, we can see that verification requires additional communication, but in several domains this increase is outweighed by the decrease in number of iterations needed to solve the task.

Fig. 8
figure 8

Left time needed to solve task as a function of problem size (log axis; time 2000 means not solved). Right portion of time an agent spends on verification of other agents’ plans

Fig. 9
figure 9

Number of iterations and amount of communication of different methods—always taken as average over problems solved by all methods in the graph (number of problems is appended to the domain name)

9 Privacy analysis of planning problems

In this section, we analyze different privacy classifications (Sect. 9.1) and we compare benchmark domains with respect to the privacy classifications (Sect. 9.2). We experimentally evaluate the impact of privacy classifications on the performance of our best PSM-based planner Psm-rv. We compare Psm-rv with other state-of-the-art MA-Strips planners (Sect. 9.3).

9.1 Privacy classifications

In Sect. 3 above, we have described MA-Strips privacy classification which defines the minimal amount of public information needed to be shared by different agents. MA-Strips requires facts used by actions of at least two different agents to be public. We have converted Fmap problems to MA-Strips problems (as described in Sect. 7.1) which reduces the amount of public information. Moreover, we allow agents to have internal goals which further increases privacy.

We now have four different privacy classifications for our benchmark problem set. First is the original Fmap privacy setting. Second, denoted \(\textsc {Fmap} ^\circleddash \), is a variant of Fmap where those facts mentioned only by a single agent are made internal (which can make some goals internal). Third, denoted MA-Strips, is the MA-Strips privacy classification with public goals. Fourth, denoted \(\textsc {MA} {\text {-}}\textsc {Strips} ^\circleddash \), is the MA-Strips classification which allows internal goals.

Let \(\textsc {Fmap} (\varPi )\) denote the percentage of private facts in problem \(\varPi \) with respect to Fmap classification, and similarly for the other three privacy classifications. It certainly holds that \(\textsc {Fmap} (\varPi ) \le \textsc {Fmap} ^\circleddash (\varPi )\) and \(\textsc {MA} {\text {-}}\textsc {Strips} (\varPi ) \le \textsc {MA} {\text {-}}\textsc {Strips} ^\circleddash (\varPi )\), and finally \( \textsc {Fmap} (\varPi ) \le \textsc {MA} {\text {-}}\textsc {Strips} (\varPi ) \).

9.2 Privacy in benchmark domains

Privacy of agent knowledge in different domains is in details described in Fig. 10.

Fig. 10
figure 10

Description of domains with extended privacy

Table 3 Percentage of internal facts (left) and actions (right) in benchmark domains with respect to different privacy classifications

Privacy classifications on benchmark domains, measured as a relative ratio of internal facts and actions, are demonstrated in Table 3.

9.3 Privacy benchmarks

In Sect. 8 above, we have compared our PSM-based planners with Fmap on problems with Fmap privacy classification. Now we compare our best Psm-rv with MA-Strips-based algorithms RDFF and GPPP. Both planners, similarly to Fmap, are state-space search planners. RDFF [32] is based on distribution of A* algorithm with distributed heuristics with a variable number of agents involved in heuristic estimation. Greedy Privacy Preserving Planner (GPPP) [26] is based on an iterative deepening search in relaxed subproblems enhanced with landmarks. Table 4 shows that Psm-rv outperforms both state-of-the-art algorithms,Footnote 8 even though GPPP solved more instances of Depots, Satellite, and Zenotravel domains.

Table 4 Number of MA-Strips problems solved by the compared planners: RDFF, GPPP, and Psm-rv
Table 5 Problem coverage of Psm-rv on benchmark domains with different privacy classifications

Table 5 shows overall results of Psm-rv algorithm for all privacy settings. Unfortunately, we are not aware of any planner that could be used to compare the performance on problems with internal goals. We can see that in some domains, increased privacy improves the performance of the planner, while in others, the performance decreases. For example, in Satellites problems, the most difficult goal is the final position pointing of a satellite (predicate \(\mathtt {pointing}\)). If this goal is internal, then the complexity of the problem decreases. The opposite case is represented by Openstacks domain. When the goals are public, a producer (agent manufacturer) can more easily create a plan fitting the requirements of a consumer (agent manager). Internal goals improve the performance with MA-Strips classifications, but the effect on Fmap classifications is the opposite. The best overall coverage is for \(\textsc {MA} {\text {-}}\textsc {Strips} ^\circleddash \) with 195 solved problems out of 244 problems. This shows that increase of internal facts does not only support privacy, but it can also improve performance (Fig. 11).

Fig. 11
figure 11

Performance of Psm-rv with different privacy classifications measured by the number of iterations and amount of communication. Values are averages over those problems solved by all the planners. Number of the problems considered for each domain is in parenthesis

10 Conclusions

We have proposed a sound and complete approach for privacy preserving multiagent planning with external actions. It combines compilation for a classical planner with a compact representation of local plans. A compact representation of local plans in the form planning state machines (PSM) allows us to effectively implement desired operations, namely PSM intersection and public projection. Improving the planner with distributed delete-relaxation heuristics (Psm-r) and approximative plan analysis (Psm-v) proved to be effective in practice. Each of the improvements helps to solve a particular class of problems, and we have shown that both the improvements can be combined together without impairing each other (Psm-rv). Experimental evaluation revealed that Psm-rv outperforms state-of-the-art planner FMAP by more than 8 %. The improvements also reduced required communication by decreasing the number of iterations needed to solve benchmark problems. Private goals were added to the planner by a compilation scheme providing transformation of private goals to public ones without loss of privacy.

We have performed analysis of common multiagent planning benchmarks from the perspective of different privacy classifications. An indisputable advantage of our approach is that it can be used under different privacy classifications which has been confirmed by experiments. The result is that Psm-rv is best performing on the privacy classification defined by MA-Strips with private goals, that is, on the classification which contains the lowest amount of public facts out of the tested classifications. Usability of our planner with different privacy classifications allowed us to directly compare our approach with other planners designed particularly for MA-Strips privacy classification with public goals. The result is that Psm-rv outperforms state-of-the-art planners RDFF and GPPP.

The proposed planning approach constitutes a generic scheme which can be extended with other improvements and heuristics. While two such extensions were presented in this work, other extensions can be introduced to target most recent challenges in multiagent planning. The extensions targeting planning with a pair-wise visibility of facts, planning in a lifted form, and planning with joint actions are left for future work. With these extensions, we believe the presented approach becomes a scalable multiagent planner necessary to tackle real-world problems.