Keywords

1 Introduction

Multi-agent planning deals with a problem of finding a coordinated sequence of actions for a set of entities (or agents), such that the actions are applicable from a predefined initial state and transform the environment to a state where predefined goals are fulfilled. If the environment and the actions are deterministic (that is their outcome is unambiguously defined by the state they are applied in), the problem is a deterministic multi-agent planning problem [3]. Furthermore, if the set of goals is common to all agents and the agents cooperate in order to achieve the goals, the problem is a cooperative multi-agent planning problem. The reason the agents cannot simply feed their problem descriptions into a centralized planner typically lies in that although the agents cooperate, they want to share only the information necessary for their cooperation, but not the information about their inner processes. Such privacy constraints are respected by privacy preserving multi-agent planners.

A number of privacy preserving multi-agent planners have been proposed in recent years, such as MAFS [10], FMAP [15], MADLA [19], PSM [16] and GPPP [8]. Although all of the mentioned planners claim to be privacy-preserving, formal proofs of such claims do not exist. The privacy of MAFS is discussed in [10] and expanded upon in [2], proposing Secure-MAFS, a version of MAFS with stronger privacy guarantees. These guarantees are proven for a family of planning problems, but does not hold generally. The approach was recently generalized in the form of Macro-MAFS [7], however without strengthening the claims about privacy.

Here, we propose a parameterized variant of strong privacy preserving planning using the definition of privacy in [10]. We show how the two extremities of the parameter lead to strong privacy preserving, but inefficient planner; or weak privacy preserving, but efficient planner.

This article is a reworked and extended version of the paper [17]. In this version, we have reformulated most of the definitions and proofs to improve readability and conciseness of the arguments. We provide tighter bounds on privacy leakage and propose a novel example illustrating the principle of the proposed planner. Moreover, the novel example provides a ground for novel claim about privacy leakage in multi-agent planning in general.

2 Multi-agent Planning

The most common model for multi-agent planning is MA-Strips  [3] and derived models (such as MA-MPT [10] using multi-valued variables). We reformulate the MA-Strips definition and we also generalize the definition to multi-valued variables. Formally, for a set of agents \(\mathcal {A}\), a problem \(\mathcal {M}=\{\Pi _{i}\}_{i=1}^{|\mathcal {A}|}\) is a set of agent problems. An agent problem of agent \(\upalpha _{i}\in \mathcal {A}\) is defined as

$$\begin{aligned} \Pi _{i}=\left\langle \mathcal {V}_{i}=\mathcal {V}^{\mathsf {pub}}\cup \mathcal {V}^{\mathsf {priv}}_{i},\mathcal {O}_{i}=\mathcal {O}^{\mathsf {pub}}_{i}\cup \mathcal {O}^{\mathsf {priv}}_{i}\cup \mathcal {O}^{\mathsf {proj}},s_{I},s_{\star }\right\rangle , \end{aligned}$$

where \(\mathcal {V}_{i}\) is a set of variables s.t. each \(V\in \mathcal {V}_{i}\) has a finite domain \(\mathsf {dom}(V)\), if all variables are binary (i.e. \(|\mathsf {dom}(V)|=2\)), the formalism corresponds to MA-Strips. The set of variables is partitioned into the set \(\mathcal {V}^{\mathsf {pub}}\) of public variables (with all values public), common to all agents and the set \(\mathcal {V}^{\mathsf {priv}}_{i}\) of variables private to \(\upalpha _{i}\) (with all values private), such that \(\mathcal {V}^{\mathsf {pub}}\cap \mathcal {V}^{\mathsf {priv}}_{i}=\emptyset \). A complete assignment over \(\mathcal {V}\) is a state, partial assignment over \(\mathcal {V}\) is a partial state. We denote s[V] as the value of V in a (partial) state s and \(\mathsf {vars}(s)\) as the set of variables defined in s. The state \(s_{I}\) is the initial state of the agent \(\upalpha _i\) containing only \(\mathcal {V}^{\mathsf {pub}}\) and \(\mathcal {V}^{\mathsf {priv}}_i\) variable. \(s_{\star }\) is a partial state representing the goal condition, that is if for all variables \(V\in \mathsf {vars}(s_{\star })\), \(s_{\star }[V]=s[V]\), s is a goal state. Similarly, as in [9], we require all goals to be public, as private goals can be transformed into a public equivalent [16], i.e. \(\mathsf {vars}(s_{\star }) \subseteq \mathcal {V}^{\mathsf {pub}}\).

The set \(\mathcal {O}^{}_{i}\) of actions comprises of a set \(\mathcal {O}^{\mathsf {priv}}_{i}\) of private actions of \(\upalpha _{i}\), a set \(\mathcal {O}^{\mathsf {pub}}_{i}\) of public actions of \(\upalpha _{i}\). A public projection of an action removes all its private parts. The set \(\mathcal {O}^{\mathsf {proj}}\) contain public projections of other agents’ actions. \(\mathcal {O}^{\mathsf {pub}}_{i}\), \(\mathcal {O}^{\mathsf {priv}}_{i}\), and \(\mathcal {O}^{\mathsf {proj}}\) are pairwise disjoint. An action is defined as a tuple \(a=\left\langle \mathsf {\mathsf {pre}}(a),\mathsf {\mathsf {eff}}(a)\right\rangle \), where \(\mathsf {\mathsf {pre}}(a)\) and \(\mathsf {\mathsf {eff}}(a)\) are partial states representing the precondition and effect respectively. An action a is applicable in a state s if \(s[V]=\mathsf {\mathsf {pre}}(a)[V]\) for all \(V\in \mathsf {vars}(\mathsf {\mathsf {pre}}(a))\) and the application of a in s, denoted \(a\circ s\), results in a state \(s'\) s.t. \(s'[V]=\mathsf {\mathsf {eff}}(a)[V]\) if \(V\in \mathsf {vars}(\mathsf {\mathsf {eff}}(a))\) and \(s'[V]=s[V]\) otherwise. A public action can be defined using a mixture of public and private preconditions and effects. A private action can be defined only over the private variables. As we often consider the planning problem from the perspective of agent \(\upalpha _{i}\), we omit the index i.

We model all “other” agents as a single agent (the adversary), as all the agents can collude and combine their information in order to infer more. The public part of the problem \(\Pi \) which can be shared with the adversary is denoted as a public projection. The public projection of a (partial) state s is \(s^{\vartriangleright }\), restricted only to variables in \(\mathcal {V}^{\mathsf {pub}}\), that is \(\mathsf {vars}(s^{\vartriangleright })=\mathsf {vars}(s)\cap \mathcal {V}^{\mathsf {pub}}\). We say that \(s,s'\) are publicly equivalent states if \(s^{\vartriangleright }=s'^{\vartriangleright }\). The public projection of action \(a\in \mathcal {O}^{\mathsf {pub}}\) is \(a^{\vartriangleright }=\left\langle \mathsf {\mathsf {pre}}(a)^{\vartriangleright },\mathsf {\mathsf {eff}}(a)^{\vartriangleright }\right\rangle \) and of action \(a'\in \mathcal {O}^{\mathsf {priv}}\) is an empty action \(\mathsf {noop}\). The public projection of \(\Pi \) is \(\Pi ^{\vartriangleright }=\left\langle \mathcal {V}^{\mathsf {pub}},\{a^{\vartriangleright }|a\in \mathcal {O}^{\mathsf {pub}}\},s_{I}^{\vartriangleright },s_{\star }^{\vartriangleright }\right\rangle .\)

We define the solution to \(\Pi \) as follows. A sequence \(\uppi =(a_{1},...,a_{k})\) of actions from \(\mathcal {O}^{}\), s.t. \(a_{1}\) is applicable in \(s_{I}=s_{0}\) and for each \(1\le i\le k\), \(a_{i}\) is applicable in \(s_{i-1}\) and \(s_{i}=a_{i}\circ s_{i-1}\), is a local \(s_{k}\)-plan, where \(s_{k}\) is the resulting state. If \(s_{k}\) is a goal state, \(\uppi \) is a local plan, that is a local solution to \(\Pi \). A local plan contains actions only of one agent, public, private or projected. Note that the actions in \(\mathcal {O}^{\mathsf {proj}}\) are assumed to be of the particular agent as well.

Such local plan \(\uppi \) does not have to be the global solution to \(\mathcal {M}\), as the actions of other agents (\(\mathcal {O}^{\mathsf {proj}}\)) are used only as public projections and are missing private preconditions and effects of other agents. The public projection of \(\uppi \) is defined as \(\uppi ^{\vartriangleright }=(a_{1}^{\vartriangleright },...,a_{k}^{\vartriangleright })\) with the \(\mathsf {noop}\) actions omitted.

From the global perspective of \(\mathcal {M}\) a public plan \(\uppi ^{\vartriangleright }=(a_{1}^{\vartriangleright },...,a_{k}^{\vartriangleright })\) is a sequence of public projections of actions of various agents from \(\mathcal {A}\) such that the actions are sequentially applicable with respect to \(\mathcal {V}^{\mathsf {pub}}\) starting in \(s_{I}^{\vartriangleright }\) and the resulting state satisfies \(s_{\star }^{\vartriangleright }\). A public plan is \(\upalpha _i\)-extensible, if by replacing \(a_{k'}^{\vartriangleright }\) s.t. \(a_{k'}\in \mathcal {O}^{\mathsf {pub}}_{i}\) by the respective \(a_{k'}\) and adding \(a_{k''}\in \mathcal {O}^{\mathsf {priv}}\) to required places we obtain a local plan (solution) to \(\Pi _{i}\). According to [16], a public plan \(\uppi ^{\vartriangleright }\) \(\upalpha _i\)-extensible by all \(\upalpha _{i}\in \mathcal {A}\) is a global solution to \(\mathcal {M}\).

The nature of MA-MPT planning allows for plans containing repeated action sequences (in extreme, repeated infinitely many times). Such repetitions however does not provide transformation to a state not yet visited. Therefore the length of global meaningful plans is bounded by the number of all possible states

$$\begin{aligned} \prod _{V\in \mathcal {V}^{\mathsf {pub}}\cup \bigcup _{i=1}^{|\mathcal {A}|}\mathcal {V}^{\mathsf {priv}}_i}{|\mathsf {dom}(V)|}. \end{aligned}$$
(1)

We define the sets of all meaningful global and local plans as \(\textsc {{sols}}(\mathcal {M})\) and \(\textsc {{sols}}(\Pi )\) respectively. Lengths of plans in both sets are limited by the presented bound on meaningful plans and therefore the sets are finite.

Since an agent can be required to repeatedly produce a particular value assignment of a variable, which is consumed (needed in precondition and changed in effect) by an action of another agent, we have to allow for meaningful repetitions in \(\textsc {{sols}}(\Pi )\). Length of a local plan with such repetitions is however still limited by the maximal length of its related global plan solving \(\mathcal {M}\). Therefore we use the same bound on length both for plans in \(\textsc {{sols}}(\mathcal {M})\) and \(\textsc {{sols}}(\Pi )\) where \(\Pi \in \mathcal {M}\). \(\textsc {{sols}}_l(\Pi )\) will denote sets of local plans of length l; \(\textsc {{sols}}_{\le l}(\Pi )\) and \(\textsc {{sols}}_{> l}(\Pi )\) denote sets of local plans of length no more than l and longer than l respectively. Finally, \(\textsc {{seqs}}(\Pi )\) will denote all possible sequences of actions (incl. non-plans) of the problem \(\Pi \).

2.1 Privacy

First definition of privacy leakage quantification was proposed in [18]. It was based on enumeration of all plans, which we used as the underlying principle for measuring the privacy leakage in this work. However, we do not explicitly require enumeration of the particular plans as in looser form, our bounds work with all possible action sequences. Note that the work in [18] is also not easily applicable to MA-Strips and MA-MPT.

The only rigorous definition of privacy for MA-Strips and MA-MPT so far was proposed in [10] and extended in [2]. The authors present two notions, weak and strong privacy preservation:

An algorithm is weak privacy-preserving if, during the whole run of the algorithm, the agent does not openly communicate private parts of the states, private actions and private parts of the public actions. In other words, the agent openly communicates only the information in \(\Pi ^{\vartriangleright }\). Even if not communicated, the adversary may deduce the existence and values of private variables, preconditions and effects from the (public) information communicated.

An algorithm is strong privacy-preserving if the adversary can deduce no information about a private variable and its values and private preconditions/effect of an action, beyond what can be deduced from the public projection \(\Pi ^{\vartriangleright }\) and the public projection of the solution plan \(\uppi ^{\vartriangleright }\).

2.2 Secure Computation

In general, any function can be computed securely [1, 20, 21], however it is not known how to encode the whole planning process into one function [14]. In this contribution, we focus on more narrow problem of private set intersection (PSI), where each agent has a private set of numbers and they want to securely compute the intersection of their private sets while not disclosing any numbers which are not in the intersection. The ideal PSI supposes that no information is transferred between the agents [11].

Ideal PSI can be solved with trusted third party which receives both private sets, computes the intersection, and sends it back to agent. As long as the third party is honest, the computation is correct and no information leaks.

In literature (e.g., [6, 11]), we can find several approaches how the ideal PSI can be solved without trusted third party. Presented solutions are based on several computational hardness assumptions, e.g., intractable large number factorization, DiffieHellman assumption [4], etc. All these assumptions break when an agent has access to unlimited computation power, therefore all the results hold under the assumption that \(P \ne NP\), in other words by computational intractability of breaking PSI.

3 \({\upvarepsilon }\)-Strong Privacy Preserving Multi-agent Planner

Multi-agent planner fulfilling the strong privacy requirement forms the lower bound of information exchanged between the agents. Agents do not leak any information about their internal problems and thus their cooperation cannot be effective [14], nevertheless, a strong privacy preserving multi-agent planner is an important theoretical result that could lead to better understanding of privacy preservation during multi-agent planning and consequentially also to creation of more privacy preserving planners.

In this contribution, we present a planner that is not strong privacy preserving but can be arbitrarily close to it. We focus on planning using projected plan-spaceFootnote 1 search [13] and thus we will define the terms in that respect. In the following definitions and proofs we suppose that there are two semi-honest (honest but curious) agents \(\upalpha _-\) and \(\upalpha _+\). We will consider the perspective of the agent \(\upalpha _-\) trying to detect the private information of \(\upalpha _+\) for the simplicity of the presentation, but all holds for both agents and also for a larger group of agents. Similarly to [2], we also assume that \(\mathcal {O}^{\mathsf {priv}}=\emptyset \). This assumption can be stated WLOG as each sequence of private actions followed by a public action can be compiled to a single public action.

Definition 1

(Public Plan Acceptance). Public plan acceptance \(P(\uppi ^{\vartriangleright })\) is a probability known to agent \(\upalpha _-\) whether a plan \(\uppi ^{\vartriangleright }\) is \(\upalpha _+\)-extensible.

When the algorithm starts, \(\upalpha _-\) has some a priori information \(P^0(\uppi ^{\vartriangleright })\) about acceptance of plan \(\uppi ^{\vartriangleright }\) by agent \(\upalpha _+\) (e.g., 0.5 probability of acceptance of each plan in the case when \(\upalpha _-\) knows nothing about \(\upalpha _+\)). At the end of execution of the algorithm, this information changes to \(P^*(\uppi ^{\vartriangleright })\). Obviously, every agent knows that the solution public plan \(\uppi ^*\) the agents agreed on is extensible and thus it is accepted by every agent, i.e. \(P^*(\uppi ^*) = 1\). The difference between the \(\upalpha _-\)’s a priori information and the final information represents information which leaked from \(\upalpha _+\) during their communication. Whether an agent is certain about acceptance of a plan can be expressed as \(|1 - 2P(\uppi ^{\vartriangleright })|\), normalized to an interval \(\langle 0,1 \rangle \), where 0 means not knowing anything about acceptance of the plan (\(P(\uppi ^{\vartriangleright })= 0.5\)) and 1 means certainty (\(P(\uppi ^{\vartriangleright })= 1\) or \(P(\uppi ^{\vartriangleright })= 0\)).

Definition 2

(Leaked Information). Leaked information from perspective of one agent during execution of a multi-agent planner leading to a solution \(\uppi ^*\) is a sum of changes in certainty about acceptance of the plan from the beginning \(P^0(\uppi ^{\vartriangleright })\) to the end \(P^*(\uppi ^{\vartriangleright })\) of planning excluding the solution plan \(\uppi ^*\)

$$\begin{aligned} \uplambda = \sum _{\uppi ^{\vartriangleright } \in \{\uppi ^{\vartriangleright }|\uppi \in \textsc {{sols}}(\Pi )\} \setminus \{\uppi ^*\}} \left| 1 - 2 P^*(\uppi ^{\vartriangleright })\right| - \left| 1 - 2 P^0(\uppi ^{\vartriangleright })\right| . \end{aligned}$$
(2)

As we do not assume the agents intentionally increase uncertainty about acceptance of other agents by sending invalid plans (the honest but curious agents), the certainty about acceptance of a plan can only grow, i.e.

$$\begin{aligned} \left| 1 - 2 P^*(\uppi ^{\vartriangleright })\right| - \left| 1 - 2 P^0(\uppi ^{\vartriangleright })\right| \ge 0, \text { thus } \uplambda \ge 0. \end{aligned}$$

Definition of algorithm’s leaked information allows us to formally define strong privacy of a projected plan-space planning algorithm. sec:strong-priv-pres

Proposition 1

(Strong Privacy). A planning algorithm is strong privacy preserving if it assures \(\uplambda = 0\).

Proof

Any information leakage allowing deduction of private information (preconditions or effects) in agent’s planning problem during planning affect probability of acceptance or rejection of plan projections by other agents as the preconditions and effects are the only principle preventing of acceptance or rejection of a sequence of actions. Therefore \(\uplambda = 0\) holds if and only if no information by Definition 2 leaked.

Definition 3

(\({\upvarepsilon }\)-Strong Privacy Preserving Planner). For given \({\upvarepsilon }> 0\), an planning algorithm is \({\upvarepsilon }\)-strong privacy preserving if it leaks acceptance or rejection of no more than \({\upvarepsilon }\) local plans, i.e. \(\uplambda \le {\upvarepsilon }\).

The high-level idea of our proposed planner (Algorithm 1) is based on a systematic generate-and-test principle, similar to our recent principle proposed in [16]. Local plans \(\uppi \) are generated in parallel by all agents and their public projections \(\uppi ^{\vartriangleright }\) are distributively tested whether there are some projections \(\uppi ^*\) common to all agents. Since only acceptable local plans \(\uppi \) thus public projections \(\uppi ^{\vartriangleright }\) are generated and tested, if a public projection common to all agents is found, it is guaranteed to be a global solution [16]. Provided that the distributed testing is done such that no information leaks, the only other point of possible information leakage is from the fact that a global solution was not found for a particular set of generated local plans. In other words, knowing \(\upalpha _+\) refused all possible solutions in a well defined set of plans, tells \(\upalpha _-\) the plans were refused because of some private preconditions of \(\upalpha _+\). Technically, the only parts of the algorithm, where the agents can learn something about each others’ plans is therefore at lines 9 and 11, where all agents know that a solution was not found for all plans generated by the iterations of the algorithm so far.

Let us assume the systematicity of the generate-and-test principle is in testing of incrementally longer plans. The length l of the agents’ local plans grows with each iteration of the main loop (lines 3, 4 and 13), therefore each distributed intersection (line 9) is done for generated plans of length \(\le l\). After each iteration, which did not end at line 11, the agent \(\upalpha _-\) knows that the agent \(\upalpha _+\) refuses a local plan projection \(\uppi ^{\vartriangleright }\) generated by \(\upalpha _-\). This increases the certainty about refusal of \(\uppi ^{\vartriangleright }\) and therefore increases \(\uplambda \). Note that such situation can be caused only by unfulfilled private preconditions of an action of the agent \(\upalpha _+\) which prevent \(\upalpha _+\) to generate \(\uppi ^{\vartriangleright }\). This principle is known as privately dependent actions, for more detail see [12].

The principle of iterations synchronized by length was used only for the sake of clearer explanation. The argument however holds WLOG also for other iterative schemes. If the length of the generated plans is not synchronized by the iterations, all local plans of length l will be eventually generated by both agents \(\upalpha _+\) and \(\upalpha _-\). When a solution of length \(l+1\) is found, \(\upalpha _-\) can use the same reasoning as in the previous paragraph to deduce \(\upalpha _+\) has some unfulfilled private preconditions in \(\uppi ^{\vartriangleright }\).

Generally, the same line of reasoning can be even used for any systematic plan generating algorithm, under the assumption all agents know the other agents’ systematic plan generation algorithms. There always exists a point in future when \(\upalpha _-\) knows that \(\upalpha _+\) had generated a plan, which would have be a solution of the problem and the algorithm would have end. And the only reason, why this had happened is that \(\upalpha _+\) has some unfulfilled private preconditions.

To fulfill the \({\upvarepsilon }\)-privacy requirement by the Algorithm 1, the systematically generated plans, which can leak information, are supplemented by a sufficiently large amount of randomly generated plans on line 7. As these plans are longer than the systematic ones, with a probability proportional to the number of such plans generated, they can shortcut finding of a solution sooner than using only the systematic plan generation. The formula \(|\textsc {{sols}}_{> l}(\Pi _i)|(1-\frac{{\upvarepsilon }}{|\textsc {{sols}}_{\le l}(\Pi _i)|})\) for the number of the shortcut plans k will be explained later as part of the \({\upvarepsilon }\)-privacy proof.

In summary, all agents sequentially generate solutions to their local problems \(\Pi _i\) at line 5. The systematic local plans are supplemented by longer randomly generated local plans at line 7. Then the agents create public plans by making public projections \(\uppi ^{\vartriangleright }\) of their generated solutions. Created public plans \(\uppi ^{\vartriangleright }\) are then stored in a set \({\Phi }_i\). As the plans \(\uppi \) are local solutions, they are \(\upalpha _i\)-extensible. Agents continuously check whether there are some plans in the intersection of these sets from all other agents. It is important to compute the intersection without disclosing any information about plans which do not belong to this intersection. Plans in the intersection are guaranteed to be \(\upalpha _i\)-extensible by all agents and thus form global solutions. If at least one global solution is found in \({\Phi }\), the algorithm ends at line 11. The algorithm ends for all agents in the same iteration, as the secure intersection is done distributively by all agents for all agents. Therefore the termination condition at line 10 is evaluated by all agents equally.

figure a

The description of the planning algorithm is followed by proofs of its soundness (a result of the algorithm is always a correct solution), completeness (if a planning problem has a solution, it is returned by the algorithm) and assurance on information leakage no more than required \({\upvarepsilon }\).

Theorem 1

(Soundness and Completeness). Algorithm SecureMAPlanner is sound and complete, under the assumption that the systematic plan generation procedure (line 5) is complete.

Proof

(Soundness) Every public plan returned by the algorithm is \(\upalpha _i\)-extensible by every agent, and thus it can be extended by all agents to a valid global solution (Lemma 1 in [16]).

(Completeness) Since there is only finite number of different plans of length at most l, all plans are eventually (in finite time) added to the plan set \({\Phi }_i\) under the assumption that the underlying plan generation procedure of the local solutions (line 5) is complete. The longest possible solution is finite by Eq. (1), thus \(\mathtt {SecureMAPlanner}()\) with systematic local planner ends in finite time when \(\mathcal {M}\) has a solution.

Theorem 2

(\({\upvarepsilon }\)-Strong Privacy). Algorithm \(\mathtt {SecureMAPlanner}()\) is \({\upvarepsilon }\)-strong privacy preserving when ideal PSI is used for the secure plan projection intersection (line 9).

Proof

The only points in the algorithm, where the agents communicate is in the distributed intersection of the public projections (line 9) and implicitly in the synchronized termination (lines 10–12).

To ensure privacy of the first point, both agents encode public projections of their plans into a set of numbers using the same encoding. Then, they just need to compare two sets of numbers representing their sets of plausible public plans, in other words they need to compute ideal PSI [6, 11]. No private information leaks within ideal PSI, therefore no private information leaks during the distributed intersection.

There can be, however, private information leakage, when the algorithm continues several iterations, i.e. the algorithm does not terminate (the second point). When the agent \(\upalpha _-\) finds out that some set of plans is unacceptable by the agent \(\upalpha _+\) (which is the only reason, why the algorithm has to continue with another iteration), private information leaks simply by growth of certainty by Definition 2.

As \(\upalpha _+\) does not know how many plans \(\upalpha _-\) has generated thus how many plans were refused, if we want to upper-bound the possible leaked information, \(\upalpha _+\) has to consider that all possible plans of length l were generated by \(\upalpha _-\) and refused by \(\upalpha _+\), i.e. \(\uplambda _{l+1}-\uplambda _{l} \le |\textsc {{sols}}_l(\Pi _i)|\). Such situation reflects the maximal possible growth in the certainty about acceptance of possible plans. In sum over all plans lengths we get \(\uplambda _l \le \sum \limits _{1 \le l' \le l}|\textsc {{sols}}_{l'}(\Pi _i)| = |\textsc {{sols}}_{\le l}(\Pi _i)|\).

To limit the leakage by shortcutting the solution prematurely by the randomly selected plans, it has to happen that all agents generate by chance a global solution (line 7) sooner than systematically in its iteration by the length l. As the best we can get is an upper-bound (the number of real solution is not known in beforehand) on the needed number of randomly generated plans k, we can assume that there is only one solution plan. A chance to randomly choose one particular plan by a selection of k random plans from all solutions \(|\textsc {{sols}}_{>l}|=n\) longer than the current iteration length l is

$$\begin{aligned} \frac{{n \atopwithdelims ()k} - {n-1 \atopwithdelims ()k}}{{n \atopwithdelims ()k}} = \frac{{n-1 \atopwithdelims ()k-1}}{{n \atopwithdelims ()k}} = \frac{k}{n} = \frac{k}{|\textsc {{sols}}_{>l}(\Pi _i)|}. \end{aligned}$$
(3)

The change of not selecting the solution plan is simply \(1-\frac{k}{|\textsc {{sols}}_{>l}(\Pi _i)|}\), which with the upper-bound on the certainty about acceptance of possible plans gives us a parameterized tighter upper-bound on the leaked information \(\uplambda _l \le |\textsc {{sols}}_{\le l}(\Pi _i)|(1-\frac{k}{|\textsc {{sols}}_{>l}(\Pi _i)|}) \le \textsc {{sols}}_{\le l}(\Pi _i)\). Note that each agent has the same chance to select the one common solution, therefore the chance is not decreased with increasing number of agents.

By Definition 3, \(\uplambda \le {\upvarepsilon }\) has to hold for \({\upvarepsilon }\)-privacy preserving planner, that means for the last iteration with a systematically found global plan \(\uplambda _{|\uppi ^*|} \le {\upvarepsilon }\). As the length of the systematic solution plan \(|\uppi ^*|\) is not know in beforehand, we have to heuristically estimate it. As \(l \le |\uppi ^*|\) holds for all iterations of the algorithm, we can modify the formula and derive the upper-bound on the number of shortcut plans to fulfill \({\upvarepsilon }\):

$$\begin{aligned} \uplambda _{l}\le & {} {\upvarepsilon },\end{aligned}$$
(4)
$$\begin{aligned} |\textsc {{sols}}_{\le l}(\Pi _i)|(1-\frac{k}{|\textsc {{sols}}_{>l}(\Pi _i)|})\le & {} {\upvarepsilon }, \end{aligned}$$
(5)
$$\begin{aligned} |\textsc {{sols}}_{> l}(\Pi _i)|(1-\frac{{\upvarepsilon }}{|\textsc {{sols}}_{\le l}(\Pi _i)|})\le & {} k. \end{aligned}$$
(6)

As such k is only an estimate assuming each iteration is the last one, we have to decrease the allowed leakage in each iteration by \({\upvarepsilon }\leftarrow {\upvarepsilon }- k\) at line 6.

The parameter \({\upvarepsilon }\), and consequentially also k, acts as a trade-off parameter between security and efficiency. If the agent “randomly” selects all its plans \(\textsc {{sols}}_{> l}(\Pi _i)\), then no information about refused plans can leak as it is assured that planning finds the (at least one existing) solution in the first iteration. Thus it would imply the strong privacy, i.e. for \(k = |\textsc {{sols}}_{>l}(\Pi _i)|\) we get \({\upvarepsilon }\ge 0 \ge \uplambda \) from Eq. 5 and Definition 3. Conversely, if we plan only systematically \(k = 0\), the leakage upper-bounded \({\upvarepsilon }\ge |\textsc {{sols}}_{\le l}(\Pi _i)| \ge \uplambda \).

In the previous cases, \(\textsc {{sols}}_{>l}\) can be replaced by \(\textsc {{seqs}}_{>l}\), as there cannot be less sequences than solutions and we are dealing with upper bounds. However, we kept the tighter \(\textsc {{sols}}_{>l}\) in the proof and discussion. The possible issue with \(|\textsc {{sols}}_{>l}|\) is that it can be hard to evaluate them efficiently, which is not problem with \(\textsc {{seqs}}_{>l}\). The drawback of \(\textsc {{seqs}}_{>l}\) is their exponentially larger amount, therefore exponential “looseness” of the bounds and a need for possibly exponentially larger k.

The leakage bounds are illustrated in Fig. 1 for an example planning problem. The problem has 2, 4, 8, 16 and 32 solutions (in the set \(\textsc {{sols}}_{\le l}(\Pi _i)\)) for plan lengths l 1, 2, 3, 4 and 5 respectively. This gives us 30, 28, 24, 16 and 0 solutions in the set \(\textsc {{sols}}_{> l}(\Pi _i)\), again for lengths \(l=1<\), 2, 3, 4 and 5. The lines depict the upper-bound of the leakage \(\uplambda \) for different numbers of shortcut plans k. For example, in the first iteration, only the two possibly refused plans can leak information, therefore even when \(k = 0\), i.e. without any shortcut plans, \(\uplambda _1\) = 2. Conversely, to assure the planning process ends in the first iteration and does not leak any information, k has to equal to the rest of plans for \(>l\), which is 30, where maximal leakage is ensured to be 0. Based on the changing numbers of already generated and still to be generated plans the ratio changes with the following iterations.

Fig. 1.
figure 1

Upper-bounds of required shortcut plans to assure leakage of the planning algorithm for iterations of the presented example planning problem.

The points, where the upper-bounds equals to 0, represent numbers of shortcut plans needed to provide strong privacy (Proposition 1). The Fig. 2 depicts the numbers k of shortcut plans for the particular iterations of the example planning problem. Although the example shows only a small and synthetic planning problem, the principles and trends of the privacy bounds are general.

Fig. 2.
figure 2

Amounts of shortcut plans k assuring leakage of no private information, therefore privacy preserving run of the planning algorithm in the presented example planning problem.

To increase efficiency of the intersections, the principle proposed in PSM planner [16] can be used. Each agent stores generated plans in a form of planning state machines, special version of finite state machines. An algorithm, which can be used for secure intersection of planning state machines, was presented in [5]. In the case of different representation of public plans, more general approach of generic secure computation can be applied [1, 20, 21].

4 Logistics Example

Let us consider a simple logistics scenario to demonstrate how private information can leak for \(k=0\) and how it decreases with larger k values.

In this scenario, there are two transport vehicles (\(\texttt {plane} \) and \(\texttt {truck} \)) operating in three locations (\(\texttt {prague} \), \(\texttt {brno} \), and \(\texttt {ostrava} \)). A \(\texttt {plane} \) can travel from \(\texttt {prague} \) to \(\texttt {brno} \) and back, while a \(\texttt {truck} \) provides connection between \(\texttt {brno} \) and \(\texttt {ostrava} \). The goal is to transport the \(\texttt {crown} \) from \(\texttt {prague} \) to \(\texttt {ostrava} \).

This problem can be expressed using MA-Strips as follows. Actions

$$\begin{aligned} \texttt {fly} ({{\texttt {\textit{loc}}}}_1,{{\texttt {\textit{loc}}}}_2) \mathrm {\; and \;} \texttt {drive} ({{\texttt {\textit{loc}}}}_1,{{\texttt {\textit{loc}}}}_2) \end{aligned}$$

describe movement of \(\texttt {plane} \) and \(\texttt {truck} \) respectively. Actions \(\texttt {load} ({{\texttt {\textit{veh}}}},{{\texttt {\textit{loc}}}})\) and \(\texttt {unload} ({{\texttt {\textit{veh}}}},{{\texttt {\textit{loc}}}})\) describe loading and unloading of \(\texttt {crown} \) by a given vehicle at a given location.

We define two agents \({{\texttt {\textit{Plane}}}}\) and \({{\texttt {\textit{Truck}}}}\). The agents are defined by sets of executable actions as follows

The aforementioned actions are defined using binary variables \(\texttt {at} ({{\texttt {\textit{veh}}}},{{\texttt {\textit{loc}}}})\in \{\mathsf {\mathsf {true}}, \mathsf {false}\}\) to describe possible vehicle locations and binary variables \(\texttt {in} (\texttt {crown} ,{{\texttt {\textit{loc}}}})\) and \(\texttt {in} (\texttt {crown} ,{{\texttt {\textit{veh}}}})\) to describe positions of \(\texttt {crown} \). A variable is assigned \(\mathsf {\mathsf {true}}\) value if the fact it is describing holds. E.g. \(\texttt {in} (\texttt {crown} ,\texttt {plane} ) = \mathsf {\mathsf {true}}\) represents the fact that \(\texttt {crown} \) is in \(\texttt {plane} \). We omit action names in examples when no confusion can arise. For example, we have the following actions:

The initial state and the goal are given as follows:

$$\begin{aligned} \begin{array}{ll} s_{I}= &{} \{ \texttt {at} (\texttt {plane} ,\texttt {prague} ) = \mathsf {\mathsf {true}},\; \texttt {at} (\texttt {truck} ,\texttt {brno} ) = \mathsf {\mathsf {true}}, \\ &{} \texttt {in} (\texttt {crown} ,\texttt {prague} ) = \mathsf {\mathsf {true}}\} \\ s_{\star }= &{} \{ \texttt {in} (\texttt {crown} ,\texttt {ostrava} ) = \mathsf {\mathsf {true}}\} \end{array} \end{aligned}$$

All other variables not present in the initial state \(s_{I}\) are \(\mathsf {false}\).

In our example, the only variable shared by the two agents is \(\texttt {in} (\texttt {crown} ,\texttt {brno} )\) and as required by \(\mathsf {vars}(s_{\star })\subseteq \mathcal {V}^{\mathsf {pub}}\) (see Sect. 2), the goal \(\texttt {in} (\texttt {crown} ,\texttt {ostrava} )\). We have the following variable classification:

$$\begin{aligned} \begin{array}{rl} \mathcal {V}^{\mathsf {pub}} = &{} \{ \texttt {in} (\texttt {crown} ,\texttt {brno} ), \\ &{} \texttt {in} (\texttt {crown} ,\texttt {ostrava} ) \}, \\ \mathcal {V}^{\mathsf {priv}}_{{{\texttt {\textit{Plane}}}}} = &{} \{ \texttt {at} (\texttt {plane} ,\texttt {prague} ), \texttt {at} (\texttt {plane} ,\texttt {brno} ), \\ &{} \texttt {in} (\texttt {crown} ,\texttt {prague} ), \texttt {in} (\texttt {crown} ,\texttt {plane} ) \}, \\ \mathcal {V}^{\mathsf {priv}}_{{{\texttt {\textit{Truck}}}}} = &{} \{ \texttt {at} (\texttt {truck} ,\texttt {brno} ), \texttt {at} (\texttt {truck} ,\texttt {ostrava} ), \\ &{} \texttt {in} (\texttt {crown} ,\texttt {truck} ) \}. \end{array} \end{aligned}$$

The actions and their projections important for the following discussion are:

All the actions arranging vehicle movements are private. Only the actions providing package treatment at public locations (\(\texttt {brno} \), \(\texttt {ostrava} \)) are public:

$$\begin{aligned} \begin{array}{rl} \mathcal {O}^{\mathsf {pub}}_{{{\texttt {\textit{Truck}}}}} = \{\ \texttt {load} (\texttt {truck} ,\texttt {brno} ), \texttt {unload} (\texttt {truck} ,\texttt {brno} ), \\ \texttt {load} (\texttt {truck} ,\texttt {ostrava} ), \texttt {unload} (\texttt {truck} ,\texttt {ostrava} ) \ \},\\ \\ \mathcal {O}^{\mathsf {pub}}_{{{\texttt {\textit{Plane}}}}} = \{ \texttt {load} (\texttt {plane} ,\texttt {brno} ), \texttt {unload} (\texttt {plane} ,\texttt {brno} ) \}. \end{array} \end{aligned}$$

The agent \({{\texttt {\textit{Plane}}}}\) generates possible plans using the systematic plan generation algorithm (e.g. Best-First Search) and thus it sequentially generates following public plans:

$$\begin{aligned} \begin{array}{ll} \uppi ^{{\texttt {\textit{Plane}}}}_1 = \langle &{} \texttt {unload} (\texttt {truck} ,\texttt {ostrava} ) \ \rangle , l = 1 \\ \uppi ^{{\texttt {\textit{Plane}}}}_2 = \langle &{} \texttt {unload} (\texttt {plane} ,\texttt {brno} ),\ \\ &{} \texttt {unload} (\texttt {truck} ,\texttt {ostrava} )\ \rangle , l = 2, \\ \uppi ^{{\texttt {\textit{Plane}}}}_3 = \langle &{} \texttt {unload} (\texttt {truck} ,\texttt {brno} ),\ \\ &{} \texttt {unload} (\texttt {truck} ,\texttt {ostrava} )\ \rangle , l = 2 \\ \uppi ^{{\texttt {\textit{Plane}}}}_4 = \langle &{} \texttt {unload} (\texttt {truck} ,\texttt {ostrava} ),\ \\ &{} \texttt {unload} (\texttt {truck} ,\texttt {ostrava} )\ \rangle , l = 2 \\ \ldots \\ \uppi ^{{\texttt {\textit{Plane}}}}_n = \langle &{} \texttt {unload} (\texttt {plane} ,\texttt {brno} ),\ \texttt {load} (\texttt {truck} ,\texttt {brno} ),\ \\ &{} \texttt {unload} (\texttt {truck} ,\texttt {ostrava} ) \ \rangle , l = 3. \\ \end{array} \end{aligned}$$

Note that any locally valid sequence of action containing action

$$\begin{aligned} \texttt {unload} (\texttt {truck} ,\texttt {ostrava} ) \end{aligned}$$

seems to be a valid solution to the \({{\texttt {\textit{Plane}}}}\) agent. In this example, \(\uppi ^{{\texttt {\textit{Plane}}}}_n\) is the first plan extensible to a global solution by both \({{\texttt {\textit{Plane}}}}\) and \({{\texttt {\textit{Truck}}}}\) generated by the systematic planning process.

Similarly, agent \({{\texttt {\textit{Truck}}}}\) sequentially generates following public plans:

We can see that \({{\texttt {\textit{Truck}}}}\) generates an extensible plan as the first one and \({{\texttt {\textit{Plane}}}}\) generated equivalent solution as the n-th plan. Thus, once both agents agree on a solution, agent \({{\texttt {\textit{Plane}}}}\) can try to deduce something about \({{\texttt {\textit{Truck}}}}\) private information. Since all plans \(\uppi ^\texttt {plane} _1, \dots , \uppi ^\texttt {plane} _4\) are strictly shorter than the accepted solution \(\uppi ^\texttt {plane} _n\) and they were not generated by \({{\texttt {\textit{Truck}}}}\), it implies that these plans are not acceptable by \({{\texttt {\textit{Truck}}}}\), i. e. for example \(P^*(\uppi ^{{\texttt {\textit{Plane}}}}_1) = 0\). More specifically, \({{\texttt {\textit{Plane}}}}\) can deduce following about \({{\texttt {\textit{Truck}}}}\)’s private information:

  • The action \(\texttt {unload} (\texttt {truck} ,\texttt {ostrava} )\) has to contain some private precondition, otherwise \(\uppi ^{{\texttt {\textit{Plane}}}}_1\) would be generated also by \({{\texttt {\textit{Truck}}}}\) before \(\uppi ^{{\texttt {\textit{Truck}}}}_1\), because it is shorter.

  • Private preconditions of \(\texttt {unload} (\texttt {truck} ,\texttt {ostrava} )\) certainly depend on private fact (possibly indirectly) generated by \(\texttt {load} (\texttt {truck} ,\texttt {brno} )\), otherwise \(\uppi ^{{\texttt {\textit{Plane}}}}_2\) would be generated before \(\uppi ^{{\texttt {\textit{Truck}}}}_1\).

In this example, we have shown how systematic generation of plans can cause private information leakage. Let us now consider a case when both agents add one shortcut plan after each systematically generated one, i. e. \(k=1\). For the simplicity, we will consider previous sequence of plans, where \(\uppi ^{{\texttt {\textit{Plane}}}}_n\) is selected as the shortcut plan in the first iteration for \(l = 1\) by both agents. In such case, the amount of leaked information is smaller by Eq. (5). If there is only one solution \(\uppi ^{{\texttt {\textit{Plane}}}}_n\), the leakage will be 0 and the agents would not be able to deduce any private information about the other agents. If the shortcut plan \(\uppi ^{{\texttt {\textit{Plane}}}}_n\) is added in next iteration for length \(l = 3\), \({{\texttt {\textit{Plane}}}}\) can deduce that \(P^*(\uppi ^{{\texttt {\textit{Plane}}}}_1) = 0\), but cannot deduce the same about other plans. \({{\texttt {\textit{Plane}}}}\) could deduce that \({{\texttt {\textit{Truck}}}}\) accepts no plan of length 2, only once it is sure that all of them have been systematically generated. But thanks to the adding of the shortcut plan, the solution can be found sooner.

Obviously \(k=1\) decreases the leaked information only minimally. To decrease the private information leakage significantly, k has to grow by Eq. (5) towards \(|\textsc {{sols}}_{> l}(\Pi _{{\texttt {\textit{Plane}}}})|\) as we showed in proof of Theorem 2.

5 Code Lock Example and General Privacy Leakage Upper-Bound

The other example is designed such that it shows the successive leaking of information by learning about refused sequences of the actions. There is a combination code lock and two agents. The agent \(\upalpha _-\) is unlocking the lock with help of the other agent \(\upalpha _+\), which knows the combination. The lock requires a correct sequence of n pressed buttons reachable by \(\upalpha _+\) (each button can be pressed only once), finished with pressing two unlock buttons, each reachable only by one of the agents. Note that strictly speaking if only one combination is correct no private information would leak as Definition 2 excludes the solution plan \(\uppi ^*\). We could modify the example such that there are more correct solution and \(\upalpha _-\) is trying to deduce all of them, however to simplify the latter discussion, we will stick to one solution, which we assume to be secret.

This assumption is not unrealistic, as in reality the press actions would be private, however by the requirement of privacy preserving MA-MPT planning, all actions incl. press are public.

The binary variables of the problem are

$$\begin{aligned} \begin{array}{ll} \mathcal {V}^{\mathsf {pub}} &{} = \{ \mathtt {unlocked}_+, \mathtt {unlocked}_- \}, \\ \mathcal {V}^{\mathsf {priv}}_{\upalpha _+} &{} = \{ \mathtt {pressed}_0, \mathtt {pressed}_1, \ldots , \mathtt {pressed}_n \}, \\ \mathcal {V}^{\mathsf {priv}}_{\upalpha _-} &{} = \emptyset . \end{array} \end{aligned}$$

The two public unlocked variables describe whether the two agents successfully unlocked their side of the lock. For simplicity, we assume only \(\upalpha _+\) need to enter the correct sequence, which allows to unlock its side. The agent \(\upalpha _-\) attempts to deduce the constraints among the presses during the process. Provided that the \(\upalpha _-\) agent has similar combination as \(\upalpha _+\) the example would work symmetrically for both agents, or even for more agents unlocking the lock in coordination. The successfully entered steps of the combination are represented by the private pressed variables.

In the initial state, all variables are set to \(\mathsf {false}\) with the exception of \(\mathtt {pressed}_0\) allowing to press the first correct button. The goal of the problem is to unlock both sides of the lock:

$$\begin{aligned} \begin{array}{ll} s_{I}= \{ \mathtt {pressed}_0 = \mathsf {\mathsf {true}}\},\\ s_{\star }= \{ \mathtt {unlocked}_+ = \mathsf {\mathsf {true}}, \mathtt {unlocked}_- = \mathsf {\mathsf {true}}\}. \end{array} \end{aligned}$$

The actions of the problem are the two unlocking the lock and actions representing pressing the buttons. All actions are public (following the assumption on no private actions), however actions of \(\upalpha _+\) have private preconditions and effects constraining only the correct unlock sequence. The action sets consist of:

$$\begin{aligned} \begin{array}{rl} \mathcal {O}^{\mathsf {pub}}_{\upalpha _+} &{} = \{ \mathtt {unlock}_+, \mathtt {press}_1, \mathtt {press}_2, \ldots , \mathtt {press}_n \}, \\ \mathcal {O}^{\mathsf {pub}}_{\upalpha _-} &{} = \{ \mathtt {unlock}_- \}. \end{array} \end{aligned}$$

The action \(\mathtt {unlock}_-\) (and its projection) has no preconditions and the sole effect \(\mathtt {unlocked}_- \leftarrow \mathsf {\mathsf {true}}\), which is required by the goal:

$$\begin{aligned} \begin{array}{rcr} \mathtt {unlock}_- = \mathtt {unlock}_-^{\vartriangleright } = \langle \mathsf {\mathsf {pre}}(.) = \emptyset , \mathsf {\mathsf {eff}}(.) = \{ \mathtt {unlocked}_+ \leftarrow \mathsf {\mathsf {true}}\}\rangle . &{} \\ \end{array} \end{aligned}$$

Let the unlocking sequence of button indices be described by a mapping \(u(i) \mapsto i'\) which for each step i returns next button index \(i'\) to be pressed. Then each button pressing actions is defined as:

$$\begin{aligned} \begin{array}{rcr} \mathtt {press}_i = \langle \mathsf {\mathsf {pre}}(.) = &{} \{ \mathtt {pressed}_{u(i)-1} = \mathsf {\mathsf {true}}\}, &{} \\ \quad \mathsf {\mathsf {eff}}(.) = &{} \{ \mathtt {pressed}_{u(i)} \leftarrow \mathsf {\mathsf {true}}\}\rangle . &{} \\ \mathtt {unlock}_+ = \langle \mathsf {\mathsf {pre}}(.) = &{} \{ \mathtt {pressed}_n = \mathsf {\mathsf {true}}\}, &{} \\ \quad \mathsf {\mathsf {eff}}(.) = &{} \{ \mathtt {unlocked}_+ \leftarrow \mathsf {\mathsf {true}}\}\rangle . &{} \\ \end{array} \end{aligned}$$

For an example, a \(n=3\) step unlocking sequence 2, 3, 1 will induce following actions for the agent \(\upalpha _+\):

$$\begin{aligned} \begin{array}{rcr} \mathtt {press}_2 = \langle \mathsf {\mathsf {pre}}(.) = &{} \{ \mathtt {pressed}_0 = \mathsf {\mathsf {true}}\}, &{} \\ \quad \mathsf {\mathsf {eff}}(.) = &{} \{ \mathtt {pressed}_1 \leftarrow \mathsf {\mathsf {true}}\}\rangle , &{} \\ \mathtt {press}_3 = \langle \mathsf {\mathsf {pre}}(.) = &{} \{ \mathtt {pressed}_1 = \mathsf {\mathsf {true}}\}, &{} \\ \quad \mathsf {\mathsf {eff}}(.) = &{} \{ \mathtt {pressed}_2 \leftarrow \mathsf {\mathsf {true}}\}\rangle , &{} \\ \mathtt {press}_1 = \langle \mathsf {\mathsf {pre}}(.) = &{} \{ \mathtt {pressed}_2 = \mathsf {\mathsf {true}}\}, &{} \\ \quad \mathsf {\mathsf {eff}}(.) = &{} \{ \mathtt {pressed}_3 \leftarrow \mathsf {\mathsf {true}}\}\rangle . &{} \\ \mathtt {unlock}_+ = \langle \mathsf {\mathsf {pre}}(.) = &{} \{ \mathtt {pressed}_3 = \mathsf {\mathsf {true}}\}, &{} \\ \quad \mathsf {\mathsf {eff}}(.) = &{} \{ \mathtt {unlocked}_+ \leftarrow \mathsf {\mathsf {true}}\}\rangle . &{} \\ \end{array} \end{aligned}$$

The projections of all \(\mathtt {press}\) actions have no preconditions and effects (they are effectively \(\mathsf {noop}\)s with different action names from perspective of \(\upalpha _-\)), as the \(\mathtt {pressed}\) variables are private. The action \(\mathtt {unlock}\) has only its goal effect:

$$\begin{aligned} \begin{array}{rll} \mathtt {press}_1^{\vartriangleright }, \ldots , \mathtt {press}_n^{\vartriangleright } &{} = &{} \langle \mathsf {\mathsf {pre}}(.) = \emptyset , \mathsf {\mathsf {eff}}(.) = \emptyset \rangle = \mathsf {noop}, \\ \mathtt {unlock}_+^{\vartriangleright } &{} = &{} \langle \mathsf {\mathsf {pre}}(.) = \emptyset , \mathsf {\mathsf {eff}}(.) = \{ \mathtt {unlocked}_+ \leftarrow \mathsf {\mathsf {true}}\}\rangle . \\ \end{array} \end{aligned}$$

From perspective of \(\upalpha _-\), any sequence of \(\upalpha _+\) ending with \(\mathtt {unlock}_+\) is legitimate. On the contrary, only the correct sequence of \(\mathtt {press}\) actions and \(\mathtt {unlock}_+\) is valid from perspective of \(\upalpha _+\).

Fig. 3.
figure 3

Hypothetical upper-bounds on information leakage in the Code Lock example problem for \(n = 2\) without decreasing \({\upvarepsilon }\) during the iterations by \({\upvarepsilon }\leftarrow {\upvarepsilon }- k\). For \(k=8\) the leakage \(\uplambda = 0\). For \(l \ge 3\), the leakage assumes possible longer plans of the problem (e.g., by more then two agents) than those limited by \(l \le n\).

In each iteration l, if a solution is not found, \(\upalpha _-\) eliminates all possible combinations of the code of length l. These eliminations represent the private variables \(\mathtt {pressed}\) in form of preconditions and effects of actions of \(\upalpha _+\). Because of the problem formulation, the number of projected local plans of \(\upalpha _+\) from perspective of \(\upalpha _-\) are:

$$\begin{aligned} \begin{array}{rll} n^1 + n^2 + \cdots + n^l + n^{l+1} + \cdots + n^{n-1} + n^n, \end{array} \end{aligned}$$

where n is the number of \(\mathtt {press}\) actions and l is the intermediate length of the solution. We can bound sizes of \(\textsc {{sols}}_{\le l}\) and \(\textsc {{sols}}_{> l}\) using this sequence, \(l \le n\) by the count of meaningful plans by Eq. 1 and an assumption on repetitions of the \(\mathtt {press}\) actions are allowed as \(\upalpha _-\) cannot know otherwise:

$$\begin{aligned} |\textsc {{sols}}_{\le l}|\le & {} \sum _{i=1}^{l}n^i \le n^{l+1},\end{aligned}$$
(7)
$$\begin{aligned} |\textsc {{sols}}_{> l}|\le & {} \sum _{i=1}^{n-l}n^{l+i} \le n^{n+1}. \end{aligned}$$
(8)

Without the shortcut plans \(k=0\), we get that \(n^{l+1} \le {\upvarepsilon }\) using Eqs. (5) and (7). This shows the possible maximal leakage of information is exponentially bound by the length of the plan which is bound by the number of actions in \(\Pi _+\). It is not surprising though that if we want a strong privacy preserving variant \({\upvarepsilon }= 0\), we need to generate exponential number of plans \(n^{n+1}\) in the number of actions of the problem for the first iteration by Eqs. (6) and (8).

The exponential growth of information leakage in l in this example is illustrated in Fig. 3. The other exponential dependency is on the number of (\(\mathtt {press}\)) actions, i.e. on n.

As the Code Lock problem is designed such that there are no public dependencies among the actions (with exception of the required goal conditions), it represents a planning problem with the maximal amount of private information and only one solution as assumed in the proof of Theorem 2. Therefore the resulting exponential bounds on leakage in l and in n hold not only for this particular planning problem, but as a general upper bound on the privacy leakage in MA-Strips planning with n actions and k shortcut plans in l-th iteration of the \(\mathtt {SecureMAPlanner}()\) algorithm (by Eqs. (5), (7) and (8)):

$$\begin{aligned} \uplambda _l \le n^{l+1}(1-\frac{k}{n^{n+1}}). \end{aligned}$$
(9)

6 Conclusions

In this article, we have provided a refined variant of the multi-agent planning principle preserving privacy from our previous version of the paper [17]. The principle of the algorithm is an application of the private set intersection (PSI) algorithm to privacy preserving multi-agent planning using intersection of sets of plans. As the plans are generated as extensible to a global solution provided that all agents agree on a selection of such local plans, the soundness of the planning approach is ensured. As we showed in [17] and in the refined proof of Theorem 2 here, the intersection process can be secure in one iteration by PSI, but some private information can leak during iterative generation of the local plans, which is the only practical way how to solve generally intractable planning problems. In more iterations, plans which are extensible by some agents but not extensible by all agents can leak private information about private dependencies of actions within the plans. In other words, if an agent says the proposed solution can be from its perspective used as a solution to the planning problem, but it cannot be used as a solution by another agent, the first one learns that the other one needs to use some private actions which obviate usage (extensibility) of the plan to a global solution. In the previous version of the algorithm, we have proposed to dilute the plans by an sufficient amount of randomized plans, however the number in general needed to be infinite [17]. In this variant of the algorithm, we have shown that the privacy leakage can be arbitrarily shortcut by randomly selected local plans and fully prevented by using all solutions (exponential in the number of actions) already in the first iteration. Although the number of such shortcut plans achieving strong privacy is exponential in general, in contrast to the dilution approach, the number is finite. The results are also in agreement with our recent results in [14]. As in the previous version of the paper, we have illustrated the principle on the logistics example, however with new results using the improved version of the algorithm. Newly, we have demonstrated the principle on a synthetic planning problem, which shows also the novel privacy leakage upper bound (Eq. 9) which holds in general.