Keywords

1 Introduction

Multi-agent path finding (MAPF) represents a task of finding collision free paths for a set of mobile agents where each agent is assigned unique start and goal positions [13, 19, 22, 27]. An environment with agents is often abstracted by undirected graph in the literature [17, 36]. Agents in this abstraction are represented by items placed in vertices of the graph while edges represent passable regions. At most one agent can be placed in each vertex to model physical space occupancy by agents. Agents can traverse a single edge at each time step.

Various movement schemes exist for MAPF on graphs. Often an agent can move into an unoccupied neighbor not entered by another agent [28] – this will be called move-to-unoccupied variant. This variant requires at least one vertex in the graph unoccupied to be able to perform some movements at all. Other movement schemes include chains of agents moving simultaneously with only the leader entering the unoccupied vertex or cases with no vacant vertex in the graph where rotations along non-trivial cycles are allowed [32]. We base our presentation on the move-to-unoccupied variant. Let us note that techniques shown in this paper are generic across all these movement schemes.

The MAPF problem and its variants are strongly practically motivated. Applications range from navigation of multiple mobile robots [5, 8] through traffic optimization [12, 15] to movement planning in computer games [34]. We refer the reader to various studies such as [19, 20] for the detailed survey of applications.

1.1 Optimality in MAPF

We address optimal MAPF in which we search for paths that are optimal with respect to a given objective. The two basic cumulative objectives studied in the literature are sum-of-costs [19, 25] and makespan [29].

The sum-of-costs objective assumes that unit costs are assigned to move and wait actions. The cost of plan is the sum of action costs along all the paths and over all agents. The aim is to obtain a plan with the minimum cost.

Under the makespan objective, the aim is to obtain a plan that can be executed in as short as possible time while each movement consumes 1 unit of time. In other words, we need the longest path out of all the paths to be as short as possible.

As we will show later, there may be situations where the increase in the sum-of-costs leads to a shorter makespan and vice versa.

A feasible solution of MAPF can be found in polynomial time [13, 35]. Adding any of the discussed objectives renders the decision version of MAPF (a yes/no question if a given MAPF has a solution of specified makespan/sum-of-costs) to be NP-complete [16, 28, 32].

We will keep the further description around the sum-of-costs variant but it is important to note that the presented techniques apply for the makespan variant as well.

1.2 Contributions to SAT-Based MAPF

Techniques for solving MAPF optimally include translation of the decision version into propositional formula [10, 11]. The formula is satisfiable if and only if the instance of MAPF is solvable for a given value of the objective function. Assuming that satisfiability of such formula is a non-decreasing function of the value of objective function, the optimum can be obtained by querying the satisfiability multiple times. A trivial strategy of increasing the value of objective function by one turned out to be a good choice [30] in many cases (thanks to the non-uniform difficulty of each query).

Satisfiability of the formula can be decided by an off-the-shelf SAT solver [2, 6] which is one of the advantages of the SAT-based approach. All advanced techniques developed in recent SAT solvers [4] can be employed for solving MAPF in this way.

The most significant bottleneck of all existing SAT-based algorithms for MAPF is the large size and combinatorial difficulty of the target propositional formula that grow significantly with the increasing number of agents as well as with growing size of the underlying graph. This kind of growth of combinatorial difficulty has already been addressed by Standley [24] in his search-based optimal MAPF solving algorithm. Standley described various variants of a method called independence detection that tries to determine the smallest possible groups of agents for which paths can be found independently of other groups.

Our contribution consists in integrating two variants of independence detection – simple independence detection (SID) and independence detection (ID) – with MDD-SAT – the most recent SAT-based MAPF solver [30]. As there are differences in how the original Standley’s search-based algorithm and SAT-based approach work, we suggested modifications to ID to be compatible with the SAT-based approach. Our new solvers are called MDD-SAT+SID and MDD-SAT+ID following the notation of [24]. Conducted experiments demonstrate similar performance benefit as in the case of original application of SID and ID in the search-based approach.

This paper is an extension of [31]. We describe in more detail the encoding of MAPF to a Boolean formula. In addition to [31] we also present experimental evaluation of MDD-SAT+SID. The paper is organized as follows. After the formal introduction of the MAPF problem a brief exposition of related work is done. Then, the Boolean encoding of MAPF is presented, also the original SID and ID are recalled and their integration with the SAT-based approach is presented. Finally, an experimental evaluation with grids and large maps is presented.

2 MAPF Definition

An arbitrary undirected graph can be used to model the environment where agents are moving. Let \( G = \left( {V,E} \right) \) be such a graph where \( V = \left\{ {v_{1} ,v_{2} , \ldots ,v_{n} } \right\} \) is a finite set of vertices and \( E\, \subseteq \left( {\begin{array}{*{20}c} V \\ 2 \\ \end{array} } \right) \) is a set of edges.

The placement of agents in the environment is modeled by assigning them vertices of the graph. Let \( A = \left\{ {a_{1} ,a_{2} , \ldots ,a_{m} } \right\} \) be a finite set of agents. Then, an arrangement of agents in vertices of graph \( G \) will be fully described by a location function \( \alpha{:}\, A \to V; \) the interpretation is that an agent \( a \in A \) is located in a vertex \( \alpha \left( a \right). \) At most one agent can be located in each vertex; that is \( \alpha \) is uniquely invertible.

Definition 1

(MAPF). An instance of multi-agent path-finding problem is a quadruple \( \Sigma = \left[ {G = \left( {V,E} \right),A,\alpha_{0} ,\alpha_{ + } } \right] \) where location functions \( \alpha_{0} \) and \( \alpha_{ + } \) define the initial and the goal arrangement of a set of agents \( A \) in \( G \) respectively. □

The dynamicity of the model assumes a discrete time divided into time steps. An arrangement \( \alpha_{i} \) at the \( i \)-th time step can be transformed by a transition action which instantaneously moves agents in the non-colliding way to form a new arrangement \( \alpha_{i + 1} \). The transition between \( \alpha_{i} \) and \( \alpha_{i + 1} \) must satisfy the following validity conditions:

  1. (1)

    \( \forall a \in A \) either \( \alpha_{i} \left( a \right) = \alpha_{i + 1} \left( a \right) \) or \( \{ \alpha_{i} \left( a \right),\alpha_{i + 1} \left( a \right)\} \in E \) holds

    (agents move along edges or wait at their current location),

  2. (2)

    \( \forall a \in A\,\alpha_{i} \left( a \right) \ne \alpha_{i + 1} \left( a \right) \Rightarrow \alpha_{i}^{ - 1} \left( {\alpha_{i + 1} (a} \right)) = \bot \)

    (agents move to vacant vertices only), and

  3. (3)

    \( \forall a,b \in A\,a \ne b \Rightarrow \alpha_{i + 1} \left( a \right) \ne \alpha_{i + 1} \left( b \right) \)

    (no two agents enter the same target/unique invertibility of resulting arrangement).

The task in MAPF is to transform \( \alpha_{0} \) using above valid transitions to \( \alpha_{ + } \). An illustration of MAPF and its solution is depicted in Fig. 1.

Fig. 1.
figure 1

An example of a MAPF instance from [31] with three agents \( a_{1} \), \( a_{2} \), and \( a_{3} \) (left). A solution of the instance is shown (right).

Definition 2

(MAPF solution). A solution for MAPF instance \( {\Sigma } = \left[ {G,A,\alpha_{0} ,\alpha_{ + } } \right] \) is a sequence of arrangements \( \left[ {\alpha_{0} ,\alpha_{1} ,\alpha_{2} , \ldots ,\alpha_{\mu } } \right] \) where \( \alpha_{\mu } = \alpha_{ + } \) and \( \alpha_{i + 1} \) is a result of valid transition from \( \alpha_{i} \) for every \( = 1,2, \ldots ,\mu - 1 \). □

Makespan \( \mu \) is the total number of time steps until the last agent reaches its destination. Sum-of-costs denoted \( \xi \) is the sum of path costs per individual agents. Each action (including wait) of an agent before it reaches its goal has unit cost.

2.1 Makespan vs. Sum-of-Costs

There exists an instance in which all the sum-of-costs optimal solutions are not makespan optimal. Similarly, none of the makespan optimal solution is sum-of-costs optimal there (see Fig. 2 for illustration).

Fig. 2.
figure 2

An instance of the MAPF problem from [31] in which no makespan optimal solution is sum-of-costs optimal and no sum-of-costs optimal solution is makespan optimal.

In the SAT-based optimal MAPF solver described below, a proper relation between makespan and sum-of-costs need to be found as both objectives are bounded during search. We need to ensure that smallest cost found under the given makespan bound is optimal (see [30] for more detailed discussion).

3 Related Work

Many other successful algorithms exist for the optimal MAPF solving. The state-of-the-art search-based algorithms (though there is no universal winner) include increasing cost tree search - ICTS [19], conflict base search - CBS [20], and improved CBS – ICBS [7]. These algorithms excel in setups with relatively few agents on large maps.

Another research direction is represented by methods based on reduction of the MAPF problem to another formalism. Except the SAT as a target formalism, successful attempts to reduce MAPF to constraint optimization problem [18], inductive logic programming [33], and answer set programming [9] have been made. These approaches (the SAT approach including) can be generally characterized by a high performance in MAPFs with small underlying graph densely populated with agents. This is a natural outcome of the maturity of solvers used to solve hard combinatorial problems in the target formalism.

Recently new research directions driven by applications have been identified in the MAPF context. For example, it is not always necessary to distinguish between individual agents – see [14] for detailed survey.

4 SAT Encoding for Optimal Sum-of-Costs

In this paper, we follow the algorithm solving sum-of-cost optimal MAPF via reduction to SAT presented in [30].

The basic approach in solving MAPF via SAT is to create a time expansion graph (denoted TEG) [29]. A TEG is a directed acyclic graph (DAG). First, the set of vertices of the underlying graph G are duplicated for all time-steps from 0 up to the given bound \( \mu \). Then, possible actions (move along edges or wait) are represented as directed edges between successive time steps. Formally a TEG is defined as follows:

Definition 3

(TEG). Time expansion graph of depth \( \mu \) for underlying graph \( \left( {V,E} \right) \) is a digraph \( \left( {V_{\mu } ,E_{\mu } } \right) \) where \( V_{\mu } = \left\{ {u_{j}^{t} {|}t = 0,1, \ldots ,\mu \wedge u_{j} \in V} \right\} \) and \( E_{\mu } = {{\{ }}(u_{j}^{u} ,u_{k}^{t + 1} {)}| \) \( t = 0,1, \ldots ,\mu - 1 \wedge \left( {\left\{ {u_{j} ,u_{k} } \right\} \in E \vee j = k} \right)\} . \)

The encoding for MAPF introduces propositional variables and constraints for a single time-step t in order to represent any possible arrangement of agents at time t. Given a desired makespan \( \mu \), the formula represents the question of whether there is a solution in the TEG of \( \mu \) time steps. The search for optimal makespan is done by iteratively incrementing \( \mu \left( { = 0,1,2 \ldots } \right) \) until a satisfiable formula is obtained.

To find the optimal sum-of-costs solution, we use similar technique as with optimal makespan solution. The sequence of decision problems is whether there exists a solution of a given sum-of-cost \( \xi \). However, encoding this decision problem is more challenging than the makespan case, because one needs to both bound the sum-of-costs, but also to predict how many time expansions are needed. We address this challenge by using two key techniques described next: (1) Cardinality constraint for bounding \( \xi \) and (2) Bounding the Makespan.

4.1 Cardinality Constraint for Bounding \( {\varvec{\upxi}} \)

The SAT literature offers a technique for encoding a cardinality constraint [3, 21], which allows calculating and bounding a numeric cost within the formula. Formally, for a bound \( \lambda \in {\mathbb{N}} \) and a set of propositional variables \( X = \left\{ {x_{1} ,x_{2} , \ldots ,x_{k} } \right\} \) the cardinality constraint \( \le_{\lambda } \left\{ {x_{1} ,x_{2} , \ldots ,x_{k} } \right\} \) is satisfied iff the number of variables from the set X that are set to TRUE is \( \le \lambda \).

In our SAT encoding, we bound the sum-of-costs by mapping every agent’s action to a propositional variable, and then encoding a cardinality constraint on these variables. Thus, one can use the general structure of the makespan SAT encoding (which iterates over possible makespans), and add such a cardinality constraint on top.

4.2 Bounding the Makespan for the Sum of Costs

We compute how many time expansions (\( \mu \)) are needed to guarantee that if a solution with sum-of-costs \( \xi \) exists then it will be found. In other words, in our encoding, the values we give to \( \xi \) and \( \mu \) must fulfill the following requirement:

R1: all possible solutions with sum-of-costs \( \xi \) must be possible for a makespan of at most \( \mu \).

To find a \( \mu \) value that meets R1, we require the following definitions. Let \( \xi_{0} \left( {a_{i} } \right) \) be the cost of the shortest individual path for agent \( a_{i} \), and let \( \xi_{0} = \mathop \sum \nolimits_{{a_{t} \in A}} \xi_{0} \left( {a_{i} } \right). \) \( \xi_{0} \) was called the sum of individual costs (SIC) [19]. \( \xi_{0} \) is an admissible heuristic for optimal sum-of-costs search algorithms, since \( \xi_{0} \) is a lower bound on the minimal sum-of-costs. \( \xi_{0} \) is calculated by relaxing the problem by omitting the other agents. Similarly, we define \( \mu_{0} = \mathop {\hbox{max} }\limits_{{a_{t} \in A}} \xi_{0} \left( {a_{i} } \right). \) \( \mu_{0} \) is length of the longest of the shortest individual paths and is thus a lower bound on the minimal makespan. Finally, let \( {\Delta } \) be the extra cost over SIC (as done in [19]). That is, let \( {\Delta } = \xi - \xi_{0} \).

Proposition 1.

For makespan \( \mu \) of any solution with sum-of-costs \( \xi \), R1 holds for \( \mu \le \mu_{0} + \Delta \).

Proof Outline:

The worst-case scenario, in terms of makespan, is that all the \( {\Delta } \) extra moves belong to a single agent. Given this scenario, in the worst case, \( {\Delta } \) is assigned to the agent with the largest shortest path. Thus, the resulting path of that agent would be \( \mu_{0} + {\Delta }, \) as required. □

Using Proposition 1, we can safely encode the decision problem of whether there is a solution with sum-of-costs \( \xi \) by using \( \mu = \mu_{0} + {\Delta } \) time expansions, knowing that if a solution of cost \( \xi \) exists then it will be found within \( \mu = \mu_{0} + {\Delta } \) time expansions. In other words, Proposition 1 shows relation of both parameters \( \mu \) and \( \xi \) which will be both changed by changing \( {\Delta } \). Algorithm 1 summarizes our optimal sum-of-costs algorithm. In every iteration, \( \mu \) is set to \( \mu_{0} + {\Delta } \) and the relevant TEGs (described below) for the various agents are built. Next a decision problem asking whether there is a solution with sum-of-costs \( \xi \) and makespan \( \mu \) is queried. The first iteration starts with \( {\Delta } = 0 \). If such solution exists, it is returned. Otherwise \( \xi \) is incremented by one, \( {\Delta } \) and consequently \( \mu \) are modified accordingly and another iteration of SAT consulting is activated.

figure a

This algorithm clearly terminates for solvable MAPF instances as we start seeking a solution of \( \xi = \xi_{0} \left( {{\Delta } = 0} \right) \) and increment \( {\Delta } \) (which increments \( \xi \) and \( \mu \) as well) to all possible values. The unsolvability of an MAPF instance can be checked separately by a polynomial-time complete sub-optimal algorithm such as PUSH-AND-ROTATE [35].

4.3 Efficient Use of the Cardinality Constraint

The complexity of encoding a cardinality constraint depends linearly in the number of constrained variables [21, 23]. Since each agent \( a_{i} \) must move at least \( \xi_{0} \left( {a_{i} } \right) \), we can reduce the number of variables counted by the cardinality constraint by only counting the variables corresponding to extra movements over the first \( \xi_{0} \left( {a_{i} } \right) \) movement \( a_{i} \) makes. We implement this by introducing a TEG for a given agent \( a_{i} \) (labeled TEGi).

TEG i differs from TEG (Definition 3) in that it distinguishes between two types of edges: \( E_{i} \) and \( F_{i} \cdot E_{i} \) are (directed) edges whose destination is at time step \( \le \xi_{0} \left( {a_{i} } \right) \). These are called standard edges. \( F_{i} \) denoted as extra edges are directed edges whose destination is at time step \( \ge \xi_{0} \left( {a_{i} } \right) \). Figure 3 shows an underlying graph for agent \( a_{1} \) (left) and the corresponding TEG1. Note that the optimal solution of cost 2 is denoted by the diagonal path of the TEG. Edges that belong to \( F_{i} \) are those that their destination is time step 3 (dotted lines). The key in this definition is that the cardinality constraint would only be applied to the extra edges, that is, we will only bound the number of extra edges (they sum up to \( {\Delta }) \) making it more efficient. There are various possibilities to define what happens to an agent when it reaches the goal (disappears, waits etc.). In all cases, edges in TEGs corresponding to wait actions at the goal are not marked as extra. Importantly, our SAT approach is robust across all these variants.

Fig. 3.
figure 3

A TEG for an agent that needs to go from \( u_{1} \) to \( u_{3} \).

4.4 Detailed Description of the SAT Encoding

Agent \( a_{i} \) must go from its initial position to its goal within TEGi. This simulates its location in time in the underlying graph G. That is, the task is to find a path from \( a_{0}^{0} \left( {a_{i} } \right) \) to \( a_{ + }^{\mu } \left( {a_{i} } \right) \) in TEGi. The search for such a path will be encoded within the Boolean formula. Additional constraints will be added to capture all movement constraints such as collision avoidance etc. And, of course, we will encode the cardinality constraint that the number of extra edges must be exactly \( \Delta \).

We want to ask whether a sum-of-costs solution of \( \xi \) exists. For this we build TEGi for each agent \( a_{i} \in A \) of depth \( \mu_{0} + {\Delta } \). We use \( V_{i} \) to denote the set of vertices in TEGi that agent \( a_{i} \) might occupy during the time steps. Next we introduce the Boolean encoding (denoted BASIC-SAT) which has the following Boolean variables:

  1. 1.

    \( {\upchi }_{j}^{t} \left( {a_{i} } \right) \) for every \( t \in \left\{ {0,1, \ldots , \mu } \right\} \) and \( u_{j}^{t} \in V_{i} \) – Boolean variable of whether agent \( a_{i} \) is in vertex \( v_{j} \) at time step t.

  2. 2.

    \( {\mathcal{E}}_{j,k}^{t} \left( {a_{i} } \right) \) for every \( t \in \left\{ {0,1, \ldots , \mu - 1} \right\} \) and \( \left( {u_{j}^{t} ,u_{k}^{t + 1} } \right) \in \left( {E_{i} \cup F_{i} } \right) \) – Boolean variable that model transition of agent \( a_{i} \) from vertex \( v_{j} \) to \( v_{k} \) through any edge (standard or extra) between time steps t and \( t + 1 \) respectively.

  3. 3.

    \( \text{C}^{t} \left( {a_{i} } \right) \) for every \( t \in \left\{ {0,1, \ldots , \mu - 1} \right\} \) such that there exist \( u_{j}^{t} \in V_{i} \) and \( u_{k}^{t + 1} \in V_{i} \) with \( \left( {u_{j}^{t} ,u_{k}^{t + 1} } \right) \in F_{i} \) – Boolean variables that model cost of movements along extra edges (from \( F_{i} \)) between time steps t and \( t + 1 \).

We now introduce constraints on these variables to restrict illegal values as defined by our variant of MAPF. Other variants may use a slightly different encoding but the principle is the same. Let \( T_{\mu } = \left\{ {0,1, \ldots ,\mu - 1} \right\}. \) Several groups of constraints are introduced for each agent \( a_{i} \in A \) as follows:

  1. C1:

    If an agent appears in a vertex at a given time step, then it must follow through exactly one adjacent edge into the next time step. This is encoded by the following two constraints, which are posted for every \( t \in T_{\mu } \) and \( u_{j}^{t} \in V_{i} \).

    $$ {\upchi }_{j}^{t} \left( {a_{i} } \right) \Rightarrow \mathop {\bigvee }\nolimits_{{\left( {u_{j}^{t} ,u_{k}^{t + 1} } \right) \in \left( {E_{i} \cup F_{i} } \right)}} {\mathcal{E}}_{j,k}^{t} \left( {a_{i} } \right) $$
    (1)
    $$ \mathop {\bigwedge }\nolimits_{{\left( {u_{j}^{t} ,u_{k}^{t + 1} } \right),\left( {u_{j}^{t} ,u_{l}^{t + 1} } \right) \in \left( {E_{i} \cup F_{i} } \right) \wedge k < l}} \neg {\mathcal{E}}_{j,k}^{t} \left( {a_{i} } \right) \vee \neg {\mathcal{E}}_{j,l}^{t} \left( {a_{i} } \right) $$
    (2)
  2. C2:

    Whenever an agent occupies an edge it must also enter it before and leave it at the next time-step. This is ensured by the following constraint introduced for every \( t \in T_{\mu } \) and \( \left( {u_{j}^{t} ,u_{k}^{t + 1} } \right) \in \left( {E_{i} \cup F_{i} } \right). \)

    $$ {\mathcal{E}}_{j,k}^{t} \left( {a_{i} } \right) \Rightarrow {\upchi }_{j}^{t} \left( {a_{i} } \right) \wedge {\upchi }_{k}^{t + 1} \left( {a_{i} } \right) $$
    (3)
  3. C3:

    The target vertex of any movement except wait action must be empty. This is ensured by the following constraint introduced for every \( t \in T_{\mu } \) and \( \left( {u_{j}^{t} ,u_{k}^{t + 1} } \right) \in \left( {E_{i} \cup F_{i} } \right) \) such that \( j \ne k. \)

    $$ {\mathcal{E}}_{j,k}^{t} \left( {a_{i} } \right) \Rightarrow \mathop {\bigwedge }\nolimits_{{a_{l} \in A \wedge a_{l} \ne a_{i} \wedge u_{k}^{t + 1} \in V_{l} }} {\upchi }_{k}^{t + 1} \left( {a_{l} } \right) $$
    (4)
  4. C4:

    No two agents can appear in the same vertex at the same time step (although the previous constraint ensures that an agent does not collide with an agent currently residing in a vertex it does not prevent simultaneous entering of the same vertex by multiple agents). That is the following constraint is added for every \( t \in T_{\mu } \) and pair of agents \( a_{i} ,a_{l} \in A \) such that \( i \ne l. \)

    $$ \mathop {\bigwedge }\nolimits_{{u_{j}^{t} \in V_{i} \cap V_{l} }} \neg {\upchi }_{j}^{t} \left( {a_{i} } \right) \vee \neg {\upchi }_{j}^{t} \left( {a_{l} } \right) $$
    (5)
  5. C5:

    Whenever an extra edge is traversed the cost needs to be accumulated. In fact, this is the only cost that we accumulate as discussed above. This is done by the following constraint for every \( t \in T_{\mu } \) and extra edge \( \left( {u_{j}^{t} ,u_{k}^{t + 1} } \right) \in F_{i} \).

    $$ {\mathcal{E}}_{j,k}^{t} \left( {a_{i} } \right) \Rightarrow \text{C}^{t} \left( {a_{i} } \right) $$
    (6)
  6. C6:

    Cardinality Constraint. Finally, the bound on the total cost needs to be introduced. Reaching the sum-of-costs of \( \xi \) corresponds to traversing exactly \( {\Delta } \) extra edges from \( F_{i} \). The following cardinality constrains ensures this:

    $$ \le_{\Delta } \left\{ {\text{C}^{t} \left( {a_{i} } \right){|}i = 1,2, \ldots ,n \wedge t = 0,1, \ldots \mu - 1 \wedge \left\{ {\left( {u_{j}^{t} ,u_{k}^{t + 1} } \right) \in F_{i} } \right\} \ne \emptyset } \right\} $$
    (7)

The resulting Boolean formula that is a conjunction of C1 … C7 will be denoted as \( {\mathcal{F}}_{BASIC} \left( {{\Sigma },\mu ,{\Delta }} \right) \) and is the one that is consulted by Algorithm 1.

The following proposition summarizes the correctness of our encoding.

Proposition 2.

MAPF \( {\Sigma } = \left( {G = \left( {V,E} \right),A,\alpha_{0} ,\alpha_{ + } } \right) \) has a sum-of-costs solution of \( \xi \) if and only if \( {\mathcal{F}}_{BASIC} \left( {{\Sigma },\mu ,{\Delta }} \right) \) is satisfiable. Moreover, a solution of MAPF \( {\Sigma } \) with the sum-of-costs of \( \xi \) can be extracted from the satisfying valuation of \( {\mathcal{F}}_{BASIC} \left( {{\Sigma },\mu ,{\Delta }} \right) \) by reading its \( {\upchi }_{j}^{t} \left( {a_{i} } \right) \) variables.

Proof:

The direct consequence of the above definitions is that a valid solution of a given MAPF \( {{ \varSigma }} \) corresponds to non-conflicting paths in the TEGs of the individual agents. These non-conflicting paths further correspond to satisfying the variable assignment of \( {\mathcal{F}}_{BASIC} \left( {{\Sigma },\mu ,{\Delta }} \right), \) i.e., that there are \( {\Delta } \) extra edges in TEGs of depth \( \mu = \mu_{0} + {\Delta }. \) □

As discussed in [30], the limitation of BASIC-SAT encoding is its size which is implied by the size of the time expanded graph. To mitigate this limitation Surynek et al. took inspiration from another successful search-based solver called increasing cost tree search (ICTS) [19]. Vertices whose sum of distances from \( a_{0}^{0} \left( {a_{i} } \right) \) and \( a_{ + }^{\mu } \left( {a_{i} } \right) \) in TEGi is greater than \( \mu \) can never be visited by \( a_{i} \) in any optimal solution or else \( a_{i} \) would not have enough time steps to reach \( a_{ + }^{\mu } \left( {a_{i} } \right). \) Omitting those vertices from TEGs that are too far in the aforementioned sense would not compromise soundness of the solving process but would lead to a smaller formula. In [30], this version of TEGs where unreachable vertices are omitted is called MDD and corresponding formula is denoted as \( {\mathcal{F}}_{MDD} \left( {{\Sigma },\mu ,{\Delta }} \right) \). When referring to MDD-SAT solver we assume the version with MDDs.

Using MDDs can rule out many vertices that would be normally considered in standard time expansions. Experiments confirmed that MDDs enabled using the SAT-based approach even for large MAPF instances for which the size of encodings without MDD was prohibitive.

5 Independence Detection

Our major aim is to increase performance of the SAT-based MAPF solver by reducing the number of agents needs to be considered at once. This has been successfully done in search based methods via a technique called independence detection.

In this section, we will describe the original method of independence detection proposed by Standley (2010). The main idea behind this technique is that difficulty of MAPF solving optimally grows exponentially with the number of agents. It would be ideal, if we could divide the problem into a series of smaller sub problems, solve them independently at low computational effort, and then combine them.

The simple approach, called simple independence detection (SID), assigns each agent to a group so that every group consists of exactly one agent. Then, for each of these groups, an optimal solution is found independently. Every pair of these solutions is evaluated and if the two groups’ solutions are in conflict (that is, when collision of agents belonging to different group occurs), the groups are merged and replanned together. If there are no conflicting solutions, the solutions can be merged to a single solution of the original problem. This approach can be further improved by avoiding merging of groups.

Generally, each agent has more than one possible optimal path. However, SID considers only one of these paths. The improvement of SID known as independence detection (ID) is as follows. Let’s have two conflicting groups G1 and G2. First, try to replan G1 so that the new solution has the same cost and the steps that are in conflict with G2 are forbidden.

If no such solution is possible, try to similarly replan G2. If this is not possible, merge G1 and G2 into a new group. In case either of the replanning was successful, that group needs to be evaluated with every other group again. This can lead to infinite cycle. Therefore, if two groups were already in conflict before, merge them without trying to replan.

figure b

Standley uses ID in combination with the A* algorithm. While planning, it is preferred to find paths that create the least possible amount of conflicts with other groups that have already planned paths. For this purpose, the conflict avoidance table is created (see Algorithm 2 for pseudo-code).

The table stores moves of agents in other groups. In case A* has a choice between several nodes with the same minimal f() cost, the one with least amount of conflicts is expanded first. This technique yields an optimal solution that has a minimal number of conflicts with other groups. This property is useful when replanning of a group’s solution is needed.

Both SID and ID do not solve MAPF on their own, they only divide the problem into smaller sub-problems that are solved by any possible MAPF algorithms. Thus, ID and SID are general frameworks, which can be executed on top of any MAPF solver.

6 Integrating SID and ID into MDD-SAT

SID can be integrated into the SAT-based framework as a top-level algorithm where MDD-SAT merely serves as a procedure for optimal MAPF solving restricted on an individual group. Hence, no modification of the core MDD-SAT procedure is needed.

ID however requires modification of the original ID since in the propositional formula it is not possible to express preference that individual paths of groups of agents should avoid occupied positions in the conflict avoidance table. In the yes/no SAT environment we either manage to avoid occupied positions or not while in the negative case there is no easy tool how to control the number of conflicts.

The SAT-based version of ID works in similar way to the original version of Standley but instead of resolving conflicts between a pair of conflicting groups \( G_{1} \) and \( G_{2} \) it resolves conflict of group \( G_{1} \) with all other groups. If this attempt is successful, \( G_{1} \) is independent on others and the process can continue with resolving conflicts between remaining groups (see Fig. 4 where \( G_{1} \) has been made independent).

Fig. 4.
figure 4

A schematic illustration from [31] of path replanning within the independence detection technique. A path for the group \( G_{1} \) conflicted with paths of other two groups (left part). Then path for \( G_{1} \) has been successfully replanned (right part).

If the attempt to resolve conflict between \( G_{1} \) and \( G_{2} \) by making \( G_{1} \) independent fails, the same is tried for \( G_{2} \). If the attempt for \( G_{2} \) fails too groups are merged. The pseudo-code is shown as Algorithm 3.

In contrast to original ID we strictly require avoidance with respect to the conflict avoidance table instead of stating it as a preference only. This is technically done by omitting the conflicting vertices in the MDD. The SAT approach does not allow to express a preference like in the search based algorithm. This is the reason why ID in the SAT-based solver differs from the original one.

figure c

7 Experiments

We performed experimental comparison of the proposed MDD-SAT+SID and MDD-SAT+ID solvers with other state-of-the-art solvers – namely with the previous best SAT-based solver MDD-SAT and also with search-based algorithms ICTS and ICBS.

The MDD-SAT+SID and MDD-SAT+ID have been implemented in C++ as an extension of an existing implementation of the MDD-SAT solver. A couple of minor improvements have been done in the original MDD-SAT encoding – some auxiliary propositional variables have been eliminated which reduced the size of the encoding and consequently saved runtime while generating formulae (this improvement affects both MDD-SAT and new MDD-SAT+SID, MDD-SAT+ID used in presented experiments).

We used Glucose 3.0 [1] in variants of MDD-SAT which is a top performing SAT solver according to the recent SAT Competitions [4]. The complete implementation of the MDD-SAT solvers is available on-line to allow reproducibility of the presented results: http://ktiml.mff.cuni.cz/~surynek/research/icaart2017.

ICTS and ICBS have been implemented in C#. The original implementations of these algorithms have been used.

All the tests were run on Xeon 2 Ghz, and on Phenom II 3.6 Ghz, both with 12 Gb of memory.

The experimental setup followed the scheme used in the literature [22] which tests MAPF algorithms on 4-connected grids. Let us note however that all the suggested algorithms are designed and implemented for general undirected graphs (the fact that grids are used in the experiments is not exploited to increase efficiency of solving in any way).

7.1 Small Grids Evaluation

The first series of experiments takes place on small square grids of sizes 8 × 8, 16 × 16, and 32 × 32 with 10% of vertices occupied by obstacles. In this setup of the environment, we increased population of agents from 1 and observed the runtime of all the solvers until no solver was able to solve the instance within the given time limit of 300 s (this was 20 agents for 8 × 8 grid, and 40 and 60 for 16 × 16 and 32 × 32 girds respectively).

Ten randomly generated instances per number of agents were used. The initial positions were generated by choosing a subset of vertices randomly. The goal arrangement has been generated as a long random walk from the initial state following valid moves – this ensured solvability of all the tested instances.

To be able to communicate results of experiments more easily we intuitively distinguish three different categories of instances with respect to the density of agents as follows. The behavior of solvers is then discussed with respect to these categories:

  • Low density – few interactions among agents, paths for individual agents can be planned independently.

  • Medium density – some interaction among agents are inevitable but there exist multiple groups of agents that are independent of each other.

  • High density – majority of agents are interdependent and form one large group.

The small grid experiment contains instances from all these three cases. The hypothesis is that the SID and ID technique will be helpful in instances with medium density of agents while ID is expected to reach benefit in higher densities of agents. We also expect that in the case of low density of agents there will be some benefit of SID and ID since many agents will just follow their shortest paths towards goals in such a case. As in low and medium density cases the complexity of the formula is not proportional to the difficulty of the instance.

Furthermore, we expect rather negative effect of using SID and ID in instances with high density of agents. This is because of the fact that most agents will be gradually merged into a large group while the process of merging represents an overhead in such a case.

Experimental result for the small grids (see Fig. 5) confirmed the hypothesis. MDD-SAT+SID/ID win in low to medium density of agents. For the higher density of agents, both MDD-SAT+SID/ID tend to be eventually outperformed by the original MDD-SAT. If SID and ID are compared then we can see that ID has more significant benefit than SID in most cases.

Fig. 5.
figure 5

Results of experiments on small grid maps of sizes 8 × 8, 16 × 16, and 32 × 32. Figures show how many instances were solved within the given runtime and sorted runtimes (right bottom part). Clearly versions of MDD-SAT dominate in the test over search based algorithms ICTS and ICBS except few quickly solvable cases. Moreover, MDD-SAT+ID and MDD-SAT+SID outperforms MDD-SAT in cases with low to medium density of agents. MDD-SAT+ID and MDD-SAT+SID exhibit similar performance while ID shows its advantage in instances requiring more time.

7.2 Large Maps – Dragon Age

We also experimented on three structurally different large maps from Dragon Age: Origins [26] – ost003d , den520d , and brc202d (see Fig. 6). Our choice of maps is driven by the choice of authors in the previous literature [20, 30].

Fig. 6.
figure 6

Illustration of large Dragon Age maps ost003d (size 194 × 194), den520d (size 257 × 256), and brc202d (size 481 × 530).

We used setup with 16 and 32 agents randomly paced agents which represents low to medium density. Let us note that a case with high density of agents in the map of that size is currently out of reach of any existing algorithm.

To obtain problems of various difficulties the distance of agents from initial positions to their goals has been varied in the range 8, 16, 24, …, 320.

For each distance 10 random instances were generated in which initial positions were selected randomly and then random walk has been performed until all the agents reach at least the given distance from its initial position.

The hypothesis for large maps is that MDD-SAT+SID/ID should dominate generally with some expected advantage of ID which in fact is the same hypothesis as in the case of small grids because here we have only the low-medium density case. However, as there are important structural differences between the three tested maps which impact is hardly predictable. Intuitively, SID/ID should have been more beneficial in ost003d and den520d maps since in these maps there is more room to find alternative paths.

Results for the three Dragon Age maps are shown in Figs. 7, 8, and 9. Again the number of instances solved in the given runtime is shown. The difficulty (runtime) grows with the growing distance of agents from their goals in this setup.

Fig. 7.
figure 7

Results of experiments on Dragon Age map ost003d . MDD-SAT+ID outperforms MDD-SAT in harder instances while MDD-SAT+SID performs worse than MDD-SAT. All MDD-SAT versions are dominated by ICTS.

Fig. 8.
figure 8

Results of experiments on Dragon Age map den520d . ID brings minor benefit in harder instances while SID has merely a negative effect.

Fig. 9.
figure 9

Results of experiments on Dragon Age map brc202d . ID brings significant improvement in harder instances with 32 agents. SID again has rather a negative effect in MDD-SAT.

It can be read from these results that MDD-SAT+ID tends to outperform MDD-SAT in more difficult instances. In these instances, the interaction among agents in non-trivial but on the other hand the interdependence among agents is tractable by ID.

Surprising results have been obtained for MDD-SAT+SID which performed generally worse than MDD-SAT. SID hence was unsuccessful in independence detection enough to produce any performance benefit in MDD-SAT expect the case of very easy instances.

The intuitive hypothesis was not confirmed completely since surprisingly MDD-SAT is better than MDD-SAT+ID in easier instances of medium density category usually and the performance of MDD-SAT+SID remained behind expectation. Our initial intuitive hypothesis did estimate well the effort needed for merging groups that eventually represents a big overhead in case of large maps. Hence, MDD-SAT+ID can show its benefit after the difficulty of the formula representing the entire MAPF instance prevails over the difficulty of group merging.

Another surprising result was obtained in brc202d map where MDD-SAT+ID was a very clear winner in harder instances with 32 agents.

Moreover, we cannot say that SAT-based approach represented by MDD-SAT and MDD-SAT+SID/ID is a universal winner as there are cases where ICTS and ICBS dominate ( ost003d with 32 agents is such an example).

7.3 Discussion

It can be generally observed that ID brings worthwhile improvement to MDD-SAT solver which by itself performs very well. The simple version of independence detection SID provides worse benefit than ID and in large instances its effect is even negative.

Experimental results indicate that there is a certain range of the density of agents though not precisely determined in our evaluation in which ID is beneficial while outside this range it cases an overhead.

The implementation of ID within the MDD-SAT+ID solver did not use any special reasoning about what groups of agents should be merged or not. The groups were processed in the ordering given by the original ordering of agents. We expect that more careful reasoning about merging can bring yet more improvements.

8 Conclusion

Integration of the existing technique of independence detection (ID) into the SAT-based approach to MAPF has been presented. Performed experimental evaluation shows that using ID in the SAT-based approach to MAPF has a significant performance benefit. This can be especially observed in instances with medium density of agents where interactions among agents are non-trivial but there is still chance to find sufficiently many small independent groups of agents.

The new solver called MDD-SAT+ID is a combination of an existing SAT-based MAPF solver MDD-SAT and ID. We have shown that MDD-SAT+ID is a new state-of-the-art in the optimal SAT-based MAPF solving for the presented categories of the MAPF problem. Moreover, the new MDD-SAT+ID performs well also with respect to the best search based solvers like ICTS and ICBS though we cannot say that MDD-SAT+ID is a universal winner.

The future research of the presented topic will focus on the following aspects: (i) The classification of density of agents was intuitive only in the presented experimental evaluation. Hence the immediate future work is to develop concepts for more precise classification of the density and interaction among agents. (ii) ID is not beneficial across all instances and sometimes represents an overhead. Hence, having a more rigorous classification of the density of agents, we can develop a mechanism for automated deciding whether to use ID or not according to the classified density of agents. (iii) Currently we take groups of agents to be merged in the same order as they appear in the input. A more careful consideration of which groups to merge may lead to further performance improvements.