1 Introduction

The domains robots can operate in is rapidly expanding as robotic capabilities increase. Some robotic domains, such as mass casualty response, will require close coupling between the human and robot responders in order to successfully complete the mission. A taxonomy for categorizing multi-agent system problems includes the types of robots (single-task robots vs. multi-task robots), the number of robots per task (single-robot tasks vs. multi-robot tasks), and when information is available (instantaneous allocation vs. time-extended allocation) [20]. This manuscript addresses centralized, domain-independent planning for multi-task robot, multi-robot task, instantaneous allocation missions.

The planning problem uses an initial state and a set of actions and constraints to derive a plan to achieve a goal state. The planning problem complexity is partially a function of the number of robots and tasks and partially a function of the domain model complexity [16]. Planning for expressive real-world models with durative actions, joint actions, concurrently executing actions, continuous variables, and continuous effects is much harder than planning domains with instantaneous actions and boolean variables.

Consider a mass casualty response scenario after a tornado, such as the EF-4 tornado in Tuscaloosa, AL on April 27, 2011. The immediate response involved hundreds of responders from local government agencies and required coordination to complete several tasks, including clearing impassable roads, securing prohibited items, triaging the wounded, and locating victims in the disaster area. Moving about the environment, clearing debris from roads, and all other actions require time and are not instantaneous actions. Agents must be able to coordinate actions and work concurrently. Real-world domains include continuous variables, such as fuel level, that must be considered. The heterogeneous agents, complex tasks, and complex environment complicate this difficult planning problem.

One method to address planning complexity is factored planning, which splits the goal into lower complexity subgoals. Multi-agent planning approaches using factored planning focus on how the entire coalition [4, 14] or an individual agent [46] can solve subproblems. Factored planning as part of single-agent planning does not address multiple agents [7, 23]. Solutions for path-planning [47] and target tracking [24] exist, but are domain-dependent. The tools presented in this manuscript address domain-independent planning problems with multiple, heterogeneous agents and multiple, complicated tasks in complex domains with durative actions, concurrently executing actions, continuous variables, and continuous effects by factoring the problem by both tasks and coalitions of agents.

Coalition formation can manage problem complexity by allocating a team of agents to each task. Coalition formation uses the capabilities of the agents and the capability requirements of the tasks to form teams of agents that can accomplish a set of assigned tasks, while optimizing an objective function (e.g., utility, cost, or number of tasks completed) [40]. The factoring produced by coalition formation generates several smaller problems from the original problem; however, coalition formation cannot guarantee allocated coalitions will be able to execute the task to which they are assigned to complete.

Current coalition formation research does not address the nonexecutable coalition problem (a coalition which cannot complete its assigned task) and current planning research does not perform task allocation on the scale of coalition formation. Factored planning is a popular approach to decomposing the problem into manageable subproblems, but existing factored planning algorithms do not consider agents. Multi-agent planning performs goal allocation, but typically does not consider how multiple agents working together simultaneously on a single task affects plan quality. Three novel tools incorporating both coalition formation and planning are presented: coalition formation then planning, relaxed plan coalition formation, and task fusion. The coalition formation then planning approach is used as the basis for the other tools that utilizes the output from the coalition formation problem as the inputs to the planning problem. Tasks are planned separately by the allocated coalitions and the results are merged into a single solution for the original problem. The coalition formation then planning tool derives satisficing solutions quickly, but raises two problems: nonexecutable coalitions and suboptimal solutions. If coalition formation allocates a coalition to a task that the coalition cannot complete, then a new coalition must be allocated. Relaxed plan coalition formation augments coalition formation then planning by performing iterative planning, relaxed planning, and coalition formation until a valid executable plan is identified. The second problem with coalition formation then planning is suboptimal solutions. Limiting the agents available for planning limits the problem complexity, but reduces the size of the problem solution set; thus, leading to suboptimal solutions. Task fusion balances solution quality with problem complexity by planning for tasks and coalitions together for which higher solution quality outweighs increased problem complexity.

Section 2 provides an overview of related research. Section 3 presents a formal definition of the problem. Three experimental domains are given in Sect. 4. Section 5 presents the experimental design used to evaluate the presented planning tools. Section 6 presents the tools and the results for each tool solving problems in each experimental domain followed by a discussion of how the results motivate the next tool. Finally, a conclusion and future work is presented in Sect. 7.

2 Related work

2.1 Coalition formation

Coalition formation is a subclass of the task allocation problem without constraints on the number of agents (robot or human) allocated to each task nor the number of coalitions to which an agent is a member. The coalition formation problem is NP-complete [37], is difficult to approximate [40], and represents a multi-task, multi-robot problem that incorporates algorithms for both instantaneous allocation [42, 50] and time extended allocation [25, 34]. The goal is to form teams of agents that are together more capable than the team’s individual agents and can accomplish a set of assigned tasks, while optimizing an objective function (e.g., utility, cost, tasks completed). The general coalition formation problem assumes a grand coalition of n agents, \({\varPhi }=\{\phi _1,\ldots ,\phi _n\}\), and a set of m tasks, \(V=\{v_1,\ldots ,v_m\}\). A solution is a map of each task, \(v \in V\), to a coalition assigned to the task, \({\varPhi }_v \subseteq {\varPhi }\) [37].

Agents and tasks are modeled as their capabilities offered or capabilities required, respectively. Two different capability models are used, the resource model and the service model. The resource model treats each agent as a set of available resources (e.g., chemical sensor, camera, laser) and each task as a set of required resources. Let Res be a vector of possible resource types, where \(Res_i\) is the ith resource type. Each agent, \(\phi \), is modeled as a resources available vector, \(Res^{\phi }\), and a coalition, \({\varPhi }_i \subseteq {\varPhi }\), is modeled as a vector equal to the sum of the available resource vectors of the constituent agents, \(Res^{{\varPhi }_i}=\sum _{\phi \in {\varPhi }_i} Res^{\phi }\). A task, v, is similarly defined as a resources required vector, \(Res^v\). All elements of resources available vectors and resources required vectors must be non-negative, and at least one element in each vector must be non-zero. A coalition, \({\varPhi }_j\), is a candidate coalition for a task, v, if and only if it has available at least as many of each resource type as the task requires, \(\forall i, Res^{{\varPhi }_j}_i \ge Res^v_i\). Only a candidate coalition for v can be allocated to v.

The service model associates a set of functions that each agent can perform with the particular agent (e.g., box-pushing, mapping, sentry-duty). Let Ser be a vector of possible service types, where \(Ser_i\) is the ith service type. An agent, \(\phi \), has a services available vector indicating whether or not each service is offered by the agent, \(Ser^\phi _i \in \{0,1\}\), where \(Ser^\phi _i\) is 1 if \(\phi \) offers service i and 0 if not. A coalition, \({\varPhi }\), has a services available vector equal to the sum of the services available vectors of its constituent agents, \(Ser^{{\varPhi }}=\sum _{\phi \in {\varPhi }} Ser^{\phi }\). A task, v, is modeled as a services required vector, \(Ser^v\), where \(Ser^v_i \in {\mathbb {N}}\) is a non-negative integer representing the number of services of type \(Ser_i\) required to satisfy v and \(\exists j, Ser^v_j > 0\). A coalition, \({\varPhi }\), is a candidate coalition for a task, v, if and only if has available at least as many services as the task requires, \(\forall i, Ser^{{\varPhi }}_i \ge Ser^v_i\).

There are many heuristic-based coalition formation algorithms, each with its own strengths and weaknesses. Greedy algorithms can derive solutions quickly, but make no guarantees on the solution quality [40, 42, 44, 51]. Approximation algorithms provide solution quality guarantees, but suffer from poor worst-case run-time complexity, which can render them inappropriate for real-time applications [27, 33]. Market-based techniques offer fault-tolerance for a distributed system, but have high communication processing requirements [10, 41, 43, 50]. Biologically inspired ant colony optimization algorithms have been applied to several NP-complete problems, including coalition formation [36, 38]. Different coalition formation algorithms provide different solutions with differing performance. For example, selecting a market-based algorithm with high communications requirements for use in a communications constrained environment results in poor performance. The intelligent Coalition Formation for Humans and Robots system was developed to autonomously reason over the specified mission constraints in order to select a subset of coalition formation algorithms to apply to a particular allocation problem [39].

2.2 Planning

Classical planning results in a satisficing plan that contains a sequence of actions that achieves a goal state [18]. While classical planning is for single-task robots executing single-robot tasks with an instantaneous allocation, variants of classical planning span the Gerkey and Matarić taxonomy. The extensions of classical planning most applicable to this research are temporal planning, continuous planning, and multi-agent planning. Temporal planning admits durative actions to the action set and allows concurrently executed actions in the solution, continuous planning incorporates continuous variables in the state space and continuous effects in the actions, and multi-agent planning models multiple agents executing actions, rather than a single agent, as in classical planning.

Temporal planning incorporates durative actions and concurrent action execution. The model for durative actions expands the classical action model to include a duration and temporal specifications for action conditions and effects. Action duration specifies the length of time required to execute the action. The temporal specifications indicate when conditions must be satisfied (at the beginning, at the end, or over the entire action duration) and when the effects are applied (at the beginning or at the end of action execution). A solution to the temporal planning problem is a satisficing plan that combines a set of actions with execution constraints to achieve a goal state. Temporal planning solutions can be classified as single-task agents executing single-agent tasks in an instantaneous allocation. State-space based search is a popular method for planning, such as Yet Another Heuristic Search Planner (YAHSP) [49], but other methods such as SAT-based planners (ITSAT [35]) also exist.

Approaches to managing problem complexity include subgoal partitioning and state-based decomposition. Subgoal Plan solved large problems by creating a subgoal partitioning through goal constraint analysis [7]. The subproblems were solved by Metric Fast Forward [21] and were significantly easier to solve than the original problem. The time to solve a problem exponentially decreased as the subproblems’ size was linearly reduced. Divide-and-Evolve, similar to Subgoal Plan, used a preprocessing step to decrease problem complexity before an encapsulated satisficing planner was used to solve the problem [3]. Divide-and-Evolve used a state-based decomposition strategy to find a sequence of intermediate states that collectively solve the problem. Divide-and-Evolve with YAHSP [49] as the encapsulated planner solved significantly more problems than YAHSP alone.

Continuous planning incorporates continuous variables in the state space and expands the action model to include continuous effects. Classical planning models require continuous variables to be discretized; however, real-world models are more accurate when state variables, such as fuel level and temperature, can be modeled as continuous variables. Continuous effects must be combined with temporal planning and durative actions can have effects applied over the entire action duration, known as a continuous effect. For example, an accurate real-world model of aircraft flight must include a continuously decreasing fuel level. If continuous effects are not admitted, then the change in fuel level over the entire action duration must be applied instantaneously. A solution to the continuous planning problem is a satisficing plan that combines a set of actions with execution constraints to achieve a goal state. Continuous planning solutions can be classified as single-task agents executing single-agent tasks in an instantaneous allocation.

Some planners, such as Temporal Fast Downward (TFD) [17], support continuous variables, but not continuous change. Zeno was the first planner to allow continuous change in planning problems [31]; however, it was unable to handle concurrent continuous effects, such as are required for an accurate model of in-flight refueling. COLIN (COntinuous LINear) extended state space search techniques to manage continuous effects [8]. Other continuous planning algorithms include IxTeT [26], Sapa [12], and dReal[5]. Accommodating domains with non-linear continuous change allows real-world domains to be more accurately modeled, but is only supported by dReal.

Multi-agent planning explicitly considers multiple agents executing actions by substituting a set of agents for a set of actions in the classical planning definition, where each agent is modeled as a set of actions that the agent can execute. The solution is a plan specifying a set of actions with execution constraints and an associated agent responsible for executing each action. Multi-agent planning solutions exist for single-task and multi-task agents, single-agent and multi-agent tasks, and instantaneous allocation.

Factored planning approaches are natural multi-agent planning solutions. One of the first such algorithms used individual agent planning to generate a heuristic for use in global planning for the original problem [14, 15]. Another factored planning approach performed distributed planning followed by agent voting [46]. Agents modified a base plan and distributed it to the other agents. A new base plan was selected from the set of agent plans by voting. If all agents indicated the base plan satisfied their task, then planning ended, otherwise another iteration of planning and voting occurred.

Deriving plans individually requires a plan merge step to integrate the plans to a single global solution. Plan merge allows agents to take advantage of side products, the unused product of other agents’ actions, to eliminate redundant actions in plan merge steps [52]. Another approach treats the problem as a plan-space search problem in which incremental changes are made until the plan is valid [9]. Part of the plan merge problem requires satisfying all temporal constraints. Simple temporal networks have been applied by encoding the temporal constraints of the individual plans and finding consistent variable assignments representing a valid plan merge [1].

The domain complexity and the method used for factoring the problem are the differentiating aspects addressed by the presented tools. Existing multi-agent planning solutions focus on instantaneous actions and discrete state spaces or assume that tasks are executable by a single agent [4, 11, 13, 30]. Developing real-world domain models requires durative actions in continuous state spaces for tasks that require multiple agents.

2.3 Integrated task allocation and planning

Chance-constrained task allocation is an example approaches that incorporate task allocation with aspects of planning [32]. Each agent estimated the utility of it completing each task. Allocation utility was a function of the agent allocated to the task, when the agent will be able to execute the task, and a predefined model of problem uncertainty. However, real-world problems do not consist exclusively of single-agent tasks.

Approaches to coalition formation in which tasks have temporal and spatial constraints can address task allocation and scheduling, but do not produce plans for how agents will execute their allocated tasks when they reach the task location [25, 34]. Both approaches are for multi-task agents executing multi-agent tasks in a time-extended allocation and assumed all agents capable of reaching the task location were able to contribute to the task, an invalid assumption in some missions (e.g., an agent without a camera cannot assist an imaging task).

Auction style coalition formation algorithms allow agents to perform task planning prior to allocation and typically grant exclusive ownership of tasks, which can be detrimental when agents fail to complete their task and no method for informing the other agents of the failure exists. A method based on bounty hunters and bail bondsmen allows for nonexclusive task execution [53]. Following the bail analogy, agents act as bounty hunters and auctioneers as bail bondsmen. The bail bondsmen increase the value of each task until it has been completed. Agents commit to a task and announce to the other agents that they have committed to the task. Agents can plan how to complete each available task, but can only commit to a single task. Agents receive the task utility only upon completing the task. If an agent fails to complete a task, the system adapts by incentivizing other agents to complete the task through increasing task value. This approach lacks collaboration among the agents, a key feature in real-world problems representative of multi-robot task domains. These auction approaches are for multi-task agents executing single-agent tasks.

The Automatic Synthesis of Multi-robot Task solutions through software Reconfiguration (ASyMTRe) system used connected agent and task schemas to allocate coalitions to tasks [45]. Agents were modeled by perceptual, motor, and communication schema and tasks were modeled as a set of motor schema requirements. ASyMTRe connected robot schemas to develop a joint schema capable of accomplishing assigned tasks. For example, if robot \(r_i\) knows its position relative to robot \(r_j\) and \(r_j\) knows its position in a global reference frame, then \(r_i\) can derive its position in the same global reference frame. Connecting the various robotic schemas determined which agents were capable of jointly completing a task, but did not plan how the robots completed the task. A similar system, Remote Object Control Interface, considered robots as nodes offering expanded functionality dependent on the other nodes in the system [6]. Both systems fail to produce executable plans for the assigned tasks. These two systems are both for multi-task agents executing multi-agent tasks.

One domain-dependent integration of task allocation and planning that has been extensively studied is the multi-robot task allocation and path planning problem [2, 29, 54]. Simultaneously considering the task allocation and the path to the task allows for collision-free trajectories to be developed more efficiently than if the two problems were considered independently. One application incorporates two agents that swap tasks when a collision is detected [47, 48]. The new trajectories for the agents are guaranteed not to collide with one another. A search and destroy problem with attack UAVs developed a plan offline to determine an optimal search pattern to locate mobile targets whose locations were unknown a priori; thus, the decision regarding which UAVs will perform the attack must be made online [24]. A distributed probabilistic approach considered the path for each UAV to reach the target, the UAV’s attack capability, and the probability of target destruction. These approaches are feasible for the specific domains, but many, diverse multi-agent domains exist and developing a different solution for each domain is impractical.

Task allocation and planning are closely coupled problems, but there is minimal existing literature that addresses the interaction between the two problems. Planning affects task allocation via the developed plans, as the plan constrains the set of agents available by requiring agents to perform actions at specific times. Task allocation affects planning by determining which agents are available when developing a plan. If an agent is not allocated to a task, then the planning algorithm will not use the agent to develop a plan. Tools for coupling domain-independent task allocation and planning will facilitate solving planning problems consisting of multi-task agents executing multi-agent tasks.

3 Formal definition

The presented tools are for planning for multi-task robot, multi-robot task, instantaneous allocation problems [20]. This Hybrid Mission Planning with Coalition Formation problem couples coalition formation with planning to facilitate solving complex problem instances with heterogeneous multi-task robots executing multi-robot tasks.

Definition 1

(Hybrid Mission Planning with Coalition Formation) The hybrid mission planning with coalition formation problem is defined as a tuple, \(\langle S, I, {\varPhi }, V, C \rangle \), where:

  • S is the state space,

  • I is the initial state,

  • \({\varPhi } = \{ \phi _1, \phi _2, \ldots , \phi _m \}\) is the grand coalition of agents,

  • \(A = 2^{\varPhi } \rightarrow 2^{Act}\) is the coalition-action set mapping,

  • \(V = \{ v_1, v_2, \ldots , v_n \}\) is the set of tasks, and

  • \(C = \langle Cap, C_{\varPhi }, C_V \rangle \) is the capability vector, coalition capability mapping, and the task capability mapping.

The hybrid state space, S, includes boolean, discrete, and continuous variables. A state, s, is an assignment of each state space variable to a value in its associated domain. The initial state, I, is the environment state at the beginning of the mission.

The grand coalition, \({\varPhi }\), is the set of all available agents. A coalition, \({\varPhi }_i \subseteq {\varPhi }\), is any non-empty set of agents. The coalition-action set mapping, A, maps a possible coalition, \(2^{\varPhi }\), to a set of actions the coalition can execute, \(2^{Act}\), where Act is the set of all possible actions. An action is modeled as a tuple, \(\langle {\varPhi }_{exec}, eff , cond, dur \rangle \), where:

  • \({\varPhi }_{exec}\) is the executor coalition,

  • \(cond = \langle cond_\vdash , cond_\leftrightarrow , cond_\dashv \rangle \) is the action state constraints that must be satisfied at the beginning, during, and at the end of action execution, respectively, and

  • \( eff = \langle eff _\vdash , eff _\leftrightarrow , eff _\dashv \rangle \) is the action effects for atomic fact transitions applied to the state at the beginning of, during, and at the end of action execution, respectively,

  • dur is a constraint on the length of the time interval required to execute the action.

The executor coalition, \({\varPhi }_{exec}\), for an action, a, is the set of agents that execute a. If \({\varPhi }_{exec}\) is a singleton coalition consisting of a single agent, then a is a single-agent action. If \({\varPhi }_{exec}\) includes more than one agent, then a is a joint action between multiple agents. A state constraint can be applied to boolean or continuous state variables. Constraints on boolean variables specify the truth value the variable must take, while constraints on continuous variables specify the interval to which the variable’s value must belong. Action state constraints can be specified as applying at the beginning, during, or end of action execution, \(cond_\vdash , cond_\leftrightarrow \), and \(cond_\dashv \), respectively. Action effects at the beginning of action execution, \( eff _\vdash \), can apply to boolean state variables (as setting the value to true or false) or to continuous state variables (as an instantaneous change in value). Action effects throughout action execution, \( eff _\leftrightarrow \), must apply to continuous state variables and represent a continuous change in the value of the variable during action execution. Action effects at the end of action execution, \( eff _\dashv \), can apply to boolean state variables or to continuous state variables. The action duration constraint, dur, is the interval to which action duration must belong. Action duration must be non-negative. Similar actions, such as navigating between waypoints, are considered different if they are executed by different agents. For example, \(\phi _i\) navigating from \(w_r\) to \(w_s\) is different than \(\phi _j\) navigating from \(w_r\) to \(w_s\).

The task set, V, is a set of tasks to be satisfied. Each task, \(v \in V\), is modeled as a set of goal state constraints. A task, v, is satisfied in a state, s, if and only if all of v’s goal state constraints are satisfied in s.

The capability vector, \(Cap = [ Cap_1, Cap_2, \ldots ]\), is the vector of coalition formation capabilities used in the problem. The coalition capability mapping, \(C_{\varPhi }\), is a mapping of each agent to a capability available vector. The elements of a capabilities available vector are non-negative values, with at least one non-zero element. Each agent, \(\phi \), has a capabilities available vector, \(Cap^\phi \). For example, if \(|Cap| = 5\) and \(\phi \) has two of \(Cap_3\) and three of \(Cap_5\), then \(Cap^\phi = [ Cap^\phi _1 = 0, Cap^\phi _2 = 0, Cap^\phi _3 = 2, Cap^\phi _4 = 0, Cap^\phi _5 = 3]\), where \(Cap^i_j\) is the amount of \(Cap_j\) that entity i (agent or coalition) has at its disposal. Each coalition, \({\varPhi }\), has a capabilities available vector, \(Cap^{\varPhi }\), equal to the sum of the capability available vectors of \({\varPhi }\)’s constituent agents, \(Cap^{\varPhi } = \sum _{\phi \in {\varPhi }} Cap^\phi \). The task capability mapping, \(C_V\), is a mapping of each task to a capability required vector. The elements of a capability required vector are non-negative reals, with at least one non-zero element. For example, if \(|Cap| = 5\) and v requires one of \(Cap_2\) and two of \(Cap_3\), then \(Cap^v = [ Cap^v_1 = 0, Cap^v_2 = 1, Cap^v_3 = 2, Cap^v_4 = 0, Cap^v_5 = 0 ]\), where \(Cap^i_j\) is the amount of \(Cap_j\) required to satisfy i.

A plan, \(\pi \), is a set of action steps. An action step consists of an action, a start time to begin executing the associated action, and the duration of the action. An executable plan is a plan for which the action steps are executed validly. An action step is executed validly if the associated action’s state constraints are satisfied. Executing the action steps in a executable plan transitions the environment from the initial state, I, to an end state, \(s_{end}\), achieved after all action steps have finished. A solution to the problem is a satisficing plan, an executable plan in which \(s_{end}\) satisfies the goal state constraints of each task, \(v \in V\). A utility function, such as makespan or number of action steps, can be used to compare satisficing plans. An optimal plan is a satisficing plan that maximizes the selected utility function. A coalition is an executable coalition if a satisficing plan has been derived for the coalition to complete its task. A nonexecutable coalition is a coalition for which a satisficing plan has not been derived for the coalition to complete its task.

4 Example domains

The goal is to solve complex real-world domain problems with multiple heterogeneous agents, durative actions, and complex state spaces. Existing planning problems were modified as a first step towards achieving this goal and evaluating the presented tools. Most existing planning domains lack at least one of the aspects representative of the desired domains and to properly evaluate the presented planning tools. A modified Blocks World domain will be used to illustrate the formal problem definition. Two additional planning domains, Rovers and a modified Zenotravel, are presented and used to experimentally validate the tools. Each domain, and the modifications to each, are presented and implemented in the Planning Domain Definition Language (PDDL) [19].

4.1 Blocks World

The modified Blocks World domain requires that heterogeneous robotic arms manipulate stacks of heterogeneous blocks on a table of finite size. Each arm has a subset of end effectors available to it, while each block requires a specific end effector to be manipulated. A block can be manipulated by an arm if and only if the arm has the block’s required end effector. While blocks have the same dimensions, blocks can be either single- or double-weight. Single-weight blocks can be manipulated by a single arm with the required end effector, while double-weight blocks require two arms, each with the required end effector, in order to be manipulated. The block stacks rest on a table with only enough space for a finite number of block stacks. The goal state is a rearrangement of the blocks from the initial state into a specified set of block stacks. The modified domain has been made freely available.Footnote 1

Fig. 1
figure 1

Example states with the double-weight blocks shaded and required end effector for each block in italics. a Initial state, b goal state constraints

The state space, S, includes both boolean and continuous variables. The boolean variables describe the block stacks, each block’s required end effector type, which block each arm is holding, and each arm’s available end effectors. The continuous variables describe the height of each arm and block, the number of blocks on the table, and the table capacity. The domain of the continuous variables is non-negative integers, which is not continuous; however, modeling the variables as continuous simplifies the state model by not requiring all possible values to be enumerated and ordered. The initial state, I, is an assignment of a value to each variable in the state space. As a partial example, the middle stack in the example initial state in Fig. 1a is expressed by assigning the value true to the following variables: \((onTable \ C), (onBlock \ D \ C), (onBlock \ E \ D), (requires \ C \ encompass), (requires \ D magnetic)\), and \((requires \ E \ friction)\).

Fig. 2
figure 2

PDDL implementation for an arm to pick up a block off another block

The grand coalition, \({\varPhi }\), is the set of arms executing actions. The actions are the up and down arm movement and block manipulation. The duration of each action is a linear function of the number of arms executing the action, i.e., a single arm picking up a single-weight block is a shorter action than two arms picking up a double-weight block due to fewer arms executing the action. An example PDDL implementation of arm a picking up a single-weight block \(b_1\) off of block \(b_2\) is presented in Fig. 2. The action has a duration of 1. Executing the action requires that a be empty, that \(b_1\) be clear, and that \(b_1\) be on \(b_2\) at the start of action execution, while over the entire action execution a must be at the same height as \(b_1\), that \(b_1\) require the specified end effector, and that a have the specified end effector. The action has three start of action effects, a is no longer empty, \(b_1\) is no longer clear, and \(b_1\) is no longer on \(b_2\). The two end of action effects are that \(b_2\) is clear, and that a is holding \(b_1\). The combination of effects at the beginning and end of action execution ensures logical consistency throughout action execution. For example, the combination of effects ensures that a third block cannot be placed on \(b_2\) while \(b_2\) is being removed from on top of \(b_1\).

Each stack of blocks in the goal state corresponds to a task. The example goal state in Fig. 1b is divided into three tasks: \(v_C, v_E\), and \(v_F\). \(v_C\) is the stack with C on the bottom and the goal state constraints for \(v_C\) are satisfied when C is on the table and B is on C, i.e., when (onBlockBC) and (onTableC) are both true.

The capability vector for the Blocks World domain corresponds to the end effector types: [suctionfrictionmagneticencompass]. The capabilities offered vector for each arm is a function of the end effectors available to the arm. For example, an arm with a friction end effector and an encompass end effector has the capabilities available vector [0, 1, 0, 1]. Double-weight blocks require twice the capabilities of single-weight blocks, because manipulating double-weight blocks requires two robotic arms. The capabilities for each stack are a function of two sets of blocks, the blocks in the goal stack and the blocks that must be manipulated to access the blocks in the goal stack. For example, the capabilities required vector for \(v_E\) is a function of E and G, because they are the blocks in the goal stack and there are no other blocks above E and G in the initial state. E requires two suction capabilities and G requires the two friction capabilities; therefore, the capabilities required vector for \(v_E\) is [2, 2, 0, 0]. The capabilities required vector for \(v_C\) is a function of B and C, as they are in the goal stack, and of D and E, because they are above C in the initial state. The capabilities required vector will be constructed iteratively as an example. E requires two friction capabilities, thus, [0, 2, 0, 0]. D adds a requirement for a single magnetic capability, [0, 2, 1, 0]. C adds a single encompass end effector, [0, 2, 1, 1]. B requires a single encompass end effector, but an encompass end effector is already part of the capabilities required vector; therefore, the capabilities required vector is not modified. The final capabilities required vector for \(v_C\) is [0, 2, 1, 1].

4.2 Rovers

The Rovers domain has been used for several iterations of the International Planning Competition (IPC) [28]. The domain models rovers navigating between waypoints, collecting different classes of scientific data at a subset of waypoints, and communicating the data back to the central lander. The five classes of scientific data are soil analysis, rock analysis, high-resolution imagery, low-resolution imagery, and color imagery. Each rover can independently navigate a subsection of the environment and collect a subset of the classes of scientific data, but only one rover at a time can communicate data to the central lander. Rock analysis is required at a subset of waypoints and soil analysis is required at a subset of waypoints. Rovers must be at a waypoint to perform rock or soil analysis on waypoint and must be equipped for the analysis. Up to three types of imagery data can be collected at each waypoint. A rover must have the correct camera type and the target waypoint must be visible in order for the rover to collect imagery data for the target waypoint. The PDDL implementation of the domain is identical to the simple time version of the domain used in the 2002 International Planning Competition,Footnote 2 with the exception of modified action durations.

The state space contains only boolean variables and describes waypoint connectivity, waypoint visibility, rover scientific tools, data collection types and location, central lander location, and communication channel capacity. Each action has a fixed duration. The domain’s capability model corresponds to the classes of scientific data being collected. Each rover’s capabilities offered vector is a function of the tools available to the rover. The goal is subdivided into a task for each class of scientific data, e.g., all the state constraints concerning rock analysis are grouped into a single task. The capabilities required vector for each task corresponds to the types of scientific data collected for the task.

4.3 Zenotravel

The Zenotravel domain was originally created for testing the Zeno planner [31] and was modified to include hub and spoke airports, passengers and cargo, and short-range and long-range planes. Spoke airports are airports in smaller cities, with each spoke connected to a single hub airport. Hub airports are located in larger cities and are connected to a set of spoke airports. Short-range planes fly only between a hub and its connected spokes. The set of spoke airports for each pair of hubs is disjoint. All hubs are connected and only long-range planes can fly between them. Each plane has limited passenger and cargo capacity. The goal is satisfied when all passengers and cargo are at their destinations. The modified domain has been made freely available.Footnote 3

The state space includes both boolean and continuous variables. The boolean variables describe the location of each passenger, cargo, and plane. The continuous variables include the amount of passengers and cargo on each plane, each plane’s passenger and cargo capacity, each plane’s fuel level and capacity, and the distance between connected cities. The number of passengers, amount of cargo and their respective capacities for each plane are not continuous variables; however, similar to Blocks World, modeling the values as continuous variables in PDDL facilitates the experiments and expressing the models by not requiring all possible values to be enumerated. The actions to load and unload passengers and cargo from a plane have fixed duration. Fuel use and the action duration for a plane to fly between two cities is a linear function of the distance traveled. The time required to refuel a plane is a linear function of the fuel level at the start of action execution and the fuel capacity. The capability model includes passenger and cargo capacity and the hub cities. For example, a short-range plane based out of the hub airport of ATL in Atlanta, Georgia has a capabilities offered vector corresponding to its passenger and cargo capacity and its ability to travel between ATL and ATL’s spoke airports. A long-range plane has a capabilities offered vector corresponding to its passenger and cargo capacity and its ability to travel between any two hub airports, such as ATL and LAX in Los Angeles, California. The goal state is divided into tasks based on the origin and destination airports of the passengers and cargo. All passengers and cargo originating in a city and traveling to the same city are grouped into a single task. The capabilities required vector of each task is a function of the number of passengers and cargo included in the task, the origin, and the destination.

5 Experimental design

This section describes the experimental design for each tool when solving the hybrid mission planning and coalition formation problem.

5.1 Random problem generation

Grand coalitions and missions were generated for each domain. A grand coalition consists of a set of agents and their associated capabilities. A Mission consists of an initial state and a goal state description. Each grand coalition in each domain was paired with each Mission in the same domain to create a problem to be solved. Ten grand coalitions and ten missions were generated for each domain, for a total of 100 generated problems for each domain. The specific experimental details for each domain are presented.

5.1.1 Blocks World

The grand coalitions in the Blocks World domain were a randomly generated set of robotic arms. Four types of end effectors were used: friction, suction, magnetic, and encompass. Each grand coalition had between four and eight arms, with each arm averaging two end effectors. The grand coalitions required at least two arms with each end effector to guarantee the ability to execute each mission. The generated grand coalitions were manually validated as possessing the required end effectors. If a grand coalition was deficient, then the least capable arm in the grand coalition was augmented with the missing end effector(s). The grand coalitions ranged from 4 to 8 arms, with an average of 6.5 arms. Each arm averaged 2.6 end effectors. The mission initial states included between three and five block stacks, with each stack having three blocks, for a total of nine to fifteen blocks. The missions averaged 4.1 stacks of blocks in the initial state. Each mission’s goal state description required a random rearrangement of the blocks from the initial block stacks into an equal number of block stacks. The problems generated from the same mission differ in the number of arms and the number of and types of end effectors on the arms. The problems generated from the same grand coalition differ in the number of blocks and the goal state description.

5.1.2 Rovers

The grand coalitions included ten randomly generated rovers. Each rover was allocated tools allowing it to collect an average of two of the five classes of scientific data, defined in Sect. 4.2. The mission initial states included the connections between the waypoints, the waypoints each rover was able to traverse, each rover’s starting location, scientific data source locations, and the central lander’s location. The mission goal state description requires all the scientific data to be communicated to the central lander. The missions averaged 103 waypoints, with an average of 4.7 waypoints traversable from each waypoint. Each mission required collecting an average of 116.1 pieces of scientific data. Each grand coalition averaged 4.2 rovers capable of collecting a given class of scientific data, with a minimum of two rovers in each grand coalition capable of collecting each class of scientific data.

5.1.3 Zenotravel

The grand coalitions in the Zenotravel domain were a randomly generated set of long-range planes and short-range planes. Each hub city had between one and three short-range planes and five to ten long-range planes were randomly distributed across the hubs in each mission’s initial state. The generated missions use the same set of hubs and spokes, based on real airports in the US and the distances between each. Seven hub airports and forty-two spoke airports were selected, with each hub having between five and seven associated spokes. The missions consisted of an average of 60.1 passengers and 59.7 units of cargo were spread over 17 tasks. The short-range planes had a capacity of four passengers and four cargo units and the long-range planes had a capacity of eight passengers and eight cargo units. The grand coalitions averaged 8.1 long-range planes and 14.5 short-range planes.

5.2 Metrics

Table 1 Dependent variables

A test case is a single problem attempted by a single planning tool. The dependent variables, as presented in Table 1, were recorded during each test case. Makespan is the amount of time required to execute a plan:

$$\begin{aligned} makespan(\pi ) = \max _{s \in \pi } start(s) + dur(s), \end{aligned}$$

where \(\pi \) is a plan and s is an action execution step in \(\pi \). The number of action execution steps in the generated plan was recorded. Memory usage was recorded using the Linux getrusage function. The getrusage function returns resource usage measures of the current process, including the maximum resident set size, which is an indicator of the amount of memory required by the planning tool. Reported memory values are in gigabytes. Planning tool time is the time for the planning tool to produce a solution in seconds.

Three potential outcomes exist for each test case. First, a satisficing plan for the grand coalition to achieve the goal is produced, in which case the metrics are reported. Second, no plan is produced due to a grand coalition being nonexecutable for a mission, which happens when an allocated coalition is confirmed by the planning tool as unable to complete its assigned task. If a coalition is unable to complete its task, then the grand coalition is nonexecutable for the mission. Finally, the planning tool can exceed either memory or computation time limits. All planning problems were limited to 48 GB of memory. Computation time limits were varied and are given with the results. VAL, the plan validator for PDDL [22], was used to confirm that the produced plans were satisficing. The experiments were run under Xubuntu 16.04 using an Intel Core i7-5820K CPU with 64 GB RAM. All source code is written in C\(++\) and compiled with g\(++\) 5.2.1.

5.3 Coalition formation and planning algorithms

Three planning algorithms and three coalition formation algorithms were used with the presented tools. ITSAT [35], a SAT-based planner, was selected for planning in the Rovers domain and a service model approximation algorithm [40] was selected for coalition formation. ITSAT was selected due to being open source after the 2014 International Planning Competition and a desire to test multiple classes of planning algorithms. The service model approximation algorithm provides solutions quickly and supports the service model of coalition formation used in the Rovers domain. TFD [17], a state space search planner using the context-enhanced additive heuristic modified for continuous state variables and temporal planning, was selected for the Blocks World domain and a dynamic programming coalition formation algorithm [40] was selected for coalition formation. TFD was selected due to being open source, performing well in the temporal track of the 2014 International Planning Competition, and supporting the continuous state variables. The dynamic programming algorithm finds solutions for the Blocks World problems quickly and works with either service or resource models. Coalition formation in the Blocks World domain uses the service model. The enforced hill climbing (EHC) version of the COLIN planner [8], a state space search planner using a relaxed plan graph heuristic, supports continuous linear effects and was selected for planning in the Zenotravel domain. COLIN was selected due to being open source and supporting the continuous effects required in the Zenotravel domain. A greedy algorithm [42] was selected for coalition formation in the Zenotravel domain. The greedy algorithm works quickly by limiting potential coalition size.

All three coalition formation algorithms can be applied to the Rovers and Blocks World domain, but the service model approximation algorithm cannot be applied in the Zenotravel domain due to not supporting the resource model of coalition formation. COLIN supports all the necessary features to plan problems in all three domains. TFD can be used to solve problems in the Rovers and Blocks World domains, but does not support the continuous effects present in the Zenotravel domain. ITSAT does not support continuous variables, so it can only be used with the Rovers domain.

6 Planning tool motivation and analysis

This section presents the tools used to solve the randomly generated problems in each domain. Each tool is presented and described using the experimental domains, followed by the experimental results, and a summary of the results and how they motivate the next tool. A summary is located at the end of this section to provide an overview of the results.

6.1 Planning alone

figure a

A solution for the generated problems can be derived using existing planning algorithms. The problem is solved as a single planning problem. All agents in the grand coalition are available for planning and all task state constraints must be satisfied simultaneously in the solution’s end state. Planning for all tasks with all agents simultaneously can consider all possible interactions between the tasks and agents, but planning as a single problem becomes computationally prohibitive as the expressive features of the state model, the number of agents, and the number of tasks increase. As the number of agents increases, so too does the number of actions that can be executed in any given state, the size of the state space, and the number of executable plans that can be considered by the planning algorithm. As the number of tasks increases, more constraints are placed on the goal state and the size of the set of goal states decreases; thus, fewer plans qualify as satisficing. These two effects combine to increase the problem’s difficulty.

The problem formalization is translated to a single planning problem by Algorithm 1. The set of available actions for planning, Actions, is a function of the agents in the grand coalition, \(A({\varPhi })\), shown in Line 1. The goals for the planning problem, G, are combined in Line 2. G is satisfied if and only if all the tasks, \(v \in V\), are satisfied. Plan in Line 3 represents a planning algorithm capable of reasoning over the action model, the state space (S), and the goals. Each presented tool is agnostic of the underlying planning algorithm. Different planning algorithms are used for each domain, as discussed in Sect. 5.3.

6.1.1 Blocks World

Forty-two of the one hundred generated problems for the Blocks World domain were solved by planning alone. Table 2 presents the five number summary for plan makespan, number of action execution steps, time to derive the plan, and memory required to derive the plan for the solved problems. The temporal makespan of the solved problems is presented in Table 3. A value of “MEM” indicates the memory limit was exceeded while trying to solving the problem. The makespan of the derived plans ranged from 21 to 69, with a median of 33. Plans were not produced for any of the Grand Coalitions in three of the Missions (3, 4, and 8) and a plan was derived for only one of the Grand Coalitions in Mission 7. Planning alone derived plans for all the Grand Coalitions in Mission 2. The number of action execution steps, presented in Table 4, ranged from 46 steps to 139 steps and had a median of 68 steps. The time to derived the satisficing plan for each problem is presented in Table 5 and ranged from 6.1 to 6489.1 s, with a median of 72.2 s. Table 6 presents the memory required during plan derivation for each problem, ranging from 0.05 to 43.91 GB, with a median of 0.49 GB. Planning alone failed to derive plans for 58 of the 100 problems due to exceeding the memory limit and required more than 10 GB of memory when solving nine problems; thus, there is much room for improvement. The heatmap ranges applied for the presented Blocks World results are also applied to the next sets of Blocks World results for comparison of results for the same problems across tools.

Table 2 The five number summary for metrics collected while solving the Blocks World problems using Planning Alone
Table 3 Temporal makespan using planning alone for the Blocks World problems (Color table online)
Table 4 Number of action execution steps in the derived plans using planning alone for the Blocks World problems (Color table online)
Table 5 Planning time in seconds using planning alone for the Blocks World problems (Color table online)
Table 6 Required memory in gigabytes using planning alone for the Blocks World problems (Color table online)

6.1.2 Rovers

Table 7 The five number summary for metrics collected while solving the Rovers problems using planning alone
Table 8 Temporal makespan using planning alone for the Rovers problems (Color table online)
Table 9 Number of action execution steps in the derived plans using planning alone for Rovers problems (Color table online)
Table 10 Planning time in seconds using planning alone for the Rovers problems (Color table online)
Table 11 Required memory in gigabytes using planning alone for the Rovers problems (Color table online)

All one hundred Rovers problems were solved by Planning Alone. Table 7 presents the five number summary for plan makespan, number of actions, time to derive the plans, and memory required to derive the plans. The makespan of the derived solutions, presented in Table 8, ranged from 876 to 3463, with a median of 1585. Grand Coalition 10 had the shortest average makespan, while Grand Coalition 1 had the longest. Mission 10 had the longest average makespan, while Mission 3 had the shortest. The number of action execution steps for each problem is presented in Table 9 and ranged from 481 steps to 1403 steps, with a median of 759 steps. Mission 10 required the most steps (998), while Mission 3 with 556 steps required the least. Grand Coalition 1 required the most steps to complete a Mission (998), while Grand Coalition 10 with 674 steps required the least. Table 10 presents the time to derive the satisficing plan for each problem, ranging from 1118.0 to 13475.4 s, with a median of 1790.4 s. Deriving plans for Mission 10 required an average of 5123 s, the highest of all the Missions, while Mission 3 required the least, at 1387 s. Memory usage ranged from 12.41 to 40.75 GB, with a median of 17.29 GB, and is presented in Table 11. Mission 10 required the most memory (25.56 GB), while Mission 3 with 14.29 GB required the least. Grand Coalition 6 required the most memory (23.86 GB), while Grand Coalition 8 at 15.19 GB required the least. The large amount of memory and time required to solve these problems leaves room for improvement. The same heatmap ranges are applied to the next sets of Rovers results.

Missions 3 and 10 were the hardest and easiest Missions, respectively, in terms of time and memory required to derive a plan and in terms of derived plan makespan and number of steps. The average number of scientific data collections per Mission was 116. Mission 10 (149 data collections) was 1.64 standard deviations above the average, while Mission 3 (92 data collections) was 1.25 standard deviations below the average. The hardest and easiest Grand Coalitions for which to plan in terms of the time and memory limitations, makespan, and number of steps in the derived plan is much less clear than the equivalent analysis with the Missions. The average Grand Coalition offered 21 capabilities. Grand Coalition 10 was the most capable with a total of 28 capabilities, with at least five of the ten rovers being capable of collecting each class of scientific data. The greater capabilities translated to better plans in terms of makespan and number of steps; however, the greater capabilities did not require the most time nor memory with which to plan. Grand Coalition 8 was the least capable of the coalitions, with three rovers capable of collecting each class of scientific data (total of 15 capabilities offered), but the plans derived for Grand Coalition 8 were not the longest, in terms of makespan or number of steps. The derived plans for Grand Coalition 1 were the longest in terms of makespan and number of steps. The reason for the difference is the number of rovers capable of soil analysis. Both rock and soil analyses require more navigation throughout the environment than the imagery analyses, because the rover must be at the waypoint to perform the analysis, while the imagery analyses only require that the rover have line of sight to the waypoint. Grand Coalition 8 had three rovers capable of soil analysis, while Grand Coalition 1 had two rovers capable of soil analysis.

6.1.3 Zenotravel

None of the Zenotravel problems were solved by planning alone using the COLIN planner in EHC mode. A 2 h time limit was enforced for solving the Zenotravel problems, which resulted in no satisficing plans being generated for any of the problems. Five of the one hundred problems were selected to reevaluate without the time limit. Each of the five problems exceeded the 48 GB memory limit after an average of 16.5 h of execution. The problems being solved in the Zenotravel domain average 60 passengers, 60 cargo, and 22 planes. The large number of planes, passengers, and cargo produces a large branching factor. Assume, as a conservative estimate, each plane can refuel, fly to three different cities, or load a single passenger or cargo. The branching factor for such a situation is over 100. Each plane is likely to be able to fly to about five cities from any particular state and load or unload cargo and passengers, thus, the true branching factor is likely to be much higher.

6.1.4 Summary

Table 12 Planning alone summary

The results from each domain provide room for improvement, as summarized in Table 12. Results from each of the domains leave room for improvement. Only 42 of the 100 Blocks World problems were solved and no plans were produced for three of the Missions. All 58 failures for the Blocks World problems were due to exceeding the 48 GB memory limit. All Rovers domain problems were solved by ITSAT, but computational resource usage needs to be improved. ITSAT required, on average, 2157 s and 18.5 GB of memory to solve the problems. None of the Zenotravel domain problems were solved within the 2 h time limit and removing the time limit did not allow for any of the problems to be solved.

The primary issue when using planning alone is exponential complexity. The number of agents, the number of tasks, and the domain complexity (durative actions, continuous state variables, etc.) all contribute to the problem complexity. One option is to reduce the domain complexity through various relaxations, such as assuming all actions are instantaneous and executed sequentially. A plan is only as good as the domain and problem description from which it is created; thus, plans are more likely to succeed in real-world missions when they are created from accurate domain models with representative durative actions and continuous variables. A better option is to address the other aspects of problem complexity, i.e., reduce the number of agents and number of tasks considered at any one time. The three presented tools address problem complexity by reducing the number of goals and agents considered during planning, while maintaining high fidelity state models and plans.

6.2 Coalition formation then planning

figure j

The coalition formation then planning tool (CFP) produces several smaller planning instances focused on a subset of the goals, each using a subset of the agents to satisfy the goals. Algorithm 2 presents the CFP algorithm, which begins with an empty plan and no goals, Lines 1 and 2, respectively. The capabilities offered vector for each agent is identified using the capability mappings in Line 3. The capabilities required vector for each task is identified using the capability mappings in Line 4. Coalition formation is applied in Line 5 to allocate coalitions to tasks. Coalition formation’s result is an assignment of a candidate coalition to each task. A coalition is a candidate coalition for a task if and only if the coalition has at least as many of each type of resource as required by task, i.e., \({\varPhi }_v\) is a candidate coalition for v if and only if \(\forall i, Cap^{{\varPhi }_v}_i \ge Cap^v_i\). The planning loop, Line 6–12, is executed after coalition formation. The goals for \(v_i\) are combined with G to form the goals to be solved in the current iteration, Line 7. Combining the previous constraints with the current iteration constraints allows the iterative plan to break previous constraints in the course of planning, as long as the constraints are satisfied at the end of the iterative planning. The initial state, \(I_i\), for the current iteration is the end state achieved after simulating the current plan, \(\pi \), from the initial state, I, using VAL, Line 8. The available actions, Actions, are a function of the coalition allocated to the current task, \({\varPhi }_i\), and available for planning, Line 9. An appropriate planning algorithm, Plan, finds an iterative plan, \(\pi _i\), to satisfy G from the initial state, \(I_i\), using the actions of the available coalition, Actions, in Line 10. CFP relies on coalition formation to produce executable coalitions. If \({\varPhi }_i\) is a nonexecutable coalition, then CFP reports the problem as failed, else the iterative plan must be merged. \(\pi _i\) is merged with the current plan, \(\pi \), to create a plan to satisfy G when executed from I, Line 11. The planning problem solved during each iteration in Line 10 is analogous to deriving a plan, executing the plan, and being given an additional planning goal to satisfy. The current goals, G, have been satisfied in the current state, \(I_i\), but additional goals are given, so augmenting G with the additional goal constraints, \(G = G \wedge v_i\), and a plan must be derived to transition from \(I_i\) to a state satisfying G. The plan merge step, Line 11, can be as simple as modifying the action execution steps in \(\pi _i\), such that the action execution steps begin execution immediately when \(\pi \) ends; however, more complex scheduling can occur. A greedy scheduling approach, in which each action execution step in \(\pi _i\) is modified, one at a time in increasing original start time order, to occur as early as possible in the resulting plan, is applied.

Performing coalition formation affects problem complexity in two ways. First, the number of goal constraints addressed at any one time is reduced. Assume the Blocks World example from Fig. 1b. The set of goal constraints for planning alone addresses the seven blocks in the figure, whereas the goal constraints in CFP are divided into three sets of goals constraints, two of which address the locations of two blocks and one of which addresses the location of three blocks. Each of the stacks of blocks in the goal state description of the Blocks World problems correspond to a task. Planning for a single stack at a time prunes blocks from the state space that must be considered. Second, the number of agents is reduced; thus, the number of actions and the state space is reduced. Reducing the number of actions creates a lower branching factor in the search tree, but also eliminates states from the search tree. Goal states can be among the eliminated states; thus, reducing the search branching factor can force deeper searches to identify a goal state. The state space is reduced due to eliminating the variables and domain values that are no longer reachable given the reduced action set. The effects of the reduced action set and reduced state space combine to increase the number of states that can be searched per unit of time. Coalition formation in the Blocks World problems identifies a coalition of arms to allocate to each task. Each identified coalition is selected based on the end effectors available to the coalition and the end effectors required to achieve each block stack. If the grand coalition is not allocated to the task, then the branching factor in planning for the task has been reduced.

6.2.1 Blocks World

Twenty-six of the one hundred generated problems for the Blocks World domain were solved by CFP. Table 13 presents the five number summary for plan makespan, number of action execution steps, time required to derive the plan, and memory required to derive the plan for the solved problems. The temporal makespan of the derived plans are presented in Table 14. Values of “NE”, “MEM”, and “TIME” indicate a nonexecutable coalition, an exceeded memory limit, and an exceeded time limit, respectively. The time limit was 1 h. The makespan ranged from 30 to 114, with a median of 52. Grand Coalition 10 was the only Grand Coalition for which plans were not derived for any of the Missions and Mission 9 was the only Mission for which plans were not derived for any of the Grand Coalitions. The number of action execution steps ranged from 37 steps to 153 steps, with a median of 84 steps, and are presented in Table 15. The time to derive a plan had a median of 103.1 s, ranging from 2.6 to 3093.4 s, and are presented in Table 16. The memory required to derive a plan is presented in Table 17. The memory required ranged from 0.01 to 22.86 GB, with a median of 0.36 GB.

Table 13 The five number summary for solving the Blocks World problems using CFP
Table 14 Temporal makespan using CFP for the Blocks World problems (Color table online)
Table 15 Number of action execution steps in the derived plans using CFP for the Blocks World problems (Color table online)
Table 16 Planning time in seconds using CFP for the Blocks World problems (Color table online)
Table 17 Required memory in gigabytes using CFP for the Blocks World problems (Color table online)

Seventeen problems were commonly solved by PA and CFP. A comparison of the seventeen problems is presented in Table 18. The ratio of the metric when using PA to the metric when using CFP was calculated, and the median ratio is presented. The median makespan ratio was 1.46, with 11 of the problems having higher makespan solutions (ratio \(>1.0\)) when solved using CFP compared to PA. Ten of the CFP derived plans had more actions than the counterparts derived using PA, with a median ratio of 1.06. Computation time was lower with CFP than with PA for 12 of the problems (ratio \(<1.0\)), with a median ratio of 0.71. Finally, thirteen of the problems required less memory to solve with CFP than with PA, with a median ratio of 0.71.

Table 18 Median ratio of the CFP metric value to the PA metric value for Blocks World problems

The plan merge step ensures that all actions execution steps in \(\pi _i\) do not occur until their conditions are satisfied in \(\pi \). For example, if block A is placed on the table as a result of executing step m in \(\pi \) and a step n in \(\pi _i\) relies on A being on the table, then the plan merge step ensures that n does not start until after A has been placed on the table by m.

CFP failed to solve nine problems due to exceeding the memory limit and failed to solve 66 of the 100 problems due to allocating nonexecutable coalitions. CFP solved fewer problems than planning alone, but the median problems required 29% less memory and 29% less time than planning alone. The problems that failed due to nonexecutable coalitions can be divided into two categories, both of which are a result of the order in which tasks were planned. First, arms were not required to drop blocks at the end of their tasks. The finite table limited the number of blocks on the table and forced agents to place blocks on top of one another. If block a was placed on top of b and the task required moving b, but the coalition was incapable of moving a due to no agent in the coalition possessing the required end effector, then the coalition was nonexecutable. Second, agents were not required to release blocks at the end of a task. If arm \(\phi \) is holding a at the end of plan execution and \(v_j\) requires a for a later plan, then \({\varPhi }_j\) allocated to \(v_j\) will fail planning, unless \(\phi \) releases a before the algorithm plans for \(v_j\) or \(\phi \in {\varPhi }_j\). If \(\phi \) releases a before the algorithm plans for \(v_j\), then a will be a part of some tower in \(I_j\). If \(\phi \in {\varPhi }_j\), then planning will succeed due to \(\phi \) holding a in \(I_j\) and \(\phi \) being able to place a where it belongs. One option to avoid the second failure mode is to require agents to satisfy “cleanup goals” before planning each task. The cleanup goals will ensure that each agent is prepared to plan for the next task. Cleanup goals in the Blocks World problems will require that all agents not hold any blocks at the end of task planning. A design decision was made not to test cleanup goals for these experiments. Requiring agents to drop blocks between tasks can detrimentally affect plan quality. Consider the \(\phi \) holding a case from earlier. Assume \(\phi \) is holding \(a, v_j\) is the next task to be planned, and \({\varPhi }_j\) is allocated to \(v_j\). If \(\phi \in {\varPhi }_j\) and a is a part of \(v_j\), then requiring \(\phi \) to drop a before planning \(v_j\) wastes time, since some agent in \({\varPhi }_j\) will pickup a again.

The results also demonstrate that coalition formation can be detrimental. Five problems (Mission 2 with Grand Coalition 5, Mission 5 with Grand Coalitions 5 and 7, Mission 6 with Grand Coalition 5, and Mission 10 with Grand Coalition 3) were solved by planning alone, but exceeded the memory limit when attempted by CFP. More states had to be created and searched because the decrease in branching factor caused by using fewer agents did not offset the increase in the search depth required to solve the problems.

6.2.2 Rovers

Table 19 The five number summary for solving the Rovers problems using CFP
Table 20 Temporal makespan using CFP for the Rovers problems (Color table online)
Table 21 Number of action execution steps in the derived plans using CFP for Rovers problems (Color table online)

Seventy-six of the one hundred generated Rovers problems were solved by CFP. The five number summary for plan makespan, number of action execution steps, time required to derive the plan, and memory required to derive the plan for the solved problems are presented in Table 19. The temporal makespan of the derived plans ranged from 1154 to 3161 and had a median of 2299. Full makespan results are presented in Table 20. A value of “NE” indicates a nonexecutable coalition was allocated. The number of action execution steps is presented in Table 21 and ranged from 486 steps to 935 steps, with a median of 656 steps. The computation time and memory usage is presented in Tables 22 and 23, respectively. Computation time ranged from 812 to 2292.8 s and had a median of 1296.1 s. Memory usage ranged from 0.93 to 4.41 GB, with a median of 1.88 GB.

Table 22 Planning time in seconds using CFP for the Rovers problems (Color table online)
Table 23 Required memory in gigabytes using CFP for the Rovers problems (Color table online)

A comparison of the set of 76 problems commonly solved by PA and CFP is shown in Table 24. The median makespan ratio was 1.34, with 62 of the problems having higher makespan solutions (ratio \(>1.0\)) when solved using CFP compared to PA. Six of the CFP derived plans had more actions than the corresponding plans derived using PA, with a median ratio of 0.85. Computation time was lower with CFP than with PA for 71 of the problems (ratio \(<1.0\)), with a median ratio of 0.73. Finally, all 76 of the problems required less memory to solve with CFP than with PA, with a median ratio of 0.11.

The plan merge step in the Rovers domain is simple as the rovers are mostly independent. The difficulty comes from the single shared communications channel with the lander. A simple merge where each plan is executed concurrently does not work because the communications channel is disrupted if multiple rovers attempt to transmit at the same time. The merged plan must ensure no two rovers are ever concurrently transmitting with the lander.

Table 24 Median ratio of the CFP metric value to the PA metric value for Rovers problems

The twenty-four problems that were not solved by CFP due to a nonexecutable coalition being assigned to a task failed due to not being able to reach required waypoints to collect scientific data. Most of the instances were due to a nonexecutable coalition assigned to the rock or soil analysis task. A rover must be at waypoint \(w_i\) to collect rock or soil analysis at \(w_i\). If no rover in the coalition can reach \(w_i\), then the coalition is nonexecutable. The imagery analyses are less likely to be nonexecutable due to a rover not being required to be at \(w_i\) to collect imagery data for \(w_i\). Imagery data for each waypoint requiring imagery analysis was collectable from multiple waypoints. Multiple specific waypoints must be unreachable in order for a coalition to fail this task whereas a single waypoint being unreachable can make rock or soil analysis nonexecutable.

6.2.3 Zenotravel

Eighty-five of the one hundred generated problems were solved by CFP using the resource model greedy coalition formation algorithm and the COLIN planner in EHC mode. The fifteen problems were due to using the incomplete EHC-based planning algorithm. The five number summary for plan makespan, number of steps, time to derive the plan, and memory required to derive the plan for the solved problems are presented in Table 25. Derived plan makespan is presented in Table 26. One Grand Coalition completed all Missions (Grand Coalition 9), while five Missions were completed by all Grand Coalitions (Missions 1, 2, 6, 9, and 10). The makespan of the derived plans ranged from 686 to 1891, with a median of 1055. Grand Coalition 6 had the smallest average makespan, while Grand Coalition 2 had the largest. Mission 5 had the longest average makespan, while Mission 7 had the shortest. The number of action execution steps ranged from 459 steps to 663 steps, with a median of 529 steps, and is presented in Table 27. Mission 6 required the most steps (649.0), while Mission 10, with 467.0 steps, required the least. The time to derive a satisficing plan for each problem ranged from 279.7 to 936.4 s, with a median of 410.2 s. Full results are presented in Table 28. Mission 6 required the most time, at 805 s, while Mission 10 required the least, at 316 s. The required memory results are presented in Table 29 and ranged from 0.15 to 0.19 GB, with a median of 0.16 GB.

Table 25 The five number summary for solving the Zenotravel problems using CFP
Table 26 Temporal makespan using CFP for the Zenotravel problems (Color table online)
Table 27 Number of action execution steps in the derived satisficing plan using CFP for the Zenotravel problems (Color table online)
Table 28 Planning time in seconds using CFP for the Zenotravel problems (Color table online)
Table 29 Required memory in gigabytes using CFP for the Zenotravel problems (Color table online)

Coalition formation was the least likely to generate nonexecutable coalitions for the Zenotravel domain. The domain and capability model was designed such that a candidate coalition will always be executable. The only instance in which a coalition will be nonexecutable occurs if planning for an earlier task transports a passenger to an airport inaccessible to the nonexecutable coalition. The iterative initial state will have changed enough that the candidate coalition will no longer be executable. Assume the goal for \(v_i\) is to transport passenger \(pass_i\) from \(city_o\) to \(city_i\) and the goal for \(v_j\) is to transport \(pass_j\) from \(city_o\) to \(city_j\). \({\varPhi }_j\) is assigned to \(v_j\) and can only travel between \(city_o\) and \(city_j\). If \(v_i\) is planned before \(v_j\) and the plan for \(v_i\) transports \(pass_j\) to \(city_i\), then \({\varPhi }_j\) will be nonexecutable due to being unable to pickup \(pass_j\) in \(city_i\).

The Zenotravel domain has the simplest plan merge due to all tasks being independent. Two plans executed by disjoint coalitions can be executed concurrently. \(plane_a\) transporting \(cargo_i\) from \(city_m\) to \(city_n\) is independent of \(plane_b\) transporting \(cargo_j\) from \(city_m\) to \(city_n\). Plan merge must consider when the agent’s prior action ended execution and if the passengers or cargo are prepared for loading or unloading. For example, a plane can begin refueling as soon as the plane arrives at an airport, but it will have to wait to continue to its destination if new passengers or cargo are not yet ready to board.

The average makespan for plans derived for Grand Coalition 2 was 1360, which is 0.93 standard deviations above the average makespan, while the average makespan for plans derived for Grand Coalition 6 was 913, 0.71 standard deviations below the average makespan. Grand Coalition 2 was composed of six long-range planes and twelve short-range planes, while Grand Coalition 6 was composed of nine long-range planes and fourteen short-range planes. Having more planes of both types allowed Grand Coalition 6 to complete more tasks simultaneously. Mission 6 required an average of 649 action execution steps per derived plan. Mission 6 had nineteen tasks to be planned, with fifteen tasks requiring that the passengers and cargo be unloaded from one plane and loaded onto another plane. Mission 1 also had nineteen tasks, but required a below average number of action execution steps. Eleven of the Mission 1 tasks did not require passengers and cargo to change planes; therefore, fewer actions were required to produce a satisficing plan. The memory results are an effect of using the EHC version of COLIN. Focusing on specific agents and goals decreases the size of the heuristic plateaus that require breadth-first search.

6.2.4 Summary

Table 30 Coalition formation then planning summary

CFP provided improved performance in terms of computational resource usage compared to PA on some problems, while introducing a new failure mode, nonexecutable coalitions. A summary of the results is presented in Table 30. The number of Blocks World problems solved by CFP is lower than the number of problems solved by PA, with 26 problems solved with CFP versus 42 solved with PA. However, most of the 74 failures are correctable. Nonexecutable coalition failures are correctable and caused 69 failures. Only five of the failures were due to exceeding the computational resource limits, which is not correctable by any of the presented tools. Only one Mission failed to generate a plan with any of the Grand Coalitions, compared to three Missions with no plans for PA. Fewer problems were solved for the Rovers domain by CFP compared to PA. CFP solved 76 problems, while PA solved 100. However, all the failures were due to nonexecutable coalitions, which is a correctable failure mode. Most of the Rovers problems were solved faster by CFP than PA. Furthermore, CFP had a required memory usage median of only 11% that required by PA. Most (85 out of 100) of the Zenotravel problems were solved using CFP, which is much better than the zero problems solved using PA. The fifteen failures were due to using a planning algorithm based on the incomplete Enforced Hill Climbing algorithm. Planning algorithm failures are not addressed by the presented tools, but using a complete underlying planning algorithm can allow the problems to be solved.

CFP provides an alternative method of centralized planning for multi-agent systems providing reduced computational resource usage at the cost of being an incomplete algorithm. Problems in both the Rovers and Blocks World domains failed due to nonexecutable coalitions. The coalition formation algorithms used only consider task capability requirements and capabilities available to the coalitions; thus, the algorithms are reliant on the capability models being accurate representations of each coalition’s action set and each task’s requirements. Research in coalition formation includes spatial and temporal constraints on tasks. The more expressive coalition formation problem is more difficult, but provides more information as to whether or not a coalition can complete a task. The Rovers domain is an example of task spatial constraints. Considering the waypoints each rover can reach allows a coalition formation algorithm to eliminate rovers which cannot reach required waypoints; however, the more complex coalition formation formalisms are still estimates of the coalitions capable of completing a task. A plan must still be produced and coalitions can still be found to be nonexecutable. The relaxed plan coalition formation modification to CFP was developed to addresses this shortcoming to CFP.

6.3 Relaxed plan coalition augmentation

figure w

The Coalition Formation then Planning with Relaxed Plan Coalition Augmentation (RPCA) tool addresses the nonexecutable coalition limitation that arises with the CFP tool. RPCA adds logic to the planning loop of the Coalition Formation then Planning tool (see lines 11–17 in Algorithm 3). Planning for relaxed domains creates plans from low fidelity models of the real-world, but the problem is easier; thus, relaxed plans are appropriate for use as a heuristic or, in this case, coalition modification. Planning is attempted as in CFP, Line 10. If planning fails, then the algorithm enters a loop (lines 11–17). A nonexecutable coalition, \({\varPhi }_i\), is modified in order to make it executable. Coalitions can be modified by adding agents, removing agents, or a combination thereof; however, allowing any modifications introduces an exponential number of possible coalitions. The tool is limited to adding agents in order to transform a nonexecutable coalition into an executable coalition. The set of actions available to \({\varPhi }, AllActions\), is derived, Line 12. A relaxed plan to complete the task, \(\pi _r\), is generated from AllActions, Line 13. The grand coalition, \({\varPhi }\), the currently allocated coalition, \({\varPhi }_i\), and the generated relaxed plan, \(\pi _r\), are analyzed to select additional agent(s) to allocate to the task, Line 14. Each action execution step in the relaxed plan is analyzed in execution order. If the action in the step can be executed by an agent in \({\varPhi }_i\) or a subcoalition of \({\varPhi }_i\), then analysis continues to the next action execution step, otherwise, the agent or coalition assigned to the action execution step is added to \({\varPhi }_i\) and relaxed plan analysis stops. At least one agent must be added to \({\varPhi }_i\) before planning is attempted again, so if all steps of the relaxed plan are analyzed and no agent has been added, then the agent, \(\phi \), executing the most action execution steps in \(\pi _r\) and \(\phi \not \in {\varPhi }_i\) is added to \({\varPhi }_i\). Planning is attempted with the new coalition, Line 16, and the loop repeats if necessary.

The relaxed plan coalition formation augmentation loop (Lines 11–17) has two potential completions. Either an executable plan is derived using the newly generated coalitions and planning attempts or the grand coalition is allocated to the task and the algorithm is unable to identify a plan. The latter case is inconclusive, since the grand coalition can be nonexecutable for multiple reasons. The grand coalition may be nonexecutable due to previous planning decisions that make the task nonexecutable. An example can be created from a finite fuel modification to Zenotravel. The version of Zenotravel used in these experiments allows unlimited refueling, but if the amount of fuel available is limited, then prior planning decisions using excessive amounts of fuel can create situations in which no planes can reach their destination. The grand coalition may also be nonexecutable if the task cannot actually be executed, in which case the problem is unsolvable by any tool. An example of a nonexecutable grand coalition in the Blocks World domain is a grand coalition for which there are fewer than two arms with access to a required end effector type. If all the initially allocated coalitions are executable, then the loop starting at Line 11 is never entered; thus, RPCA reduces to CFP when all coalitions are executable. All Zenotravel coalitions were executable, so RPCA results are not reported for zenotravel.

6.3.1 Blocks World

Table 31 The five number summary for solving the Blocks World problems using RPCA

Seventy of the one hundred problems were solved by RPCA within the time limit of 1 h, a large improvement over the 42 and 25 problems solved by planning alone and CFP, respectively. The five number summary for the solved problems are presented in Table 31. The quality of the derived plans are presented in Table 32. At least one satisficing plan was derived for all Grand Coalitions and all Missions. Plans were derived for all Grand Coalitions for Mission 9 and for all Missions for Grand Coalition 6. The number of actions in each plan is presented in Table 33. The derived plans had a median of 70 action execution steps and ranged from 37 to 168 action execution steps. Time and memory requirements are presented in Tables 34 and 35, respectively. Time requirements ranged from 2.6 to 3111.9 s, with a median of 76.7 s. Memory usage had a median of 0.35 GB and ranged from 0.01 to 30.80 GB.

Table 32 Temporal makespan using RPCA for the Blocks World problems (Color table online)
Table 33 Number of action execution steps in the derived plans using RPCA for the Blocks World problems (Color table online)
Table 34 Planning time in seconds using RPCA for the Blocks World problems (Color table online)
Table 35 Required memory in gigabytes using RPCA for the Blocks World problems (Color table online)
Table 36 Median ratio of the RPCA metric value to the PA metric value for Blocks World problems

A set of 37 Blocks World problems were commonly solved by both PA and RPCA. The median metric ratios for the commonly solved problems are presented in Table 36. The solution quality metric ratios were 1.45 and 1.11 for makespan and steps, respectively. A ratio greater than 1 indicates that the plans derived by PA were better than the plans derived by RPCA. The makespan and action ratios were greater than 1 for 29 and 24 problems, respectively. However, the computational resource ratios were 0.73 and 0.70 for time and memory, respectively, indicating that RPCA derived plans while using less memory than PA. RPCA required less time and memory than PA for 25 and 24 problems, respectively.

The coalition formation failures in CFP are a result of the plans derived for earlier tasks and can be divided into two different cases. Assume a problem with the start state in Fig. 1a and the goal state description illustrated in Fig. 1b. Achieving stack C in the goal state requires moving D and E in order to access C. The goal state to plan to achieve stack C does not place any constraints on D and E. The first coalition formation failure case occurs when an arm is still holding a block at the end of a task plan. Blocks D or E in this example can remain held at the end of the plan due to not being part of the goal description. The second coalition formation failure case occurs when a block is placed such that the blocks in the goal description are inaccessible. Block D placed on G is an example of the second failure case. The end effector required to manipulate D is not considered in the capabilities required vector for the stack E task. If the coalition for stack E cannot manipulate D, then the coalition will be nonexecutable.

6.3.2 Rovers

Table 37 The five number summary for solving the Rovers domain problems using RPCA
Table 38 Temporal makespan using RPCA for the Rovers problems (Color table online)
Table 39 Number of action execution steps in the derived plans using RPCA for Rovers problems (Color table online)
Table 40 Planning time in seconds using RPCA for the Rovers problems (Color table online)
Table 41 Required memory in gigabytes using RPCA for the Rovers problems (Color table online)

RPCA solved all of the Rovers problems. The five number summary of the collected metrics is presented in Table 37. Each problem’s solution makespan is presented in Table 38 and ranged from 1074 to 4043, with a median of 2355. The number of action execution steps in the derived plans is presented in Table 39 and had a median of 658 steps. Plan derivation time ranged from 777.0 to 2289.3 s, with a median of 1280.6 s. Full derivation time results are presented in Table 40 and the memory results are presented in Table 41. Required memory ranged from 0.93 to 7.99 GB.

All 100 Rovers problems were solved by both PA and RPCA. The median metric ratios are presented in Table 42. The solution quality metric ratios were 1.38 and 0.87 for makespan and steps, respectively. PA produced lower makespan plans than RPCA for 85 of the problems, while produced plans with more actions than the corresponding RPCA plans for 86 of the problems. The computational resource ratios were 0.71 and 0.11 for time and memory, respectively, indicating that RPCA derived plans while using less memory than PA. PA required more time than RPCA for 95 of the problems, and more memory to solve all of the problems.

RPCA was able to solve the 24 problems that CFP failed to solve and in so doing solved every problem that planning alone was able to solve, but with a much lower computational resource requirement. The median memory usage and computation time for RPCA was 11% and 72% that of PA, respectively. The CFP failures were the result of rovers being unable to reach a necessary waypoint to collect scientific data. The generated relaxed plan provided information regarding which rovers were able to reach the required waypoints. A single rover was selected to augment the coalition and planning reattempted.

Table 42 Median ratio of the RPCA metric value to the PA metric value for Rovers problems

6.3.3 Summary

The RPCA algorithm was able to correct all the nonexecutable coalition failures that occurred during the CFP evaluation. RPCA solved at least as many problems as CFP in each domain, as summarized in Table 43. The RPCA algorithm solved more Blocks World problems than PA and did so with lower average computation time and memory. RPCA solved the same Blocks World problems as CFP in addition to others, but PA solved some problems that RPCA did not. RPCA solved all the Rovers problems, requiring median lower memory and computation time compared to PA. All of the nonexecutable coalition failures from CFP were fixed by applying RPCA. No nonexecutable coalition failures occurred when CFP was applied to the the Zenotravel problems; thus, the RPCA results are identical to the CFP results, because RPCA only addresses CFP’s nonexecutable coalition failure mode.

The ability to correct for nonexecutable coalitions is imperative when generating plans for real-world problems. Generating a relaxed plan is easier than the original planning problem, so the grand coalition can be used in relaxed planning. Using the grand coalition during the relaxed planning step generates additional information unavailable during coalition formation. An agent, previously unassigned to the coalition, is selected to augment a nonexecutable coalition and transform it into an executable coalition. Adding agents to a nonexecutable coalition does not guarantee that the coalition will become executable, but it does provide additional actions that can be considered during planning to produce an executable plan for the coalition’s assigned task.

Table 43 Relaxed plan coalition augmentation summary

6.4 Task fusion

The Coalition Formation followed by Task Fusion then Planning (TF) tool addresses the limited ability of CFP to reason over task interactions. Coalition formation is applied as in CFP, Line 5 in Algorithm 4. Task fusion reasons over the tasks and their capabilities required and the coalitions and their capabilities offered to guess which tasks and coalitions are most likely to interact and benefit from joint planning, Line 6. Two coalition-task pairs, \(\langle {\varPhi }_a, v_a \rangle \) and \(\langle {\varPhi }_b, v_b \rangle \), selected to be fused create a single coalition-task pair, \(\langle {\varPhi }_a \cup {\varPhi }_b, v_a \wedge v_b \rangle \).

Fusing tasks into a single task permits more potential interactions between the tasks to be considered during the planning step. The coalition capability offerings and the task capability requirements will be used to determine which tasks to fuse. Assume two coalitions, \({\varPhi }_i\) and \({\varPhi }_j\), are allocated to two tasks, \(v_i\) and \(v_j\), respectively. If \({\varPhi }_i\) is also a candidate coalition for \(v_j\) and \({\varPhi }_j\) is also a candidate coalition for \(v_i\), then the coalitions can assist each other; fusing the coalition-task allocations allows both coalitions to contribute to both tasks during the planning step. Each coalition-task allocation fusion decreases the number of subproblems by one, but the difficulty of planning for the two tasks is much lower than planning for the fused subproblem due to the exponential complexity of the planning problem. The goal of Task Fusion is to select the coalition-task pairs to fuse for which improved plan quality is worth the increase in computational resources.

TF applied to the Blocks World domain allows planning to intentionally make progress towards constructing multiple stacks. A block can be placed intentionally where it belongs in the goal state. TF can be applied to the other experimental domains as well. Fusing tasks in the Rovers domain allows rovers to work together. Assume the rock and soil tasks have been fused. If a rover is capable of both analyses and is at a waypoint requiring both analyses, then the rover can perform both rock and soil analysis. Zenotravel task fusion allows more of each plane’s capacity to be used. The CFP algorithm only plans for one task at a time; thus, the current task’s passengers and cargo must be transported to their destination before passengers and cargo for subsequent tasks can be considered. Fusing tasks involving transport between the same hubs will allow more of each plane’s capacity to be used.

figure af

TF allows mission planners to tune computational resource usage and plan quality. Analyzing TF is most logical in mission planning problems for which plan quality is measured by plan makespan due to the tradeoff between planning time and plan quality introduced by the tool. The time to derive a solution can be as important as the quality of the derived solution. Solving problems in domains, such as planetary rovers, can take longer to plan, because the time to execute the plan is likely to dominate the planning plus execution time equation due to the large areas being explored. If the makespan unit in the example Blocks World problem is 30 s, then TF minimizes planning time plus execution time. If the makespan unit is 1 s, then CFP minimizes planning plus execution time. If the makespan unit is 1 min, then planning alone minimizes planning plus execution time. TF can be applied to decrease plan makespan at the cost of increased memory usage and planning time. However, TF must be applied intelligently, as increasing the planning time by minutes is illogical if the plan makespan is only decreased by seconds. TF represents an intermediate option between CFP and planning alone. Fusing selected coalition-task pairs can produce better plans than CFP, while still satisfying computational resource constraints.

Fig. 3
figure 3

Number of problems in each domain with derived plans by algorithm

Fig. 4
figure 4

Metrics box plots for Rovers problems solved with PA and RPCA. a Makespan, b steps, c computation Time and d required memory

6.5 Summary

The first tool evaluated, PA, is an existing method for solving multi-agent planning problems containing expressive state models. The other two tools evaluated in this research, CFP and RPCA, present an alternative for multi-agent planning addressing the problem’s computational complexity. Three domains were used to evaluate the tools, Blocks World, Rovers, and Zenotravel. Each tool was evaluated using 100 randomly generated problems from each domain, with the number of problems for which plans were derived presented in Fig. 3. RPCA derived plans for at least as many problems as PA in each domain. Every problem for which CFP derived a plan, an identical plan was derived by RPCA. Furthermore, every nonexecutable coalition failure mode that occurred during the CFP evaluation was corrected by RPCA; thus, RPCA dominates CFP and CFP will not be further analyzed. Statistical analysis of the PA and RPCA is complicated by the computation resource limits applied during the evaluation. The limits are realistic in that real-world applications of planning research must consider the time and memory required to solve a problem, but the limits cause the collected data to be right-censored, i.e., no data is collected for problems requiring time above the limit or memory above the limit, even though problems capable of generating such data are included in the evaluation. Missing data limits the data points that can be used in statistical significance testing and when determining effect size. All the problems attempted during the evaluation are solvable, but data is lacking for many, especially for the PA algorithm. Given the limits of the data collection, statistical analysis using limited assumptions regarding the data was conducted.

Both PA and RPCA solved Blocks World problems that the other algorithm did not, with PA solving 42 problems and RPCA solving 70 problems. A McNemar’s test found a significant difference in the proportion of problems for which plans were derived by each tool (\(a=37, b=5, c=33, d=25, p<0.001\)). The set of 37 problems commonly solved by both PA and RPCA was used for analyzing the four metrics using the sign test. Significant differences between the PA and RPCA algorithms were found for makespan (\(Z=3.66, p<0.001\)) and computation time (\(Z=2.33, p=0.028\)), but there was not enough evidence to reject the significance hypothesis for differences in required memory (\(Z=1.81, p=0.099\)) or number of actions (\(Z=1.86, p=0.090\)). Given that all 58 of the PA failures were due to exceeding memory limits, it is predicted that increasing the memory limit will produce additional data points that will demonstrate significant differences for required memory between PA and RPCA in the Blocks World domain.

All the Rovers problems were solved by both PA and RPCA. The box plots for each metric are presented in Fig. 4, please note that the y-axis for the time and memory box plots are on a log scale. The sign test found significant differences between PA and RPCA for all four metrics (\(Z=7.00, p<0.001\) for makespan, \(Z=7.20, p<0.001\) for action steps, \(Z=9.00, p<0.001\) for computation time, and \(Z=10.00, p<0.001\) for required memory), with PA deriving plans with lower makespan, but RPCA requiring less time and memory to derive plans with fewer action executions.

PA derived no plans for the Zenotravel domain; therefore, it is infeasible to provide an analysis of the results compared to RPCA. RPCA derived plans for 85 of the problems.

7 Conclusion

Multi-agent systems will only become more complex and robots will need to plan and act autonomously alongside humans, not simply be controlled by humans. An important aspect of mission planning is the tradeoff between mission planning time and plan quality. If unlimited computational resources are available, then existing planning algorithms alone are able to solve large, complex mission planning instances for high fidelity real-world domain models. However, computational resource constraints and real-world time constraints encourage development of new tools to support real-time planning for dynamic, uncertain real-world domains. The presented tools manage the planning problem’s exponential complexity. The Coalition Formation then Planning tool uses existing coalition formation algorithms to factor the problem prior to applying existing planning algorithms in order to generate executable plans. Two issues arise with the Coalition Formation then Planning tool: nonexecutable coalitions and suboptimal plans. The Relaxed Plan Coalition Formation tool addresses the nonexecutable coalition issue by using relaxed planning to select agents to insert into coalitions in order to transition nonexecutable coalitions into executable coalitions. Task Fusion addresses the suboptimal plans issue by fusing coalition-task pairs for which the increase in plan quality from planning simultaneously offsets the increase in computational resource requirements. Deciding whether or not to apply Task Fusion is mission dependent. If plan quality is judged by makespan, then makespan must be reduced by an amount of time greater than or equal to the additional time required to derive the plan. These tools work together to generate executable plans for large, complex mission planning problems with high fidelity domain models that include continuous variables and concurrently executing actions.

Future work involves integrating the Task Fusion augmentation into the existing framework for coalition formation then planning and analyzing different heuristics for selecting tasks for fusion. A test domain modeling a first response scenario and associated problems will be evaluated with each of the presented tools. Finally, the mission planning system will be integrated with the i-CiFHaR coalition formation system in order to create a complete system for real-world mission deployments.