Hybrid mission planning with coalition formation

Dukeman, Anton; Adams, Julie A.

doi:10.1007/s10458-017-9367-7

Hybrid mission planning with coalition formation

Published: 08 May 2017

Volume 31, pages 1424–1466, (2017)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Autonomous Agents and Multi-Agent Systems Aims and scope Submit manuscript

Hybrid mission planning with coalition formation

Download PDF

679 Accesses
11 Citations
Explore all metrics

Abstract

The increase in robotic capabilities and the number of such systems being used has resulted in opportunities for robots to work alongside humans in an increasing number of domains. The current robot control paradigm of one or multiple humans controlling a single robot is not scalable to domains that require large numbers of robots and is infeasible in communications constrained environments. Robots must autonomously plan how to accomplish missions composed of many tasks in complex and dynamic domains; however, mission planning with a large number of robots for such complex missions and domains is intractable. Coalition formation can manage planning problem complexity by allocating the best possible team of robots for each task. A limitation is that simply allocating the best possible team does not guarantee an executable plan can be formulated. However, coupling coalition formation with planning creates novel, domain-independent tools resulting in the best possible teams executing the best possible plans for robots acting in complex domains.

Task Allocation Using a Team of Robots

Article Open access 20 August 2022

Coalition Formation Games for Dynamic Multirobot Tasks

Task Allocation of Multi-robot Coalition Formation

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The domains robots can operate in is rapidly expanding as robotic capabilities increase. Some robotic domains, such as mass casualty response, will require close coupling between the human and robot responders in order to successfully complete the mission. A taxonomy for categorizing multi-agent system problems includes the types of robots (single-task robots vs. multi-task robots), the number of robots per task (single-robot tasks vs. multi-robot tasks), and when information is available (instantaneous allocation vs. time-extended allocation) [20]. This manuscript addresses centralized, domain-independent planning for multi-task robot, multi-robot task, instantaneous allocation missions.

The planning problem uses an initial state and a set of actions and constraints to derive a plan to achieve a goal state. The planning problem complexity is partially a function of the number of robots and tasks and partially a function of the domain model complexity [16]. Planning for expressive real-world models with durative actions, joint actions, concurrently executing actions, continuous variables, and continuous effects is much harder than planning domains with instantaneous actions and boolean variables.

Consider a mass casualty response scenario after a tornado, such as the EF-4 tornado in Tuscaloosa, AL on April 27, 2011. The immediate response involved hundreds of responders from local government agencies and required coordination to complete several tasks, including clearing impassable roads, securing prohibited items, triaging the wounded, and locating victims in the disaster area. Moving about the environment, clearing debris from roads, and all other actions require time and are not instantaneous actions. Agents must be able to coordinate actions and work concurrently. Real-world domains include continuous variables, such as fuel level, that must be considered. The heterogeneous agents, complex tasks, and complex environment complicate this difficult planning problem.

One method to address planning complexity is factored planning, which splits the goal into lower complexity subgoals. Multi-agent planning approaches using factored planning focus on how the entire coalition [4, 14] or an individual agent [46] can solve subproblems. Factored planning as part of single-agent planning does not address multiple agents [7, 23]. Solutions for path-planning [47] and target tracking [24] exist, but are domain-dependent. The tools presented in this manuscript address domain-independent planning problems with multiple, heterogeneous agents and multiple, complicated tasks in complex domains with durative actions, concurrently executing actions, continuous variables, and continuous effects by factoring the problem by both tasks and coalitions of agents.

Coalition formation can manage problem complexity by allocating a team of agents to each task. Coalition formation uses the capabilities of the agents and the capability requirements of the tasks to form teams of agents that can accomplish a set of assigned tasks, while optimizing an objective function (e.g., utility, cost, or number of tasks completed) [40]. The factoring produced by coalition formation generates several smaller problems from the original problem; however, coalition formation cannot guarantee allocated coalitions will be able to execute the task to which they are assigned to complete.

Current coalition formation research does not address the nonexecutable coalition problem (a coalition which cannot complete its assigned task) and current planning research does not perform task allocation on the scale of coalition formation. Factored planning is a popular approach to decomposing the problem into manageable subproblems, but existing factored planning algorithms do not consider agents. Multi-agent planning performs goal allocation, but typically does not consider how multiple agents working together simultaneously on a single task affects plan quality. Three novel tools incorporating both coalition formation and planning are presented: coalition formation then planning, relaxed plan coalition formation, and task fusion. The coalition formation then planning approach is used as the basis for the other tools that utilizes the output from the coalition formation problem as the inputs to the planning problem. Tasks are planned separately by the allocated coalitions and the results are merged into a single solution for the original problem. The coalition formation then planning tool derives satisficing solutions quickly, but raises two problems: nonexecutable coalitions and suboptimal solutions. If coalition formation allocates a coalition to a task that the coalition cannot complete, then a new coalition must be allocated. Relaxed plan coalition formation augments coalition formation then planning by performing iterative planning, relaxed planning, and coalition formation until a valid executable plan is identified. The second problem with coalition formation then planning is suboptimal solutions. Limiting the agents available for planning limits the problem complexity, but reduces the size of the problem solution set; thus, leading to suboptimal solutions. Task fusion balances solution quality with problem complexity by planning for tasks and coalitions together for which higher solution quality outweighs increased problem complexity.

Section 2 provides an overview of related research. Section 3 presents a formal definition of the problem. Three experimental domains are given in Sect. 4. Section 5 presents the experimental design used to evaluate the presented planning tools. Section 6 presents the tools and the results for each tool solving problems in each experimental domain followed by a discussion of how the results motivate the next tool. Finally, a conclusion and future work is presented in Sect. 7.

2 Related work

2.1 Coalition formation

Coalition formation is a subclass of the task allocation problem without constraints on the number of agents (robot or human) allocated to each task nor the number of coalitions to which an agent is a member. The coalition formation problem is NP-complete [37], is difficult to approximate [40], and represents a multi-task, multi-robot problem that incorporates algorithms for both instantaneous allocation [42, 50] and time extended allocation [25, 34]. The goal is to form teams of agents that are together more capable than the team’s individual agents and can accomplish a set of assigned tasks, while optimizing an objective function (e.g., utility, cost, tasks completed). The general coalition formation problem assumes a grand coalition of n agents, ${\varPhi }=\{\phi _1,\ldots ,\phi _n\}$, and a set of m tasks, $V=\{v_1,\ldots ,v_m\}$. A solution is a map of each task, $v \in V$, to a coalition assigned to the task, ${\varPhi }_v \subseteq {\varPhi }$ [37].

Agents and tasks are modeled as their capabilities offered or capabilities required, respectively. Two different capability models are used, the resource model and the service model. The resource model treats each agent as a set of available resources (e.g., chemical sensor, camera, laser) and each task as a set of required resources. Let Res be a vector of possible resource types, where $Res_i$ is the ith resource type. Each agent, $\phi $, is modeled as a resources available vector, $Res^{\phi }$, and a coalition, ${\varPhi }_i \subseteq {\varPhi }$, is modeled as a vector equal to the sum of the available resource vectors of the constituent agents, $Res^{{\varPhi }_i}=\sum _{\phi \in {\varPhi }_i} Res^{\phi }$. A task, v, is similarly defined as a resources required vector, $Res^v$. All elements of resources available vectors and resources required vectors must be non-negative, and at least one element in each vector must be non-zero. A coalition, ${\varPhi }_j$, is a candidate coalition for a task, v, if and only if it has available at least as many of each resource type as the task requires, $\forall i, Res^{{\varPhi }_j}_i \ge Res^v_i$. Only a candidate coalition for v can be allocated to v.

The service model associates a set of functions that each agent can perform with the particular agent (e.g., box-pushing, mapping, sentry-duty). Let Ser be a vector of possible service types, where $Ser_i$ is the ith service type. An agent, $\phi $, has a services available vector indicating whether or not each service is offered by the agent, $Ser^\phi _i \in \{0,1\}$, where $Ser^\phi _i$ is 1 if $\phi $ offers service i and 0 if not. A coalition, ${\varPhi }$, has a services available vector equal to the sum of the services available vectors of its constituent agents, $Ser^{{\varPhi }}=\sum _{\phi \in {\varPhi }} Ser^{\phi }$. A task, v, is modeled as a services required vector, $Ser^v$, where $Ser^v_i \in {\mathbb {N}}$ is a non-negative integer representing the number of services of type $Ser_i$ required to satisfy v and $\exists j, Ser^v_j > 0$. A coalition, ${\varPhi }$, is a candidate coalition for a task, v, if and only if has available at least as many services as the task requires, $\forall i, Ser^{{\varPhi }}_i \ge Ser^v_i$.

There are many heuristic-based coalition formation algorithms, each with its own strengths and weaknesses. Greedy algorithms can derive solutions quickly, but make no guarantees on the solution quality [40, 42, 44, 51]. Approximation algorithms provide solution quality guarantees, but suffer from poor worst-case run-time complexity, which can render them inappropriate for real-time applications [27, 33]. Market-based techniques offer fault-tolerance for a distributed system, but have high communication processing requirements [10, 41, 43, 50]. Biologically inspired ant colony optimization algorithms have been applied to several NP-complete problems, including coalition formation [36, 38]. Different coalition formation algorithms provide different solutions with differing performance. For example, selecting a market-based algorithm with high communications requirements for use in a communications constrained environment results in poor performance. The intelligent Coalition Formation for Humans and Robots system was developed to autonomously reason over the specified mission constraints in order to select a subset of coalition formation algorithms to apply to a particular allocation problem [39].

2.2 Planning

Classical planning results in a satisficing plan that contains a sequence of actions that achieves a goal state [18]. While classical planning is for single-task robots executing single-robot tasks with an instantaneous allocation, variants of classical planning span the Gerkey and Matarić taxonomy. The extensions of classical planning most applicable to this research are temporal planning, continuous planning, and multi-agent planning. Temporal planning admits durative actions to the action set and allows concurrently executed actions in the solution, continuous planning incorporates continuous variables in the state space and continuous effects in the actions, and multi-agent planning models multiple agents executing actions, rather than a single agent, as in classical planning.

Temporal planning incorporates durative actions and concurrent action execution. The model for durative actions expands the classical action model to include a duration and temporal specifications for action conditions and effects. Action duration specifies the length of time required to execute the action. The temporal specifications indicate when conditions must be satisfied (at the beginning, at the end, or over the entire action duration) and when the effects are applied (at the beginning or at the end of action execution). A solution to the temporal planning problem is a satisficing plan that combines a set of actions with execution constraints to achieve a goal state. Temporal planning solutions can be classified as single-task agents executing single-agent tasks in an instantaneous allocation. State-space based search is a popular method for planning, such as Yet Another Heuristic Search Planner (YAHSP) [49], but other methods such as SAT-based planners (ITSAT [35]) also exist.

Approaches to managing problem complexity include subgoal partitioning and state-based decomposition. Subgoal Plan solved large problems by creating a subgoal partitioning through goal constraint analysis [7]. The subproblems were solved by Metric Fast Forward [21] and were significantly easier to solve than the original problem. The time to solve a problem exponentially decreased as the subproblems’ size was linearly reduced. Divide-and-Evolve, similar to Subgoal Plan, used a preprocessing step to decrease problem complexity before an encapsulated satisficing planner was used to solve the problem [3]. Divide-and-Evolve used a state-based decomposition strategy to find a sequence of intermediate states that collectively solve the problem. Divide-and-Evolve with YAHSP [49] as the encapsulated planner solved significantly more problems than YAHSP alone.

Continuous planning incorporates continuous variables in the state space and expands the action model to include continuous effects. Classical planning models require continuous variables to be discretized; however, real-world models are more accurate when state variables, such as fuel level and temperature, can be modeled as continuous variables. Continuous effects must be combined with temporal planning and durative actions can have effects applied over the entire action duration, known as a continuous effect. For example, an accurate real-world model of aircraft flight must include a continuously decreasing fuel level. If continuous effects are not admitted, then the change in fuel level over the entire action duration must be applied instantaneously. A solution to the continuous planning problem is a satisficing plan that combines a set of actions with execution constraints to achieve a goal state. Continuous planning solutions can be classified as single-task agents executing single-agent tasks in an instantaneous allocation.

Some planners, such as Temporal Fast Downward (TFD) [17], support continuous variables, but not continuous change. Zeno was the first planner to allow continuous change in planning problems [31]; however, it was unable to handle concurrent continuous effects, such as are required for an accurate model of in-flight refueling. COLIN (COntinuous LINear) extended state space search techniques to manage continuous effects [8]. Other continuous planning algorithms include IxTeT [26], Sapa [12], and dReal[5]. Accommodating domains with non-linear continuous change allows real-world domains to be more accurately modeled, but is only supported by dReal.

Multi-agent planning explicitly considers multiple agents executing actions by substituting a set of agents for a set of actions in the classical planning definition, where each agent is modeled as a set of actions that the agent can execute. The solution is a plan specifying a set of actions with execution constraints and an associated agent responsible for executing each action. Multi-agent planning solutions exist for single-task and multi-task agents, single-agent and multi-agent tasks, and instantaneous allocation.

Factored planning approaches are natural multi-agent planning solutions. One of the first such algorithms used individual agent planning to generate a heuristic for use in global planning for the original problem [14, 15]. Another factored planning approach performed distributed planning followed by agent voting [46]. Agents modified a base plan and distributed it to the other agents. A new base plan was selected from the set of agent plans by voting. If all agents indicated the base plan satisfied their task, then planning ended, otherwise another iteration of planning and voting occurred.

Deriving plans individually requires a plan merge step to integrate the plans to a single global solution. Plan merge allows agents to take advantage of side products, the unused product of other agents’ actions, to eliminate redundant actions in plan merge steps [52]. Another approach treats the problem as a plan-space search problem in which incremental changes are made until the plan is valid [9]. Part of the plan merge problem requires satisfying all temporal constraints. Simple temporal networks have been applied by encoding the temporal constraints of the individual plans and finding consistent variable assignments representing a valid plan merge [1].

The domain complexity and the method used for factoring the problem are the differentiating aspects addressed by the presented tools. Existing multi-agent planning solutions focus on instantaneous actions and discrete state spaces or assume that tasks are executable by a single agent [4, 11, 13, 30]. Developing real-world domain models requires durative actions in continuous state spaces for tasks that require multiple agents.

2.3 Integrated task allocation and planning

Chance-constrained task allocation is an example approaches that incorporate task allocation with aspects of planning [32]. Each agent estimated the utility of it completing each task. Allocation utility was a function of the agent allocated to the task, when the agent will be able to execute the task, and a predefined model of problem uncertainty. However, real-world problems do not consist exclusively of single-agent tasks.

Approaches to coalition formation in which tasks have temporal and spatial constraints can address task allocation and scheduling, but do not produce plans for how agents will execute their allocated tasks when they reach the task location [25, 34]. Both approaches are for multi-task agents executing multi-agent tasks in a time-extended allocation and assumed all agents capable of reaching the task location were able to contribute to the task, an invalid assumption in some missions (e.g., an agent without a camera cannot assist an imaging task).

Auction style coalition formation algorithms allow agents to perform task planning prior to allocation and typically grant exclusive ownership of tasks, which can be detrimental when agents fail to complete their task and no method for informing the other agents of the failure exists. A method based on bounty hunters and bail bondsmen allows for nonexclusive task execution [53]. Following the bail analogy, agents act as bounty hunters and auctioneers as bail bondsmen. The bail bondsmen increase the value of each task until it has been completed. Agents commit to a task and announce to the other agents that they have committed to the task. Agents can plan how to complete each available task, but can only commit to a single task. Agents receive the task utility only upon completing the task. If an agent fails to complete a task, the system adapts by incentivizing other agents to complete the task through increasing task value. This approach lacks collaboration among the agents, a key feature in real-world problems representative of multi-robot task domains. These auction approaches are for multi-task agents executing single-agent tasks.

The Automatic Synthesis of Multi-robot Task solutions through software Reconfiguration (ASyMTRe) system used connected agent and task schemas to allocate coalitions to tasks [45]. Agents were modeled by perceptual, motor, and communication schema and tasks were modeled as a set of motor schema requirements. ASyMTRe connected robot schemas to develop a joint schema capable of accomplishing assigned tasks. For example, if robot $r_i$ knows its position relative to robot $r_j$ and $r_j$ knows its position in a global reference frame, then $r_i$ can derive its position in the same global reference frame. Connecting the various robotic schemas determined which agents were capable of jointly completing a task, but did not plan how the robots completed the task. A similar system, Remote Object Control Interface, considered robots as nodes offering expanded functionality dependent on the other nodes in the system [6]. Both systems fail to produce executable plans for the assigned tasks. These two systems are both for multi-task agents executing multi-agent tasks.

One domain-dependent integration of task allocation and planning that has been extensively studied is the multi-robot task allocation and path planning problem [2, 29, 54]. Simultaneously considering the task allocation and the path to the task allows for collision-free trajectories to be developed more efficiently than if the two problems were considered independently. One application incorporates two agents that swap tasks when a collision is detected [47, 48]. The new trajectories for the agents are guaranteed not to collide with one another. A search and destroy problem with attack UAVs developed a plan offline to determine an optimal search pattern to locate mobile targets whose locations were unknown a priori; thus, the decision regarding which UAVs will perform the attack must be made online [24]. A distributed probabilistic approach considered the path for each UAV to reach the target, the UAV’s attack capability, and the probability of target destruction. These approaches are feasible for the specific domains, but many, diverse multi-agent domains exist and developing a different solution for each domain is impractical.

Task allocation and planning are closely coupled problems, but there is minimal existing literature that addresses the interaction between the two problems. Planning affects task allocation via the developed plans, as the plan constrains the set of agents available by requiring agents to perform actions at specific times. Task allocation affects planning by determining which agents are available when developing a plan. If an agent is not allocated to a task, then the planning algorithm will not use the agent to develop a plan. Tools for coupling domain-independent task allocation and planning will facilitate solving planning problems consisting of multi-task agents executing multi-agent tasks.

3 Formal definition

The presented tools are for planning for multi-task robot, multi-robot task, instantaneous allocation problems [20]. This Hybrid Mission Planning with Coalition Formation problem couples coalition formation with planning to facilitate solving complex problem instances with heterogeneous multi-task robots executing multi-robot tasks.

Definition 1

(Hybrid Mission Planning with Coalition Formation) The hybrid mission planning with coalition formation problem is defined as a tuple, $\langle S, I, {\varPhi }, V, C \rangle $, where:

S is the state space,
I is the initial state,
${\varPhi } = \{ \phi _1, \phi _2, \ldots , \phi _m \}$ is the grand coalition of agents,
$A = 2^{\varPhi } \rightarrow 2^{Act}$ is the coalition-action set mapping,
$V = \{ v_1, v_2, \ldots , v_n \}$ is the set of tasks, and
$C = \langle Cap, C_{\varPhi }, C_V \rangle $ is the capability vector, coalition capability mapping, and the task capability mapping.

The hybrid state space, S, includes boolean, discrete, and continuous variables. A state, s, is an assignment of each state space variable to a value in its associated domain. The initial state, I, is the environment state at the beginning of the mission.

The grand coalition, ${\varPhi }$, is the set of all available agents. A coalition, ${\varPhi }_i \subseteq {\varPhi }$, is any non-empty set of agents. The coalition-action set mapping, A, maps a possible coalition, $2^{\varPhi }$, to a set of actions the coalition can execute, $2^{Act}$, where Act is the set of all possible actions. An action is modeled as a tuple, $\langle {\varPhi }_{exec}, eff , cond, dur \rangle $, where:

${\varPhi }_{exec}$ is the executor coalition,
$cond = \langle cond_\vdash , cond_\leftrightarrow , cond_\dashv \rangle $ is the action state constraints that must be satisfied at the beginning, during, and at the end of action execution, respectively, and
$ eff = \langle eff _\vdash , eff _\leftrightarrow , eff _\dashv \rangle $ is the action effects for atomic fact transitions applied to the state at the beginning of, during, and at the end of action execution, respectively,
dur is a constraint on the length of the time interval required to execute the action.

The executor coalition, ${\varPhi }_{exec}$, for an action, a, is the set of agents that execute a. If ${\varPhi }_{exec}$ is a singleton coalition consisting of a single agent, then a is a single-agent action. If ${\varPhi }_{exec}$ includes more than one agent, then a is a joint action between multiple agents. A state constraint can be applied to boolean or continuous state variables. Constraints on boolean variables specify the truth value the variable must take, while constraints on continuous variables specify the interval to which the variable’s value must belong. Action state constraints can be specified as applying at the beginning, during, or end of action execution, $cond_\vdash , cond_\leftrightarrow $, and $cond_\dashv $, respectively. Action effects at the beginning of action execution, $ eff _\vdash $, can apply to boolean state variables (as setting the value to true or false) or to continuous state variables (as an instantaneous change in value). Action effects throughout action execution, $ eff _\leftrightarrow $, must apply to continuous state variables and represent a continuous change in the value of the variable during action execution. Action effects at the end of action execution, $ eff _\dashv $, can apply to boolean state variables or to continuous state variables. The action duration constraint, dur, is the interval to which action duration must belong. Action duration must be non-negative. Similar actions, such as navigating between waypoints, are considered different if they are executed by different agents. For example, $\phi _i$ navigating from $w_r$ to $w_s$ is different than $\phi _j$ navigating from $w_r$ to $w_s$.

The task set, V, is a set of tasks to be satisfied. Each task, $v \in V$, is modeled as a set of goal state constraints. A task, v, is satisfied in a state, s, if and only if all of v’s goal state constraints are satisfied in s.

The capability vector, $Cap = [ Cap_1, Cap_2, \ldots ]$, is the vector of coalition formation capabilities used in the problem. The coalition capability mapping, $C_{\varPhi }$, is a mapping of each agent to a capability available vector. The elements of a capabilities available vector are non-negative values, with at least one non-zero element. Each agent, $\phi $, has a capabilities available vector, $Cap^\phi $. For example, if $|Cap| = 5$ and $\phi $ has two of $Cap_3$ and three of $Cap_5$, then $Cap^\phi = [ Cap^\phi _1 = 0, Cap^\phi _2 = 0, Cap^\phi _3 = 2, Cap^\phi _4 = 0, Cap^\phi _5 = 3]$, where $Cap^i_j$ is the amount of $Cap_j$ that entity i (agent or coalition) has at its disposal. Each coalition, ${\varPhi }$, has a capabilities available vector, $Cap^{\varPhi }$, equal to the sum of the capability available vectors of ${\varPhi }$’s constituent agents, $Cap^{\varPhi } = \sum _{\phi \in {\varPhi }} Cap^\phi $. The task capability mapping, $C_V$, is a mapping of each task to a capability required vector. The elements of a capability required vector are non-negative reals, with at least one non-zero element. For example, if $|Cap| = 5$ and v requires one of $Cap_2$ and two of $Cap_3$, then $Cap^v = [ Cap^v_1 = 0, Cap^v_2 = 1, Cap^v_3 = 2, Cap^v_4 = 0, Cap^v_5 = 0 ]$, where $Cap^i_j$ is the amount of $Cap_j$ required to satisfy i.

A plan, $\pi $, is a set of action steps. An action step consists of an action, a start time to begin executing the associated action, and the duration of the action. An executable plan is a plan for which the action steps are executed validly. An action step is executed validly if the associated action’s state constraints are satisfied. Executing the action steps in a executable plan transitions the environment from the initial state, I, to an end state, $s_{end}$, achieved after all action steps have finished. A solution to the problem is a satisficing plan, an executable plan in which $s_{end}$ satisfies the goal state constraints of each task, $v \in V$. A utility function, such as makespan or number of action steps, can be used to compare satisficing plans. An optimal plan is a satisficing plan that maximizes the selected utility function. A coalition is an executable coalition if a satisficing plan has been derived for the coalition to complete its task. A nonexecutable coalition is a coalition for which a satisficing plan has not been derived for the coalition to complete its task.

4 Example domains

The goal is to solve complex real-world domain problems with multiple heterogeneous agents, durative actions, and complex state spaces. Existing planning problems were modified as a first step towards achieving this goal and evaluating the presented tools. Most existing planning domains lack at least one of the aspects representative of the desired domains and to properly evaluate the presented planning tools. A modified Blocks World domain will be used to illustrate the formal problem definition. Two additional planning domains, Rovers and a modified Zenotravel, are presented and used to experimentally validate the tools. Each domain, and the modifications to each, are presented and implemented in the Planning Domain Definition Language (PDDL) [19].

4.1 Blocks World

The modified Blocks World domain requires that heterogeneous robotic arms manipulate stacks of heterogeneous blocks on a table of finite size. Each arm has a subset of end effectors available to it, while each block requires a specific end effector to be manipulated. A block can be manipulated by an arm if and only if the arm has the block’s required end effector. While blocks have the same dimensions, blocks can be either single- or double-weight. Single-weight blocks can be manipulated by a single arm with the required end effector, while double-weight blocks require two arms, each with the required end effector, in order to be manipulated. The block stacks rest on a table with only enough space for a finite number of block stacks. The goal state is a rearrangement of the blocks from the initial state into a specified set of block stacks. The modified domain has been made freely available.^{Footnote 1}

The state space, S, includes both boolean and continuous variables. The boolean variables describe the block stacks, each block’s required end effector type, which block each arm is holding, and each arm’s available end effectors. The continuous variables describe the height of each arm and block, the number of blocks on the table, and the table capacity. The domain of the continuous variables is non-negative integers, which is not continuous; however, modeling the variables as continuous simplifies the state model by not requiring all possible values to be enumerated and ordered. The initial state, I, is an assignment of a value to each variable in the state space. As a partial example, the middle stack in the example initial state in Fig. 1a is expressed by assigning the value true to the following variables: $(onTable \ C), (onBlock \ D \ C), (onBlock \ E \ D), (requires \ C \ encompass), (requires \ D magnetic)$, and $(requires \ E \ friction)$.

The grand coalition, ${\varPhi }$, is the set of arms executing actions. The actions are the up and down arm movement and block manipulation. The duration of each action is a linear function of the number of arms executing the action, i.e., a single arm picking up a single-weight block is a shorter action than two arms picking up a double-weight block due to fewer arms executing the action. An example PDDL implementation of arm a picking up a single-weight block $b_1$ off of block $b_2$ is presented in Fig. 2. The action has a duration of 1. Executing the action requires that a be empty, that $b_1$ be clear, and that $b_1$ be on $b_2$ at the start of action execution, while over the entire action execution a must be at the same height as $b_1$, that $b_1$ require the specified end effector, and that a have the specified end effector. The action has three start of action effects, a is no longer empty, $b_1$ is no longer clear, and $b_1$ is no longer on $b_2$. The two end of action effects are that $b_2$ is clear, and that a is holding $b_1$. The combination of effects at the beginning and end of action execution ensures logical consistency throughout action execution. For example, the combination of effects ensures that a third block cannot be placed on $b_2$ while $b_2$ is being removed from on top of $b_1$.

Each stack of blocks in the goal state corresponds to a task. The example goal state in Fig. 1b is divided into three tasks: $v_C, v_E$, and $v_F$. $v_C$ is the stack with C on the bottom and the goal state constraints for $v_C$ are satisfied when C is on the table and B is on C, i.e., when (onBlockBC) and (onTableC) are both true.

The capability vector for the Blocks World domain corresponds to the end effector types: [suction, friction, magnetic, encompass]. The capabilities offered vector for each arm is a function of the end effectors available to the arm. For example, an arm with a friction end effector and an encompass end effector has the capabilities available vector [0, 1, 0, 1]. Double-weight blocks require twice the capabilities of single-weight blocks, because manipulating double-weight blocks requires two robotic arms. The capabilities for each stack are a function of two sets of blocks, the blocks in the goal stack and the blocks that must be manipulated to access the blocks in the goal stack. For example, the capabilities required vector for $v_E$ is a function of E and G, because they are the blocks in the goal stack and there are no other blocks above E and G in the initial state. E requires two suction capabilities and G requires the two friction capabilities; therefore, the capabilities required vector for $v_E$ is [2, 2, 0, 0]. The capabilities required vector for $v_C$ is a function of B and C, as they are in the goal stack, and of D and E, because they are above C in the initial state. The capabilities required vector will be constructed iteratively as an example. E requires two friction capabilities, thus, [0, 2, 0, 0]. D adds a requirement for a single magnetic capability, [0, 2, 1, 0]. C adds a single encompass end effector, [0, 2, 1, 1]. B requires a single encompass end effector, but an encompass end effector is already part of the capabilities required vector; therefore, the capabilities required vector is not modified. The final capabilities required vector for $v_C$ is [0, 2, 1, 1].

4.2 Rovers

The Rovers domain has been used for several iterations of the International Planning Competition (IPC) [28]. The domain models rovers navigating between waypoints, collecting different classes of scientific data at a subset of waypoints, and communicating the data back to the central lander. The five classes of scientific data are soil analysis, rock analysis, high-resolution imagery, low-resolution imagery, and color imagery. Each rover can independently navigate a subsection of the environment and collect a subset of the classes of scientific data, but only one rover at a time can communicate data to the central lander. Rock analysis is required at a subset of waypoints and soil analysis is required at a subset of waypoints. Rovers must be at a waypoint to perform rock or soil analysis on waypoint and must be equipped for the analysis. Up to three types of imagery data can be collected at each waypoint. A rover must have the correct camera type and the target waypoint must be visible in order for the rover to collect imagery data for the target waypoint. The PDDL implementation of the domain is identical to the simple time version of the domain used in the 2002 International Planning Competition,^{Footnote 2} with the exception of modified action durations.

The state space contains only boolean variables and describes waypoint connectivity, waypoint visibility, rover scientific tools, data collection types and location, central lander location, and communication channel capacity. Each action has a fixed duration. The domain’s capability model corresponds to the classes of scientific data being collected. Each rover’s capabilities offered vector is a function of the tools available to the rover. The goal is subdivided into a task for each class of scientific data, e.g., all the state constraints concerning rock analysis are grouped into a single task. The capabilities required vector for each task corresponds to the types of scientific data collected for the task.

4.3 Zenotravel

The Zenotravel domain was originally created for testing the Zeno planner [31] and was modified to include hub and spoke airports, passengers and cargo, and short-range and long-range planes. Spoke airports are airports in smaller cities, with each spoke connected to a single hub airport. Hub airports are located in larger cities and are connected to a set of spoke airports. Short-range planes fly only between a hub and its connected spokes. The set of spoke airports for each pair of hubs is disjoint. All hubs are connected and only long-range planes can fly between them. Each plane has limited passenger and cargo capacity. The goal is satisfied when all passengers and cargo are at their destinations. The modified domain has been made freely available.^{Footnote 3}

The state space includes both boolean and continuous variables. The boolean variables describe the location of each passenger, cargo, and plane. The continuous variables include the amount of passengers and cargo on each plane, each plane’s passenger and cargo capacity, each plane’s fuel level and capacity, and the distance between connected cities. The number of passengers, amount of cargo and their respective capacities for each plane are not continuous variables; however, similar to Blocks World, modeling the values as continuous variables in PDDL facilitates the experiments and expressing the models by not requiring all possible values to be enumerated. The actions to load and unload passengers and cargo from a plane have fixed duration. Fuel use and the action duration for a plane to fly between two cities is a linear function of the distance traveled. The time required to refuel a plane is a linear function of the fuel level at the start of action execution and the fuel capacity. The capability model includes passenger and cargo capacity and the hub cities. For example, a short-range plane based out of the hub airport of ATL in Atlanta, Georgia has a capabilities offered vector corresponding to its passenger and cargo capacity and its ability to travel between ATL and ATL’s spoke airports. A long-range plane has a capabilities offered vector corresponding to its passenger and cargo capacity and its ability to travel between any two hub airports, such as ATL and LAX in Los Angeles, California. The goal state is divided into tasks based on the origin and destination airports of the passengers and cargo. All passengers and cargo originating in a city and traveling to the same city are grouped into a single task. The capabilities required vector of each task is a function of the number of passengers and cargo included in the task, the origin, and the destination.

5 Experimental design

This section describes the experimental design for each tool when solving the hybrid mission planning and coalition formation problem.

5.1 Random problem generation

Grand coalitions and missions were generated for each domain. A grand coalition consists of a set of agents and their associated capabilities. A Mission consists of an initial state and a goal state description. Each grand coalition in each domain was paired with each Mission in the same domain to create a problem to be solved. Ten grand coalitions and ten missions were generated for each domain, for a total of 100 generated problems for each domain. The specific experimental details for each domain are presented.

5.1.1 Blocks World

The grand coalitions in the Blocks World domain were a randomly generated set of robotic arms. Four types of end effectors were used: friction, suction, magnetic, and encompass. Each grand coalition had between four and eight arms, with each arm averaging two end effectors. The grand coalitions required at least two arms with each end effector to guarantee the ability to execute each mission. The generated grand coalitions were manually validated as possessing the required end effectors. If a grand coalition was deficient, then the least capable arm in the grand coalition was augmented with the missing end effector(s). The grand coalitions ranged from 4 to 8 arms, with an average of 6.5 arms. Each arm averaged 2.6 end effectors. The mission initial states included between three and five block stacks, with each stack having three blocks, for a total of nine to fifteen blocks. The missions averaged 4.1 stacks of blocks in the initial state. Each mission’s goal state description required a random rearrangement of the blocks from the initial block stacks into an equal number of block stacks. The problems generated from the same mission differ in the number of arms and the number of and types of end effectors on the arms. The problems generated from the same grand coalition differ in the number of blocks and the goal state description.

5.1.2 Rovers

The grand coalitions included ten randomly generated rovers. Each rover was allocated tools allowing it to collect an average of two of the five classes of scientific data, defined in Sect. 4.2. The mission initial states included the connections between the waypoints, the waypoints each rover was able to traverse, each rover’s starting location, scientific data source locations, and the central lander’s location. The mission goal state description requires all the scientific data to be communicated to the central lander. The missions averaged 103 waypoints, with an average of 4.7 waypoints traversable from each waypoint. Each mission required collecting an average of 116.1 pieces of scientific data. Each grand coalition averaged 4.2 rovers capable of collecting a given class of scientific data, with a minimum of two rovers in each grand coalition capable of collecting each class of scientific data.

5.1.3 Zenotravel

The grand coalitions in the Zenotravel domain were a randomly generated set of long-range planes and short-range planes. Each hub city had between one and three short-range planes and five to ten long-range planes were randomly distributed across the hubs in each mission’s initial state. The generated missions use the same set of hubs and spokes, based on real airports in the US and the distances between each. Seven hub airports and forty-two spoke airports were selected, with each hub having between five and seven associated spokes. The missions consisted of an average of 60.1 passengers and 59.7 units of cargo were spread over 17 tasks. The short-range planes had a capacity of four passengers and four cargo units and the long-range planes had a capacity of eight passengers and eight cargo units. The grand coalitions averaged 8.1 long-range planes and 14.5 short-range planes.

5.2 Metrics

Table 1 Dependent variables

Hybrid mission planning with coalition formation

Abstract

Similar content being viewed by others

Task Allocation Using a Team of Robots

Coalition Formation Games for Dynamic Multirobot Tasks

Task Allocation of Multi-robot Coalition Formation

Explore related subjects

1 Introduction

2 Related work

2.1 Coalition formation

2.2 Planning

2.3 Integrated task allocation and planning

3 Formal definition

Definition 1

4 Example domains

4.1 Blocks World

4.2 Rovers

4.3 Zenotravel

5 Experimental design

5.1 Random problem generation

5.1.1 Blocks World

5.1.2 Rovers

5.1.3 Zenotravel

5.2 Metrics

5.3 Coalition formation and planning algorithms

6 Planning tool motivation and analysis

6.1 Planning alone

6.1.1 Blocks World

6.1.2 Rovers

6.1.3 Zenotravel

6.1.4 Summary

6.2 Coalition formation then planning

6.2.1 Blocks World

6.2.2 Rovers

6.2.3 Zenotravel

6.2.4 Summary

6.3 Relaxed plan coalition augmentation

6.3.1 Blocks World

6.3.2 Rovers

6.3.3 Summary

6.4 Task fusion

6.5 Summary

7 Conclusion

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation