1 Introduction

We are seeing nowadays an emergent need to enhance the capabilities of mobile robots deployed in factories, warehouses, and other logistics operations. In these settings, robots are often required to pickup and deliver objects from multiple locations within certain time windows. This gives rise to challenging task-and-motion planning problems as it intertwines logical and temporal constraints about the operations with geometric and differential constraints related to obstacle avoidance and robot dynamics. Robot dynamics, which are often high-dimensional and nonlinear, constrain the feasible motions, making it difficult to generate trajectories that enable the robot to complete its tasks. Moreover, due to the limited capacity, a robot may have to deliver some objects before picking up others, causing the robot to travel longer, which could violate temporal constraints. When the problem is oversubscribed, the robot also has to decide which tasks to abandon, while seeking to complete as many of the remaining tasks as possible.

Fig. 1.
figure 1

Example of a multi-goal motion-planning problem with time windows and capacity constraints. The vehicle is required to pickup items, without exceeding its capacity, and deliver them to the corresponding locations within the specified time windows. Videos of solutions obtained by our approach can be found anonymously at https://bit.ly/2Ef7HMQ.

To address these challenges, we build upon the burgeoning body of research on task-and-motion planning. In a discrete setting, which ignores the obstacles and robot dynamics, there has been significant progress in solving challenging vehicle routing problems that often include time windows, pickups, deliveries, capacities, or premium service requests [1, 2, 4, 7, 10, 14]. In a continuous setting, sampling-based motion planning has shown great success in dealing with constraints imposed by collision avoidance and robot dynamics [3, 5, 11, 12, 15]. Sampling-based motion planning has even been coupled with TSP solvers to effectively solve multi-goal motion-planning problems [16]. Temporal reasoning has also been integrated to allow for time windows [6].

We advance research on task-and-motion planning by developing an effective approach that solves significantly more complex problems. While related work can solve multi-goal motion planning with time windows [6], this paper develops the first effective approach that also incorporates pickups, deliveries, and capacities. Another contribution is that our approach is able to deal with oversubscription by generating collision-free and dynamically-feasible trajectories that enable the robot to pickup and deliver as many objects as possible within the specified time windows. This is made possible by first developing an effective discrete solver based on Monte-Carlo search to quickly generate low-cost tours. Central to our approach is an effective coupling of vehicle-routing solvers in a discrete setting with sampling-based motion planning in a continuous setting that accounts for obstacles and robot dynamics. On the one hand, vehicle routing provides low-cost tours to effectively guide sampling-based motion planning as it explores the vast space of feasible motions. On the other hand, motion planning provides feasibility estimates which vehicle routing uses to refine its plans. To deal with oversubscription, our approach associates costs with the travel and benefits with fulfilling pickups and deliveries, seeking to minimize the temporal violations and maximize the net benefit. Experiments with nonlinear vehicle models show the efficiency and scalability of the approach as we increase the complexity of the operations.

2 Problem Formulation

Robot dynamics are expressed as differential equations

$$\begin{aligned} \dot{s} \leftarrow f(s, a), s \in \mathcal {S}, a \in \mathcal {A}, \end{aligned}$$
(1)

where \(\mathcal {S}\) and \(\mathcal {A}\) denote the state space and action space, respectively. As an example, the dynamics of the car model (Fig. 2) used in the experiments are defined as

$$\begin{aligned} \dot{x} = v \cos (\theta ) \cos (\psi ), \dot{y} = v \sin (\theta ) \cos (\psi ), \end{aligned}$$
(2)
$$\begin{aligned} \dot{\theta } = v \sin (\psi ), \dot{v} = a_{{\mathrm {acc}}}, \dot{\psi } = a_\omega , \end{aligned}$$
(3)

where the state \((x, y, \theta , \psi , v) \in \mathcal {S}\) defines the position, orientation, steering angle, and velocity; the action \((a_{{\mathrm {acc}}}, a_\omega ) \in \mathcal {A}\) defines the acceleration and steering rate. As another example, a snake model (Fig. 2) can be obtained by attaching several trailers to the car and augmenting f as

$$\begin{aligned} \dot{\theta _i} = ({v}/{H})(\sin (\theta _{i-1})-\sin (\theta _0)) \mathop {\prod }\nolimits _{j=1}^{i-1} \cos (\theta _{j-1}-\theta _j), \end{aligned}$$
(4)

where \(\theta _0 = \theta \), N is the number of trailers, \(\theta _i\) is the orientation of the i-th trailer, and H is the hitch distance [11].

Fig. 2.
figure 2

Vehicle models and scenes used in the experiments (scene 1 shown in Fig. 1). Figures (b) and (c) also show roadmaps constructed by our approach to facilitate navigation.

When applying an action \(a \in \mathcal {A}\) to a state \(s \in \mathcal {S}\), the robot changes state according to its dynamics. The new state \(s_{{\mathrm {new}}} \in \mathcal {S}\) is computed by a function

$$\begin{aligned} s_{{\mathrm {new}}} \leftarrow \small {\textsc {simulate}}(s, a, f, dt), \end{aligned}$$
(5)

which numerically integrates \(f\) for one time step dt. Applying a sequence of actions \({\langle {a_1, \ldots , a_{i-1}} \rangle }\) gives rise to a dynamically-feasible motion trajectory \(\zeta : \{1, \ldots , i\} \rightarrow \mathcal {S}\), where \(\zeta (1) \leftarrow s\) and \(\forall j \in \{2, \ldots , i\}\):

$$\begin{aligned} \zeta (j) \leftarrow \small {\textsc {simulate}}(\zeta (j-1), a_{j-1}, f, dt). \end{aligned}$$
(6)

Multi-Goal Motion Planning with Time Windows, Pickups, Deliveries, and Capacities: The world contains obstacles, pickup regions \(\{\mathcal {R}_1, \ldots ,\mathcal {R}_k\}\), and delivery regions \(\{\mathcal {R}_{k+1}, \ldots , \mathcal {R}_{n=2k}\}\). Each \(\mathcal {R}_i \in \mathcal {R}= \{\mathcal {R}_1, \ldots , \mathcal {R}_n\}\) is associated with a load \(\ell _i \in \mathbb {Z}\) and a time interval \([t_i^{{\mathrm {start}}}, t_i^{{\mathrm {end}}}]\) during which the pickup or delivery can be made. For each pickup-delivery pair \({\langle {\mathcal {R}_i, \mathcal {R}_{i+\lfloor \frac{n}{2}\rfloor }} \rangle }\), it holds that \(\ell _i = -\ell _{i + \lfloor {\frac{n}{2}}\rfloor } \ge 0\). The robot has a maximum capacity \(C_{{\mathrm {max}}} \in {\mathbb {N}}\).

Let \(\mathcal {G}= \{\mathcal {G}_1, \ldots , \mathcal {G}_n\}\) denote the goals, where \(\mathcal {G}_i = {\langle {\mathcal {R}_i, [t_i^{{\mathrm {start}}}, t_i^{{\mathrm {end}}}], \ell _i} \rangle }\). Let \(\small {\textsc {goals}}(\zeta )\) denote the goals completed by the robot as it moves according to a trajectory \(\zeta \). A pickup goal \(\mathcal {G}_i\) is completed only when the robot reaches \(\mathcal {G}_i\) within its time window \([t_i^{{\mathrm {start}}}, t_i^{{\mathrm {end}}}]\) and has sufficient remaining capacity to add the load \(\ell _i\). The robot makes a delivery only when the region is reached within its time window and the corresponding load has already been picked up.

Given an initial state \(s_{{\mathrm {init}}} \in \mathcal {S}\), the objective is then to compute a collision-free and dynamically-feasible trajectory \(\zeta \) that enables the robot to complete all the goals, i.e., \(\small {\textsc {goals}}(\zeta ) = \mathcal {G}\). When this is not possible, the objective is to reduce the number of missed goals \(|\mathcal {G}\setminus \small {\textsc {goals}}(\zeta )|\).

3 Discrete Problem Solving

As mentioned, our approach relies on vehicle routing over a discrete abstraction to guide sampling-based motion planning. The discrete solver is required to solve a variant of TSP that includes prizes, time windows, and capacity constraints for pickup and delivery tasks. The discrete solver is invoked numerous times by the approach. While the initial call requires the discrete solver to find a solution involving all pickups and deliveries, subsequent calls involve only a subset of the original problem as some goals may have already been reached. This makes the problem more challenging since the discrete solver has to deal with partial problems where some orders have already been completed and some items have already been picked up and are still on the vehicle.

We now define the discrete problem and then describe a specialized solver based on Monte-Carlo search and a general PDDL3 planner. Section 4 describes how the discrete solvers are integrated within the overall framework.

figure a

Definition 1

(Prize-Collecting TSP with Time Windows, Pickups, and Deliveries, and Capacity Constraints) Given

  • a graph \(G = (V, E, T)\) with \(V = \{v_{{\mathrm {start}}}, g_1, \ldots , g_m\}\), \(E = V \times V\), and times \(T = \{t_{(v',v'')} : (v', v'') \in E\}\)

  • goals \(\{{\langle {g_1, t_1^{{\mathrm {start}}}, t_1^{{\mathrm {end}}}, \ell _1, p_1} \rangle }, \ldots , {\langle {g_m, t_m^{{\mathrm {start}}}, t_m^{{\mathrm {end}}}, \ell _m, p_m} \rangle }\}\)

    • \([t_i^{{\mathrm {start}}}, t_i^{{\mathrm {end}}}]\) is the time interval to visit \(g_i\)

    • \(\ell _i \in \mathbb {Z}\) represents the load associated with \(g_i\)

    • \(p_i \in {\mathbb {R}}\) is the profit gained by visiting \(g_i\)

  • orders \(O = \{{\langle {p_1,d_1} \rangle }, \ldots ,{\langle {p_k,d_k} \rangle }\}\) of pickup \(P = \{p_1,\ldots , p_k\}\) and delivery \(D = \{d_1,\ldots , d_k\}\) tasks

    • \(P \cup D = \{g_1,\ldots ,g_n\}\) (mutually disjoint)

    • if \(p_i\) is empty (denoted by \(p_\bot \)), then \({\langle {p_i, d_i} \rangle }\) is a delivery-only order

    • otherwise, the load for \(p_i\) must be positive and match the negative value of the load for \(d_i\)

  • maximum capacity \(C_{{\mathrm {max}}}\) that exceeds the preload, i.e., \(C_{{\mathrm {max}}} \ge C_{{\mathrm {pre}}} = -\sum _{p_i \in P, p_i = p_\bot } {{\mathrm {load}}}(d_i)\)

compute a path \(\sigma = \sigma _1,\ldots ,\sigma _k\) over \(G = (V, E, T)\) and start times \({\langle {t_1, \ldots , t_{|\sigma |}} \rangle }\) such that (i) \(\sigma \) starts at \(v_{{\mathrm {start}}}\); (ii) each goal \(\sigma _i\) is reached within its time window; (iii) pickups are visited before corresponding deliveries; (iv) capacity is respected; and (v) maximize the accrued net benefit \(\small {\textsc {payoff}}(\sigma ) - a \cdot \small {\textsc {travelcost}}(\sigma ) + b \cdot \small {\textsc {goalsvisited}}(\sigma ) \) where a and b are user-defined constants.

This objective function allows for oversubscription, making it possible to generate partial tours. We have implemented a solver based on Monte-Carlo search and another on PDDL3.

Monte-Carlo Solver. Monte-Carlo search [19] is a randomized reinforcement learning strategy that utilizes random rollouts. Nested rollout policy adaptation (NRPA) [17] was introduced as a way to use gradient ascent rather than navigating the tree directly. The algorithm starts with nested level L and performs a sequential call to \(L-1\), \(L-2\), and if they reach the lowest recursion level, the rollout occurs, which could find a better solution.

Pseudocode for the rollout function developed for our discrete problem is shown in Algorithm 1. Rollout incrementally constructs the tour, penalizing for capacity constraint violations, and rewarding for making deliveries. To handle the fact the vehicle has to wait when it arrives early at a goal, we use two different variables, one for makespan and one for distance.

An important principle of NRPA is to bias rollouts through learning by weighting possible actions. The weights for the actions are updated in each step of the algorithm to prioritize the movements of the best routing sequence found. In addition, each simulation state is coded differently to prevent the policy of the simulation state from affecting the selection of subsequent simulation state steps. The choices of the simulation steps are random, but the probability is not the same in each step. Policy adaptation adds value to the action of the best sequence and decreases the weight of other actions.

Fig. 3.
figure 3

Two actions in PDDL3 domain.

PDDL3 Planner. Preferences are soft constraints on logical expressions. In PDDL3 [8], we compute the quality of a plan based on how many preferences are satisfied. We transform the routing problem of Definition 1 into a PDDL3 planning domain and task description, and call a domain-independent planner to generate a feasible plan. We use OPTIC [2], which allows the handling of preferences and uses CPLEX as the backend MILP solver.

Two main PDDL3 actions are shown in Fig. 3. For pickup and delivery tasks, we split the durative action execute task into one for pickups and one for the delivery, and one that executes a delivery-only task, based on some preload value.

4 Overall Approach

Our approach enhances the framework for integrating temporal reasoning into motion planning [6]. The main components are as follows: (i) construct a roadmap to provide a suitable discrete abstraction, (ii) expand a tree in the vast state space of feasible motions and partition the motion tree into equivalence classes, and (iii) use discrete solvers to guide the expansion of the equivalence classes.

We make significant technical contributions to enhance these components, namely: (a) develop an efficient discrete solver based on Monte-Carlo search, (b) develop effective notions of equivalence classes that account for pickups, deliveries, and capacities, and (c) enable guidance of the motion-tree expansion by partial tours to handle oversubscription.

Pseudocode is shown in Algorithm 2. The rest of the section describes the components and their interplay in more detail.

Fig. 4.
figure 4

Vehicle models and scenes used in the experiments (scene 1 shown in Fig. 1). Figures (b) and (c) also show roadmaps constructed by our approach to facilitate navigation.

Discrete Abstraction. As in [6], the discrete abstraction is obtained by constructing a roadmap \(RM\) to capture the connectivity of the environment. As in PRM [9], the roadmap is obtained by sampling collision-free configurations and connecting neighboring configurations. This process continues until all the regions in \(\mathcal {R}\) are connected. Figure 4 shows some examples.

After constructing the roadmap, Dijkstra’s shortest-path algorithm is invoked from each region \(\mathcal {R}_i\) to compute minimum cost paths to every roadmap vertex. As an enhancement to [6], in order to generate paths that are not so close to the obstacles, we add a clearance component to the cost of each roadmap edge, i.e., \(\small {\textsc {cost}}(q_i, q_j) = ||q_i-q_j|| / \min (\small {\textsc {clear}}(q_i), \small {\textsc {clear}}(q_j))\), where \(\small {\textsc {clear}}(q_i)\) denotes the separation from the obstacles. These paths will be used by the discrete solver when computing tours.

Motion Tree and Equivalence Classes. A motion tree \(\mathcal {T}\) is incrementally expanded by adding new vertices and edges. Each vertex v is associated with a collision-free state, denoted by v.s. Each edge \((v_i, v_j)\) is labeled with a control action a and represents a collision-free and dynamically-feasible motion from \(v_i.s\) to \(v_j.s\), i.e., \(v_j.s \leftarrow \small {\textsc {simulate}}(v_i.s, a, f, dt)\).

Let \(\zeta _\mathcal {T}(v)\) denote the trajectory from the root of \(\mathcal {T}\) to the vertex v. If \(\small {\textsc {goals}}(\zeta _\mathcal {T}(v)) = \mathcal {G}\), then \(\zeta _\mathcal {T}(v)\) constitutes a solution since all goals would have been completed. As \(\mathcal {T}\) is expanded, our approach keeps track of the vertex v that minimizes \(|\mathcal {G}\setminus \small {\textsc {goals}}(\zeta _\mathcal {T}(v))|\), using the time duration of \(\zeta _\mathcal {T}(v)\) to break ties. The tree expansion continues until a solution is found, a runtime limit is reached, or a convergence criterion is met (no improvement in best v after numerous iterations). At the end, the approach returns the best trajectory \(\zeta _\mathcal {T}(v)\).

How should \(\mathcal {T}\) be expanded from v? Since \(\zeta _\mathcal {T}(v)\) has already completed \(\small {\textsc {goals}}(\zeta _\mathcal {T}(v))\), the objective is to complete as many of the remaining goals as possible. This is where the discrete solver comes into play as it can provide tours to visit the remaining goals, while accounting for time windows, pickups, deliveries, and capacity constraints. Since the discrete solver operates over the roadmap, we must also map v.s to the roadmap, e.g., by selecting the nearest roadmap configuration to v.s. Let then v.t denote the time duration of \(\zeta _\mathcal {T}(v)\), \(v.\mathrm {rem}\) the remaining goals, i.e., \(v.\mathrm {rem} = \mathcal {G}\setminus \small {\textsc {goals}}(\zeta _\mathcal {T}(v))\), and v.q the mapping from v.s to a roadmap configuration. The discrete solver can then be invoked to provide a tour over the roadmap that starts from v.q at time v.t and completes as many of the remaining goals \(v.\mathrm {rem}\) as possible.

figure b

It is infeasible, however, to call the discrete solver for every vertex since \(\mathcal {T}\) can have tens of thousands of vertices. For this reason, \(\mathcal {T}\) is partitioned into equivalence classes, which group together vertices that provide the same discrete information. A new vertex \(v_{{\mathrm {new}}}\) is added to an equivalence class \(\mathcal {X}_v\) only if it has the same remaining goals (hence, the same remaining capacity, pickups, and deliveries), maps to the same roadmap configuration, and the tour associated with v (denoted by \(v.\sigma \)) is compatible with the start time of \(v_{{\mathrm {new}}}\), i.e.,

$$\begin{aligned} \mathcal {X}_v = \{v_{{\mathrm {new}}} \in \mathcal {T}: v.q = v_{{\mathrm {new}}}.q \wedge v.{{\mathrm {rem}}} = v_{{\mathrm {new}}}.{{\mathrm {rem}}} \wedge \\ \small {\textsc {CompatibleTour}}(v.\sigma , v_{{\mathrm {new}}}.t)\} \end{aligned}$$

When a new vertex \(v_{{\mathrm {new}}}\) is added to \(\mathcal {T}\), it is also checked if it can be added to an existing equivalence class. If not, a new equivalence class \(\mathcal {X}_{v_{{\mathrm {new}}}}\) is created. At this point, the discrete planner is invoked to compute the tour associated with \(\mathcal {X}_{v_{{\mathrm {new}}}}\).

Putting it All Together: Guided Search

As shown in Algorithm 2, the overall approach starts by constructing a roadmap \(RM\), computing shortest paths in \(RM\), and then initializing the motion tree \(\mathcal {T}\) and the equivalence classes \(\mathcal {X}\). The core loop is driven by two procedures: (i) selecting an equivalence class \(\mathcal {X}_v\) and (ii) expanding the motion tree \(\mathcal {T}\) along the temporal plan \(\mathcal {X}_v.\sigma \). The equivalence classes are continually updated each time a vertex is added to \(\mathcal {T}\). These procedures are invoked repeatedly until a solution is found, a runtime limit is reached, or the convergence criterion is met.

Selecting an Equivalence Class. A weight is defined for each equivalence class \(\mathcal {X}_v \in \mathcal {X}\) as

$$\begin{aligned} w(\mathcal {X}_v) = \frac{\alpha ^{\small {\textsc {NrSelections}}(\mathcal {X}_v)}}{\beta ^{\small {\textsc {duration}}(\mathcal {X}_{v}.\sigma )} + \gamma ^{|\mathcal {X}_v.\sigma |} + \delta ^{|\mathcal {X}_v.{{\mathrm {rem}}}| - |\mathcal {X}_v.\sigma |}}. \end{aligned}$$
(7)

and the equivalence class with maximum weight is then selected for expansion. The term \(\beta ^{\small {\textsc {duration}}(\mathcal {X}_{v}.\sigma )}\), \(\beta > 1\), increases the weight when the temporal plan \(\mathcal {X}_v.\sigma \) is short. The term \(\gamma ^{|\mathcal {X}_v.\sigma |}\), \(\gamma > 1\), increases the weight when the tour has only a few goals. The term \(\delta ^{|\mathcal {X}_v.{{\mathrm {rem}}}| - |\mathcal {X}_v.\sigma |}\), \(\delta > 1\), is essential to deal with oversubscription (which was not considered in [6]). Note that \(\mathcal {X}_v.{{\mathrm {rem}}}\) denotes the goals that have to be reached; so \(\mathcal {X}_v.\sigma \) is a partial or complete tour over these goals. The term essentially penalizes temporal plans that miss goals. We recommend setting \(\delta \gg \gamma \gg \beta \) so that the weight is dominated by how many goals a temporal plan missed, followed next by the number of goals, and lastly by the plan duration. The term \(\alpha ^{\small {\textsc {NrSelections}}(\mathcal {X}_v)}\), \(0< \alpha < 1\), decreases the weight over repeated selections. This is to avoid selecting \(\mathcal {X}_v\) when repeated expansions from \(\mathcal {X}_v\) fail due to constraints imposed by the obstacles and the robot dynamics.

Expanding the Equivalence Class. After selecting \(\mathcal {X}_v\), the approach seeks to expand \(\mathcal {T}\) to complete the pickup and delivery goals of the temporal plan \(\mathcal {X}_v.\sigma \). To reach the first goal \(\mathcal {G}_1\) in \(\mathcal {X}.\sigma \), the approach retrieves the minimum cost path \({\langle {q_1,\ldots , q_k} \rangle }\) in the roadmap from v.q to \(\mathcal {G}_1\) and attempts to expand \(\mathcal {T}\) along this path. A proportional-derivative-integrative controller [18] is used to expand \(\mathcal {T}\) toward \({\langle {q_1,\ldots , q_k} \rangle }\) in succession by turning and then moving the vehicle toward each target. Intermediate collision-free states are added to \(\mathcal {T}\) as new vertices. The expansion stops when a collision occurs or \(\mathcal {G}_1\) is reached. Once the expansion stops, the approach updates the equivalence classes and their weights to account for the new vertices added to \(\mathcal {T}\) and goes back to the core loop to select perhaps another equivalence class. When the approach has difficulty expanding \(\mathcal {T}\), it penalizes the edges so that in the next iteration the expansion could be attempted along a different path. This allows the approach to discover alternative paths to reach each goal.

5 Experiments and Results

Experiments are conducted using challenging logistics scenarios, where a vehicle with nonlinear dynamics (car or snake) has to navigate in obstacle-rich environments (Figs. 1 and 4) to pickup and deliver items from numerous locations within specified time windows. The efficiency and scalability of the approach is evaluated by increasing the number of goals, tightening the time windows, and reducing the vehicle capacity.

Fig. 5.
figure 5

Runtime results when varying the discrete planner in our approach: (a) MC (b) Optic (c) Random.

Discrete Solvers. We use Monte-Carlo as the specialized solver and Optic [2] as the general temporal planner. We also use a Random solver as a baseline for small problem instances. Random iterates in a random order over all the possible permutations with n goals (then with \(n-1, \ldots , 1\)), stopping as soon as it finds a valid tour.

Benchmark Instances. We extended the benchmark generator in [6] to incorporate loads and capacity constraints. Specifically, 30 instances were generated for each combination of a scene S and number of goals n, denoted by \(\mathcal {I}_{\langle {S, n} \rangle }\). Each instance is generated by randomly placing the goals and the vehicle until there are no collisions. Each goal \(\mathcal {G}_i\) is assigned a random load (from 1 to 5) and a random assignment is then used to create pickup-delivery pairs. Time bounds are generated by first computing a random tour with the constraint that a pickup must appear before its corresponding delivery. The time window for goal \(\mathcal {G}_i\) is set to \([(1 - \epsilon ) t_i, (1 + \epsilon ) t_i]\), where \(t_i\) is the time to reach \(\mathcal {G}_i\) in the tour when traveling along the shortest paths in the roadmap, using the expected velocity to convert distances to time. The parameter \(\epsilon \) is used to tighten the time window (default is 0.2).

The vehicle capacity, by default, is set to \(100\%\) of the sum of the loads. Even in such cases, it may be difficult or even impossible to reach all the goals due to constraints imposed by the time windows, obstacles, and dynamics. Experiments are also conducted with reduced capacity.

For each combination of parameters, the planner is run over all instances. Results report mean runtime, distance traveled, time duration, and goals missed. The mean is computed after dropping runs that are below the first quartile or above the third quartile to avoid the influence of outliers. Runtime measures everything from reading the input until finding a solution or reaching the runtime limit (set to 40 s per run).

Fig. 6.
figure 6

Quality of the dynamically-feasible and collision-free solution trajectories computed by the approach as measured by the time duration, distance traveled, and goals missed. The results are normalized with respect to the tour obtained from the initial location over the discrete abstraction. Results are shown for our approach using the Monte-Carlo discrete solver.

Fig. 7.
figure 7

(left) Results when adjusting the time window \([(1\! -\! \epsilon ) t_i, (1\! + \!\epsilon ) t_i]\). (right) Results when adjusting the vehicle capacity as a percentage of the sum of the loads.

Fig. 8.
figure 8

Runtime distribution as a percentage of the total runtime (bottom to top): (1) roadmap construction and shortest paths, (2) collision checking and SIMULATE, (3) discrete solver, and (4) other.

Results: Increasing the Number of Goals. Figure 5 shows the efficiency and scalability of the approach as the number of goals is increased from 6 to 30. Even when coupled with Random, the approach is still able to quickly solve problems with up to 10 goals. The approach has similar performance when using Optic, which provides generality but is not as fast as specialized solvers. Indeed, the approach achieves the best results when coupled with the Monte-Carlo discrete solver, scaling up to 30 goals in a matter of a few seconds.

Figure 6 shows results on the quality of the solution trajectories in terms of the time duration, distance traveled, and goals missed. The results are normalized with respect to the tour obtained from the initial location (tour associated with \(\mathcal {X}_{v_{{\mathrm {init}}}}\)). The results show that our approach produces high-quality solutions. Note that in some cases our approach is able to find even shorter solutions than those associated with the initial tour. This could be due to the vehicle traveling faster than the expected velocity used in the tour calculations or taking smoother turns when following tour edges.

Results: Adjusting the Time Windows. Figure 7 (left) shows the results when varying \(\epsilon \in \{0.05, 0.1, 0.2, 0.3, 0.4\}\) to adjust the time window \([(1 - \epsilon ) t_i, (1 + \epsilon ) t_i]\) for each goal \(\mathcal {G}_i\). Note that the approach is quite efficient even for small values. As expected, when \(\epsilon \) is too small, e.g., \(\epsilon = 0.05\), it becomes quite difficult to find solutions, causing the approach to miss some goals (which is also why the reported solutions are shorter).

Results: Adjusting the Vehicle Capacity. Figure 7 (right) summarizes the results when varying the vehicle capacity from 100% of the sum of the loads down to 40%. The results show that the approach is quite efficient even when the vehicle capacity is considerably reduced. When the capacity becomes too small, the problem becomes significantly more difficult, which in some cases causes our approach to miss some goals.

Results: Runtime Distribution. Figure 8 summarizes the runtime distribution. As the problems increase in complexity so does the time taken by the discrete solver. Recall that the discrete solver is invoked numerous times during each run to compute tours for each equivalence class that is created during the motion-tree expansion. The discrete solver enables our approach to focus the expansion along promising directions, thus saving valuable computational time.

6 Discussion

We have developed an effective and scalable approach to solve challenging multi-goal motion-planning problems with time windows, pickups, deliveries, and capacity constraints. A crucial aspect of our approach is the use of a discrete solver over a suitable abstraction to effectively guide sampling-based motion planning. We expect our framework to benefit even more from advances in discrete planning. A natural but involved research direction is minimizing energy consumption, e.g., by adding recharging stations. Another direction is to enable more expressive PDDL tasks. We also expect to progress this work towards cooperative motion planning of several robots [13].