1 Introduction

Motion planning is one of the fundamental problems in robotics, and consists of guiding the robot from an initial state to a final one along a collision-free path.

For many robots, focusing only on kinematics could result in collision-free paths that are impossible to be executed by the actual system. In particular, for systems that are differentially constrained, such a decoupled approach makes it difficult to turn a collision-free path into a feasible trajectory. In order to tackle this problem, Donald et al. (1993) proposed the idea of kinodynamic planning, which combines the search for a collision-free path with the underlying dynamics of the system, so that the resulting trajectory would be feasible.

For most robotic applications, the solution to the planning problem should not only be feasible and collision-free, but also satisfy some properties such as, e.g., reaching the goal in minimum time, minimizing the energy consumption, and maximizing safety. These additional requirements have shifted the focus from simply designing collision-free and feasible trajectories to finding optimal ones.

This paper deals with optimal kinodynamic motion planning for systems with complex dynamics and subject to constraints.

1.1 Literature review

A significant amount of work in the robotics community has then been dedicated to the problem of kinodynamic planning so as to determine a trajectory that fulfills the differential constraints arising from the dynamics of the robot. Solving this problem is complex, in general, since it requires a search in the state space of the robot, which often implies a higher-dimensional search space compared to a pure kinematic planning.

Most of the motion planners can be classified under the two categories of exact and sampling-based methods, LaValle (2006). The former looks for a solution in the continuous state space, while the latter samples this space redefining it as a graph where nodes are connected via edges representing local trajectories between sampled states.

Exact methods are said to be complete, since they terminate in finite time with a solution, if one exists, and return the nonexistence otherwise. However, with the exact approaches even the simplest problem is PSPACE hard (Reif 1979). Exact methods require that the obstacles are represented explicitly in the state space, dramatically increasing the problem complexity. However, they can provide practical solutions for problems that are characterized by a low dimensional state space, or for which a low dimensional obstacle representation can be adopted. Obstacles introduce non convex constraints in the admissible state space, and make the problem of computing an optimal trajectory hard. In particular, most of the algorithms, that are gradient based, can only find a solution in the same homotopy class of the initial guess (LaValle 2006). There has been a significant progress to address this issue and, in particular, the ideas of dividing the global optimal trajectory planning problem into simpler subproblems and of using numerical optimization to compute locally optimal trajectories have been explored in, e.g., (Park et al. 2015; Kuderer et al. 2014).

Sampling-based approaches emerged to handle systems with high dimensional state spaces, and they became the most popular approaches in the planning literature, representing the practical way to tackle the problem. The basic idea is to sample states (nodes) in the continuous state space and connect these nodes with trajectories in the collision-free space, building a roadmap in the form of a graph or a tree of feasible trajectories. These algorithms avoid an explicit representation of obstacles by using a collision check module that allows to determine the feasibility of a tentative trajectory. They are not complete, but they satisfy the probabilistic completeness property, i.e., they return a solution with a probability converging to one as the number of samples grows to infinity, if such a solution exists.

Probabilistic Road Map (PRM), introduced by Kavraki et al. (1996), and Rapidly exploring Random Trees (RRT), introduced by LaValle and Kuffner (2001), were the first popular sampling-based planners. PRM first creates a graph in the free configuration space by randomly sampling nodes and connecting them to the already existing ones in the graph using a local planner. The graph can then be used to answer multiple queries, where in each query a start node and a goal node are added to the graph and a path connecting the two nodes is looked for. RRT, on the other hand, incrementally builds a tree starting from a given node, returning a solution as soon as the tree reaches the goal region, hence providing a fast on-line implementation. In all the different formulations of sampling-based planners, a steering function is required to design a trajectory (edge in the tree terminology) connecting two nodes of the tree.

Considering the quality of the solution, an important progress has been made with the introduction of \(\hbox {RRT}^{\star }\) (Rapidly exploring Random Tree star) and \(\hbox {PRM}^{\star }\) (Probabilistic Road Map star), which have been proven to be asymptotically optimal, i.e., the probability of finding an optimal solution, if there exists one, converges to 1 as the tree cardinality grows to infinity, (Karaman and Frazzoli 2011). The main idea of these algorithms is to ensure that each node is connected to the graph optimally, possibly rewiring the graph by testing connections with pre-existing nodes that are in a suitably defined neighborhood. The same strategy applies to kinodynamic planning (Karaman and Frazzoli 2010) as well, with the additional difficulty that when optimality is required, implementing the steering function involves solving a two point boundary value problem (TPBVP), which is computationally challenging especially when dealing with complex dynamics, such as for non-holonomic robots, in presence of actuation constraints. In the context of kinodynamic planning \(\hbox {RRT}^{\star }\) and \(\hbox {PRM}^{\star }\) cannot be considered in the same way. In fact, \(\hbox {PRM}^{\star }\) is limited to symmetric costs and to those systems for which the cost associated to a TPBVP is conserved when the boundary pairs are swapped. Note that, nonholonomic systems do not belong to this class.

Considering instead \(\hbox {RRT}^{\star }\), it must be noticed that for various dynamical systems, such as non-holonomic vehicles, the presence of kinodynamic constraints makes the constrained optimization problem that the steering function has to solve extremely complex.

To deal with this computational complexity, some effort has been made towards developing effective steering functions for different types of dynamical systems. Webb and van den Berg (2013) have obtained the closed-form analytical solution for a minimum time minimum energy optimization problem for systems with linear dynamics, and extended it to non-linear dynamics using first-order Taylor approximation. Other works (Perez et al. 2012; Goretkin et al. 2013) have focused on approximating the solution for systems with linearizable dynamics, by locally linearizing the system and applying linear quadratic regulation (LQR).

Some recent attempts have been made towards optimality without formulating and solving a TPBVP, as well. For example, algorithms like Stable Sparse \(\hbox {RRT}^{\star }\) (\(\hbox {SST}^{\star }\)) (Li et al. 2016) have proved asymptotic optimality given access only to a forward propagation model. The idea is to iteratively sample a set of controls and final times instead of explicitly solving the BVP. Similarly, a variant of \(\hbox {RRT}^{\star }\) (hwan Jeon et al. 2011) uses a shooting method without a steering function to improve the solution by pruning branches from the tree. If a sampled node has a lower cost compared to another one that is close by and that shares the same parent, the pre-existing node is pruned from the tree and its branches are connected to the newly sampled node, or they are pruned completely if they are not collision-free. This approach generates feasible but inherently suboptimal solutions. Other works on extending \(\hbox {RRT}^{\star }\) to handle kinodynamic constraints include limiting the volume in the state space from which nodes are selected by tailoring it to the considered dynamical system in order to improve computational effectiveness (Karaman and Frazzoli 2013).

Nevertheless, solving the TPBVP for an arbitrary nonlinear system remains challenging and typically calls for numerical solvers. The algorithms that account for a nonlinear optimization tool, like for example ACADO toolkit (Houska et al. 2011), GPOPS-II (Patterson and Rao 2014), etc., commonly use Sequential Quadratic Programming (SQP) (Boggs and Tolle 1995) for solving the TPBVPs numerically, and embed it as a subroutine in the sampling based planning framework such as in RRT\(^\star \) (Stoneman and Lampariello 2014) or in Batch-Informed-Trees star (\(\hbox {BIT}^{\star }\)) (Gammell et al. 2014).

A different class of algorithms, aiming at optimality, is based on graph search and adopt a gridding approach. The main idea is to discretize the state space, building a grid, and compute a graph. The motion planning problem is then recast into finding the best sequence of motions by traversing this graph with an optimal search algorithm like \(\hbox {A}^{\star }\) (Pearl 1984). This graph is often represented by a state lattice, a set of states distributed in a regular pattern, where the connections between states are provided by feasible/optimal trajectories (Pivtoraiko et al. 2009). Likhachev and Ferguson (2009) improved the idea of state lattice by using a multi-resolution lattice such that the portion of the graph that is close to either the initial or the goal state has a higher resolution than the other parts. These approaches have been successfully applied to several robotic systems and found to be effective for dynamic environments (Likhachev and Ferguson 2009; Dolgov et al. 2010). However, these algorithms are resolution optimal, such that the optimality is guaranteed up to the grid resolution. Furthermore, their computational effectiveness is highly related to the resolution: the finer is the grid, the higher the branching factor, and thus the computational time and the required memory to execute a graph traversal algorithm.

1.2 Contribution of the paper

The main contribution of this paper is proposing an algorithm, called \(\mathtt{RRT}^{\star } \_ \mathtt{MotionPrimitives}\), which extends \(\hbox {RRT}^{\star }\) by introducing a database of pre-computed motion primitives in order to avoid the online solution of a constrained TPBVP for the edge computation. The database is composed of a set of trajectories, each one connecting an initial state to a final one in a suitably defined grid. By sampling in the gridded state space, the implementation of the steering function adopted for growing and rewiring the \(\hbox {RRT}^{\star }\) tree reduces to the search of a motion primitive in a pre-computed Look Up Table (LUT).

The proposed approach is applicable to any dynamical system described by differential equations and subject to analytical constraints, for which edge design can be formulated as a TPBVP. Notably, when a model of the robot is not available, the database can be derived directly from experimental trajectories.

The main difference of the proposed approach, with respect to existing algorithms that use a database of pre-computed trajectories, e.g., search based approaches as in (Pivtoraiko et al. 2009), is that it leverages on a dynamic tree whose size depends only on the number of samples, but not on the number of motion primitives that affects the accuracy in the approximation of the robot kinematic and dynamic characteristics. As a consequence, memory consumption to store the tree and computation time to determine a solution on a given tree can be bounded selecting an appropriate maximum number of samples, without introducing undesired constraints in the robot action space.

Search based approaches, instead, have to strongly limit the action space, keeping the number of motion primitives low, as the branching factor of the graph, i.e., the number of edges generated expanding each node, depends on the size of the database.

The effectiveness of \(\mathtt{RRT}^{\star } \_ \mathtt{MotionPrimitives}\)  has been validated in simulation, showing that the time required to build the tree is greatly reduced with the introduction of a LUT. This represents a promising result for online applications, especially in dynamic environments where the planner has to generate a new trajectory in response to changes in the obstacle-free state space.

An analysis of the probabilistic completeness and optimality properties of \(\mathtt{RRT}^{\star } \_ \mathtt{MotionPrimitives}\)  is also provided. This involves a two-step procedure where we assess how close the proposed sample-based solution gets to the optimal one in the gridded state space as the number of samples grows to infinity, and how it gets close to the optimal solution in the continuous state space as the gridding gets finer and finer.

1.3 Paper structure

The paper is organized as follows. Section 2 introduces a formal description of the problem. In Sect. 3 the proposed \(\mathtt{RRT}^{\star } \_ \mathtt{MotionPrimitives}\) algorithm is explained in detail. An analysis of its probabilistic completeness and optimality properties is presented in Sect. 4. A numerical validation of \(\mathtt{RRT}^{\star } \_ \mathtt{MotionPrimitives}\)  is provided in Sect. 5. Finally, some concluding remarks are drawn in Sect. 6.

2 Problem statement

In this work, dynamical systems with state vector \(\mathbf {q}\in \mathbb {R}^d\) and control input \(\mathbf {u}\in \mathbb {R}^m\), governed by

$$\begin{aligned} \dot{\mathbf {q}}(t)=f \left( \mathbf {q}(t),\mathbf {u}(t)\right) \end{aligned}$$
(1)

where f is continuously differentiable as a function of both arguments, are considered. The control input \(\mathbf {u}(t)\) is subject to actuation constraints, and the admissible control space is denoted as \(U \subset \mathbb {R}^m\). The state \(\mathbf {q}(t)\) is constrained in the set \(Q \subset \mathbb {R}^d\), and initialized with \(\mathbf {q}(0)=\mathbf {q}_0\in Q\). Both U and Q are assumed to be compact. An open subset \(Q_{goal}\) of Q represents the goal region that the state has to reach.

A trajectory of system (1) is defined by the tuple \(\mathbf {z}=\left( \mathbf {q}(\cdot ), \mathbf {u}(\cdot ),\tau \right) \), where \(\tau \) is the duration of the trajectory, and \(\mathbf {q}(\cdot ):[0,\tau ] \rightarrow Q \) and \(\mathbf {u}(\cdot ):[0,\tau ] \rightarrow U\) define the state and control input evolution along the time interval \([0,\tau ]\), satisfying the differential equation (1) for \(t \in [0,\tau ]\), the initial condition \(\mathbf {q}(0)=\mathbf {q}_0\), and the final condition \(\mathbf {q}(\tau )\in Q_{goal}\).

Obstacles are represented via an open subset \(Q_{obs}\) of Q. The free space is then defined as \(Q_{free} := Q \setminus Q_{obs}\), and the assumption \(\mathbf {q}_0 \in Q_{free}\) is enforced.

A trajectory \(\mathbf {z}=\left( \mathbf {q}(\cdot ), \mathbf {u}(\cdot ),\tau \right) \) of system (1) is said to be collision free, if it avoids collisions with obstacles, i.e., \(\mathbf {q}(t) \in Q_{free}\), \(t \in [0,\tau ]\).

The set of collision free trajectories is denoted as \(Z_{free}\).

An optimal kinodynamic motion planning problem can then be formalized as finding a feasible and collision free trajectory \(\mathbf {z}^\star = \left( \mathbf {q}^\star (\cdot ),\mathbf {u}^\star (\cdot ),\tau ^\star \right) \in Z_{free}\) that is optimal according to a cost criterion \(J(\mathbf {z}):Z_{free} \rightarrow \mathbb {R}_{\ge 0}\) that is expressed as

$$\begin{aligned} J(\mathbf {z}) = \int \limits _{0}^{\tau }g\left( \mathbf {q}(t),\mathbf {u}(t)\right) \mathrm {d}t \end{aligned}$$

where \(g:Q\times U\rightarrow \mathbb {R}_{\ge 0}\) is an instantaneous cost function. We assume that trajectories joining two different states have a non-zero cost. This is for instance the case in minimum time optimization where \(g(\mathbf {q}, \mathbf {u})=1\) and the trajectory duration \(\tau \) is one of the optimization variables of the problem.

Remark 1

(translation invariance property) In the context of motion planning, system (1) represents the robot equations of motion and, consequently, its state vector \(\mathbf {q}\) includes the robot position with respect to a given absolute reference frame. In the following, a translation invariance property is supposed to hold. This means that, if obstacles are neglected and a pair of initial and final states and the associated optimal trajectory \(\mathbf {z}^\star = \left( \mathbf {q}^\star (\cdot ),\mathbf {u}^\star (\cdot ),\tau ^\star \right) \) are considered, by shifting the origin of the coordinate system and rewriting all relevant quantities—including system dynamics (1), and initial and final states—in the new coordinate system and applying input \(\mathbf {u}^\star (\cdot )\), the optimal robot trajectory is obtained, which is \(\mathbf {z}^\star \) rewritten in the new coordinates.

3 \(\hbox {RRT}^{\star }\) with motion primitives

The approach here proposed is based on previous works on search-based (Pivtoraiko et al. 2009; Likhachev and Ferguson 2009), and sampling-based (Karaman and Frazzoli 2011, 2010) methods, and combines them in a novel way.

In particular, it relies on a uniform discretization of the state space, and on the computation of a finite set of motion primitives by solving a constrained optimization problem with boundary conditions on the grid points of a smaller (uniform) grid. The motion primitives are then embedded in the \(\hbox {RRT}^{\star }\) algorithm, where they are used to connect the randomly generated nodes to the tree, thus eliminating the need of solving online challenging and time consuming TPBVPs.

3.1 Database of motion primitives

The database of motion primitives is built by gridding the continuous state space in order to obtain a finite set of boundary conditions (initial and final states), and by solving offline a constrained boundary value optimization problem for each pair. The resulting set of optimal trajectories is then used repeatedly online, implementing a procedure that, when an edge connecting two nodes is requested by the planner, simply picks up from the database a suitable trajectory.

Given a state tuple \(\mathbf {q}\in Q\), \(\mathbf {q}= [\varvec{\pi }^T, \ldots ]^T \), including the robot position \(\varvec{\pi }\), motion primitives are computed for each pair of initial and final states \((\bar{\mathbf {q}}_0,{\bar{\mathbf {q}}}_f)\) with \(\bar{\mathbf {q}}_0=[\bar{\varvec{\pi }}_0^T, \ldots ]^T\). Given a boundary value pair \(({\bar{\mathbf {q}}}_0,{\bar{\mathbf {q}}}_f)\), a motion primitive is computed by solving the following optimization problem

$$\begin{aligned} \left. \begin{array}{l@{\qquad }l} \underset{\mathbf {u}(\cdot ), \tau }{\text {mininimize}} &{} \int \limits _{0}^{\tau }g\left( \mathbf {q}(t),\mathbf {u}(t)\right) \mathrm {d}t \\ \text {subject to }&{} \dot{\mathbf {q}}(t)=f\left( \mathbf {q}(t),\mathbf {u}(t)\right) \\ &{}\mathbf {u}(t)\in U, \, t \in [0,\tau ] \\ &{}\mathbf {q}(t)\in Q_{free}, \, t \in [0,\tau ] \\ &{}\mathbf {q}(0)={\bar{\mathbf {q}}}_0,\;\mathbf {q}(\tau )={\bar{\mathbf {q}}}_f \end{array}\right. \end{aligned}$$
(2)

Finally, the database is generated by considering distinct pairs of initial and final states \(({\bar{\mathbf {q}}}_0^i,{\bar{\mathbf {q}}}_f^i)\), \(i=1,\ldots ,N\), and computing the corresponding set of motion primitives \(\mathcal {Z}= \{\mathbf {z}^\star _i = \left( \mathbf {q}^\star _i(t),\mathbf {u}^\star _i(t),\tau ^\star _i\right) , \, i=1,\ldots ,N \} \) with the associated costs \(\mathcal {C}=\{C^\star _i, \, i=1,\ldots ,N\}\).

Fig. 1
figure 1

A subset of the motion primitives computed for a 3D search space \((x,y,\theta )\). Red dots correspond to the initial and final positions, and black lines represent the resulting trajectories for different final orientations (\(\theta \)) (Color figure online)

Note that, thanks to the translation invariance property introduced in Remark 1, the size of the database can be kept small while covering a wide range of the space where the robot is moving. Indeed, one can, without loss of generality, set the initial position \(\bar{\varvec{\pi }}_0\) to \(\bar{\varvec{\pi }}_0=0\) when building the database, and recover the optimal trajectory for an arbitrary initial position \({\varvec{\pi }}_0\) by simply centring the coordinate system in \({\varvec{\pi }}_0\).

Example 1

Consider, as an example, a planning problem for a unicycle robot characterised by a 3D search space \((x,y,\theta )\), including the position (xy) and the orientation \(\theta \), and by a 2D actuation space \((v,\omega )\), constituted by the linear velocity v and the angular velocity \(\omega \).

Motion primitives are computed solving the following TPBVP

$$\begin{aligned}&\underset{v(\cdot ), \omega (\cdot ), \tau }{\text {mininimize}}&\int \limits _{0}^{\tau } \left( 1 + 0.5v(t)^2 + 0.5\omega (t)^2 \right) \mathrm {d}t \\&\text {subject to }&\dot{x}(t)=v(t)\cos \left( \theta (t)\right) \\&\dot{y}(t)=v(t)\sin \left( \theta (t)\right) \\&\dot{\theta }(t)=\omega (t) \\&v(t)\in \left[ 0,2\right] , \, t \in [0,\tau ] \\&\omega (t)\in \left[ -2,2\right] , \, t \in [0,\tau ] \\&x(0)=\bar{x}_0,\; y(0)=\bar{y}_0,\; \theta (0)=\bar{\theta }_0 \\&x(\tau )=\bar{x}_f,\; y(\tau )=\bar{y}_f,\; \theta (\tau )=\bar{\theta }_f \end{aligned}$$

for different initial and final poses.

Figure 1 shows a subset of these motion primitives, characterized by trajectories starting from \(\bar{x}_0=\bar{y}_0=\bar{\theta }_0=0\).

As can be seen in Fig. 1, with the dynamical system and cost function considered in this example, motion primitives, corresponding to boundary conditions that are symmetric with respect to the x-axis, are symmetric. A further analysis reveals that the same property holds for the y-axis as well, and that symmetric primitives are characterized by the same cost.

Remark 2

When for the considered dynamical system and cost criterion stronger invariance properties hold, like the axis symmetry in Example 1, the size of the database can be further reduced by storing only a few “reference trajectories”, and generating all the others using the invariance transformation.

Figure 2 shows an example, referred to the TPBVP considered in Example 1, where the trajectories represented by the pairs \((\mathbf {q}_0,\mathbf {q}_f^1)\), \((\mathbf {q}_0,\mathbf {q}_f^2)\), \((\mathbf {q}_0,\mathbf {q}_f^3)\) are characterised by the same cost and can be easily mapped to a “reference trajectory” corresponding to the boundary value pair \(\left( \mathbf {q}_0,\mathbf {q}_f\right) \). In this case, the size of the database can be further reduced storing only the “reference trajectory”.

Fig. 2
figure 2

“Reference trajectory” corresponding to \((\mathbf {q}_0,\mathbf {q}_f)\) (red solid line), and symmetric trajectories defined by the pair of boundary conditions \((\mathbf {q}_0,\mathbf {q}_f^1)\), \((\mathbf {q}_0,\mathbf {q}_f^2)\), \((\mathbf {q}_0,\mathbf {q}_f^3)\) (Color figure online)

3.2 Search space design

In order to take advantage of the pre-computed database of motion primitives in sampling-based planning, the search space of the planner has to be defined appropriately so that every time the planner needs to connect two nodes, the corresponding optimal trajectory can be found in the database. To guarantee that this is indeed the case, the search space of the planner is uniformly gridded as the region where motion primitives are built. The translation invariance propertyFootnote 1 can then be exploited, as any optimal trajectory which connects a pair of initial and final states (in the discretized search space) can be computed by first shifting the initial and final states so that the initial robot position corresponds to the zero position, then picking a suitable motion primitive in the database, and finally shifting the motion primitive so as to get back to the original reference coordinate system. Translation invariance, jointly with uniform gridding, allow a reduced number of motion primitives to cover the entire (discretized) search space. Note that, as the resolution of the database and the uniform grid size of the search space are coupled, we often use these two terms interchangeably.

Determining optimal state space discretization depends on the specific application, and is out of the scope of this work. We shall assume here that the state space grid should include the initial state \(\mathbf {q}_0\), at least a grid point in the goal region, and few points in the free space \(Q_{free}\). Moreover, in order to find a solution that reaches the goal region, the algorithm should be able to search within all homotopy classes that are feasible given the robot footprint. In other words, one should be able to represent in the grid space all sets of trajectories in the continuous state space that can be obtained applying a smooth transformation and lead to the goal region. Missing a homotopy class could highly deteriorate performance in terms of achieved cost.

In Sect. 5 some numerical examples are provided, in which different grids are used for solving the same case study and results are compared.

figure e

3.3 Motion planning

This section introduces \(\mathtt{RRT}^{\star } \_ \mathtt{MotionPrimitives}\) (Algorithm ), the proposed variant of \(\hbox {RRT}^{\star }\) integrating the database of motion primitives for the computation of a collision-free optimal trajectory (cf. Sect. 2).

As in \(\hbox {RRT}^{\star }\), \(\mathtt{RRT}^{\star } \_ \mathtt{MotionPrimitives}\)  is based on the construction of a random tree \(T=(Q_T,E_T)\) where \(Q_T\subset Q\) is the set of nodes, and \(E_T\) is the set of edges. Nodes are states and edges are optimal trajectories, each one connecting a pair of origin and destination nodes and solving the TPBVP in (2).

The tree T is expanded for a maximum number of iterations N, defined by the user, starting from \(Q_T=\{\mathbf {q}_0\}\), where \(\mathbf {q}_0\in Q_{free}\) is the initial state, and \(E_T=\emptyset \), as described in Algorithm .

Every node \( \mathbf {q}\in Q_T\) is connected to \(\mathbf {q}_0\) via a single sequence of intermediate nodes \(\mathbf {q}_i \in Q_T\), \(i=1, \ldots , n-1\), \(n\le N\), and associated edges \(e_i=e_{\mathbf {q}_i,\mathbf {q}_{i+1}} \in E_T\), \(i=0,1, \ldots , n-1\), with \(\mathbf {q}_{n}=\mathbf {q}\).

One can then associate to this sequence a cost \(C(\rightarrow \mathbf {q}_n)\) given by

$$\begin{aligned} C(\rightarrow \mathbf {q}_n) = \sum \limits _{i=0}^{n-1}C(e_i) \end{aligned}$$

where \(C(e_i)\) denotes the cost associated with edge \(e_i\in E_T\) and computed as in (2).

figure f

Tree growing is based on four main steps—random sampling, finding near nodes, extending the tree, and rewiring—that are described in the following.

3.3.1 Random sampling

A random state \(\mathbf {q}_{rand}\) is sampled from the free state space \(Q_{free}\) according to a uniform distribution by \(\mathtt{SAMPLE}\left( Q_\mathrm {free} \right) \). Unlike the original \(\hbox {RRT}^{\star }\) algorithm, however, the node is not sampled from the continuous state space, but from its discretization according to a uniform grid. For this reason, there is also a non zero probability that the same state \(\mathbf {q}_{rand}\) is sampled again in the next iterations of the algorithm.

3.3.2 Near nodes (Algorithm 2)

In \(\hbox {RRT}^{\star }\) a random state \(\mathbf {q}_{rand}\) can be connected only to a node that is within the set of its near nodes.

For Euclidean cost metrics, the set of near nodes is defined as a d-dimensional ball centered at \(\mathbf {q}_{rand}\) of radius

$$\begin{aligned} \gamma _{ball} = \gamma _{RRT^\star }\left( \log (n)/n\right) ^{1/d} \end{aligned}$$

where n is the tree cardinality at the current iteration of the algorithm and \(\gamma _{RRT^\star }\) is a suitable constant selected as

$$\begin{aligned} \gamma _{RRT^\star } > 2 \left( 1+1/d\right) ^{1/d}\left( \mu (Q_{free})/\zeta _d\right) ^{1/d} \end{aligned}$$

\(\mu (Q_{free})\) and \(\zeta _d\) denoting the volume of the free configuration space and of the unit ball, respectively, in a d-dimensional Euclidean space.

For non-Euclidean cost metrics, the distance between two states is represented by the optimal cost of the trajectory that connects them, and near nodes are selected from a set of reachable states, \(Q_{reach}\), defined as the set of states that can be reached from \(\mathbf {q}_{rand}\) or that can reach \(\mathbf {q}_{rand}\) with a cost that satisfies some threshold value. More specifically,

$$\begin{aligned} Q_{reach}= & {} \{\mathbf {q}\in Q : C(e_{\mathbf {q}_{rand},\mathbf {q}}) \le l(n) \vee \nonumber \\&C(e_{\mathbf {q},\mathbf {q}_{rand}}) \le l(n)\} \end{aligned}$$
(3)

where \(e_{\mathbf {q}_i,\mathbf {q}_j}\) denotes the edge from \(\mathbf {q}_i\) to \(\mathbf {q}_j\), and l(n) is a cost threshold that decreases over the iterations of the algorithm as \(l(n)=\gamma _l \left( \log n/n\right) \) such that a ball of volume \(\gamma ^{d}\left( \log n/n\right) \) is contained within \(Q_{reach}\), where \(\gamma _l\) and \(\gamma \) are suitable constants (see Karaman and Frazzoli 2010, for further details).

Fig. 3
figure 3

Steps involved in the coordinate transformation within the \(\mathtt{FindTrajectory}\) procedure

In \(\mathtt{RRT}^{\star } \_ \mathtt{MotionPrimitives}\), however, the set \(Q_{reach}\) has to be further constrained to ensure that there is a pre-computed trajectory in the database for each connection in the set of near nodes, that is thus redefined as follows

$$\begin{aligned} Q_{near}=Q_{reach} \cap \mathtt{BoundingBox}(\varvec{\pi }_{rand}) \end{aligned}$$

where \(\mathtt{BoundingBox}(\varvec{\pi }_{rand})\) denotes the box of grid points in the state space Q adopted for the database construction, with the origin of the reference coordinate system shifted from \({\bar{\pi }}_0=0\) to \(\pi _{rand}\), which is the robot position associated to state \(\mathbf {q}_{rand}\) and obtained by using \(\mathtt{GetPosition}\).

If \(Q_{near}\) occurs to be an empty set, the algorithm continues to the next iteration selecting a new \(\mathbf {q}_{rand}\).

3.3.3 Extend (Algorithm 3)

The tree is extended to include \(\mathbf {q}_{rand}\) by selecting the node \(\mathbf {q}_{best}\in Q_T\) such that the edge \(e_{\mathbf {q}_{best},\mathbf {q}_{rand}}\) connects \(\mathbf {q}_{rand}\) with a minimum cost collision free trajectory.

\(\mathbf {q}_{best}\) is determined as follows

$$\begin{aligned} \mathbf {q}_{best} = \mathop {\hbox {argmin}}\limits _{\mathbf {q}\in Q_{feasible}} C(\rightarrow \mathbf {q}) + C(e_{\mathbf {q},\mathbf {q}_{rand}}) \end{aligned}$$

where \(Q_{feasible}\subseteq Q_{near}\) is the set of nodes \(\mathbf {q}\) that belong to \(Q_{near}\) and such that the trajectory connecting \(\mathbf {q}\) to \(\mathbf {q}_{rand}\) is collision free, i.e.,

$$\begin{aligned} Q_{feasible} = \{\mathbf {q}\in Q_{near} | \mathtt{CollisionFree} (e_{\mathbf {q},\mathbf {q}_{rand}})=1 \} \end{aligned}$$

where \(\mathtt{CollisionFree}: E_T \rightarrow \{0,1\}\) is a function that returns 1 for an edge that is collision free, 0 otherwise.

The selection of an edge from the database, connecting \(\mathbf {q}\) to \(\mathbf {q}_{rand}\), is a peculiarity of \(\mathtt{RRT}^{\star } \_ \mathtt{MotionPrimitives}\), and is performed by the \(\mathtt{FindTrajectory}\) function as follows:

  1. 1.

    a translation is applied to the pair of initial and final states \(\left( \mathbf {q},\mathbf {q}_{rand}\right) \) (Fig. 3a), obtaining the normalized pair \(\left( \tilde{\mathbf {q}},\tilde{\mathbf {q}}_{rand}\right) \), such that the resulting \(\tilde{\mathbf {q}}\) has the position \(\tilde{\varvec{\pi }}\) corresponding to the null vector (Fig. 3b);

  2. 2.

    a query is executed on the database to look for the trajectory \(\mathbf {z}_i \in \mathcal {Z}\) and the cost \(C(e_{\tilde{\mathbf {q}},\tilde{\mathbf {q}}_{rand}}) \in \mathbb {R}\);

  3. 3.

    the inverse of the previous translation is applied in order to recover the trajectory connecting the actual pair of boundary values \(\left( \mathbf {q},\mathbf {q}_{rand}\right) \), determining the edge \(e_{\mathbf {q},\mathbf {q}_{rand}}\) (Fig. 3c).

At the end of this procedure, if \(\mathbf {q}_{rand}\) is not already in the tree, then it is added to the tree together with the minimum cost edge, i.e., \(Q_T\) is replaced by \(\{\mathbf {q}_{rand} \}\cup Q_T\) and \(E_T\) by \(\{e_{\mathbf {q}_{best},\mathbf {q}_{rand}}\}\cup E_T\) (see steps 7 and 8 in Algorithm ). If \(\mathbf {q}_{rand}\) is already in the tree and if the computed \(\mathbf {q}_{best}\) is different from the current parent node, \(\mathbf {q}_{prev}\), of \(\mathbf {q}_{rand}\), given by \(\mathtt{PARENT}(\mathbf {q}_{rand})\), then the previous edge, \(e_{\mathbf {q}_{prev},\mathbf {q}_{rand}}\), is replaced by the new edge, \(e_{\mathbf {q}_{best},\mathbf {q}_{rand}}\) (see step 14 in Algorithm ).

figure g
figure h

3.3.4 Rewiring (Algorithm 4)

In order to ensure that all node pairs are connected by an optimal sequence of edges, every time a new node \(\mathbf {q}_{rand}\) is added to the tree, a check is performed to verify if an already existing node can be reached from this newly added node with a smaller cost.

Therefore, \(\forall \mathbf {q}\in Q_{near}\) if \(e_{\mathbf {q}_{rand},\mathbf {q}}\) is collision free, and the following conditions hold

$$\begin{aligned}&C(e_{\mathbf {q}_{rand},\mathbf {q}}) \le l(n) \\&C(\rightarrow \mathbf {q}_{rand}) + C(e_{\mathbf {q}_{rand},\mathbf {q}}) \le C(\rightarrow \mathbf {q}) \end{aligned}$$

the tree is rewired, i.e.,

$$\begin{aligned} E_T \leftarrow \{E_T \setminus e_{prev}\} \cup \{e_{\mathbf {q}_{rand},\mathbf {q}}\} \end{aligned}$$

where \(e_{prev}\) is the previous edge connecting the node \(\mathbf {q}\) to the tree.

3.3.5 Termination and best sequence selection

After the maximum number of iterations is reached the procedure to build the tree terminates.

The best trajectory is selected as the node sequence reaching the goal region with the minimum cumulative cost C.

Note that, using a discretized search space limits the number of nodes that can be sampled, once all of them have been sampled the tree cardinality does not increase any more, but the algorithm can still continue updating the edges to ensure that each node is connected with the best possible parent node.

4 Completeness and optimality analysis

In this section, probabilistic completeness of the proposed planning algorithm and optimality of the solution are discussed. Furthermore, some results to assess how close the solution obtained using a discretized state space and motion primitives is to the optimal trajectory computed considering a continuous state space are provided.

Let \(Q^{{\varDelta }}\) define the set of grid points that represent the discretized state spaceFootnote 2 and similarly \(Q^{{\varDelta }}_{free} := Q^{\varDelta }\cap Q_{free} \) represents the free discrete state space. Assuming that the discretization step size is chosen properly, then, the collection of all grid points \(\mathbf {q}\in Q^{{\varDelta }}_{free}\) that can be reached from \(\mathbf {q}_0\) by concatenating a sequence of motion primitives in \(Q_{free}\) is a non empty set. We shall denote this set as \(V^{\varDelta }_{free}\) and its cardinality as \(N^{\varDelta }\). Note that the end points of the concatenated motion primitives are grid points in \( Q^{{\varDelta }}_{free}\) and hence they belong to \(V^{\varDelta }_{free}\).

Let \(\mathcal {G}^{\varDelta }_{free} = (V^{{\varDelta }}_{free}, E^{\varDelta }_{free})\) be a graph where the set of nodes is given by \(V^{\varDelta }_{free}\) defined before and the set of edges \(E^{\varDelta }_{free}\) is the collection of all the (possibly translated) motion primitives iteratively built as follows: starting from \(\mathbf {q}_{0}\) consider all the (translated) motion primitives that lie in \(Q_{free}\) and connect \(\mathbf {q}_0\) to all possible grid points in \(Q^{\varDelta }_{free}\), and, then, continue with the same strategy for all of the newly reached grid points iteratively until it is not possible to further expand the graph.

Finally, \(Q_{goal}^{{\varDelta }}\) denotes the set of those grid points of \(V^{{\varDelta }}_{free}\) that belong to \(Q_{goal}\). The motion planning problem using the grid representation admits a solution if \(Q_{goal}^{{\varDelta }}\) is not empty since this means that there exists a way of reaching a state in \(Q_{goal}\) starting from \(\mathbf {q}_{0}\) with the available motion primitives. In the following derivations we assume that \(Q_{goal}^{{\varDelta }} \ne \emptyset \).

Similarly to \(\hbox {RRT}^{\star }\), \(\mathtt{RRT}^{\star } \_ \mathtt{MotionPrimitives}\)  generates a tree T based on the random samples extracted from \(Q^{\varDelta }_{free}\). However, unlike \(\hbox {RRT}^{\star }\), the nodes and edges added to the tree belong respectively to \(V^{\varDelta }_{free}\) and \(E^{\varDelta }_{free}\), so that the obtained tree T is a sub-graph of \(\mathcal {G}^{\varDelta }_{free}\), i.e., \(T \subset \mathcal {G}^{\varDelta }_{free}\). The subscript i is used to denote the generated tree and the cost of the lowest cost trajectory represented in that tree after i-th iterations, i.e., \(T_i\) and \(c_i\), respectively.

Since \(Q_{goal}^{{\varDelta }} \ne \emptyset \), then, there exists at least an optimal branch of \(\mathcal {G}^{\varDelta }_{free}\) composed of the ordered sequence of nodes

$$\begin{aligned} S^\star := \{\mathbf {q}_0^\star ,\mathbf {q}_1^\star ,\ldots ,\mathbf {q}_k^\star \}, \end{aligned}$$

which represents a resolution optimal\({\varDelta }\)-trajectory, a minimum cost trajectory that starts at the initial state \(\mathbf {q}_0\) and ends in the goal region, i.e., \(\mathbf {q}_k^\star \in Q_{goal}^{{\varDelta }}\), such that \(\mathbf {q}_j^\star \in V^{{\varDelta }}_{free}\) and \(e_{\mathbf {q}_{j-1}^\star ,\mathbf {q}_{j}^\star } \in E^{\varDelta }_{free}\), for any \(j=1, \ldots ,k\). Let \(c^{\star {\varDelta }}\) denote the cost of this optimal trajectory, i.e., \(C(\rightarrow \mathbf {q}_k^\star )= c^{\star {\varDelta }}\), which is named resolution optimal\({\varDelta }\)-cost.

The goal of \(\mathtt{RRT}^{\star } \_ \mathtt{MotionPrimitives}\)  can then be reformulated as that of generating a tree that contains an optimal branch \(S^\star \) to reach \(Q^{{\varDelta }}_{goal}\) that is represented in \(\mathcal {G}^{\varDelta }_{free}\). Note that, one could in principle build \(G^{\varDelta }_{free}\) and apply an exhaustive search on it. However, this can still be an issue due to the combinatorial nature of the problem, in particular due to the branching caused by the dimensionality of the state space and the number of nodes contained in the graph.

In this section the quality of the solution obtained by \(\mathtt{RRT}^{\star } \_ \mathtt{MotionPrimitives}\)  is analyzed by addressing the following questions:

  1. 1.

    resolution optimality: if there exists a resolution optimal \({\varDelta }\)-trajectory \(S^\star \) in \(\mathcal {G}^{\varDelta }_{free}\), then, is it possible to obtain such a trajectory?

  2. 2.

    asymptotic optimality: how close is the resolution optimal \({\varDelta }\)-cost to the cost of the optimal trajectory, as the grid resolution increases and the grid converges to the continuous state space?

Theorem 1

As the number of iterations goes to infinity the cost of the trajectory returned by \(\mathtt{RRT}^{\star } \_ \mathtt{MotionPrimitives}\)  converges to the resolution optimal \({\varDelta }\)-cost with a probability equal to 1, i.e.,

$$\begin{aligned} \mathbb {P}\left( \left\{ \lim _{i \rightarrow \infty } c_i = c^{\star {\varDelta }} \right\} \right) =1. \end{aligned}$$

Moreover, the expected number of iterations required to converge to \(c^{\star {\varDelta }}\) is upper bounded by \(k|Q_{free}^{{\varDelta }}|\), where k is the length of an optimal branch of \(\mathcal {G}^{\varDelta }_{free}\).

Proof

\(\mathtt{RRT}^{\star } \_ \mathtt{MotionPrimitives}\) returns the resolution optimal solution if it discovers an optimal branch, \(S^\star = \{\mathbf {q}_0^\star ,\mathbf {q}_1^\star ,\ldots ,\mathbf {q}_k^\star \}\), defined on \(\mathcal {G}_{free}^{\varDelta }\). Assuming that the constant defining the cost threshold for selecting nearby nodes, \(\gamma _l\), is selected large enough (e.g. \(\gamma _l\) can be chosen so that \(Q_{reach}\) defined in (3) contains the \(\mathtt{BoundingBox}\)\((\varvec{\pi }_{rand})\) for \(n=N^{\varDelta }\)), as soon as \(\{\mathbf {q}_0^\star ,\mathbf {q}_1^\star ,\ldots ,\mathbf {q}^\star _{j-1}\}\) is a branch in the tree, \(T=(Q_T,E_T)\), then the algorithm adds the edge \(e_{\mathbf {q}_{j-1},\mathbf {q}_j}\) to the tree in one of the following two ways:

  1. 1.

    any time \(\mathbf {q}_{j}^\star \) is sampled it is connected to \(\mathbf {q}_{j-1}^\star \) by the EXTEND procedure. In fact, as \(e_{\mathbf {q}_{j-1}^\star ,\mathbf {q}_j^\star }\) is part of the optimal sequence, there is no better way of reaching \(\mathbf {q}_{j}^\star \) other than \(e_{\mathbf {q}_{j-1}^\star ,\mathbf {q}_j^\star }\).

  2. 2.

    if \(\mathbf {q}_{j}^\star \) is sampled before \(\mathbf {q}_{j-1}^\star \) and connected to some node \(\mathbf {q}_{prev}\), such that \(\mathbf {q}_j^\star \in Q_T\) and \(e_{\mathbf {q}_{prev},\mathbf {q}_j^\star } \in E_T\), the edge \(e_{\mathbf {q}_{j-1}^\star ,\mathbf {q}_j^\star }\) is selected any time \(\mathbf {q}_{j-1}^\star \) is sampled, thanks to the REWIRE procedure.

This property allows to model the process of determining the sequence \(S^\star \) as an absorbing Markov chain initialized at \(\mathbf {q}_0^\star \) with \(\mathbf {q}_k^\star \) as absorbing state and all intermediate \(\mathbf {q}_j^\star \), \(j=1,2, \ldots k-1 \) that are transient states. There is a positive probability \(\mathcal {P}_j\) of advancing in the sequence, i.e., moving from \(\mathbf {q}^\star _{j-1}\) to \(\mathbf {q}^\star _j\), and a probability \(1-\mathcal {P}_j\) of staying at the same state. Considering that at each iteration a new grid point is sampled from \(Q_{free}^{{\varDelta }}\) independently and according to a uniform distribution, each one has a probability \(\frac{1}{|Q_{free}^{{\varDelta }}|}\) of being extracted. Therefore, there is a probability \(\mathcal {P}_j=\frac{1}{|Q_{free}^{{\varDelta }}|}\) of advancing in the Markov chain and, since all states are transient states apart from \(\mathbf {q}_k^\star \) which is the absorbing state, the probability that the process is absorbed by \(\mathbf {q}_k^\star \) tends to 1 as i tends to infinity (Bertsekas and Tsitsiklis 2002). Moreover, there is a finite number of expected iterations, \(k/\mathcal {P}_j=k\,|Q_{free}^{{\varDelta }}|\), before the process is absorbed, i.e., before \(\mathtt{RRT}^{\star } \_ \mathtt{MotionPrimitives}\)  returns the resolution optimal solution. \(\square \)

Clearly, the number of expected iterations increases with the depth of the solution, k, and the number of states represented in the grid. However, one advantage of \(\mathtt{RRT}^{\star } \_ \mathtt{MotionPrimitives}\), as \(\hbox {RRT}^{\star }\) is the possibility of obtaining a solution rapidly and possibly improving its quality within the allowed computing time.

Corollary 1

\(\mathtt{RRT}^{\star } \_ \mathtt{MotionPrimitives}\) is probabilistically resolution complete, as the number of iterations goes to infinity the algorithm will return a solution to the motion planning problem, if there exists one in \(\mathcal {G}_{free}^{\varDelta }\), with a probability 1.

The remaining of this section deals with the relation between the resolution optimal \({\varDelta }\)-trajectory returned by \(\mathtt{RRT}^{\star } \_ \mathtt{MotionPrimitives}\)  as the number of iterations grows to infinity and the truly optimal trajectory in the continuous state space. To this purpose, we shall focus with the case when there are no actuation constraints and enforce the following assumptions regarding the properties of the dynamical system (1) and the existence of a solution.

Assumption 1

The following properties hold for the dynamical system in (1)

  • the system is small-time locally attainable (STLA)Footnote 3;

  • function \(f(\cdot )\), representing the system dynamics, is Lipschitz continuous with Lipschitz constant \(\mathcal {K}_f\);

  • function \(C(\cdot )\), assigning a cost to an edge, satisfies the following Lipschitz-like continuity condition with Lipschitz constant \(\mathcal {K}_c\):

    $$\begin{aligned} \left| C(e_{\mathbf {q}_0,\mathbf {q}_1})- C(e_{\tilde{\mathbf {q}}_0, \tilde{\mathbf {q}}_1})\right| \le \mathcal {K}_c \left\| \left| \begin{array}{c} \mathbf {q}_0 \\ \mathbf {q}_1 \\ \end{array} \right| - \left| \begin{array}{c} \tilde{\mathbf {q}}_0 \\ \tilde{\mathbf {q}}_1 \\ \end{array} \right| \right\| \end{aligned}$$

    for each pair of edges \(e_{\mathbf {q}_0,\mathbf {q}_1}\) and \(e_{\tilde{\mathbf {q}}_0, \tilde{\mathbf {q}}_1}\).

Derivations in the rest of this section apply straightforwardly to state spaces that are Euclidean, and can be generalized to state spaces that are manifolds if the following assumption holds.

Assumption 2

The state space manifold of system (1) with d state variables is a subspace of the d-dimensional Euclidean space, \(\mathbb {R}^d\), therefore can be locally treated as \(\mathbb {R}^d\).

With a slight abuse of the previously introduced notation, in the rest of this section we use the term “trajectory” for the state space component of the tuple \(\mathbf {z}\) defined in Sect. 2. In order to compare the trajectory returned by \(\mathtt{RRT}^{\star } \_ \mathtt{MotionPrimitives}\)  and the optimal trajectory in the continuous state space, firstly, trajectories whose points are all away from obstacles by a certain distance are considered. For this reason, the definition of obstacle clearance of a trajectory, i.e., the minimum distance between obstacles and points belonging to the trajectory, has to be introduced.

Definition 1

(\(\epsilon \)-obstacle clearance) Given a trajectory \(\sigma (t)\), \(t \in [0,T]\), if the ball \(\mathcal {B}_{\epsilon }\left( \sigma (t)\right) \) of radius \(\epsilon \) and centred at \(\sigma (t)\) is strictly inside \(Q_{free}\), for any \(t \in [0,T]\), then, the \(\epsilon \)-obstacle clearance property holds for \(\sigma \).

Definition 2

(\(\epsilon \)-free space) Let \(\sigma (t):[0,T] \rightarrow Q_{free}\) be a trajectory which has \(\epsilon \)-obstacle clearance, the \(\epsilon \)-free space along \(\sigma \) is given by

$$\begin{aligned} Q^{\epsilon }_{\sigma } := \bigcup _{t \in [0,T]}\mathcal {B}_{\epsilon }\left( \sigma (t)\right) \end{aligned}$$

Then, a set of trajectories that are called to be \(\epsilon \)-similar to \(\sigma \) can be introduced.

Definition 3

(\(\epsilon \)-similarity) Any trajectory \(\tilde{\sigma }(t):[0,\tilde{T}] \rightarrow Q_{free}\) is said to be \(\epsilon \)-similar to \(\sigma \) if it lies in the \(Q^{\epsilon }_{\sigma }\) free-space, i.e., if

$$\begin{aligned} \tilde{\sigma }(t) \in Q^{\epsilon }_{\sigma }, \, t \in [0,\tilde{T}]. \end{aligned}$$
Fig. 4
figure 4

An example of a trajectory \(\sigma _2\) that is \(\epsilon \)-similar to a trajectory \(\sigma _1\)

Figure 4 shows an example of a trajectory that is \(\epsilon \)-similar to another one.

Note that having an \(\epsilon \)-free space along a trajectory is not sufficient to guarantee the existence of \(\epsilon \)-similar trajectories (\(\epsilon \)-dynamic clearance), as this existence depends also on the properties of the dynamical system (1). The following definition relates the \(\epsilon \)-similarity with the \(\epsilon \)-free space through the system dynamics.

Definition 4

(\(\epsilon \)-dynamic clearance) Given a trajectory \(\sigma (t) :[0,T] \rightarrow Q_{free}\) which has \(\epsilon \) obstacle clearance, if for any pair of time instants \( t_1, t_2\), such that \(0 \le t_1 < t_2 \le T\), there exists a set of states inside a ball of radius \(\alpha \epsilon \), with \(0 < \alpha \le 1\), centered at \(\sigma (t_2)\) that are reachable from \(\sigma (t_1)\) according to dynamics (1) without leaving the \(\epsilon \)-free space around \(\sigma (t)\), then \(\sigma \) has \(\epsilon \)-dynamic clearance.

Fig. 5
figure 5

\(\sigma _\epsilon ^\star \) is the optimal trajectory with at least \(\epsilon \)-obstacle and dynamic clearance

Let \({\varSigma }_\epsilon \) denote all the trajectories that solve the motion planning problem and have at least \(\epsilon \)-obstacle and dynamic clearance. Let \(c^\star _\epsilon \) denote the minimum cost over all \({\varSigma }_\epsilon \), which corresponds to the \(\epsilon \)-optimal trajectory, \(\sigma ^\star _\epsilon \) (Fig. 5). Note that, as \(\epsilon \) tends to zero \(\sigma ^\star _\epsilon \) converges to the truly optimal trajectory in the continuous state space. Due to the discretized nature of \(\mathtt{RRT}^{\star } \_ \mathtt{MotionPrimitives}\)  it is not possible to converge to the optimal trajectory in the continuous state space, however in the following it is proven that the graph, \(\mathcal {G}^{\varDelta }_{free}\), contains an \(\epsilon \)-similar trajectory, \(\sigma ^{{\varDelta }}_\epsilon \), to the optimal trajectory \(\sigma ^\star _\epsilon \) with a particular \(\epsilon \)-clearance. Consequently, this result will be used to show that as the resolution of the grid increases, and as \(\epsilon \) converges to zero, the resolution optimal\({\varDelta }\)-cost will converge to the truly optimal cost, i.e., \(c^\star \).

Theorem 2

Let \(\bar{\epsilon } >0\) be smaller than half of the shortest side of the bounding box and such that the system (1) admits an \(\epsilon \)-optimal trajectory for any \(\epsilon \le \bar{\epsilon }\), then,

  • for a sufficiently fine gridding, \(\mathcal {G}^{\varDelta }_{free}\) contains an \(\epsilon \)-similar trajectory to \(\sigma ^\star _\epsilon \), \(\forall \epsilon \le \bar{\epsilon }\),

  • as the grid resolution increases, the resolution optimal\({\varDelta }\)-cost converges to \(c^\star \).

Fig. 6
figure 6

Consecutive samples taken along the \(\epsilon \)-optimal trajectory to design a \(\{\mathcal {B}_{\delta }\}\) sequence. The corresponding \(\mathtt{BB}(\varvec{\pi }_m)\) are shown in red (Color figure online)

Proof

Let us fix \(\epsilon \), with \(0 < \epsilon \le \bar{\epsilon }\), and sample a set of states, \(\{\mathbf {q}_m :m=0,1, \ldots , M \}\), along the \(\epsilon \)-optimal trajectory, \(\sigma _\epsilon ^\star \), starting from the initial state and ending in the final state, in such a way that a ball of radius \(\bar{\epsilon }\) centered at sample \(\mathbf {q}_{m}\) would be contained in the \(\mathtt{BoundingBox}(\varvec{\pi }_{m-1})\) centered at the preceding sample \(\mathbf {q}_{m-1}\) and touching its boundary for \( m=1, \ldots , M-1\) (Fig. 6), till the final state is included within \(\mathtt{BoundingBox}(\varvec{\pi }_{M-1})\).

As \(\sigma _\epsilon ^\star \) has \(\epsilon \)-dynamic clearance, it is true that for any sample \(\mathbf {q}_{m}\) along the trajectory there exists a set of states within a ball of radius \(\alpha \epsilon \) centered at \(\mathbf {q}_{m+1}\) that is reachable without leaving \(Q^\epsilon _{\sigma _\epsilon ^\star }\), which is the \(\epsilon \)-free space along \(\sigma _\epsilon ^\star \). From Assumption 1, in particular from the Lipschitz continuity of the system dynamics, it follows that a sequence of non-overlapping balls, \(\{\mathcal {B}_{\delta }\left( \mathbf {q}_m\right) \}\), centered at the samples along \(\sigma _\epsilon ^\star \) and characterized by radius \(\delta \) where

$$\begin{aligned} \delta = \frac{\alpha \epsilon }{2 \mathcal {K}_f} \end{aligned}$$
(4)

can be determined such that any state within \(\mathcal {B}_{\delta }\left( \mathbf {q}_{m}\right) \) can reach any state in \(\mathcal {B}_{\delta }\left( \mathbf {q}_{m+1}\right) \) without leavingFootnote 4\(Q^\epsilon _{\sigma _\epsilon ^\star }\) (see, Khalil 1996; Karaman and Frazzoli 2010). Assuming that the discretization is fine enough so that none of the \(\mathcal {B}_{\delta }\left( \mathbf {q}_m\right) \) is empty, there exists a sequence of nodes \( \{ \tilde{\mathbf {q}}_m :m=0,1, \ldots , M \} \) represented in \(\mathcal {G}^{\varDelta }_{free}\), such that \( \tilde{\mathbf {q}}_m \in \mathcal {B}_{\delta }\left( \mathbf {q}_{m}\right) , m=0,1, \ldots , M \), \(\tilde{\mathbf {q}}_0 = \mathbf {q}_0\), \(\tilde{\mathbf {q}}_M\) belongs to the goal region and the corresponding trajectory \(\sigma _\epsilon ^{\varDelta }\) in \(\mathcal {G}^{\varDelta }_{free}\) is \(\epsilon \)-similar to \(\sigma _\epsilon ^\star \) (Fig. 7).

Fig. 7
figure 7

Construction of the ball sequence \(\{\mathcal {B}_{\delta }\left( \mathbf {q}_m\right) \}\). Tiling \(\sigma _\epsilon ^\star \) with balls of radius \(\delta \) centered at consecutive samples, there exist trajectories that are \(\epsilon \)-similar to \(\sigma _\epsilon ^\star \) represented on \(\mathcal {G}^{\varDelta }_{free}\). The cost of the trajectory connecting two consecutive samples is larger than or equal to \(l_{min}\)

The cost of the optimal trajectory can be defined as the sum of the costs of the trajectories connecting consecutive \(\mathcal {B}_{\delta }\left( \mathbf {q}_m\right) \) balls, i.e.,

$$\begin{aligned} c_\epsilon ^\star = \sum _{m=1}^{M} C(e_{\mathbf {q}_{m-1} , \mathbf {q}_{m}}), \end{aligned}$$
(5)

and similarly the cost of the \(\sigma ^{{\varDelta }}_\epsilon \) is

$$\begin{aligned} c^{{\varDelta }}_{\epsilon } = \sum _{m=1}^{M} C(e_{\tilde{\mathbf {q}}_{m-1}, \tilde{\mathbf {q}}_{m}}). \end{aligned}$$
(6)

Subtracting (5) from (6), the difference can be written as

$$\begin{aligned} c^{{\varDelta }}_{\epsilon } - c_\epsilon ^\star = \sum _{m=1}^{M} \left( C(e_{\tilde{\mathbf {q}}_{m-1}, \tilde{\mathbf {q}}_{m}}) - C(e_{\mathbf {q}_{m-1}, \mathbf {q}_{m}})\right) . \end{aligned}$$

By the Lipschitz continuity of the cost function given in Assumption 1 it follows that

$$\begin{aligned} c^{{\varDelta }}_{\epsilon } - c_\epsilon ^\star \le \sum _{m=1}^{M} \mathcal {K}_c \delta , \end{aligned}$$

where \(\mathcal {K}_c\) is the Lipschitz constant for the cost function. Therefore,

$$\begin{aligned} c^{{\varDelta }}_{\epsilon } \le c_\epsilon ^\star + \mathcal {K}_c M \delta . \end{aligned}$$
(7)

We next derive an upper bound on the number of trajectory segments, M, using the lower bound on the cost of the trajectory connecting two consecutive samples, \(\mathbf {q}_{m-1}\) and \(\mathbf {q}_{m}\). To this purpose, let us define a smaller bounding box centered at \(\mathbf {q}_{m}\) for each sample, which we shall denote as \(\mathtt{BB}(\varvec{\pi }_{m})\), such that the frontier of \(\mathtt{BB}(\varvec{\pi }_{m})\) intersects with \(\sigma ^\star _\epsilon \) at \(\mathbf {q}_{m}\) (see Fig. 6) and all its sides are obtained by decreasing those of \(\mathtt{BoundingBox}(\varvec{\pi }_{m})\) of \(2{\bar{\epsilon }}\) evenly. Then, we can define a minimum cost \(l_{min}\) among all the trajectories that reach the frontier of \(\mathtt{BB}(\varvec{\pi }_0)\) from \(\varvec{\pi }_0\). Note that \(l_{min}>0\) since the length of the shortest side of \(\mathtt{BB}(\varvec{\pi }_0)\) is larger than zero and we assumed non-zero cost for trajectories joining two different states. Then, the cost \(C(e_{\mathbf {q}_{m-1} , \mathbf {q}_{m}})\) of the trajectory connecting \(\mathbf {q}_{m-1}\) and \(\mathbf {q}_{m}\), for \(m=1, \ldots , M-1\), satisfies the following inequality

$$\begin{aligned} l_{min} \le C(e_{\mathbf {q}_{m-1} , \mathbf {q}_{m}}). \end{aligned}$$
(8)

Note that we have to treat separately the last trajectory segment connecting \(\mathbf {q}_{M-1}\) to \(\mathbf {q}_{M}\) since the existence of an \(\bar{\epsilon }\)-ball centered at \(\mathbf {q}_M\) and contained within \(\mathtt{BoundingBox}\left( \varvec{\pi _{M-1}}\right) \) can not be guaranteed as \(\mathbf {q}_{M}\) is the final state which is fixed. Therefore, (8) may not hold for the last trajectory segment. We can then rewrite Eq. (5) as follows

$$\begin{aligned} c_\epsilon ^\star = \sum _{m=1}^{M-1} C(e_{\mathbf {q}_{m-1} , \mathbf {q}_{m}}) + C(e_{\mathbf {q}_{M-1},\mathbf {q}_{M}}) \end{aligned}$$

and lower bound the cost of \(\sigma _\epsilon ^\star \) as

$$\begin{aligned} c_\epsilon ^\star \ge (M-1)l_{min} + 0. \end{aligned}$$

Considering that the number of trajectory segments can be upper bounded by

$$\begin{aligned} M-1 \le \dfrac{c_\epsilon ^\star }{l_{min}}, \end{aligned}$$

from (7) it follows that

$$\begin{aligned} c^{{\varDelta }}_{\epsilon } \le \left( 1 + \frac{\mathcal {K}_c\delta }{l_{min}}\right) c_\epsilon ^\star + \mathcal {K}_c \delta . \end{aligned}$$

Substituting the definition of \(\delta \) given in (4), we get

$$\begin{aligned} c^{{\varDelta }}_{\epsilon } \le \left( 1 + \frac{\mathcal {K}_c \, \alpha \, \epsilon }{2 \, \mathcal {K}_f \, l_{min}}\right) c_\epsilon ^\star + \dfrac{\mathcal {K}_c \, \alpha \, \epsilon }{2 \, \mathcal {K}_f}, \end{aligned}$$

which shows that the cost of the \(\epsilon \)-similar trajectory for a specific gridding is upper bounded by the cost of the \(\epsilon \)-optimal trajectory and the \(\epsilon \)-clearance. Now the cost \(c^{\star {\varDelta }}\) of the optimal trajectory that can be obtained given a particular discretization satisfies

$$\begin{aligned} c^{\star {\varDelta }}\le c^{{\varDelta }}_{\epsilon } \le \left( 1 + \frac{\mathcal {K}_c \, \alpha \, \epsilon }{2\mathcal {K}_f \, l_{min}}\right) c_\epsilon ^\star + \dfrac{\mathcal {K}_c \, \alpha \, \epsilon }{2\mathcal {K}_f} \end{aligned}$$
(9)

which provides an upper bound on the resolution optimal\({\varDelta }\)-cost, \(c^{\star {\varDelta }}\).

As the resolution of the discretization increases and as the grid converges to the continuous state space, \(\epsilon \) can converge to zero. Then, since \(\mathcal {K}_f\) and \(\mathcal {K}_c\) are constants and \(l_{min}\) is fixed (it depends on the upper bound \({\bar{\epsilon }}\) on \(\epsilon \)), from (9) it follows that as \(\epsilon \) goes to zero the resolution optimal\({\varDelta }\)-cost converges to the cost \(c^\star \) of the optimal trajectory, i.e.,

$$\begin{aligned} \lim _{\epsilon \rightarrow 0} c^{\star {\varDelta }} = c^\star . \end{aligned}$$
(10)

\(\square \)

Theorem 2 states that when the number of nodes in \(\mathcal {G}^{\varDelta }_{free}\) goes to infinity, i.e., when the uniform gridding converges to the continuous state space, the resolution optimal \({\varDelta }\)-cost converges to the cost \(c^\star \) of the optimal trajectory in the continuous state space without gridding. By Theorem 1, it then follows that, when the uniform gridding converges to the continuous state space, as the number of iterations goes to infinity the cost of the trajectory returned by \(\mathtt{RRT}^{\star } \_ \mathtt{MotionPrimitives}\)  converges to the cost of the optimal trajectory. Furthermore, Theorem 2 establishes an upper bound on the resolution optimal\({\varDelta }\)-cost as a function of the cost of the \(\epsilon \)-optimal trajectory and the \(\epsilon \)-clearance such that as the grid resolution increases the upper bound decreases (see Eq. (9)). However, as shown in Theorem 1, as the number of discrete states increase, \(\mathtt{RRT}^{\star } \_ \mathtt{MotionPrimitives}\)  will take more iterations to return the resolution optimal\({\varDelta }\)-trajectory. Depending on the problem at hand, one should make the best compromise between the computing time (and size of the database) and the performance in terms of cost.

Fig. 8
figure 8

The three grids used to generate the database. The red circle is the initial position, the blue dots the final ones. Red arrows represent the initial headings and velocities, blue arrows are the final ones. Different arrow sizes correspond to different velocities (Color figure online)

5 Numerical example

In this section a numerical example is presented to show the effectiveness of the proposed algorithm.

A 4D state-space \((x,y,\theta ,v)\) representing a unicycle like robot moving on a planar surface is considered. The robot is described by the following equations

$$\begin{aligned} {\left\{ \begin{array}{ll} \dot{x}(t)=v(t)\cos \theta (t)\\ \dot{y}(t)=v(t)\sin \theta (t)\\ \dot{\theta }(t)=w(t)\\ \dot{v}(t)=a(t) \end{array}\right. } \end{aligned}$$
(11)

where (xy) is the position of the robot and \(\theta \) the orientation with respect to a global reference frame, v and w are the linear and angular velocity, respectively. The control input is represented by \(\mathbf {u}=[w,a]^T\), where w and a are the angular velocity and linear acceleration, respectively.

Fig. 9
figure 9

Linear velocity and actuation profiles for the optimal trajectories in Figs. 10c (pink line), 11c (blue line) and 12c (green line). Red lines show the velocity and actuation limits (Color figure online)

The motion primitives in the database are computed for each pair of initial and final state, \(\mathbf {q}_0=[x_0,y_0,\theta _0, v_0]\) and \(\mathbf {q}_f=[x_f,y_f,\theta _f,v_f]\), solving the TPBVP in (2) for the differential equations given in (11) and the cost function

$$\begin{aligned} J(\mathbf {u}, \tau )=\int \limits _{0}^{\tau }\left[ 1 + \mathbf {u}(t)^TR\mathbf {u}(t)\right] \mathrm {d}t \end{aligned}$$

that minimizes the total time of the trajectory \(\tau \), penalizing the total actuation effort with a weight \(R=0.5I_2\). The control variables a and w are bounded as \(a \in [-3,3]\,\text {m/s}^2\), \(w \in [-5,5]\,\text {rad/s}\).

TPBVPs are solved using MATLAB toolbox GPOPS (Patterson and Rao 2014), a nonlinear optimization tool based on the Gauss pseudo-spectral collocation method.

Fig. 10
figure 10

Trajectories generated with an indoor map and a coarse resolution uniform square gridding for various number of iterations. Magenta square is the goal region, and the optimal trajectories are represented in red (Color figure online)

Fig. 11
figure 11

Trajectories generated with an indoor map and a fine resolution uniform square gridding for various number of iterations. Magenta square is the goal region, and the optimal trajectories are represented in red (Color figure online)

Fig. 12
figure 12

Trajectories generated with an indoor map and a fine resolution uniform diamond gridding for various number of iterations. Magenta square is the goal region, and the optimal trajectories are represented in red (Color figure online)

Three databases based on different grids have been considered. For all of them the initial state is characterised by the same position \((x_0,y_0) = (0,0)\).

The first database is based on a coarse resolution uniform square grid (Fig. 8a), where \((x_f,y_f) \in [-2,2] \times [-2,2] \setminus \{(0,0)\}\) and each square cell has a size of one meter. The initial orientation \(\theta _0\) is selected among three values \(\{0, \pi /4, \pi /2 \}\,\text {rad}\), the final orientation \(\theta _f\) can take 8 equally spaced values in the range \([0,2\pi )\,\text {rad}\). For the initial and final velocities, \(v_0\) and \(v_f\), a minimum and a maximum velocity of \(1\,\text {m/s}\) and \(4\,\text {m/s}\) is considered together with the zero velocity.

The second database is based on a fine resolution uniform square grid (Fig. 8b), where \((x_f,y_f) \in [-2,2] \times [-2,2] \setminus \{(0,0)\}\) and each square cell has a size of half a meter. The initial and final orientations can take 24 equally spaced values in the range \([0,2\pi )\,\text {rad}\). The initial and final velocities are selected among 5 equally spaced values in the range \([0,4]\,\text {m/s}\).

Finally, the third database is based on a uniform diamond grid (Fig. 8c), characterised by the same initial and final states as the previous one, plus some additional final states at \((x_f,y_f) \in [-1.75,1.75] \times [-1.75, 1.75]\) with a discretization step of half a meter in each direction. These additional states are characterized by the same orientation and velocity of the rest of the database.

Fig. 13
figure 13

Cost with respect to the number of iterations. Coarse resolution square gridding (red line), fine resolution square gridding (green line) and diamond gridding (blue line) as a function of number of iterations (Color figure online)

Simulations are performed on an IntelCore i7@2.40 GHz personal computer with 8Gb RAM and the algorithm has been implemented in MATLAB.

An indoor map is considered (Figs. 1011, and 12), setting the robot initial pose at \(\left( 1,0,\pi /2\right) \) with zero velocity. The goal area is defined as a square of half a meter side and centred at (9, 11). The robot should stop at the end of the trajectory.

Fig. 14
figure 14

Tree cardinality with respect to the number of iterations, for coarse (red line) and fine (blue line) resolution square gridding (Color figure online)

Fig. 15
figure 15

Computation time with respect to the number of iterations, for coarse (red line) and fine (blue line) resolution square gridding (Color figure online)

Figures 1011, and 12 show the trees and the optimal trajectories obtained for different number of iterations and for the three different gridding strategies. Correspondingly, Fig. 9 shows the velocity and the actuation profiles for the optimal trajectories computed with 3000 and 50,000 iterations, and reported in Figs. 10c, 11c and 12c, clearly demonstrating that the velocity constraint and the actuation bounds are satisfied. Note that the velocity profile that corresponds to the coarse resolution square gridding exhibits a jerky acceleration behaviour, due to the fact that the velocity at each node is constrained to be exactly one of the values in the database. This demonstrates that the velocity discretization step has to be accurately selected if a smoother velocity profile is required.

The same planning problem has been solved for 10 independent simulation runs. Figure 13 shows the average cost evolutions related to the coarse resolution, fine resolution and diamond gridding, as the number of iterations increases. As expected, the cost reduces increasing the number of iterations, and converges to the resolution optimal\({\varDelta }\)-cost: once this minimum is achieved the solution will not further improve.

As can be easily seen, motion primitives computed using a denser grid provide lower cost plans. Moreover, the resolution optimal \({\varDelta }\)-cost achieved using the fine resolution square and diamond grids are similar, demonstrating that the choice of the discretization step is strictly related to the specific problem. Finally, it is worth mentioning that as the resolution of the grid increases, the cardinality of \(Q_{free}^{{\varDelta }}\) increases as well, slowing down the convergence to the resolution optimal \({\varDelta }\)-cost.

In order to assess the impact of the grid resolution on the size of the search space, in Table 1 we report the number of nodes corresponding to the grids of the three adopted databases (see Fig. 8), together with the corresponding minimum and maximum branching factors, i.e., the number of neighbors that each node is connected to. The computed number of nodes is an upper bound on the cardinality of \(\mathcal {G}^{\varDelta }_{free}\). Yet, from the figures in Table 1, it should be clear the combinatorial nature of the problem, which makes it hard building the whole graph of motion primitives and applying a graph search. As a matter of fact, the most commonly used lattice-based approaches use graph search algorithms that resort to some heuristic (see for example dynamic \({\hbox {A}}^{\star }\) (\(\hbox {D}^{\star }\)) and anytime repairing \(\hbox {A}^{\star }\) (\(\hbox {ARA}^{\star }\)) by Stentz (1994) and Likhachev et al. (2008) respectively). Note that \(\mathtt{RRT}^{\star } \_ \mathtt{MotionPrimitives}\)  does not need to adopt any heuristic, but if a smart heuristic were available, it could be integrated within \(\mathtt{RRT}^{\star } \_ \mathtt{MotionPrimitives}\)  to speed up the search.

Table 1 Number of nodes of the state space grids in Fig. 8 corresponding to different resolutions together with the minimum and maximum branching factors
Fig. 16
figure 16

Histogram of the computation time for solving a single TPBVP, median of the values is represented with the red line. The histogram is determined based on 1000 boundary pairs randomly selected from the same subspace used for building the motion primitives (Color figure online)

Figure 14 shows the cardinality of the tree for the coarse and fine resolution square griddings, as the number of iterations increases. Since the nodes that can be added to the tree are limited to the elements of the grid, the tree cardinality converges to the cardinality of \(\mathcal {G}^{\varDelta }_{free}\): once this value is achieved no more nodes can be added.

Figure 15 shows the computation time evolution for the coarse and fine resolution griddings, as the number of iterations increases. Though \(\mathtt{RRT}^{\star } \_ \mathtt{MotionPrimitives}\)  does not necessarily add a new node at each iteration, but can only change the existing connections, showing the computation time evolution with respect to the number of iterations allows to easily relate this quantity to the achieved cost, and thus the degree of sub-optimality of the planned trajectory.

It is also worth mentioning that the resulting computation time is promising, even for online replanning in the case of dynamic and partially known environments. Code optimisation and a C/C++ implementation can be considered for a further speed up.

To better emphasize the advantage of using a precomputed database, we report in Fig. 16 the histogram of the computation time for solving a single TPBVP of the considered example using the GPOPS commercial numerical solver (Patterson and Rao 2014). As can be seen from this figure, it typically takes around 400 ms to get a solution for a single TPBVP while a trajectory can be extracted from the database in a time of the order of 0.01 ms (values ranged between 0.008 ms and 0.015 ms over 100 trials). Note that at each iteration of the standard \(\mathtt{RRT}^{\star }\) algorithm, a set of TPBVPs that corresponds to the set of tentative trajectories connecting \(\mathbf {q}_{rand}\) to a set of nearby nodes has to be solved. When the TPBVP is not easy to be solved (like in the considered example), the applicability of \(\mathtt{RRT}^{\star }\) to dynamic and partially known environments is hampered.

6 Conclusions

In this paper, a variant of \(\hbox {RRT}^{\star }\), named \(\mathtt{RRT}^{\star } \_ \mathtt{Motion}{} \mathtt{Primitives}\), that allows to introduce motion primitives in the \(\hbox {RRT}^{\star }\) planning framework is presented. In particular, a set of pre-computed trajectories, named motion primitives, is used to substitute the computationally challenging step of solving for a steering action. Then, in order to ensure that for any queried steering action a pre-computed trajectory exists, a grid representation of the state space has been introduced.

This newly conceived algorithm is supported by an accurate theoretical analysis, demonstrating the optimality and probabilistic completeness.

The performance of \(\mathtt{RRT}^{\star } \_ \mathtt{MotionPrimitives}\)  has been verified in simulation, showing promising results in terms of quality of the planned trajectory and computation time, that is particularly important for an online usage in the case of dynamic environments that require repeated replanning. The results show also that as the grid size gets smaller, asymptotic optimality is achieved. Having a fine resolution, however, increases the size of the database and the number of iterations required to converge to the resolution optimal trajectory. Nevertheless, one advantage of adopting a sampling based approach is the possibility of computing a feasible though sub-optimal solution first, and then, in case more time is available, improve it. One should indeed choose the best compromise between computing time and performance, according to the application at hand.