1 Introduction

Motion planning is a crucial aspect of autonomous robots [1, 2]. However, it has been shown to be difficult and nontrivial for legged locomotion due to problems of mechanical complexity, irregular ground surfaces, as well as hybrid, underactuated and high-dimensional nature of the dynamics of legged robots [3,4,5]. To overcome these difficulties, simplified template models are often utilized to simplify the planning problem, which enables online reactive control in changing situations [6, 7].

There are numerous related works in literature on online motion planning using simplified template models [8, 9]. Kajita et al. first proposed a preview control scheme for biped walking pattern generation based on the linear inverted pendulum (LIP) model [10]. This scheme utilizes predefined footholds as the zero moment point (ZMP, referred to as center of pressure hereafter) reference, which allows for the generation of dynamically stable walking motion through the use of a linear quadratic regulator (LQR) [11]. Subsequently, a generic approach called model predictive control (MPC) has been developed by adding constraints on the center of pressure (CoP). Results demonstrate this approach improves stability against disturbances [12, 13].

However, due to the intrinsic under-actuation of the floating-based dynamics of legged robots, the center of mass (CoM) cannot be directly controlled and is subject to the ground reaction forces (GRFs) interacting with the environment [14, 15]. The aforementioned approaches predefine footholds, which enforce the GRFs within a specified support polygon formed by the feet, which becomes limiting in a changing environment. As a result, the responsiveness and feasibility of the motion against unexpected disturbances are severely restricted by pre-allocated footholds.

Instead of following the predefined footholds, some studies have formulated linear MPC problems that simultaneously optimize the CoM trajectories and footholds. This allows for the implementation of reactive online motion planning with automatic foothold adaptation in both bipedal locomotion, assuming a fixed step duration [16, 17]. Similarly, our previous work has proposed to apply LIP model and MPC to quadruped locomotion, considering foothold optimization [18].

In addition, more flexible locomotion optimizations have been proposed in literature [19, 20] by formulating the planning problem as a mixed-integer quadratic program (MIQP), which allows for the adaptation of both the duration and position of the footsteps.

With the development of recent numerical and theoretical algorithms in nonlinear optimization, more advanced nonlinear MPC can be used for online applications [21, 22]. The work in [21] extends the linear MPC in [8] to a nonlinear MPC based on the LIP model, which demonstrated improved performance and the ability to avoid obstacles for humanoids. Another motion planning approach for quadruped robots was formulated as a nonlinear programming problem, which simultaneously optimizes both the CoM trajectories and footholds [23].

Inspired by these approaches, we proposed a reactive motion planning approach based on a nonlinear MPC for simultaneous optimization of the CoM trajectories and footholds for legged locomotion, as depicted in Fig. 1.

Fig. 1
figure 1

Control scheme of the proposed motion planning approach. Using the task, robot and terrain information, a unified nonlinear optimization problem is formulated for both long-term and reactive planning that allows for simultaneous footholds and CoM trajectory optimization. Leveraging the advantages of the MPC and vector field strategy, the long-term results are used as input for reactive re-planning

The proposed approach combines long-term pre-planning and reactive re-planning for the simultaneous optimization of CoM trajectories and footholds. It is based on a unified optimization formulation that takes into account information related to the target and environmental constraints, such as obstacles, robot kinematics and dynamics. By utilizing a combination of MPC, the vector field guidance, and long-term optimal CoM–CoP planning, this approach offers a robust and efficient method for motion planning.

Instead of utilizing traditional time-based motion planning methods, the proposed approach uses motion planning based on the CoM spatial deviation to generate the reference velocity for reactive tracking. The approach takes into account the deviation of the current state of the CoM from the pre-planned trajectory, and generates reactive motions that are more robust against disturbances and uncertainties. Furthermore, by incorporating off-line footholds planning as an initial condition for MPC, this approach enables more efficient convergence to the pre-planned long-term motions, resulting in a more stable and reliable performance of the robot.

Previous work has investigated quadruped motion generation method based on the MPC [18]. However, the method did not take into account reaching the target location, obstacle avoidance, and jumping gait. In contrast to previous work based on LIP model [18, 23], the proposed approach utilizes a generic point-mass representation as the template model, which overcomes the limitations of the constant height assumption in the LIP model, allowing for the generation of more natural and adaptable movements with varying heights and aerial, flight phases. Furthermore, the gait transitions between walking and jumping are available. The main contributions of our work are summarized below:

  1. 1.

    A reactive motion planning approach allows for simultaneous CoM trajectory and footholds optimization, taking into account the target location, robot kinematics/dynamics, and environmental constraints such as obstacles.

  2. 2.

    A unified optimization problem is formulated for long-term off-line pre-planning and reactive online re-planning.

  3. 3.

    The approach combines the advantages of MPC and vector field strategy, making it long-term trajectory convergence, fast to compute, and more robust against disturbances, which mitigates the limitations of motions dominated by time-based functions.

  4. 4.

    The use of a generic point-mass model enables the generation of more natural and adaptable gaits, such as walking, jumping, and their gait transitions.

The remainder of this paper is organized as follows: Sect. 2 presents the formulation of the unified nonlinear optimization problem for long-term off-line planning. Section 3 elaborates on the reactive online planning, which utilizes vector field guidance for tracking and the nonlinear MPC in detail. The results and analysis of numerical simulations and experiments are discussed in Sect. 4, and the conclusion is presented in Sect. 5.

2 2. Point-Mass Model Walking

With the advantage of the computational efficiency, a low-dimensional generic point-mass model is introduced to abstract the dynamics of the legged robots, as depicted in Fig. 2. In this paper, the legged robot is reduced to a generic point-mass model with mass m. A support polygon (represented by the yellow area) is formed by multi-contact point feet for quadruped robots or a finite-sized foot for biped robots, with a geometric center that represents the foothold \({\mathbf{p}}^{f}\). The vector \({\mathbf{r}}\) represents the leg vector from the foothold to the CoM. The support polygon, determined by the foothold, forms a CoP constraint area, which restricts the location of the CoP \({\mathbf{p}}^{p}\). The ground applies a GRF \({\mathbf{f}}\) to the point-mass model.

Fig. 2
figure 2

A simplified generic point-mass model: a mass m and a massless, extendable leg r; a foothold with a geometric center pf in contact with the ground; and the GRF f applied from the CoP pp

The point-mass model simplification in legged robot dynamics is based on the assumption of conservation of angular momentum over time, specifically, that its derivative is equal to zero. Additionally, it is assumed that there is sufficient friction present to allow for the dynamics of the point-mass model to be represented as:

$$m{\mathbf{\ddot{c}}} = {\mathbf{f}} + m{\mathbf{g}}.$$
(1)

This model is further illustrated through the periodic locomotion of the robot, where it takes a certain number of steps \(\overline{m}\) forward before coming to a stop, and the use of virtual swing legs (depicted as gray dotted lines) to explicitly describe the exchange of support legs. It is important to note that the footholds remain constant during each step and only change upon exchange of the support foot.

The concept of a foothold is utilized in legged robot locomotion to determine the position of the support feet. However, it should be noted that the foothold does not always align with the CoP \({\mathbf{p}}^{p}\), which is constrained by the support polygon defined by the foothold.

3 Optimization-Based Motion Planning

To facilitate efficient locomotion, an off-line optimization approach can be employed to plan a long-term, optimal trajectory prior to initiating the task. This trajectory can then serve as a reference for an online motion planner, which utilizes reactive walking patterns in conjunction with the off-line results, while ensuring compliance with a set of constraints. In light of this, a unified, nonlinear optimization problem can be formulated to incorporate environmental constraints, robot motion, and the task goal for both off-line planning and online MPC algorithms.

The explicit relationship between the CoM acceleration and the GRF in the point-mass dynamics Eq. (1) is well defined and takes into account the constraints on the footholds and the CoP in its motion analysis. Therefore, the optimization of the acceleration of the CoM and the foothold is chosen as the decision variable \({{\varvec{\upchi}}}\), namely:

$${{\varvec{\upchi}}}(t) = \left[ {{\mathbf{\ddot{c}}}(t),\;{\mathbf{p}}^{f} (t)} \right].$$
(2)

To proceed with this optimization, cost functions and corresponding constraints must be formulated.

3.1 Cost Functions

The goal of off-line planning is to determine the decision variables that will result in the desired performance of the point-mass model. To quantify the discrepancy between predicted and desired behavior explicitly, it is common practice to use a least-squares formula to express the cost functions in the optimization problem:

$$\mathop {\min }\limits_{{{{\varvec{\upchi}}}(t)}} \quad J_{{{{\varvec{\upchi}}}(t)}} = \int_{{t_{0} }}^{{t_{F} }} {\omega_{{{\varvec{\upchi}}}} \left\| {f({{\varvec{\upchi}}}(t)) - f({\tilde{\mathbf{\chi }}}(t))} \right\|^{2} } ,$$
(3)

where \(f({{\varvec{\upchi}}}(t))\) and \(f({\tilde{\mathbf{\chi }}}(t))\) indicate the predicted performance and desired performance, respectively.

Considering the locomotion performance index, the cost functions for off-line optimization are discussed as follows.

  1. 1.

    CoM states

In the case of predominantly sagittal motion, it is imperative for the CoM velocity along the x axis to adhere to the specified commands. To minimize energy expenditure and ensure a smooth trajectory for the CoM, it is essential to minimize velocities along the y and z axes, as well as accelerations. The cost functions of the CoM states are given by:

$$J_{{{\mathbf{c}}^{a,v} }} = \int_{{t_{0} }}^{{t_{F} }} {\omega_{{\mathbf{c}}}^{a,v} \left\| {{\mathbf{c}}^{a,v} (t) - {\tilde{\mathbf{c}}}^{a,v} (t)} \right\|}^{2} .$$
(4)
  1. 2.

    CoP position

To preserve the stability of the robot’s motion and minimize the impact of GRFs on the angular momentum of the system, it is crucial for the CoP to be situated as close to the foothold as possible. As a result, it is important for the GRF vector (from CoP to CoM) to align with the vector of the leg. The alignment between these two vectors can be mathematically quantified using the cross-multiplication of the vectors. This leads to the formulation of the CoP position cost function:

$$J_{{{\text{cop}}}} = \int_{{t_{0} }}^{{t_{F} }} {\omega_{p} \left\| {{\mathbf{r}}(t) \times {\mathbf{f}}(t)} \right\|}^{2} .$$
(5)

The dynamics of the system can be incorporated into the optimization problem by utilizing Eq. (1) to calculate the GRF.

  1. 3.

    CoM height

    The significant fluctuation of the CoM height is generally detrimental for legged robots without effective energy storage components. This can result in unstable motion, significant variations in GRFs, and an increase in energy consumption. To prevent the height of the CoM from hitting the upper or lower limits and to mitigate the fluctuation of the CoM height, a cost function is formulated as:

    $$J_{z} = \int_{{t_{0} }}^{{t_{F} }} {\omega_{z} \left\| {{\mathbf{r}}^{z} (t) - {\tilde{\mathbf{r}}}^{z} (t)} \right\|}^{2} .$$
    (6)

For walking gait, it is reasonable to establish a desired CoM height that remains constant, which should be determined based on the robot structure. However, for jumping gait, it is challenging to predefine an energy-efficient CoM height function. For simplicity, the desired CoM height is set to be constant for the nature trajectory generated by the optimizer.

3.2 Constraints

Throughout the whole task, the system states’ reachable constraint space can be uniformly expressed as:

$$\left\{ \begin{gathered} \chi (t) - \chi_{{{\text{upp}}}} \le 0 \hfill \\ \chi_{{{\text{low}}}} - \chi (t) \le 0 \hfill \\ \end{gathered} \right..$$
(7)
  1. 1.

    CoM states

    When taking into account the specific mode of walking and the limitations imposed by the robot's actuators, the constraints of the CoM states can be described as:

    $$\left\{ \begin{gathered} {\mathbf{c}}^{a,v,p} (t) - {\mathbf{c}}_{{{\text{upp}}}}^{a,v,p} \le 0 \hfill \\ {\mathbf{c}}_{{{\text{low}}}}^{a,v,p} - {\mathbf{c}}^{a,v,p} (t) \le 0 \hfill \\ \end{gathered} \right..$$
    (8)

    It is important to note that during the aerial phase of jump gait, the acceleration of the CoM is restricted to gravity acceleration.

  1. 2.

    Leg length

To ensure consistency with the full-body model of the legged robot, the point-mass model has defined upper and lower limits for the length of the legs, which are represented by:

$$\left\{ \begin{gathered} \left\| {{\mathbf{r}}(t)} \right\|^{2} - {\mathbf{r}}_{{{\text{upp}}}}^{2} \le 0 \hfill \\ {\mathbf{r}}_{{{\text{low}}}}^{2} - \left\| {{\mathbf{r}}(t)} \right\|^{2} \le 0 \hfill \\ \end{gathered} \right..$$
(9)
  1. 3.

    Step length

    The step length between any two adjacent footholds is restricted by both the length of the legs and the stride length of the robot. As a result, the step length constraint can be expressed as:

    $$\left\{ \begin{gathered} (p^{f,xy} (t + T_{{{\text{step}}}} ) - p^{f,xy} (t)) - l_{{{\text{upp}}}}^{df,xy} \le 0 \hfill \\ l_{{{\text{low}}}}^{df,xy} - (p^{f,xy} (t + T_{{{\text{step}}}} ) - p^{f,xy} (t)) \le 0 \hfill \\ \end{gathered} \right.,$$
    (10)

    where \(l_{{{\text{upp}}}}^{df,xy}\) and \(l_{{{\text{low}}}}^{df,xy}\) denote the upper and lower limits of the step length, respectively.

    Furthermore, it can be noted that the position of the foothold remains fixed throughout each step duration, which can be written as:

    $$p^{f} (t) \equiv {\text{C, }}t \in [t_{{{\text{step0}}}} ,\;t_{{{\text{step0}}}} + T_{{{\text{step}}}} ).$$
    (11)
  2. 4.

    CoP constraint

    Similar to the contact sequences of bipedal walking, quadrupedal trotting is an efficient gait among the various forms of locomotion. The constraints on the CoP for both quadrupedal trotting and bipedal locomotion are depicted in Fig. 3.

    Fig. 3
    figure 3

    Locomotion of the point-mass model walking \(\overline{m }\) steps forward and then stop. The red line segments represent the finite-sized feet centered at \(p^{f}\) to constrain the CoP location \(p^{p}\)

The analysis of the walking process from the perspective of the point-mass model demonstrates that the CoP cannot be situated too far from the relevant foothold. This is due to the fact that (1) the CoP must be situated within the support polygon; (2) the point-mass model does not account for the momentum generated by the dynamics of the legged robots. As the distance between the center of pressure and the current foothold increases, the discrepancy between the model and reality increases in the reduced model.

Figure 4 demonstrates the constraints of the CoP for quadrupedal (trotting) and bipedal gaits. The cross symbol indicates the geometric center of the conservative CoP constraints. As illustrated by the blue rectangle in Fig. 4, the conservative constraint on the position of the CoP position can be expressed mathematically as:

$$\left\{ \begin{gathered} (p^{p,xy} (t) - p^{f,xy} (t)) - l_{{{\text{upp}}}}^{p,xy} \le 0 \hfill \\ l_{{{\text{low}}}}^{p,xy} - (p^{p,xy} (t) - p^{f,xy} (t)) \le 0 \hfill \\ \end{gathered} \right.,$$
(12)

where \(l_{{{\text{upp}}}}^{p,xy}\) and \(l_{{{\text{low}}}}^{p,xy}\) indicate the upper and lower limits of the conservative CoP constraint, respectively.

Fig. 4
figure 4

CoP constraints for quadrupedal (trotting) and bipedal gaits. The cross symbol indicates the geometric center of the conservative CoP constraints (blue rectangle)

  1. 5.

    Trotting CoP constraint

For a quadrupedal trot with a point foot, as depicted in Fig. 4, the CoP position must be located on the line-contact constraint made by the two diagonal support feet. The green line (conservative CoP constraint) is situated on the plane defined by the three points: CoM, CoP, and the foothold. Consequently, the cross product of the leg vector and the GRF vector generates a normal vector to the plane, which is orthogonal to the green line. This relationship can be utilized as a constraint on the position of the CoP while trotting. The constraint can therefore be mathematically expressed as:

$$({\mathbf{r}}(t) \times {\mathbf{f}}(t)) \cdot {\mathbf{r}}_{{{\text{trot}}}} (t) = 0,$$
(13)

where \({\mathbf{r}}_{{{\text{trot}}}} (t)\) indicates the vector formed by the points of the two diagonal support feet, namely \({\mathbf{r}}_{{{\text{trot}}}} (t) = \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{{p_{{{\text{HL}}}} p_{{{\text{FR}}}} }}\) in Fig. 4. Note that the above equation remains valid even if the leg vector and the GRF vector are coincident.

Moreover, the green line remains constant throughout each step duration given by:

$${\mathbf{r}}_{{{\text{trot}}}} (t) \equiv C,\;t \in [t_{{{\text{step0}}}} ,\;t_{{{\text{step0}}}} + T_{{{\text{step}}}} ).$$
(14)

By combining the constraint of Eq. (12), a conservative constraint on the CoP is established for quadrupedal trotting. It should be emphasized that the CoP constraint for bipedal locomotion and quadrupedal walking during the four-foot support phase can be defined without the constraint of Eq. (13). Additionally, the quadrupedal pace gait can also be formulated by modifying vector \({\mathbf{r}}_{{{\text{trot}}}} (t)\) with the support feet on the same side.

  1. 6.

    Terrain surface

Physical constraints dictate that the foothold must be restricted to the plane of the terrain. Once the x- and y- coordinates of the foothold have been determined, the z- coordinate can be calculated using the function of the terrain surface as specified by:

$$p^{f,z} (t) = f_{{{\text{terr}}}} (p^{f,x} (t),\;p^{f,y} (t)),$$
(15)

where \(f_{{{\text{terr}}}} ( \cdot )\) indicates the function of the terrain surface.

  1. 7.

    Obstacle avoidance

To avoid collisions with the environment, obstacle avoidance is a crucial aspect of motion planning. Due to the diversity of obstacle shapes and sizes, it is challenging to describe them in a general mathematical form. As an approximation, surrounding obstacles are represented using minimum enveloping cylinders. Therefore, obstacle avoidance constraints on the ground (vertical cylinders) and in the air (horizontal cylinders) can be expressed as:

$$\left\{ \begin{gathered} (r_{{{\text{obs}}}} + r_{{{\text{sec}}}} )^{2} - (c^{x} (t) - P_{{{\text{obs}}}}^{x} )^{2} - (c^{y} (t) - P_{{{\text{obs}}}}^{y} )^{2} \le 0 \hfill \\ (r_{{{\text{obs}}}} + r_{{{\text{sec}}}} )^{2} - (c^{x} (t) - P_{{{\text{obs}}}}^{x} )^{2} - (c^{z} (t) - P_{{{\text{obs}}}}^{z} )^{2} \le 0 \hfill \\ \end{gathered} \right.,$$
(16)

where \(r_{{{\text{obs}}}}\) represents the radius of the cylinder and \(r_{{{\text{sec}}}}\) represents the security threshold radius of the obstacle. The vertical cylinders and horizontal cylinders are centered at \((P_{{{\text{obs}}}}^{x} ,\,P_{{{\text{obs}}}}^{y} ,\,0)\) and \((P_{{{\text{obs}}}}^{x} ,\,0,\,P_{{{\text{obs}}}}^{z} )\), respectively.

Additionally, the constraint for the footholds in obstacle avoidance can also be mathematically represented in a similar manner.

$$(r_{{{\text{obs}}}} + r_{{{\text{sec}}}} )^{2} - (P^{f,x} (t) - P_{{{\text{obs}}}}^{x} )^{2} - (P^{f,y} (t) - P_{{{\text{obs}}}}^{y} )^{2} \le 0.$$
(17)
  1. 8.

    Final CoM velocity

The legged robot should approach the final position in a steady manner, with a final speed of zero. Therefore, the constraint on the final CoM velocity can be represented as:

$$\left\{ \begin{gathered} \dot{c}(t_{{\text{F}}} ) - \dot{c}_{{{\text{Fupp}}}} \le 0 \hfill \\ \dot{c}_{{{\text{Flow}}}} - \dot{c}(t_{{\text{F}}} ) \le 0 \hfill \\ \end{gathered} \right..$$
(18)

Ideally, the CoM velocity should be zero, \(\dot{c}_{{{\text{Flow}}}} = \dot{c}_{{{\text{Fupp}}}} = 0\), at the final position. However, in practice, ensuring a CoM velocity of zero at the final goal can significantly reduce the range of feasible numerical solutions for off-line optimization. Furthermore, the presence of external disturbances and limitations on the CoP and foothold constraints can make the off-line optimization problem computationally expensive or even unsolvable. To mitigate these issues, it is reasonable to allow for a small non-zero final velocity in the optimization formulation, which allows for the CoM to be brought to a stop through the use of a large support polygon. This approach allows for an appropriate extension of the final CoM velocity constraints, which increases the feasible range of optimal solutions and reduces computational costs.

  1. 9.

    Final CoM position

To achieve a specific final position and stop for a legged robot, a constraint on the final position of the CoM can be written as:

$$\left\{ \begin{gathered} c^{xy} (t_{F} ) - c_{{{\text{fupp}}}}^{xy} \le 0 \hfill \\ c_{{{\text{flow}}}}^{xy} - c^{xy} (t_{F} ) \le 0 \hfill \\ \end{gathered} \right..$$
(19)

In the same way, if there is no strict restriction on the final position, the boundary for the final CoM position can be expanded. By combining this constraint with the terrain surface constraint represented in Eq. (15), discrete-time decision variables can be defined to numerically solve the continuous-time optimization problem. The optimization problem is formulated by discretizing the continuous-time decision variables into a finite number of variables.

$${{\varvec{\upchi}}} = [\ddot{c}_{1} , \ldots ,\ddot{c}_{{\overline{n}}} ,p_{1}^{f,xy} , \ldots ,p_{{\overline{m}}}^{f,xy} ]^{{\text{T}}} ,$$
(20)

where \(\overline{m} = \frac{{t_{{\text{F}}} - t_{0} }}{{T_{{{\text{step}}}} }}\), \(\overline{n} = \frac{{t_{{\text{F}}} - t_{0} }}{T}\). By entering the number of steps \(\overline{m}\) and a fixed duration, the number of samples \(\overline{n}\) in the discrete-time optimization problem is fixed.

The above optimization problem can be further formulated mathematically as:

$$\begin{gathered} \mathop {\min }\limits_{{{{\varvec{\upchi}}}(t)}} \quad J_{{\overline{n}}} ({{\varvec{\upchi}}}) = \sum\limits_{i = 1}^{{\overline{n}}} {\omega_{{{\varvec{\upchi}}}} \left\| {f({{\varvec{\upchi}}}_{i} ) - f({\tilde{\mathbf{\chi }}}_{i} )} \right\|^{2} } \hfill \\ {\text{s}}{\text{.t}}{.}\quad \;\left\{ \begin{gathered} {{\varvec{\upchi}}}_{i} - {\tilde{\mathbf{\chi }}}_{{i,{\text{upp}}}} \le 0 \hfill \\ {\tilde{\mathbf{\chi }}}_{{i,{\text{low}}}} - {{\varvec{\upchi}}}_{i} \le 0 \hfill \\ \end{gathered} \right.. \hfill \\ \end{gathered}$$
(21)

Based on the optimization formulation, the long-term optimal CoM trajectory and footholds can be solved off-line.

4 4. Reactive Motion Planning

While off-line pre-planning can produce a long-term optimal solution, uncertainties arise during execution, such as model mismatches in both the robot and the terrain, and external disturbances. These uncertainties can lead to deviation from the pre-planned CoM trajectory and footholds, which can inevitably affect the states of the system as it moves. This deviation can potentially result in failure as the off-line optimization is based on an ideal scenario and cannot respond to online changes and adaptations. To overcome this limitation, it is necessary to implement a reactive motion planning approach, which is capable of adapting to the changing states of the system in real time, and can handle uncertainties in an online manner.

In this section, a reactive motion planner using a MPC approach can be implemented in an online manner. This approach not only allows for the finding of a local optimum solution based on current state feedback, but also aims to closely track the long-term optimum while satisfying current constraints. Additionally, the use of an online reactive motion planner has the advantage of efficient convergence and a reduced likelihood of getting trapped in local minima, as it is guided by the results of off-line optimization.

In this section, the implementation of a reactive motion planner using MPC aims to not only find a locally optimal solution based on current state feedback, but also to closely track the long-term optimum while satisfying current constraints. Additionally, online reactive motion planning, guided by off-line optimization results, has the advantage of efficient convergence and avoiding getting trapped in local minima.

Given that the footholds are only related to the GRFs to accelerate and decelerate the CoM, this work focuses on the CoM trajectory. As illustrated in Fig. 5, under significant unexpected uncertainties, e.g., a large offset, it is often impossible for the actual CoM trajectory to precisely track the long-term optimum. Instead, a tractable solution is to generate intermediate, transitional trajectories that converge to the long-term optima.

Fig. 5
figure 5

Attraction vector field for converging the CoM trajectory: the long-term optimum (\({\tilde{\mathbf{v}}}_{k}^{ref}\)) and the new convergent velocity (\({\tilde{\mathbf{v}}}_{k}\)) under the unexpected disturbances

As illustrated in Fig. 5, the black line indicates the long-term optimal CoM trajectory projected on the xy plane. The light red dashed line is the actual CoM trajectory with the current CoM position pk at the kth sample instant. The current CoM position deviates from the long-term trajectory due to disturbances. As can be seen from the figure, the current ideal CoM velocity generated off-line should be the vector vk, which is equal to the reference velocity \({\tilde{\mathbf{v}}}_{k}^{{{\text{ref}}}}\) on the foot \({\tilde{\mathbf{p}}}_{k}^{{{\text{ref}}}}\) of the perpendicular of the point pk onto the long-term trajectory. In the theoretical case, the CoM position in the future would move along the gray curve parallel to the long-term optimal trajectory without converging to the long-term optimum. Therefore, in addition to the reference velocity calculated off-line, another velocity, namely the convergence velocity, is required for the CoM position to converge to the long-term trajectory. Otherwise, the walking task would fail as the position offset accumulates.

To address the problem of the CoM position's convergence, we utilize the concept of vector field. Without loss of generality, the distance along the long-term optimal trajectory is defined as the x-coordinate, the perpendicular distance of any points in space is defined as the y-coordinate. Only the CoM trajectory projected on the xy 2D plane is considered as the height of the CoM is significantly affected by terrain, robot structures, and the walking gait.

We define the desired velocity \({\tilde{\mathbf{v}}}_{k}\) and the corresponding convergence velocity \({\mathbf{v}}_{k}^{cov}\) at the current sample instant \(t_{k}\) as follows:

$${\tilde{\mathbf{v}}}_{k} = {\tilde{\mathbf{v}}}_{k}^{{{\text{ref}}}} + {\mathbf{v}}_{k}^{{{\text{cov}}}} ,$$
(22)
$${\mathbf{v}}_{k}^{{{\text{cov}}}} = {\mathbf{v}}_{{\text{C}}} \left( {\frac{{\varepsilon_{k} }}{d}} \right)^{\kappa } ,$$
(23)

where \({\mathbf{v}}_{{\text{C}}} \equiv {\text{C}}\) is the maximum safe convergence velocity, which remains constant. \(\kappa \ge 1\) is the convergence ratio. \(\varepsilon_{k}\) indicates the perpendicular distance between the current CoM position \({\mathbf{p}}_{k}\) and the long-term trajectory. d is the allowable perpendicular distance threshold, which defines a convergence space. Thereby, \(\varepsilon_{k}\) is bounded within d. As long as the CoM position is within this space, the vector field formula can guide the CoM toward the long-term optimal trajectory.

The convergence space, depicted as the gray area in Fig. 6, is set here to ensure that the deviation of the actual CoM trajectory from the long-term optimum is not too large throughout the entire walking task. If the actual CoM position falls outside of this convergence space, even if the actual CoM eventually converges to the long-term trajectory, but the deviation may become too large, resulting in an online trajectory that is not optimal for the entire locomotion. In addition, using the off-line footholds as initial guesses may not significantly reduce the computation time of MPC as expected. To prevent the long-term optimization result from becoming infeasible for reactive re-planning, it is necessary to interrupt the ongoing re-planning process and bring the robot to a halt. Restart a new long-term domain pre-planning with the current state as a starting point, followed by a reactive re-planning motion planning toward the target location. It should be noted that this work does not address the optimization problem of MPC within a small sampling interval in the order of milliseconds.

Fig. 6
figure 6

Convergence of the vector field guidance: the blue lines are the re-generated references, which converges to the original reference

To generate a new desired velocity trajectory within the prediction horizon for the MPC, an iterative algorithm is applied to calculate the future convergence velocity trajectory at each control loop:

$$\left\{ \begin{gathered} \varepsilon_{k,1} = \varepsilon_{k} - {\mathbf{v}}_{k}^{{{\text{cov}}}} T,\quad \quad \;\,{\mathbf{v}}_{k,1}^{{{\text{cov}}}} = {\mathbf{v}}_{{\text{C}}} \left( {\frac{{\varepsilon_{k,1} }}{d}} \right)^{\kappa } \; \hfill \\ \varepsilon_{k,2} = \varepsilon_{k,k + 1} - \, {\mathbf{v}}_{k,1}^{{{\text{cov}}}} T,\quad {\mathbf{v}}_{k,2}^{{{\text{cov}}}} = {\mathbf{v}}_{{\text{C}}} \left( {\frac{{\varepsilon_{k,2} }}{d}} \right)^{\kappa } \hfill \\ \cdots \hfill \\ \varepsilon_{{k,\hat{n}}} = \varepsilon_{{k,\hat{n} - 1}} - {\mathbf{v}}_{{k,\hat{n} - 1}}^{{{\text{cov}}}} T,\quad {\mathbf{v}}_{{k,\hat{n}}}^{{{\text{cov}}}} = {\mathbf{v}}_{{\text{C}}} \left( {\frac{{\varepsilon_{{k,\hat{n}}} }}{d}} \right)^{\kappa } \hfill \\ \end{gathered} \right.,$$
(24)

where \(\hat{n}\) indicates the prediction horizon, defined here as the duration of the following three steps for the online MPC.

Then the desired velocity trajectory can be given by:

$${\tilde{\mathbf{v}}}_{k,i} = {\tilde{\mathbf{v}}}_{k + i}^{{{\text{ref}}}} + {\mathbf{v}}_{k,i}^{{{\text{cov}}}} ,\quad i = 1, \ldots ,\hat{n}.$$
(25)

Considering the actual CoM trajectory on both sides of the long-term trajectory, the y-coordinates on the left side of the long-term trajectory are defined as positive, and negative on the right side. The vector field trajectory tracking in the xy plane is illustrated in Fig. 6.

The reference velocity (on the x axis) is assumed to be constant. The black arrows represents the desired velocity attraction within the convergence space, which is represented by the gray area in the figure. The red dots indicate two examples of disturbed CoM positions. With the guidance of the velocity vector field, the new CoM trajectories (blue lines) are re-generated as the further updated reference for reactive online MPC planning.

5 Results

The proposed reactive motion planning approach is conducted to demonstrate the adaptability and robustness of the generated locomotion using the nonlinear programming solver Knitro [24]. Firstly, based on previous work on the motion parameters of the legged robot, we have assumed a set of duration parameters for motion planning implementation, as shown in Table 1. In addition, the nominal height of the CoM is defined as 0.54 m according to the height of the hip joint.

Table 1 Related duration value for motion planning implementation

A series of simulations were carried out to evaluate the performance of the walking gait, jumping gait, and the reaction to external disturbances. It should be noted that the optimization model for bipedal walking and quadrupedal trotting are identical, except for the addition of a CoP constraint for trotting. Therefore, a trotting gait is used in this work. Here, “walking gait” refers to a trot without four-feet stance phase, and “jumping” means a flying trot.

5.1 Walking and Jumping Gaits

Firstly, we carried out walking and jumping based on the proposed approach. Figure 7 depicts the results of the motion planning for walking trot on flat terrain from (0, 0) to (2.0, 0). The red line represents the CoM trajectory, on which the CoM positions (red dots) are plotted at each sampling instant. The virtual legs of the point-mass model are indicated by blue line segments at each sampling interval, while the green dots on the ground (orange translucent plane) represent footholds and the pink dots indicate the CoP.

Fig. 7
figure 7

Reactive motion planning for walking gait on flat terrain from (0, 0) to (2.0, 0)

As shown in Fig. 7, the height of the CoM (0.520–0.564 m) remains relatively constant. Significant changes in the height of the CoM occur during the start and stop periods, with a fluctuation in the range of about ± 3.7%. The performance of the generic point-mass model is found to be similar to that of the fixed height linear inverted pendulum. Furthermore, note that the fluctuation in the y-direction is less than 2.8 cm, which is caused by the diagonal CoP constraint.

Figure 8 shows the jumping results of the reactive motion planning on sloped terrain. Set the duration of the support and aerial phases as shown in Table 1. In comparison to the walking gait, the jumping gait differs in that the acceleration of the CoM is constrained by the gravitational acceleration in the aerial phase. The proposed approach is able to generate a jumping trot with an aerial phase (represented by the red dotted curves).

Fig. 8
figure 8

Reactive motion planning for jumping gait on a slope from (0, 0) to (2.5, 0)

It is observed that the jumping trot gait exhibits a spring-like property, similar to that of the spring-loaded inverted pendulum, without the need for an explicit, parametric spring-loaded pendulum model, which is another classic simplified model of legged robots. The proposed approach possesses the capability to achieve the dynamic behavior of both the LIP model and the spring-loaded inverted pendulum model.

It is worth noting that in both Figs. 7 and 8, the positions of the CoP are close to the corresponding footholds. As a result, in the absence of an apparent lateral GRF applied from the ground, the CoM trajectory does not exhibit significant deviation in the lateral plane, and the acceleration of the CoM is primarily in the forward direction. This explains the observed variations in the height of the CoM during the start and the end phases as a means of compensating for the acceleration and deceleration shown in Fig. 7.

5.2 Gait Transitions on Changing Terrain

The second validation is to traverse over a specific terrain with a step height greater than the robot's climbing ability. As illustrated in Fig. 9, the height is 0.23 m, which is approximately half of the nominal CoM. Using a walking gait would not be feasible in this scenario, so an alternate solution, such as jumping, is proposed as a method of traversal. This approach is akin to the jumping gait employed by animals in similar situations.

Fig. 9
figure 9

Gait transitions to traverse over uneven terrain at the height. The gait is adopted based on walking on a flat surface, and the jumping gait naturally emerged through optimization while walking no longer negotiates with the significant terrain height

When the terrain height changes, it is reasonable to use a jumping gait instead of a walking gait. The traversal performance validates that the proposed approach provides the ability to adapt to vertical surfaces. However, as a proof of concept, the timing of the gait transition here is set manually. This means that the position of each foothold in the online optimization cannot be too far away from the long-term one.

5.3 Obstacle Avoidance and Robustness to Disturbances

In the third validation, the proposed approach was employed to track the long-term optimal solution online to evaluate its performance of obstacle avoidance and reactive motion under several unexpected disturbances. These disturbances were simulated as a constant push applied to the CoM acceleration at a magnitude of 0.3 m/s2, and were introduced during specific steps of the walking gait, specifically, steps 1st–3rd, 10th, 20th, 30th and 35th–36th with each step time of 0.5 s.

The results, as depicted in Fig. 10, demonstrate the reactive performance of obstacle avoidance, in the scenario of dynamic environmental changes that were not considered during the off-line optimization.

Fig. 10
figure 10

Online reactive gait in presence of obstacles and external pushes: the black line is the reference CoM trajectory (long-term trajectory) generated by the off-line optimization; the red line is the online CoM trajectory (reactive trajectory) under unexpected disturbances

As shown in Fig. 10, a horizontal cylinder without black meshes was introduced as a new obstacle. This was used to demonstrate the reactive performance of obstacle avoidance (see zoomed side view of the adaptation of height), in the scenario of dynamic environmental changes that were not considered during the off-line optimization. It can be observed that the CoM trajectories and associated footholds generated both online and off-line can effectively avoid obstacles based on the explicit environmental constraints formulated in Eqs. (16) and (17).

The proposed reactive motion planning approach demonstrates the ability to maintain approximate long-term optimal walking performance even under multiple disturbances. This is shown by the alignment of the reactive CoM trajectory (represented by the red line) with the long-term trajectory (represented by the black line). For clarification, the velocity and position trajectories of the CoM are demonstrated separately in Fig. 11.

Fig. 11
figure 11

Comparison of the reference and actual CoM velocity and position from the off-line and reactive planning under the external disturbances. The black line indicates the reference velocity and position generated off-line, and the red line is the reactive actual trajectories optimized online

The duration of the external disturbances is indicated by the gray-shaded rectangles, with positive and negative disturbances represented by purple + and − signs, respectively. The original long-term motion was planned for 30 steps. In contrast, the actual number of the total steps in the reactive walking motion was increased to 40, as the total duration of the reactive online trajectory was extended due to the external pushes. It should be emphasized that the number of steps is not fixed and may vary depending on whether or not the robot reaches its target.

The reactive MPC motion planning demonstrates the ability to effectively follow a spatial motion trajectory despite temporal deviations. Specifically, when the robot is subject to backward pushes in the 1st and 2nd steps, the reactive CoM velocity deviates significantly from the long-term one. However, through the use of an online MPC strategy that optimize the footholds and the CoM trajectory simultaneously and utilizing a velocity reference vector generated by the vector field as in Eq. (25), the approach is able to effectively follow the spatial motion trajectory with small deviations.

When the long-term trajectory is shifted by approximately 2 s later, the planned long-term trajectory and the reactive planning exhibit an overall match. This demonstrates the ability of the proposed approach to follow the spatial motion regardless of time. It should be emphasized that the desired velocity in Eq. (22) is not an explicit function of time. Instead, it is a constantly updated reference velocity vector based on the long-term trajectory. The proposed approach utilizes a reactive strategy based on the CoM spatial deviation, which serves to alleviate the computational demands associated with time-based re-planning. This feature of the approach serves to emphasize the efficacy of the reactive approach, as it demonstrates robustness against disturbances while simultaneously exhibiting computational efficiency.

In addition, the proposed motion planning was further implemented on a hydraulic quadruped prototype, EHbot. The footholds of the reduced model were mapping onto the diagonal feet of the quadruped robot. An experiment was performed using the previously developed controller [25, 26]. As depicted in Fig. 12, EHbot was able to securely navigate from the original location to its desired target location of (3.5 m, 0.5 m) while avoiding an obstacle centered at (1.0 m, 0.25 m). Currently, the states of the EHbot were estimated using extended Kalman filter (EKF) algorithms. However, due to the inherent noise and drift in the system, the estimated states did not exactly match the real states, especially for the CoM position.

Fig. 12
figure 12

Motion capture system and the experimental situation of the quadruped robot moving toward the target with an obstacle

In order to capture the true position of the EHbot’s CoM for comparison, motion analysis were employed to capture the actual states of the quadruped in real time. The results are depicted in Fig. 13, where the estimated and captured CoM trajectories of the EHbot are compared.

Fig. 13
figure 13

The estimated (blue dashed line) and captured (red solid line) CoM trajectories of the quadruped robot

The blue dashed line and red solid lines indicate the estimated and captured CoM trajectories of the quadruped robot. The coordinates of the estimated and captured targets are (3.528 m, 0.5246 m) and (3.716 m, 0.5527 m), respectively. In comparison with the desired target at (3.5 m, 0.5 m), the deviation of the estimated target of CoM from the desired target is within 4 cm. The errors between the estimated and desired targets of the CoM along the x and y axes are 0.8% and 4.9%, respectively. However, due to the error accumulation of the CoM over time, the deviation of the captured target from the desired target is about 22 cm. This indicates that the deviation of the CoM target mainly arises from the error in the state estimation of the EKF algorithm (Fig. 14).

Fig. 14
figure 14

Snapshots from the trotting experiment on the quadrupedal prototype, EHbot

6 Conclusions

The proposed reactive motion planning approach simultaneously searches for approximate long-term optimal CoM trajectory and footholds online. The approach is based on a generic point-mass model that generates natural and adaptive walking patterns, such as walking, jumping, and gait transitions. A unified optimization formulation is utilized to solve the long-term motion as off-line planning, which is used to compute the reactive walking online via MPC and vector field guidance. However, there are several limitations. The approach does not consider the torso geometry and multiple feet for legged locomotion without collisions. Additionally, step duration and number of steps are treated as inputs rather than decision variables. Another issue not covered in this work is the automatic invoking of gait transition when terrain changes are detected, which could have significantly improved the adaptability to complex environmental scenarios in future work.