1 Introduction

The gait stability of a bipedal robot is largely determined by its ability to reject the effects of model uncertainties and unknown external disturbances. The rejection of model uncertainties has been extensively studied and addressed through approaches that include exhaustive gait pattern generation [7, 8, 12, 13], adaptive control \(\mathcal{L} _{1}\) [34], sliding mode control [11, 35], robust stabilization of periodic orbits [20, 21], and robust boundedness based on parameter uncertainties [28, 29]. Benefits in robustness have also been explored with the use of a motorized reaction wheel [3, 4]. Overall, these control techniques have achieved successful walking over controlled environments. However, they isolate the problem of dynamic uncertainties from the problem of external disturbances attenuation; thus, the external disturbances are not explicitly considered and their rejection is not guaranteed. Rejection of external disturbances (unknown forces and torques) has received less attention and fewer reports are found in the literature. Available methods make use of gait pattern generators based on heuristics [30, 36] and robust optimization [18, 27]. Conditions to achieve aperiodic gait patterns in the presence of persistent disturbances have been proposed in [40]. A recent development reported in [33] makes use of a robust nonlinear \(\mathcal{H}_{\infty }\)-controller that ensures internal asymptotic stability. Despite the effectiveness and conservative nature of this controller, their approach does not allow one to bound the control signals and the resulting peak torques may be unattainable in a physical implementation. Alternative approaches to addressing both model uncertainties and external disturbances have not been sufficiently explored.

This work introduces a hybrid disturbance rejection control strategy to reject both model uncertainties and external disturbances incorporating two key control elements: (i) a continuous model-based active disturbance rejection control (ADRC) and (ii) a discrete adaptive controller. The continuous model-based ADRC allows disturbance estimation and active rejection in the continuous domain of the robot dynamics. This proposed strategy extends the traditional (non-model-based) ADRC approach that uses a simplified model to estimate and reject a unified disturbance signal [9]. This traditional control method has been evaluated in variety of applications such as bipedal robots [1], gait exoskeletons [31], among other mechatronic systems [2, 5, 37]. Although the traditional ADRC approach has shown a robust, closed-loop behavior, it has also shown limitations in its closed-loop performance caused by the limited use of the available model information [16, 17, 42]. In contrast, the proposed model-based ADRC approach allows one to estimate and reject the effects of model uncertainties and external disturbances using all available model information.

The discrete adaptive controller in the proposed hybrid disturbance rejection control strategy resets the gait trajectories after each support-leg exchange. The trajectories are reset with a control law based on a nonlinear state observer, which estimates the states of the robot after the support-leg exchange. In this way, the design of the hybrid disturbance rejection control strategy considers external disturbances and model uncertainties in both continuous and discrete dynamics. The stability of the gait can be evaluated using a reduced-order dynamics, i.e., an extended hybrid zero dynamics (EHZD).

This paper is organized as follows: Sect. 2 describes the features of a dynamic bipedal robot and its mathematical model. Section 3 presents the proposed model-based ADRC for trajectory tracking. Section 4 proposes a discrete adaptive trajectory generation strategy based on virtual constraints; here, a technique to guaranty the zero dynamics invariance for walking under uncertainties is developed. Section 5.1 develops the EHZD, which includes the uncertainties and disturbances in the zero dynamics. Section 5.2 presents the EHZD-based stability test; here, the Poincaré return map is used to analyze the asymptotic orbital periodic stability of the walking under uncertainties. Section 6 contains the numerical evaluation of the proposed strategies. Section 7 presents the description of the Saurian testbed and shows the result of stable walking experiments. Section 8 summarizes the accomplishments of this work and draws recommendations for future developments.

2 Hybrid model

The gait of dynamic bipedal robots is mathematically described with a hybrid dynamic model. This model integrates continuous dynamics and discrete dynamics. The continuous dynamics takes place in the swing gait phase with a single support-leg. The discrete dynamics takes place in a double support phase during the support-leg exchange. The double support phase is considered instantaneous and the support leg-end has unilateral constraints [41]. These constraints allow one to study the contact between the support leg-end and the ground as a passive pivot. They also imply that the normal reaction force in the support leg-end is repulsive and that the tangential reaction does not produce slipping.

The bipedal robot developed in this work is designed with a torso and two identical legs (Fig. 1(a)). The robot has five rigid links that form a planar mechanism (Fig. 1(b)). Each leg has two links connected by a revolute (pin) joint that forms the knee. The leg-ends have a point-feet without an ankle.

Fig. 1
figure 1

Robot developed at the Control Laboratory in the National University of Colombia, Bogotá

Each of the robot’s legs is an open linkage of two rigid bodies: thigh and shin. Figure 2 shows the linkage corresponding to the left leg, which is identical to the one of the right leg. The location of the (left) hip and knee joints are represented by \(h_{l}\) and \(k_{l}\), respectively. In this figure, \(\tau _{hl}\) and \(\tau _{kl}\) represent the torques from the actuators on the hip and knee mechanisms, respectively. Here, the subindex \(l\) is used for the left side and the subindex \(r\) will be used for the right side. The hip and knee mechanisms have two four-bar mechanisms, each connected through a bi-directional spring arrangement with an equivalent torsional spring constant \(E\). The springs provides compliance to the robot’s dynamics, which allows (i) to isolate the actuator from the effect of the impacts produced during the support-leg exchange and (ii) to store the impact energy to be used in the propulsion of the next step. In total, the robot has 11 degrees of freedom (DOFs) distributed as follows:

\(k_{l}\)

Left knee joint

\(k_{r}\)

Right knee joint

\(h_{l}\)

Left hip joint

\(h_{r}\)

Right hip joint

\(q_{T}\)

Torso absolute angle

\(p_{1}^{h}\)

Horizontal position of the support leg-end

\(p_{1}^{v}\)

Vertical position of the support leg-end

\(k_{l}^{\prime }\)

Flexible left knee joint

\(k_{r}^{\prime }\)

Flexible right knee joint

\(h_{l}^{\prime }\)

Flexible left hip joint

\(h_{r}^{\prime }\)

Flexible right hip joint

Fig. 2
figure 2

Torque transmission mechanism

The Lagrange differential equation can be used to derive the mathematical model of the swing phase of a planar dynamic bipedal robot with rigid bodies and serial compliant actuation [39]. This model considers the springs effect as part of the input generalized forces; thus, the swing phase dynamics is defined by the Euler–Lagrange equation

$$ D_{s}(q_{s})\ddot{q}_{s}+C_{s}(q_{s}, \dot{q}_{s})\dot{q}_{s}+G_{s}(q _{s})= \varGamma _{s}, $$
(1)

where \(\varGamma _{s}\) is the vector of generalized forces and torques, \(q_{s}:={[\begin{array}{ccccc} h_{l}&h_{r}&k_{l}&k_{r}&q_{T} \end{array}]}^{\mathrm {T}}\) is the generalized coordinates vector shown in Fig. 1(b), \(D_{s}(q_{s})\) is the inertia matrix, \(C_{s}(q_{s}, \dot{q}_{s})\dot{q_{s}}\) is the vector of centripetal and Coriolis effects, and \(G_{s}(q_{s})\) is the vector of torques associated to the gravity.

In our model, the vector of generalized forces and torques \(\varGamma _{s}\) is defined as

$$ \varGamma _{s}=B_{s}(q_{s})u+K \bigl(q_{b}-q^{\prime }_{b}\bigr)+\delta (q_{s}, \dot{q}_{s})+\zeta (t), $$
(2)

where \(u\) is the vector of torque control inputs, the input matrix \(B_{s}(q_{s})\) defines how the control inputs \(u\) affect the controlled joints, the position vector \(q_{b}:={[\begin{array}{ccccc} h_{l}&h_{r}&k_{l}&k_{r} \end{array}]}^{\mathrm {T}}\) contains the controlled joints, \(q^{\prime }_{b}:={[\begin{array}{ccccc} h^{\prime }_{l}&h^{\prime }_{r}&k^{\prime }_{l}&k^{\prime }_{r} \end{array}]}^{\mathrm {T}}\) are the relative angles of the compliant transmission, \(\zeta (t)\) is the vector of the unknown external disturbances. In order to simplify the model, the compliance is modeled as an external force, instead of including it as part of the potential energy [39]. Thus, the interacting forces in the compliant transmission are considered in \(K(q_{b}-q^{\prime }_{b})\), where \(K\) is the matrix with the spring stiffness values \(E\), and \(\delta (q_{s},\dot{q}_{s})\) is a vector that models mismatching and parameter uncertainties. In this case, \(\delta (q_{s}, \dot{q}_{s})\) includes the unmodeled dynamics of the flexible joints.

The spring torques, model uncertainties, and external disturbances are lumped into a vector of total disturbance signals \(\gamma (x,t)\). In this way, the perturbed model can be expressed by rewriting (1) and (2) in the general input affine form

$$ \dot{x}=f(x) + g(x)u + \gamma (x,t), $$
(3)

where \(x:={\left [\begin{array}{cc} {q_{s}}^{\mathrm {T}}&{\dot{q}_{s}}^{\mathrm {T}} \end{array} \right ]}^{\mathrm {T}}\) is the state space vector in \(\mathbb{R}^{n} \),

$$ f(x):=\left [ \textstyle\begin{array}{c} \dot{q}_{s} \\ D_{s}(q_{s})^{-1} [-C_{s}(q_{s},\dot{q}_{s})\dot{q}_{s}-G_{s}(q _{s}) ] \end{array}\displaystyle \right ], $$
(4)
$$ g(x):=\left [ \textstyle\begin{array}{c} 0 \\ D_{s}(q_{s})^{-1}B_{s} \end{array}\displaystyle \right ], $$
(5)

and

$$ \gamma (x,t):=\left [ \textstyle\begin{array}{c} 0 \\ D_{s}(q_{s})^{-1} [K(q_{b}-q^{\prime }_{b})+\delta (q_{s}, \dot{q}_{s})+\zeta (t) ] \end{array}\displaystyle \right ]. $$
(6)

The support-leg exchange is assumed to be an inelastic impact that takes place during an instantaneous double-leg support event [24]. In our model, this impact produces sudden changes in the angular velocities of the joints and triggers the reset function

$$ x^{+} := \Delta \bigl(x^{-}\bigr)+\gamma _{\Delta }, $$
(7)

where \(x^{-}\) and \(x^{+}\) are the state variables just before and after the support-leg exchange, respectively, \(\gamma _{\Delta }\) models the uncertainties in the reset function, and

$$ \Delta \bigl(x^{-}\bigr):=\left [ \textstyle\begin{array}{c} \Delta _{q} (q_{s}^{-} ) \\ \Delta _{\dot{q}} (q_{s}^{-} )\dot{q}_{s}^{-} \end{array}\displaystyle \right ], $$
(8)

where \(\Delta _{q} (q_{s}^{-} )\) represents a rearrangement of the position vector \(q_{s}\) and \(\Delta _{\dot{q}} (q_{s}^{-} ) \dot{q}_{s}^{-}\) represents the angular velocity changes.

The hybrid model considers the continuous model (3) and the discrete reset function (7) as

$$ \textstyle\begin{array}{c} \varSigma :\left \{ \textstyle\begin{array}{l@{\quad}l} \dot{x}=f(x) + g(x)u + \gamma (x,t), & x^{-} \notin \mathcal{S}, \\ x^{+} =\Delta (x^{-})+\gamma _{\Delta }, & x^{-}\in \mathcal{S}, \end{array}\displaystyle \right . \end{array} $$
(9)

where the switching set is defined as

$$ {\mathcal{{S}}}:=\left \{ {\left [\textstyle\begin{array}{c@{\quad}c} {q_{s}}^{\mathrm {T}}&{\dot{q}_{s}}^{\mathrm {T}} \end{array}\displaystyle \right ]}^{\mathrm {T}} \in \mathbb{R}^{n} \, | \, p_{2}^{v}(q_{s})=d, \ \dot{p}_{2}^{v}(q_{s}, \dot{q}_{s})< 0\right \}, $$
(10)

where \(p_{2}^{v}(q_{s})\) is the vertical Cartesian position of the swing leg-end, and \(d\) is the terrain height, which under nominal conditions is \(d=0\).

3 Model-based active disturbance rejection control (ADRC) for trajectory tracking

In order to provide robustness to the control of continuous dynamics against unknown model uncertainties and external disturbances, active disturbance rejection control (ADRC)-based tracking has been successfully utilized [10, 23]. The ADRC-based tracking collects both endogenous (state-dependent) disturbances and exogenous (external force-dependent) disturbances into a lumped signal referred to as the total disturbance. The total disturbance is treated as an unknown bounded signal with \(m\) continuous and bounded derivatives [37]. The core component of the ADRC-based tracking is the design of an extended state observer (ESO) that estimates the total disturbance, which is, then, actively rejected through feedback control [23]. With this approach, the nonlinearities of the system are represented in a simplified model, affine in the control input, with a chain of integrators and the total disturbance. This has shown to handle differences between the dynamics of the physical robot and its mathematical model, driving the tracking errors to small, acceptable values [17]. Unfortunately, neglecting the system nonlinearities drastically reduces the performance of the closed-loop system.

In this work, the trajectory tracking incorporates a model-based ADRC for hybrid dynamical systems that considers all the known system nonlinearities. In this approach, a nonlinear ESO estimates the total disturbance as well as the state variables. The design of the proposed model-based ADRC is divided into three stages. First, a local coordinate transformation is performed to express the robot model into a normal form. Second, a nonlinear extended state observer (NESO) is designed to estimate the total disturbances in the robot. Finally, a feedback control law is proposed to perform an active cancellation of the disturbances.

3.1 Local coordinate transformation

A local coordinate transformation is proposed to express the model of the robot into a normal form. To this end, the controlled output vector \(h(x)\) (tracking error) is expressed as a function of the control-input vector \(u\) (torque). This transformation formulates an explicit expression for the underactuated dynamics and decomposes the model into a reachable part and an unreachable part. The transformation also reveals important properties of the model such as its relative degree. In this way, the output vector, which is a function of the generalized coordinates \(q_{s}\), is defined as

$$ y:=h(x)=q_{d}(q_{s})-q_{b}, $$
(11)

where \(q_{b}\) is the vector of controlled joints and \(q_{d}(q_{s})\) is the vector of target trajectories. In order to express the output vector as a function of the control-input vector, successive time differentiations of (11) are performed until the control-input terms are explicit. This is,

$$\begin{aligned} \frac{dy}{dt} &=\frac{\partial h}{\partial x}\dot{x} \end{aligned}$$
(12)
$$\begin{aligned} &=\left [ \textstyle\begin{array}{c@{\quad}c} \frac{\partial h}{\partial q_{s}}&\frac{\partial h}{\partial \dot{q} _{s}} \end{array}\displaystyle \right ] \bigl[f(x) + g(x)u + \gamma (x,t) \bigr] \end{aligned}$$
(13)
$$\begin{aligned} &=\nabla hf(x)+\nabla hg(x)u+\nabla h\gamma (x,t) \end{aligned}$$
(14)
$$\begin{aligned} &=L_{f}h+L_{g}hu+L_{\gamma }h, \end{aligned}$$
(15)

where \(h\) is the short expression for \(h(x)\) and \(L_{f}h\), \(L_{g}h\), \(L_{\gamma }h\), are the Lie derivatives of \(h\) along the vector fields \(f\), \(g\), and \(\gamma \), respectively. Given that \(h\) is independent of \(\dot{q}_{s}\), then

$$ \frac{\partial h}{\partial \dot{q}_{s}}=0, $$
(16)

therefore,

$$ \nabla h=\left [ \textstyle\begin{array}{c@{\quad}c} \frac{\partial h}{\partial q_{s}}&0 \end{array}\displaystyle \right ]. $$
(17)

Based on the structure of (5), (6) and (17), the Lie derivatives of \(h\) along \(g\) and \(\gamma \) are

$$ L_{g}h = 0, \hspace{1em} \mbox{and} \hspace{1em} L_{\gamma }h = 0. $$

Then the time derivative of the output can be expressed as

$$ \frac{dy}{dt}=L_{f}h. $$
(18)

Given that (18) is independent of \(u\), an additional time derivative of the output is applied,

$$\begin{aligned} \frac{d^{2}y}{dt^{2}} &= \left [ \textstyle\begin{array}{c@{\quad}c} \frac{\partial }{\partial q_{s}} (\frac{\partial h}{\partial q _{s}}\dot{q}_{s} )&\frac{\partial h}{\partial q_{s}} \end{array}\displaystyle \right ] \bigl[f(x) + g(x)u + \gamma (x,t) \bigr], \end{aligned}$$
(19)
$$\begin{aligned} &= L^{2}_{f}h+L_{g}L_{f}hu+L_{\gamma }L_{f}h, \end{aligned}$$
(20)

where \(L^{2}_{f}h:=L_{f} (L_{f}h )\), \(L_{g}L_{f}h=\frac{ \partial h}{\partial q_{s}}D_{s}(q_{s})^{-1}B_{s}\) is a known decoupling matrix that is locally invertible [41], and \(L_{\gamma }L_{f}h\) is a vector with the lumped total disturbance signals. The input-state interaction found in (20) implies that the robot has a relative degree equal to the sum of the relative degrees associated to each output, which are defined by the number of time derivatives required to make the control output a function of the control signal. That is, \(r=r_{1} + \cdots + r_{k}\), where \(k\) is the number of controlled joints. The relative degree for the bipedal robot considered in this work, which has one degree of underactuation, is \(r=2k=n-2\), where \(n\) is the number of state variables in the swing phase. After the definition of the relative degree \(r\), it is possible to define a mapping

$$ \varPhi (x):={\left [ \textstyle\begin{array}{ccccccccc} \phi _{1,1}(x),&\cdots ,&\phi _{1,k}(x),&\phi _{2,1}(x),&\cdots ,&\phi _{2,k}(x),&\phi _{r+1}(x),&\cdots ,&\phi _{n}(x) \end{array}\displaystyle \right ]}^{\mathrm {T}}, $$
(21)

such that the Jacobian matrix of \(\varPhi (x)\) at the equilibrium point \(x^{\circ }\) is nonsingular. Then \(\varPhi (x)\) is a local coordinate transformation of (3), that is, it is locally invertible in the neighborhood of \(x^{\circ }\) [26, Sect. 4.1].

Let us define the first \(r\) coordinate transformation functions as

$$ \xi =\left [ \textstyle\begin{array}{c} \xi _{1,1} \\ \vdots \\ \xi _{1,k} \\ \xi _{2,1} \\ \vdots \\ \xi _{2,k} \\ \end{array}\displaystyle \right ]:=\left [ \textstyle\begin{array}{c} \phi _{1,1}(x) \\ \vdots \\ \phi _{1,k}(x) \\ \phi _{2,1}(x) \\ \vdots \\ \phi _{2,k}(x) \\ \end{array}\displaystyle \right ]:=\left [ \textstyle\begin{array}{c} h_{1}(x) \\ \vdots \\ h_{k}(x) \\ L_{f}h_{1}(x) \\ \vdots \\ L_{f}h_{k}(x) \end{array}\displaystyle \right ], $$
(22)

where \(\xi \) is the vector of the first \(r\) state variable of the transformation.

Since \(n-r=2\), it is possible to define two control-input independent functions, \(\phi _{r+1}(x)\) and \(\phi _{r+2}(x)\), to complete the transformation, such that

$$ L_{g}\phi _{r+1}(x)=0, \hspace{1em} \mbox{and} \hspace{1em} L_{g}\phi _{r+2}(x)=0. $$

Following [41], these two functions can be defined as,

$$ \eta =\left [ \textstyle\begin{array}{c} \eta _{1} \\ \eta _{2} \end{array}\displaystyle \right ]:=\left [ \textstyle\begin{array}{c} \phi _{r+1}(x) \\ \phi _{r+2}(x) \end{array}\displaystyle \right ]:=\left [ \textstyle\begin{array}{c} \varTheta (q_{s}) \\ D_{n}(q_{s})\dot{q_{s}} \end{array}\displaystyle \right ], $$
(23)

where \(\eta \) is the vector of the last \(n-r\) state variable of the transformation, \(\varTheta (q_{s})\) is the angle between the ground and the virtual link that connects the support leg-end with the hip (see Fig. 1(b)).

The transformation defined by (22) and (23) is invertible. This is

$$ x=\left [ \textstyle\begin{array}{c} q_{s} \\ \dot{q}_{s} \end{array}\displaystyle \right ]=\varPhi ^{-1}( \xi ,\eta ). $$
(24)

For the specific case of our robot,

$$ q_{s}=\left [ \textstyle\begin{array}{c@{\quad}c@{\quad}c@{\quad}c@{\quad}c} 1&0&0&0&0 \\ 0&1&0&0&0 \\ 0&0&1&0&0 \\ 0&0&0&1&0 \\ -1&0&-\frac{1}{2}&0&1 \end{array}\displaystyle \right ]\left [ \textstyle\begin{array}{c} q_{d,1}-\xi _{1,1} \\ q_{d,2}-\xi _{1,2} \\ q_{d,3}-\xi _{1,3} \\ q_{d,4}-\xi _{1,4} \\ \eta _{1} \end{array}\displaystyle \right ]+ \left [ \textstyle\begin{array}{c} 0 \\ 0 \\ 0 \\ 0 \\ \frac{\pi }{2} \end{array}\displaystyle \right ] $$
(25)

and

$$ \dot{q}_{s}=\left [ \textstyle\begin{array}{c} \frac{dh}{dq_{s}} \\ D_{n}(q_{s}) \end{array}\displaystyle \right ]^{-1}\left [ \textstyle\begin{array}{c} \xi _{2,1} \\ \xi _{2,2} \\ \xi _{2,3} \\ \xi _{2,4} \\ \eta _{2} \end{array}\displaystyle \right ]. $$
(26)

The definition of the robot model considers disturbances that cannot be decoupled from the system dynamics; therefore, the Lie derivatives of \(L_{f}h\), and \(\phi _{r+2}(x)\) along \(\gamma \) are different from zero,

$$ L_{\gamma }L_{f}h \neq 0, \hspace{1em} \mbox{and} \hspace{1em} L_{\gamma }\phi _{r+2}(x)\neq 0. $$

Notably, since \(\phi _{r+1}(x)\) is a function of the generalized coordinates only, then \(L_{\gamma }\phi _{r+1}(x)\) is equal to zero.

In order to model the disturbances, let us assume that each term of the vector \(L_{\gamma }L_{f}h\) can be locally approximated by a self-updated time-polynomial \(m-1\) degree. Then the \(m\)th time derivative of \((L_{\gamma }L_{f}h )_{j}\) vanishes,

$$ \biggl(\frac{d^{m}}{dt^{m}} \bigl(L_{\gamma }L_{f}h \bigl(\varPhi ^{-1}( \xi ,\eta ) \bigr) \bigr) \biggr)_{j}\approx 0. $$
(27)

In the same way, let us assume that \(L_{\gamma }\phi _{r+2}(x)\) can be locally approximated by a constant, then

$$ \frac{d}{dt} \bigl( L_{\gamma }\phi _{r+2}(x) \bigr)\approx 0. $$
(28)

The aforementioned assumptions are valid in the signal processing framework; thus, \((L_{\gamma }L_{f}h )_{j}\) and \(L_{\gamma }\phi _{r+2}(x)\) can be locally estimated in a small time-window around the current time. Based on the above assumptions (27) and (28), one can define \(L_{\gamma }L_{f}h\), its corresponding \(m-1\) time derivatives, and \(L_{\gamma }\phi _{r+2}(x)\) as extended state variables of the model,

$$\begin{aligned} z_{j}=\left [ \textstyle\begin{array}{c} z_{1,j} \\ z_{2,j} \\ \vdots \\ z_{m,j} \end{array}\displaystyle \right ] &:=\left [ \textstyle\begin{array}{c} (L_{\gamma }L_{f}h )_{j} \\ (\frac{d}{dt} (L_{\gamma }L_{f}h ) )_{j} \\ \vdots \\ (\frac{d^{m-1}}{dt^{m-1}} (L_{\gamma }L_{f}h ) ) _{j} \end{array}\displaystyle \right ], \end{aligned}$$
(29)
$$\begin{aligned} \rho &:=L_{\gamma }\phi _{r+2}(x), \end{aligned}$$
(30)

for all \(j\in \{1,\ldots ,k\}\).

In an extended normal form, the continuous dynamics (3) can be expressed as a function of the state variables \(\xi \), \(\eta \), \(z_{j}\), and \(\rho \) as

$$\begin{aligned} &\left . \textstyle\begin{array}{rcl} \dot{\xi }_{1,j}&=&\xi _{2,j} \\ \dot{\xi }_{2,j}&=&\varphi _{j}(\xi ,\eta )+z_{1,j} \\ \dot{z}_{1,j}&=&z_{2,j} \\ \dot{z}_{2,j}&=&z_{3,j} \\ &\vdots& \\ \dot{z}_{m,j}&=& (\frac{d^{m}}{dt^{m}} (L_{\gamma }L_{f}h (\varPhi ^{-1}(\xi ,\eta ) ) ) )_{j} \end{array}\displaystyle \right \}\forall j, \end{aligned}$$
(31)
$$\begin{aligned} &\left [ \textstyle\begin{array}{c} \dot{\eta }_{1} \\ \dot{\eta }_{2} \\ \dot{\rho } \end{array}\displaystyle \right ]= \left [ \textstyle\begin{array}{l} L_{f}\phi _{r+1} (\varPhi ^{-1}(\xi ,\eta ) ) \\ L_{f}\phi _{r+2} (\varPhi ^{-1}(\xi ,\eta ) )+\rho (t) \\ \frac{d}{dt}\rho (t) \end{array}\displaystyle \right ], \end{aligned}$$
(32)
$$\begin{aligned} & y= h \bigl(\varPhi ^{-1}(\xi ,\eta ) \bigr), \end{aligned}$$
(33)

where

$$ \varphi _{j} (\xi ,\eta )= \bigl(L^{2}_{f}h \bigl(\varPhi ^{-1}(\xi ,\eta ) \bigr) \bigr)_{j}+ \bigl(L_{g}L_{f}h \bigl(\varPhi ^{-1}(\xi ,\eta ) \bigr)u \bigr)_{j}. $$

In Sect. 3.2, the extended model described by (31), (32), and (33) is used to derive a nonlinear extended state observer that estimates the state variables and the total disturbances.

3.2 Nonlinear extended state observer

An estimation of the state variables \(\xi \), \(\eta \), \(z_{j}\), and \(\rho \) can be found with the use of a nonlinear extended state observer (NESO). The NESO design is based on the extended state model proposed in the local coordinate transformation (31), (32), and (33). Then, let us define the NESO as

$$\begin{aligned} &\left . \textstyle\begin{array}{rcl} \dot{\hat{\xi }}_{1,j}&=&\hat{\xi }_{2,j}+l_{m+1,j}(\xi _{1,j}- \hat{\xi }_{1,j}), \\ \dot{\hat{\xi }}_{2,j}&=&\varphi _{j}(\hat{\xi },\hat{\eta })+\hat{z} _{1,j}+l_{m,j}(\xi _{1,j}-\hat{\xi }_{1,j}), \\ \dot{\hat{z}}_{1,j}&=&\hat{z}_{2,j}+l_{m-1,j}(\xi _{1,j}-\hat{\xi } _{1,j}), \\ \dot{\hat{z}}_{2,j}&=&\hat{z}_{3,j}+l_{m-2,j}(\xi _{1,j}-\hat{\xi } _{1,j}), \\ &\vdots& \\ \dot{\hat{z}}_{m,j}&=&l_{0,j}(\xi _{1,j}-\hat{\xi }_{1,j}), \\ \end{array}\displaystyle \right \}\forall j, \end{aligned}$$
(34)
$$\begin{aligned} &\left [ \textstyle\begin{array}{c} \dot{\hat{\eta }}_{1} \\ \dot{\hat{\eta }}_{2} \\ \dot{\hat{\rho }} \end{array}\displaystyle \right ]= \left [ \textstyle\begin{array}{l} L_{f}\phi _{r+1} (\varPhi ^{-1}(\hat{\xi },\hat{\eta }) )+\alpha _{2}(\eta _{1}-\hat{\eta }_{1}) \\ L_{f}\phi _{r+2} (\varPhi ^{-1}(\hat{\xi },\hat{\eta }) )+ \hat{\rho }+\alpha _{1}(\eta _{1}-\hat{\eta }_{1}) \\ \alpha _{0}(\eta _{1}-\hat{\eta }_{1}) \end{array}\displaystyle \right ], \end{aligned}$$
(35)

where \(l\) and \(\alpha \) are the observer gains and the \(\hat{\xi }\) is the estimate of \(\xi \) (same for \(z\), \(\eta \), and \(\rho \)). Let us define the estimation error vectors as follows:

$$\begin{aligned} \hat{e}_{j}&=\left [ \textstyle\begin{array}{c} \xi _{1,j}-\hat{\xi }_{1,j} \\ \xi _{2,j}-\hat{\xi }_{2,j} \\ z_{1,j}-\hat{z}_{1,j} \\ \vdots \\ z_{m,j}-\hat{z}_{m,j} \\ \end{array}\displaystyle \right ]\forall j, \end{aligned}$$
(36)
$$\begin{aligned} \hat{\varrho }&=\left [ \textstyle\begin{array}{c} \eta _{1}-\hat{\eta }_{1} \\ \eta _{2}-\hat{\eta }_{2} \\ \rho -\hat{\rho } \end{array}\displaystyle \right ]. \end{aligned}$$
(37)

Now, subtracting the observer equations (34) and (35) from the extended model equations (31) and (32), it is possible to write the error estimation dynamics as

$$\begin{aligned} \dot{\hat{e}}_{j}&=\hat{A}_{j}\hat{e}_{j}+ \hat{B} \biggl(\frac{d^{m} (L_{\gamma }L_{f}h (\varPhi ^{-1}(\xi ,\eta ) ) )}{dt ^{m}} \biggr)_{j} + \hat{F}, \hspace{3mm} \forall j, \end{aligned}$$
(38)
$$\begin{aligned} \dot{\hat{\varrho }}&=\tilde{A}\hat{\varrho }+\tilde{B} \frac{d\rho }{dt} + \tilde{F}, \end{aligned}$$
(39)

where

$$\begin{aligned} \hat{A}_{j}&= \begin{bmatrix} -l_{m+1,j} & 1 & 0 &\cdots & 0 \\ -l_{m,j} & 0 & 1 &\cdots & 0 \\ \vdots &\vdots &\vdots &\ddots &\vdots \\ -l_{1,j} & 0 & 0 &\cdots & 1 \\ -l_{0,j} & 0 & 0 &\cdots & 0 \end{bmatrix} , \qquad \hat{B}= \begin{bmatrix} 0 \\ 0 \\ \vdots \\ 0 \\ 1 \end{bmatrix} , \qquad \hat{F}= \begin{bmatrix} 0 \\ 1 \\ 0 \\ \vdots \\ 0 \end{bmatrix} \bigl(\varphi ^{j} (\xi ,\eta ) -\varphi ^{j} (\hat{\xi }, \hat{\eta } ) \bigr), \\ \tilde{A}&= \begin{bmatrix} -\alpha _{2} & 1 & 0 \\ -\alpha _{1} & 0 & 1 \\ -\alpha _{0} & 0 & 0 \\ \end{bmatrix} , \qquad \tilde{B}= \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix} , \\ \tilde{F}&= \begin{bmatrix} L_{f}\phi _{r+1} (\varPhi ^{-1}(\xi ,\eta ) )-L_{f}\phi _{r+1} (\varPhi ^{-1}(\hat{\xi },\hat{\eta }) ) \\ L_{f}\phi _{r+2} (\varPhi ^{-1}(\xi ,\eta ) )-L_{f}\phi _{r+2} (\varPhi ^{-1}(\hat{\xi },\hat{\eta }) ) \\ 0 \end{bmatrix} . \end{aligned}$$

Note that the difference between \(\varphi _{j}(\xi ,\eta )\) and \(\varphi _{j}(\hat{\xi },\hat{\eta })\) are treated as disturbances in the estimation dynamics; thus, their effects are attenuated by the right selection of the observer gains. Similar effects can be accomplished with the differences

$$ L_{f}\phi _{r+1}\bigl(\varPhi ^{-1}(\epsilon ,\eta )\bigr)- L_{f}\phi _{r+1}\bigl(\varPhi ^{-1}( \hat{ \xi },\hat{\eta })\bigr) $$

and

$$ L_{f}\phi _{r+2}\bigl(\varPhi ^{-1}(\xi ,\eta ) \bigr)- L_{f}\phi _{r+2}\bigl(\varPhi ^{-1}( \hat{\xi },\hat{\eta })\bigr). $$

The observer constants \(l\) and \(\alpha \) must be appropriately selected in order to make the matrices \(\hat{A}_{j},\forall j\), and \(\tilde{A}\) Hurwitz and to achieve asymptotic estimation of the states and the disturbances. In order to guarantee a small estimation error before the support leg exchange, the convergence rate must be selected such that the estimation error converges to a small value close to zero before the next support-leg exchange. To achieve this task, it is convenient to take into account that the convergence rate of the NESO is defined by the observer gains \(l\) and \(\alpha \), and that the average time period of the swing phase can be estimated using the target forward speed and the target step length.

3.3 Feedback control law

Let us use a feedback linearization strategy to design the control law for the model described in (20). This strategy is observer-based and includes a state feedback as well as the disturbance estimation. This is

$$\begin{aligned} u &=u^{*}+v, \end{aligned}$$
(40)
$$\begin{aligned} u^{*} &=- [L_{g}L_{f}h ]^{-1} \bigl[L^{2}_{f}h \bigr], \end{aligned}$$
(41)
$$\begin{aligned} v &=- [L_{g}L_{f}h ]^{-1} [K_{D}\hat{ \xi }_{\mathit{low}}+K _{P}h +\hat{z}_{\mathit{low}} ], \end{aligned}$$
(42)

where

$$\begin{aligned} &\hat{\xi }_{\mathit{low}}={\left [\textstyle\begin{array}{c@{\quad}c@{\quad}c@{\quad}c}\hat{\xi }_{2,1}&\hat{\xi }_{2,2}&\cdots &\hat{\xi }_{2,k}\end{array}\displaystyle \right ]}^{\mathrm {T}}, \\ &\hat{z}_{\mathit{low}}={\left [\textstyle\begin{array}{c@{\quad}c@{\quad}c@{\quad}c}\hat{z}_{1,1}&\hat{z}_{1,2}&\cdots &\hat{z}_{1,k}\end{array}\displaystyle \right ]}^{\mathrm {T}}, \end{aligned}$$

and \(K_{D}, K_{P}\) are diagonal positive definite matrices that tune the gains of the feedback control.

Substituting (40) into (20), the closed-loop system takes the form

$$ \frac{d^{2}h}{dt^{2}} +K_{D}\hat{\xi }_{\mathit{low}}+K_{P}h =L_{\gamma }L_{f}h- \hat{z}_{\mathit{low}}. $$
(43)

Considering accurate estimations of \(\hat{\xi }_{\mathit{low}}\) and \(\hat{z} _{\mathit{low}}\), the estimation errors asymptotically converge to zero. Then the closed-loop system dynamics is dominated by the differential equation

$$ \frac{d^{2}h}{dt^{2}} +K_{D}\frac{dh}{dt}+K_{P}h \approx 0. $$
(44)

Finally, the control gain matrices \(K_{D}\) and \(K_{P}\) can be arbitrarily selected such that (44) is stable and satisfies a desired convergence rate.

4 Discrete adaptive controller for trajectory generation

In order to provide robustness to the walking in the discrete dynamics, a reset control law is proposed to generate an adaptive trajectory that ensures zero tracking error after the support-leg exchange. The proposed trajectory generator uses a reset control law to restart the gait trajectories, driving the tracking error to zero even when walking over unknown terrain. While an event-based controller can be implemented to update the reference, such approach requires a repository of gait patterns [14, 19]. Instead, this work proposes a smooth transition function from the post-impact states to a nominal trajectory. This smooth transition benefits the state variables estimation because jumps in the controlled joints are avoided and the state estimation remains in an invariant set [38, p. 68]. The trajectories design is divided into two parts. The first part is the design of a nominal reference trajectory through the use of hybrid zero dynamics. The second part is the design of a reset control law to perform the smooth transition from the post-impact state to the reference nominal trajectory.

4.1 Design of the nominal reference trajectory

To achieve a stable gait, the controlled joints follow a set of nominal reference trajectories, which are synchronized with the underactuated degree of freedom through a feedback control loop. The reference trajectories designed are known as the virtual holonomic constraints (VHC) [41]. To synchronize the controlled joints with the underactuated degree of freedom \(q_{T}\), the nominal reference trajectories are designed as functions of the angle \(\varTheta (q_{s})\) between the ground and the virtual link that connects the support leg-end with the hip. Since \(\varTheta (q_{s})\) is a function of \(q_{T}\), synchronizing the references with \(\varTheta (q_{s})\) also synchronizes the references with \(q_{T}\).

In order to obtain a smooth motion of the controlled joints, the nominal reference trajectories are determined through the Bézier polynomials

$$ \bar{q}_{d,j} \bigl(\varTheta (q_{s}) \bigr)=\sum _{i=0}^{M}\beta _{i} ^{j} \frac{M!}{i!(M-i)!}s^{i}(1-s)^{M-i}, \quad \forall j, $$
(45)

where

$$ s:=\frac{\varTheta (q_{s})-\varTheta ^{+}}{\varTheta ^{-}-\varTheta ^{+}}, $$
(46)

\(\varTheta ^{+}\) and \(\varTheta ^{-}\) are the \(\varTheta \) angles at the beginning and ending of the swing phase. The values of \(\varTheta ^{+}\) and \(\varTheta ^{-}\) are functions of the nominal configuration of the robot at the support-leg exchange. The coefficients \(\beta _{i}^{j}\) in (45) are numerically selected to satisfy a set of constraints that guarantee a stable gait in nominal conditions. To this end, let us define the nominal values of the states just before and after the support-leg exchange as

$$ \bar{x}^{-}:=\left [ \textstyle\begin{array}{c} \bar{q}_{s}^{-} \\ \dot{\bar{q}}_{s}^{-} \end{array}\displaystyle \right ] \hspace{1em}\mbox{and} \hspace{1em} \bar{x}^{+}=\left [ \textstyle\begin{array}{c} \Delta _{q} (\bar{q}_{s}^{-} ) \\ \Delta _{\dot{q}} (\bar{q}_{s}^{-} )\dot{\bar{q}}_{s}^{-} \end{array}\displaystyle \right ], $$
(47)

respectively. Then the constraints in the trajectories at the beginning of the swing phase can be defined as

$$ h \bigl(\Delta \bigl(\bar{x}^{-}\bigr) \bigr)=0 \hspace{1em}\mbox{and} \hspace{1em} L_{f}h \bigl(\Delta \bigl(\bar{x}^{-}\bigr) \bigr)=0. $$
(48)

An additional constraint is defined with the solution \(\varphi _{f}(t)\) of the continuous dynamics of the robot (3) at the beginning, \(t_{0}\), and at the end of the swing phase, \(t_{f}\). This is,

$$ \bar{x}^{+}:=\varphi _{f}(t_{0}), \hspace{1em} \mbox{and} \hspace{1em} \bar{x}^{-}:=\varphi _{f}(t_{f}). $$
(49)

4.2 Smooth transition from the post-impact state to the nominal reference trajectory

Under uncertain discrete dynamics is not possible to ensure that the robot’s states before the support-leg exchange are in their nominal values \(\bar{x}^{-}\). In that way, let us define the pre-switching states as

$$ x^{-}=\bar{x}^{-}+\varepsilon ^{-}, $$
(50)

where \(\varepsilon ^{-}\) is the offset from the nominal pre-switching states. Then the post-switching states are defined by an extension of the reset map in (7) as

$$\begin{aligned} x^{+} &= \Delta \bigl(x^{-}\bigr)+\gamma _{\Delta }, \end{aligned}$$
(51)
$$\begin{aligned} \bigl(\bar{x}^{+}+\varepsilon ^{+} \bigr) &=\Delta \bigl( \bar{x} ^{-}+\varepsilon ^{-} \bigr). \end{aligned}$$
(52)

Following the above result, it is possible to conclude that even if the reference trajectories are designed to satisfy the conditions (48), it is uncertain whether these conditions will be satisfied on uncertain terrain.

In order to avoid the sudden changes that could be produced by the uncertain terrain in the controlled output \(h(x)\) and its Lie derivative \(L_{f}h(x)\), the post-impact angles \(q_{b}^{+}\) and the estimation of the angular velocities \(\hat{\dot{q}}_{s}^{-}\) are used to perform a smooth transition from the post-impact states to the nominal reference trajectories. To this end, let us define a passive trend function that will act as passive reference just after the support-leg exchange. This is,

$$ \vartheta _{j}= \bigl(\Delta _{\dot{q}}\bigl(q_{s}^{-} \bigr)\hat{\dot{q}}_{s}^{-} \bigr) _{j}\tau +q_{b,j}^{+}, \hspace{3mm}\forall j, $$
(53)

where \(\tau \) is a time variable, which is reset to zero after each support-leg exchange. The values of \(\vartheta \) in (53) are used as the main passive reference just after the support-leg exchange. Then a transition from \(\vartheta \) to the nominal trajectory, \(\bar{q}_{d}(\varTheta (q_{s}))\), is performed with the use of the equation

$$ q_{d,j}=\bar{q}_{d,j} \bigl(\varTheta (q_{s}) \bigr)+ \bigl(\vartheta _{j}-\bar{q}_{d,j}\bigl(\varTheta (q_{s})\bigr) \bigr)B_{z}(t_{\mathit{aux}}), \quad \forall j, $$
(54)

where \(B_{z}(t_{\mathit{aux}})\) is a smooth function that changes from one to zero in a given period of time. The function \(B_{z}(t_{\mathit{aux}})\) is defined by a Bézier polynomial as

$$\begin{aligned} &B_{z}(t_{\mathit{aux}})=\sum_{i=0}^{3}b_{i} \frac{3!}{i!(3-i)!}t_{\mathit{aux}}^{i}(1-t _{\mathit{aux}})^{3-i}, \quad 0\leq t_{\mathit{aux}} \leq 1, \end{aligned}$$
(55)

satisfying the boundary conditions

$$\begin{aligned} &B_{z}(0)=b_{0}=1, \qquad B_{z}(1)=b_{3}=0, \\ &\left . \biggl(\frac{\partial B_{z}(t_{\mathit{aux}})}{\partial t_{\mathit{aux}}} \biggr)\right \vert _{t_{\mathit{aux}}=0}=3(b_{1}-b_{0})=0, \end{aligned}$$

and

$$ \left . \biggl(\frac{\partial B_{z}(t_{\mathit{aux}})}{\partial t_{\mathit{aux}}} \biggr)\right \vert _{t_{\mathit{aux}}=1}=3(b_{3}-b_{2})=0. $$

The auxiliary time variable \(t_{\mathit{aux}}\) is defined such as (55) converges to zero in a fraction of the time of a nominal swing phase. Thus,

$$ t_{\mathit{aux}}=k_{\tau }\tau , $$
(56)

where \(k_{\tau }\) is a constant that sets the transition period of the reference to \(1/k_{\tau }\).

5 Stability analysis

5.1 Extended hybrid zero dynamics (EHZD)

The maximal internal dynamics of the system when the output is identical to zero is called zero dynamics [26, p. 162]. Typically, the dimension of the zero dynamics is determined by the difference between the order of the system and its relative degree, \(n-r\); however, this dimension does not consider unmodeled dynamics, model mismatching, or parameter uncertainties, which affects the stability of the system.

This paper proposes the use of an extended state variable \(\rho (t)\) that locally models the effects of the uncertainties and disturbances in the zero dynamics. The extended state \(\rho (t)\) and the states \(\eta _{1}\) and \(\eta _{2}\) in (32), with the state vector \(\xi \) being identically zero, produce the extended zero dynamics

$$ \left [ \textstyle\begin{array}{c} \dot{\eta }_{1} \\ \dot{\eta }_{2} \\ \dot{\rho } \end{array}\displaystyle \right ]= \left [ \textstyle\begin{array}{l} a(\eta ) \\ b(\eta )+\rho (t) \\ \frac{d}{dt}\rho (t) \end{array}\displaystyle \right ], $$
(57)

where

$$\begin{aligned} a(\eta ) &=\left .L_{f}\phi _{r+1} \bigl(\varPhi ^{-1}(\xi ,\eta ) \bigr)\right \vert _{\xi =0}, \\ b(\eta ) &=\left .L_{f}\phi _{r+2} \bigl(\varPhi ^{-1}(\xi ,\eta ) \bigr)\right \vert _{\xi =0}. \end{aligned}$$

The dimension of the extended zero dynamics is considerable less than the dimension of the system. Thus, the gait stability is encoded into a lower-dimensional system defined by the extended zero dynamics.

Since the impact produced at the support-leg exchange affects the system dynamics with a discrete event, the state variables of the extended zero dynamics are also affected. These state variables are reset after each support-leg exchange with the discrete reset function

$$ \left [ \textstyle\begin{array}{c} \eta ^{+} \\ \rho ^{+} \end{array}\displaystyle \right ] = \left [ \textstyle\begin{array}{c} \Delta _{\eta } (\eta ^{-}) \\ \Delta _{\rho } (\rho ^{-}) \end{array}\displaystyle \right ], $$
(58)

where \((\eta ^{-},\rho ^{-})\) and \((\eta ^{+},\rho ^{+})\) are state variable of the extended zero dynamics state just before and after the support-leg exchange, respectively. \(\Delta _{\rho }(\rho ^{-})\) produces a unknown bounded value and \(\Delta _{\eta }\) can be computed directly from (7) and (23). This is

$$ \eta ^{+}=\left [ \textstyle\begin{array}{c} \eta _{1}^{+} \\ \eta _{2}^{+} \end{array}\displaystyle \right ]=\left [ \textstyle\begin{array}{c} \varTheta (q_{s}^{+}) \\ D_{n}(q_{s}^{+})\dot{q_{s}}^{+} \end{array}\displaystyle \right ]=\left [ \textstyle\begin{array}{c} \varTheta (\Delta _{q} (q_{s}^{-} )) \\ D_{n}(\Delta _{q} (q_{s}^{-} ))\Delta _{\dot{q}} (q_{s} ^{-} )\dot{q}_{s}^{-} \end{array}\displaystyle \right ]=\Delta _{\eta } \bigl(\eta ^{-}\bigr). $$
(59)

An extended hybrid zero dynamics (EHZD) is defined by (57) and (58). In a compact form, the EHZD can be written as

$$ \textstyle\begin{array}{c} \varSigma _{\eta }:\left \{ \textstyle\begin{array}{c@{\quad}c} \dot{z}_{\eta }=f_{\eta }(z_{\eta }), & z^{-}_{\eta }\notin \mathcal{S}\cap \mathcal{Z,} \\ z^{+}_{\eta } =\Delta _{z} (z^{-}_{\eta }), & z^{-}_{\eta }\in \mathcal{S}\cap \mathcal{Z}, \end{array}\displaystyle \right . \end{array} $$
(60)

where

$$ {\mathcal{{Z}}}:=\left \{ {\left [\textstyle\begin{array}{c@{\quad}c} {q_{s}}^{\mathrm {T}}&{\dot{q}_{s}}^{\mathrm {T}} \end{array}\displaystyle \right ]}^{\mathrm {T}} \in \mathbb{R}^{n} \, | \, y=h(q_{s})=0, \ \dot{y}=\frac{\partial h(q_{s})}{\partial q_{s}}\dot{q} _{s}=0\right \} , $$
(61)
$$ z_{\eta }=\left [ \textstyle\begin{array}{c} \eta _{1} \\ \eta _{2} \\ \rho \end{array}\displaystyle \right ], \qquad f_{\eta }(z_{\eta })= \left [ \textstyle\begin{array}{l} a(\eta ) \\ b(\eta )+\rho (t) \\ \frac{d}{dt}\rho (t) \end{array}\displaystyle \right ], \hspace{1em} \mbox{and} \hspace{1em} \Delta _{z} \bigl(z^{-}_{\eta }\bigr)=\left [ \textstyle\begin{array}{c} \Delta _{\eta } (\eta ^{-}) \\ \Delta _{\rho } (\rho ^{-}) \end{array}\displaystyle \right ]. $$

The next section defines a stability test for the EHZD using the estimations of the extended state observer developed in Sect. 3.2.

5.2 Asymptotic periodic orbits in EHZD

Under the assumption that the states \(h\) and \(L_{f}h\) converge asymptotically to zero and remain zero after the support-leg exchange, there is an invariant set determined by the EHZD, which has the gait stability encoded. In this way, the gait stability can be defined by the presence of asymptotic periodic orbits in the evolution of the state variables of the EHZD. The use of the Poincaré return map transforms the problem of finding periodic orbits into a problem of finding fixed points of a particular discrete-time, nonlinear system [41, Chap. 4].

This paper uses a stability test of the EHZD with the inclusion of an extended state that models the zero dynamics uncertainties. The use of an extended state in the computation of the Poincaré return map allows one to determine whether the uncertainties in the hybrid zero dynamics are periodic. If the uncertainties in the hybrid zero dynamics are periodic, then the uncertainties are bounded and do not cause instability.

In this work, the uncertainties (and disturbances) can be classified as persistent and non-persistent. Persistent uncertainties are inherent to the model and include unmodeled dynamics, model mismatching, and parameter uncertainties. Non-persistent uncertainties are sporadic external events and include rough terrain and external disturbances. In this way, the extended state variable \(\rho \) can be expressed as

$$ \rho =\rho _{p}+\rho _{\bar{p}}, $$
(62)

where \(\rho _{p}\) and \(\rho _{\bar{p}}\) represent the effect of the persistent and non-persistent uncertainties and disturbances in the zero dynamics, respectively. A periodic dynamic behavior is only observed under undisturbed conditions of operation, e.g., walking on flat terrain, constant mechanical properties, and no external disturbances. In this case, \(\rho _{\bar{p}}=0\). Thus, the stability can be determined based on the periodicity of the EHZD with persistent uncertainties and disturbances.

In order to numerically compute the Poincaré return map, values of \(\eta _{1}\), \(\eta _{2}\), and \(\hat{\rho }\) are sampled just before the support-leg exchange. Sampled values are collected in the vector \(\tilde{z}_{\eta }(k):={\left [\begin{array}{ccc} \eta _{1}^{-}&\eta _{2}^{-}&\hat{\rho }^{-} \end{array} \right ]}^{\mathrm {T}}\), where \(k\) is a discrete-time variable that represents the \(k\)th support-leg exchange. Then the Poincaré return map can be expressed as

$$ \tilde{z}_{\eta }(k+1)=\tilde{P}\bigl(\tilde{z}_{\eta }(k) \bigr), $$
(63)

where \(\tilde{P}(\tilde{z}_{\eta }(k))\) maps the state variables of the extended zero dynamics just before the support-leg exchange of the current step \(\tilde{z}_{\eta }(k)\) to the states of the next support-leg exchange \(\tilde{z}_{\eta }(k+1)\). In order to test the stability of (63), a linear approximation of \(\tilde{P}(\tilde{z}_{\eta }(k))\) around a fixed point \(\tilde{z}_{ \eta }^{*}(k)\) is performed. The linear approximation is given by

$$ \tilde{z}_{\eta }(k+1)\approx \tilde{\varPhi }\tilde{z}_{\eta }(k), $$
(64)

where \(\tilde{\varPhi }\) is the Jacobian of \(\tilde{P}(\tilde{z}_{\eta }(k))\) around \(\tilde{z}_{\eta }^{*}(k)\).

The eigenvalues of \(\tilde{\varPhi }\) are used to indicate whether the gait is stable or not. In such a way, if the magnitude of the eigenvalues \(\lambda (\tilde{\varPhi })\) satisfy

$$ \bigl\vert \lambda (\tilde{\varPhi }) \bigr\vert < 1, $$
(65)

then (64) is asymptotically stable. In this way, the EHZD and, consequently, the full hybrid model have asymptotic periodic orbits; therefore, the gait is stable.

6 Numerical simulation

A numerical simulation is performed to evaluate the proposed hybrid disturbance rejection control strategy. The physical parameters of the robot are summarized Table 1. The number of extended state variables in each controlled joint is \(m=1\). The observer constants \(l\) and \(\alpha \) are selected such that \(\hat{A}_{j}\) in (38) and \(\tilde{A}\) in (39) are Hurwitz. Then the 12 selected eigenvalues of \(\hat{A}_{j}\) are \(\lambda (\hat{A} _{j}) = -1300, -1200, \ldots , -200\), and the three selected eigenvalues of \(\tilde{A}\) are \(\lambda (\tilde{A}) = -990,-5 \pm 8.7i\). The constant \(k_{\tau }=3.3\) is used in (56).

Table 1 Physical parameters of the robot

6.1 Nominal conditions

The evaluation of the walking on a flat terrain is performed during 50 steps under nominal conditions (i.e., without external disturbances). The tracking trajectory and control torques of the first two seconds of the simulation are shown in Fig. 3(a). An effective tracking of the reference is achieved with torques bounded in the range \({\pm}5~\mbox{N}\,\mbox{m}\). The reaction forces in the support-leg are shown in Fig. 3(b). As expected, the normal reaction force \(F_{N}\) is non-negative. Also, the absolute value of the ratio between the tangential and normal reaction forces is less than the friction coefficient, \(\mu =0.6\). This confirms that the unilateral constraints in the support-leg end are satisfied.

Fig. 3
figure 3

(a) Tracking trajectories and control signals. (b) Reaction forces in the support leg-end (Color figure online)

6.2 Model uncertainties

In order to test the robustness against model uncertainties, an evaluation of the gait over uneven terrain is shown in Fig. 4. In this simulation, the terrain has random height variation in the range of ±5 mm. The simulation shows that, in the presence of random conditions at the support-leg exchange, the hybrid control has a robust performance.

Fig. 4
figure 4

Stick diagram of the walking simulation over uneven terrain

6.3 External disturbances

In order to test the robustness against external disturbances, a gait test over flat terrain is performed during 20 steps. At the 10th step, the robot is perturbed with external torques. The value of the external torques is \(2~\mbox{N}\,\mbox{m}\), which corresponds to 40% the nominal torque of the actuators. The external torques are applied simultaneously to all the controlled joints during the simulation. The evolution of the state variables of the EHZD is shown in Fig. 5. Even in the presence of external disturbances, the EHZD state variables convergence to a nominal periodic orbit.

Fig. 5
figure 5

Orbital behavior of the states variables in the extended hybrid zero dynamics. Red: Nominal orbit. Blue: Perturbed behavior (Color figure online)

In order to compare the features of the model-based ADRC with a classical HZD control strategy, an observer-based feedback control is designed based on [15]. This control is referred to as nonlinear proportional-derivative (NPD). NPD and model-based ADRC, are tested in simulation under equal operation conditions. The simulation tests are performed with gaits over flat terrain during 25 steps. In the middle of the 20th step, external torques of \(1~\mbox{N}\,\mbox{m}\) are applied during 0.02 s in the actuated and underactuated joints of the robot. Figure 6 shows the behavior of \(\varTheta \) and \(\dot{\varTheta }\) during the 25 steps. Both controllers keep the robot walking; however, the model-based ADRC control has a better disturbance rejection than the NPD. These simulations confirm the practical convenience of using the proposed model-based ADRC over a classical HZD with a NPD control strategy.

Fig. 6
figure 6

External disturbances rejection. Brown: Model-based ADRC. Blue: Classical HZD with a NPD controller (Color figure online)

7 Design of the testbed (Saurian) and physical experiments

In order to analyze, design, and test the proposed disturbance rejection control strategies for hybrid dynamic systems, a bipedal robot is designed and fabricated. This robot, referred to as Saurian, combines the continuous dynamics of the swing walking phase with the discrete dynamics of the support-leg exchange. An overview of the design and fabrication of Saurian is presented below.

7.1 Design of mechanisms

Saurian is conceived to move along its sagittal plane. Its lateral movements are constrained by a radial bar attached to a central column through a universal joint. The length of the radial bar is large enough, so that the robot’s gait can be considered straight for all practical purposes.

The mechanical configuration of Saurian is inspired by the French robot RABBIT [6], which is an active bipedal robot developed to perform the natural and dynamic movements of a passive dynamic walker but in a horizontal terrain [32]. As RABBIT, Saurian is designed with five rigid links: two shins, two tights, and one torso. It takes advantage of the passive dynamics to achieve a power efficient walking through point-feet leg-ends without ankles. The leg-ends are built with fixed rubber wheels, which provide a pivot contact with the ground and a high friction coefficient.

A significant difference from RABBIT is that Saurian incorporates flexible actuated joints as compliant mechanisms that isolate the motor’s shafts from impacts produced during the support-leg exchange. The compliant mechanisms in Saurian also increase the power efficiency of the walking due to the spring effect in storing impact kinetic energy and releasing it during the propulsion of the body in next swing phase [22, 25, 39].

7.2 Sensors, actuators, and control hardware

Saurian is equipped with four DC brushed motors coupled to high efficiency titanium gearboxes that transmit the torque to the controlled joints with a high power-to-weight ratio. A torque feedback control in the DC motors is designed and implemented using four current sensors. Each joint has an incremental encoder (nine in total): one encoder measures the absolute angle of the torso with respect to the central column, four encoders measure the relative angles of the actuated joints, and four encoders measure the relative angles of the flexible joints. Four linear potentiometers are also installed in the actuated joints to define a fix reference frame for the corresponding angular positions. Two axial load cells are part of the shins to detect the reaction forces in the legs. Saurian’s testbed is shown in Fig. 7.

Fig. 7
figure 7

Saurian’s testbed composed of a Host-PC acting as the human–machine interface, and an XPC-Target dedicated computer in charge of computing the control signals

Saurian uses a real-time computing system to perform the feedback control and store data for off-line analysis. The real-time computing system is developed on the xPC-Target toolbox from Matlab&Simulink® with a data acquisition DAQ environment to acquire data from sensors and transmit control outputs to the actuators, all at a rate of 1.0 kHz. Figure 8 shows the flow diagram with the connection of the testbed and the interaction between the controller hardware, sensors, and the robot.

Fig. 8
figure 8

Flow diagram containing six modules: Host-PC, XPC-Target, Signal Conditioning, Power Drive, Power Supply, and Saurian

7.3 Walking experiments

Saurian is used to experimentally validate the proposed hybrid disturbance rejection control strategy. For this experiment, the robot is set up to walk over a flat terrain (a wooden table). The video snapshots of the experiment are shown in Fig. 9. The evolution of the controlled angles \((h_{l},h_{r},k_{l},k_{r})\), the flexible joints \((h_{l}^{\prime },h_{r}^{\prime },k_{l}^{\prime },k _{r}^{\prime })\), the trajectory references \((h_{l,d},h_{r,d},k_{l,d},k _{r,d})\), and the control torques \((\tau _{hl},\tau _{hr},\tau _{kl}, \tau _{kr})\) are shown in Fig. 10. The evolution of the controlled joints closely track the trajectory references. The control torques are bounded by the operation range of the gear motors.

Fig. 9
figure 9

Snapshots sampled with a period of 1.0 s. See video in https://youtu.be/sNaXcFzXQFc

Fig. 10
figure 10

Behavior of the states variables during the physical experiment with Saurian in flat terrain (Color figure online)

The evolution of the state variables is presented in the form of phase portraits in Fig. 11. The average trends are identified in these plots and are shown to be periodic. The evolution of the pair \((\varTheta (q_{s}),\dot{\varTheta }(q_{s}))\) from the simulation and physical experimentation is shown in Fig. 12. Bounded deviations around the simulation results can be observed. Such deviations are caused by model uncertainties and external disturbances.

Fig. 11
figure 11

Periodic behavior in the state variables of Saurian. Experimental data

Fig. 12
figure 12

phase portrait of \(\varTheta (q_{s})\) vs. \(\dot{\varTheta }(q _{s})\). Experimental data vs. Simulation

The sources of the model uncertainties include model mismatching and manufacturing imperfections. Model mismatching is caused by factors such as joint compliance and robot asymmetry. The joint compliance is a product of the flexible coupling between the actuators and the joints, which causes deviations in the controlled angles that are considered in the robot’s model as unmodeled dynamics. The robot asymmetry is caused in large part by the difference in the radii of the circular paths of the internal and the external (right and left) legs of the robot. Manufacturing imperfections, which are ever-present in the fabrication and assembly of a physical prototype, cause aleatory uncertainties that produce asymmetries and joint backlash. Finally, since the double support phase of the walking is not instantaneous, as assumed in the mathematical model, the robot suffers external unexpected perturbing forces in the swing leg-end. Notably, despite the multiple sources of uncertainty, the time evolution of all state variables shows a periodic trend demonstrating that the proposed hybrid disturbance rejection control strategy is able to maintain Saurian’s periodic and stable gait.

8 Conclusions and remarks

A hybrid disturbance rejection control strategy for dynamic bipedal robots is developed in this work. The control strategy is robust against model uncertainties and external disturbances in both continuous and discrete dynamics. In order to reject the total disturbance in the continuous dynamics, the disturbances and state variables are estimated using a nonlinear extended state observer. Such estimation is used on a model-based active disturbance rejection controller (ADRC). In conjunction with the model-based ADRC, a discrete adaptive trajectory generator is developed. The trajectory generator uses a discrete reset control law that maintains zero tracking error after the support-leg exchange even under model uncertainties.

This work extends the methods to evaluate periodic stability from hybrid zero dynamics to hybrid dynamic systems with model uncertainties. The proposed extended hybrid zero dynamics (EHZD) incorporates an extended state that models the uncertainties of the zero dynamics. The Poincaré method is used to search for periodic orbits on the EHZD to assess the robot’s periodic stability.

The proposed control strategy is validated using numerical simulation and physical experimentation. Numerical simulations demonstrate robust gait stability in the presence of model mismatching and external disturbances. Experimental validation was carry out on a bipedal robot testbed, referred to as Saurian. The testbed was built to evaluate the proposed hybrid disturbance rejection control strategy. The performance in laboratory conditions shows the effectiveness of the proposed control. The results of the simulation and experimentation elucidate the possibility to reject disturbances in the robot’s underactuated dynamics. In order to continuously update optimal gait patterns, ongoing work focuses on the sensitivity analysis of gait parameters such as step length, speed, cadence, and squat performance.