1 Introduction

Trajectory tracking of underactuated mechanical systems represents a challenging control problem that has been attracting increasing attention in the research community [8, 9, 35]. Flexible and soft robots are notable examples of underactuated mechanical systems. Other examples include surface vessels and swimming robots [34, 55]. Underactuated mechanical systems often possess an unstable internal dynamics, that is the remaining dynamics when the output is constrained [15, 32, 47]. In general, stable inversion methods can be used for the feedforward control design. For example, a boundary value problem can be constructed considering suitable eigenvectors and eigenvalues for the zero dynamics, see [12, 50]. Alternatively, an optimal control approach can be employed avoiding derivations and computation of eigenvectors/eigenvalues, see [6, 7, 33].

In practical applications, closed-loop control is required to ensure satisfactory performance, particularly in trajectory tracking and in the presence of disturbances. Model predictive control (MPC) is a feedback control strategy where an optimal control problem is solved for each MPC iteration [1, 14, 31, 41]. Consequently, MPC is also known as receding horizon control or moving horizon optimal control. MPC can be employed for dynamic systems with instability and nonlinearity; however, feasibility, stability, and performance could be a challenge for complex dynamic systems. Another important aspect is related to the computational efficiency of MPC because an online implementation is required in practical applications. This aspect is particularly challenging for nonlinear systems, fast dynamic, and large-scale problems, see [10, 17, 18].

The nominal MPC is suitable in case of a small mismatch between model and plant [30]. Conversely, robust MPC is appropriate if a significant difference between dynamic model and plant exist [38]. This can occur due to parametric uncertainty and external disturbances. A way to approach this problem involves considering a robust MPC related to the nominal problem [48] and applying the same strategy if a disturbance is present. When the dynamic system to be controlled is nonlinear, this control strategy is called nonlinear model predictive control (NMPC). Robust NMPC can be employed for a wider class of systems, including underactuated mechanical systems, although some challenges remain. Aspects related to the robustness of NMPC are discussed in [13, 37]: Chen and Shaw [13] considered a receding horizon feedback control (RHFC) for nonlinear autonomous systems, where the horizon distance is computed as an explicit function of the state; Magni and Sepulchre [37] showed that receding-horizon control possesses the stability margins of optimal control laws, using the nonlinear counterpart of the Fake Riccati equation. Most robust NMPC schemes that take the uncertainty/disturbance directly into account are related to a min-max formulation such as in [11]. A more recent control strategy called tube-MPC can also account for model/plant discrepancies by employing an ancillary control together with MPC. In [40], tube-based MPC for linear systems is extended to achieve robust control of nonlinear systems subject to additive disturbances, where the ancillary control is computed using MPC. In [39], two competing versions of robust and stochastic tube-MPC are compared, concluding that a control policy (sequence of control laws) can be preferable to a sequence of control actions.

Tube-MPC algorithms have been applied to a wide variety of systems, thus underscoring the general validity of this approach. A tube-MPC for a class of Lipschitz nonlinear systems with application to a simple example is proposed in [53]. A tube-based NMPC for autonomous mobile robots with tire–terrain interactions is presented in [45]. A tube-based MPC with relaxed stability for smart grid is presented in [49]. A sliding-mode control (SMC) for constrained MPC of nonlinear systems is employed in [21]. In [20, 43], tube-MPC is investigated for uncertain multiagent systems and for systems with non-additive dynamic disturbances, respectively. Finally, Dong and Angeli [19] investigated an homothetic tube-based MPC for systems with nonzero mean disturbances. In the aforementioned works, the tube-MPC employs constant tube parameters which define a fixed region of attraction. Instead, in [36] the tube geometry is treated as a design variable and it is optimized alongside states and control, thus improving dynamic performance and robustness. In particular, SMC is employed as ancillary control so that the gap between linear and nonlinear homothetic/elastic tube-MPC is eliminated and a state-dependent uncertainty can be compensated. However, in order to employ this method the nonlinear dynamic system is required to be either feedback linearizable or minimum phase.

This work aims to extend the dynamic tube-MPC proposed in [36] to a class of underactuated mechanical systems. To this end, the interconnection and damping assignment passivity-based control (IDA-PBC) methodology is employed to compute the ancillary control instead of SMC. We have chosen the IDA-PBC methodology since it is ideally suited to underactuated systems [44] and it provides a physical interpretation of the control action in terms of mechanical energy. In addition, an adaptive observer is used to compensate the effect of unknown disturbances and model uncertainties under some realistic assumptions [24, 27]. In summary, the main contributions of this work include the following points.

  1. 1.

    An analytical formulation of the ancillary control law constructed with the IDA-PBC methodology considering the case of underactuated mechanical systems with matched and bounded disturbances first, and then extending the results to the case of matched and unmatched disturbances with bounded time-derivative.

  2. 2.

    The design of the dynamic tube by employing an energy-based approach resulting in a tube dynamics representative of a first-order filter, and the study of the stability conditions with a Lyapunov approach.

  3. 3.

    The integration of the ancillary control with an MPC algorithm resulting in a new robust receding-horizon optimization problem that accounts for the tube parameters.

  4. 4.

    Simulation results highlighting the benefits of the proposed approach for two examples of nonlinear underactuated mechanical systems.

The rest of the paper is organized as follows. For completeness, Sect. 2 provides an overview of the dynamic tube-MPC for fully actuated systems that inspired our work. Section 3 outlines the MPC for underactuated systems. Section 4 details the proposed dynamic tube-MPC for underactuated mechanical systems, which represents the main methodological contribution of this work. In Sect. 5, the methodology is applied to a two-mass-spring-damper system with parametric uncertainties and to an inertia-wheel pendulum system with external disturbances. Conclusions and directions for future work are discussed in Sect. 6.

2 Overview of dynamic tube-MPC for fully actuated systems

For reference, this section outlines the approach presented in [36] which defines the control law \(\pi \) as the sum of two terms: the MPC control input u, and the ancillary control input \(u_{\text {smc}}\). The former is a standard MPC with additional constraints related to the dynamic tube variables to be optimized. The latter is an SMC law consisting of a continuous term and a switching term. In order to formulate the optimization problem, consider the system dynamics in a so-called affine form

$$\begin{aligned} \dot{x} = f(x) + u - \delta , \end{aligned}$$
(1)

where \(\delta \) is a disturbance. The dynamic tube-MPC control law [36] is

$$\begin{aligned}&\pi = u + u_{\text {smc}}, \end{aligned}$$
(2)
$$\begin{aligned}&u_{\text {smc}} = -f(x) - k_1 s - k_2 \text {sign}(s/\phi ), \end{aligned}$$
(3)
$$\begin{aligned}&\text {min}(J)_{u,\nu } = h(\breve{x}(t_f)) + \int _{t_i}^{t_f} l(\breve{x}, u, u_{\text {smc}}, \alpha , \nu ) \, \hbox {d}\tau ,\nonumber \\ \end{aligned}$$
(4)

where \(s= \dot{x}_d - \dot{x} - \lambda (x_d - x)\), and \(s=0\) is the sliding surface, \(x_d\) is the desired trajectory for the output, \(\phi \) is the dynamic tube with time derivative \(\dot{\phi } = \Delta - \alpha \phi \), and \(\Delta \) is the maximum value of the lumped model uncertainty, which is assumed known and bounded, that is \(|\delta | < \Delta \). The terms \(k_1\), \(k_2\) are tuning parameters, and the MPC optimization is subject to the constraints

$$\begin{aligned}&\dot{\breve{x}} = f(\breve{x}) + u, \end{aligned}$$
(5)
$$\begin{aligned}&\dot{\phi } = \Delta - \alpha \phi , \end{aligned}$$
(6)
$$\begin{aligned}&\dot{\alpha } = \nu , \end{aligned}$$
(7)

in addition to the initial conditions and the final conditions. The main limitation of this approach is that it requires full actuation (see Assumption 4 in [36]). This is a considerable restriction, and it precludes the use of this technique for underactuated mechanical systems. In this study, we relax the former assumption by computing the ancillary control law with the IDA-PBC methodology instead of using SMC. The ancillary control is then combined with MPC thus resulting in a new control algorithm (see Sect. 4).

3 Overview of MPC for underactuated mechanical systems

3.1 Equations of motion for mechanical systems

The equations of motion of a multibody system expressed according to Newton’s second law and using generalized position coordinates are

$$\begin{aligned} \mathbf {M}(\mathbf {q})\ddot{\mathbf {q}} + \mathbf {g}(\mathbf {q},\dot{\mathbf {q}},t) = \mathbf {A}\mathbf {u} - \mathbf {\delta }. \end{aligned}$$
(8)

This is a system of ordinary differential equations where \(\mathbf {q}\) is the vector of n generalized coordinates in position, \(\mathbf {M}(\mathbf {q})\) is the positive definite and invertible inertia matrix, \(\mathbf {g}(\mathbf {q},\dot{\mathbf {q}},t)\) is the vector of internal and complementary inertia forces. The input matrix \(\mathbf {A}\) distributes the m control inputs \(\mathbf {u}\) onto the directions of the system coordinates, and \(\mathbf {\delta }\) is an external disturbance. If the system is fully actuated, then \(m = n\), while for underactuated systems \(n > m\). The equation of motion in matrix form yields

$$\begin{aligned}&\left[ \begin{array}{c} \dot{\mathbf {q}}\\ \dot{\mathbf {v}}\end{array}\right] = \left[ \begin{array}{c}\mathbf {v}\\ - {\mathbf {M}}^{-1}\mathbf {g}(\mathbf {q},\mathbf {v},t)\end{array}\right] + \left[ \begin{array}{c}0\\ \mathbf {M}({\mathbf {q}})^{-1} \mathbf {A}\end{array}\right] \,\mathbf {u} - \left[ \begin{array}{c} 0 \\ \mathbf {\delta } \end{array}\right] \end{aligned}$$
(9)
$$\begin{aligned}&\mathbf {y}=\mathbf {h}(\mathbf {q},\mathbf {v}), \end{aligned}$$
(10)

where (9) is a differential equation representing the dynamics of the system, and (10) is an algebraic equation defining the output \(\mathbf {y}\) of the system. The term \(\mathbf {v}\) indicates the time derivative of the generalized position coordinates. In particular, the matrix \(\mathbf {h(q,v)}\mathbf {M}(\mathbf {q})^{-1} \mathbf {A} \) is assumed nonsingular. In general, system (9), (10) can have relative degree 1 or 2; however, this work focuses on systems with relative degree 2 (see examples in Sect. 5).

3.2 Time integration for underactuated mechanical systems

In order to perform the time integration, which is required to implement the MPC algorithm for the system (9), the \(\alpha \)-generalized method is employed [16]. This method was initially designed for structural dynamics, but was subsequently applied to multibody systems, and it is known to provide stable convergence for constrained mechanical systems [2]. In particular, further equations can be added to (8) in order to compose a differential-algebraic equation (DAE), e.g., to consider mechanical contact. The \(\alpha \)-generalized method is a generalization of the Newmark method [42], and it can be written as

$$\begin{aligned} \dot{\mathbf {q}}^{(k+1)}= & {} \dot{\mathbf {q}}^{(k)} + (1 - \gamma _{*})h\mathbf {a}^{(k)} + \gamma _{*} h\mathbf {a}^{(k+1)}, \end{aligned}$$
(11)
$$\begin{aligned} \mathbf {q}^{(k+1)}= & {} \mathbf {q}^{(k)} + h\dot{\mathbf {q}}^{(k)} + \left( \frac{1}{2} - \beta _{*}\right) h^2\mathbf {a}^{(k)} \nonumber \\&+ \beta _{*} h^2\mathbf {a}^{(k+1)}, \end{aligned}$$
(12)
$$\begin{aligned} \mathbf {a}^{(k+1)}= & {} \frac{-\alpha _m\mathbf {a}^{(k)} + (1-\alpha _f){\ddot{\mathbf {q}}}^{(k+1)}+\alpha _f{\ddot{\mathbf {q}}}^{(k)}}{1-\alpha _m}, \end{aligned}$$
(13)

where \(\mathbf {a}\) is termed pseudo-acceleration, while \(\gamma _{*}\), \(\beta _{*}\), \(\alpha _m\) and \(\alpha _f\) are parameters that can be tuned to combine unconditional stability and second-order accuracy. In particular \(\gamma _{*}\), \(\beta _{*}\), \(\alpha _m\) and \(\alpha _f\) depend on the spectral radius at infinite frequencies \(\rho _{\infty } \in [0,1]\), where \(\rho _{\infty }=0\) represents the maximum numerical damping in the integration, and \(\rho _{\infty }=1\) denotes the absence of numerical damping (see [16] for further details). The parameter \(k=1,2,..N_{\text {tot}}\) is the index corresponding to the discretized time step. Equations (1112) represent the Newmark method, and (13) is an additional acceleration update law defining the \(\alpha \)-generalized method.

3.3 Feedforward control design: numerical optimal control

In order to define an appropriate feedforward control for the virtual plant, we consider an inverse dynamics problem. Differently from the direct dynamics problem that computes the output for a given dynamic system related to a known input force/torque, the inverse dynamics problem aims to compute the control input related to a desired trajectory of the system’s output (10). In general, the direct dynamics problem (e.g., initial value problem) is well known for both fully actuated and underactuated systems. However, the inverse dynamics problem can be a challenge for complex nonlinear underactuated systems. For a class of underactuated mechanical systems that are typically non-minimum phase, the so-called stable inversion methods (e.g., optimal control) are appropriate for the trajectory tracking problem. In such cases, pre/post-actuation is often required to stabilize the internal dynamics; thus, a non-causal solution can be obtained using optimal control, see [7], where the unactuated states and the control input start before the beginning of the output trajectory in the pre-actuation phase and continue after the end of the output trajectory in the post-actuation phase. In this scenario, the feedforward control is applied to the plant in an open-loop fashion, while MPC is activated in order to minimize the output error.

We employ the direct transcription method in our optimal control approach to deal with nonlinearities and instability that might occur for underactuated mechanical systems, e.g., trajectory tracking of flexible manipulators [5]. The direct transcription method follows the rule first discretize, then optimize, and it allows solving an optimal control problem numerically by treating the equations of motion, the time integration, and other boundary conditions or state/control constraints as equality constraints (i.e., binding constraints) in the optimization algorithm (see [4] for further details). In addition, inequalities constraints for the design variables can also be enforced. The discretization in time \(t^{(k)}\), \(k=1,\ldots ,N_{\text {tot}}\) results in a finite number of points \(N_{\text {tot}}\). The set of design variables \(\mathbf {\chi }\) includes at each time step the positions \(\mathbf {q}\), velocities \(\dot{\mathbf {q}}\), accelerations \(\ddot{\mathbf {q}}\), pseudo-accelerations \(\mathbf {a}\), and control inputs \(\mathbf {u}_{\text {ff}}\). A subset of the design variables at each time step k is defined as

$$\begin{aligned} \mathbf {\chi }^{(k)} = \left( \mathbf {q}^{(k)} \dot{\mathbf {q}}^{(k)} \ddot{\mathbf {q}}^{(k)} \mathbf {a}^{(k)} \mathbf {u}_{\text {ff}}^{(k)}\right) ^\mathrm{T}. \end{aligned}$$
(14)

Thus, the set of design variables including all time steps can be written as the vector \(\mathbf {\chi } = (\mathbf {\chi }^{(1)} \mathbf {\chi }^{(2)} \ldots \mathbf {\chi }^{(N_{\text {tot}})})^\mathrm{T}\), and its dimension is \(N_{\text {tot}}(4n + m)\). After discretization, the feedforward control design indicated as optimization problem \(\mathcal {P}_1\) takes the general form

$$\begin{aligned} \begin{aligned}&\mathcal {P}_1: \\&\underset{\mathbf {\mathbf {\chi }}}{\text {min}}&J = \sum _{k = 1}^{N_{\text {tot}}-1} \left[ (\mathbf {e}_\mathrm{num}^{(k)})^T \mathbf {Q} (\mathbf {e}_\mathrm{num}^{(k)}) + (\mathbf {u}_{\text {ff}}^{(k)})^T \mathbf {R} (\mathbf {u}_{\text {ff}})^{(k)}\right] h \\&\text {s.t.}&\mathbf {c}_{\text {mot}}(\mathbf {\chi }) = \mathbf {res}(\mathbf {q}, \dot{\mathbf {q}}, \ddot{\mathbf {q}}, \mathbf {u}) \\&&\mathbf {A}_{\text {intg}} \mathbf {\chi } - \mathbf {b}_{\text {intg}} = 0 \\&&\mathbf {y}^{(N_{\text {tot}})} = \mathbf {y}_d(t^{(N_{\text {tot}})}), \end{aligned} \end{aligned}$$

where \( \mathbf {e}_\mathrm{num}^{(k)} = \mathbf {y}_d(t^{(k)}) - \mathbf {y}^{(k)}\) represent the output error of the system, \(\mathbf {res}(\mathbf {q}, \dot{\mathbf {q}}, \ddot{\mathbf {q}}, \mathbf {u})\) is the residual of the dynamic equation (i.e., the numerical error related to the dynamic equation (8) in the optimization process, which approaches zero once a solution of the optimization problem is found and is accounted for with the equality constraint \( \mathbf {c}_{\text {mot}}(\chi ) \approx 0\)), while \( \mathbf {A}_{\text {intg}} \mathbf {\chi } - \mathbf {b}_{\text {intg}} = 0\) is a system of linear equations representing the time integration based on the generalized-\(\alpha \) method corresponding to (1113). The bound equation \(\mathbf {y}_d(t^{(N_{\text {tot}})}) = \mathbf {y}^{(N_{\text {tot}})}\) is the vector of terminal constraints related to the output at the end of the trajectory, \(\mathbf {u_{\text {ff}}}\) is the feedforward control, h is the time step and \(\mathbf {Q}\) and \(\mathbf {R}\) are weight matrices of the objective function. Note that \(\mathcal {P}_1\) represents a weak form of the trajectory tracking problem, since the system output is imposed through the performance index. Instead a strong form would require defining equality constraints for the output. Finally note that the initial condition is not required since the optimization algorithm computes it automatically. A direct computation is employed for the sensitivity analysis: the gradients \({\partial J / \partial \mathbf {\chi }}\) and \({\partial \mathbf {c_{\text {mot}}} / \partial \mathbf {\chi }}\) are computed for each variable and are combined for the entire set. Instead, the linear constraints and the bound constraints do not require any additional computation.

3.4 Feedback control design: model predictive control

MPC can be seen as a type of feedback control law that predicts the dynamics of a system: this is achieved by considering constraints for states and control input in closed-loop and by minimizing a performance index. MPC requires solving various optimal control problems in a receding horizon recursively, where, after obtaining a solution at each feedback loop, just the input command at the first time step is applied to the plant. The subset of the design variables at each time step i is defined as

$$\begin{aligned} \mathbf {\chi }^{(k|i)} = \left( \mathbf {q}^{(k|i)} \dot{\mathbf {q}}^{(k|i)} \ddot{\mathbf {q}}^{(k|i)} \mathbf {a}^{(k|i)} \mathbf {u}^{(k|i)}\right) ^\mathrm{T}, \end{aligned}$$
(15)

with \(i=1,\ldots ,N_\mathrm{tot}-k\). Then, the set of design variables including all time steps is \(\mathbf {\chi } = (\mathbf {\chi }^{(1)}, \mathbf {\chi }^{(2)}, \ldots \mathbf {\chi }^{(N_\mathrm{tot} - k)})^T\) and its dimension is \((N_\mathrm{tot} - k)(4n + m)\) at each iteration. After discretization, the nonlinear programming problem takes the general form

$$\begin{aligned} \begin{aligned}&\mathcal {P}_2: \\&\underset{\mathbf {\chi }}{\text {min}}&J = \sum _{i = 1}^{N_\mathrm{tot}-(k+1)} \left[ (\mathbf {e}^{(k|i)})^T \mathbf {Q} (\mathbf {e}^{(k|i)}) \right. \\&&\left. + (\Delta \mathbf {u}^{(k|i)})^T \mathbf {R} (\Delta \mathbf {u}^{(k|i)}) \right] h \\&\text {s.t.}&\mathbf {c}_{\text {mot}}(\mathbf {\chi }) = \mathbf {res}(\mathbf {q}, \dot{\mathbf {q}}, \ddot{\mathbf {q}}, \mathbf {u}) \\&&\mathbf {A}_{\text {intg}} \mathbf {\chi } - \mathbf {b}_{\text {intg}} = 0 \\&&\mathbf {q}^{(k|1)} = \mathbf {q}_{\text {out}}^{(k)} \\&&\mathbf {y}^{(N_\mathrm{tot}-k)} = \mathbf {y}_d(t^{(N_\mathrm{tot}-k)}), \end{aligned} \end{aligned}$$

where \(\mathbf {res}(\mathbf {q}, \dot{\mathbf {q}}, \ddot{\mathbf {q}}, \mathbf {u})\) is the residual of the dynamic equation (8), \(\mathbf {e}^{(k|i)} = \mathbf {y}_d(t^{(k|i)}) - \mathbf {y}^{(k|i)}\) represents the output error of the system, \(\mathbf {Q}\) and \(\mathbf {R}\) are weight matrices, while \(\Delta \mathbf {u} = \mathbf {u}^{(k|i+1)} - \mathbf {u}^{(k|i)} \) and h is the time step. Differently from \(\mathcal {P}_1\), the vector of equality constraints corresponding to the discretized path for \(\mathcal {P}_2\) is \(\mathbf {c}_{\text {mot}}(\mathbf {\chi }) = (\mathbf {c}_{\text {mot}}^{(1)}, \ldots , \mathbf {c}_{\text {mot}}^{(N_\mathrm{tot}-k)})^T\).

4 Dynamic tube-MPC for underactuated systems

The proposed dynamic tube-MPC combines two powerful control methodologies, namely IDA-PBC and MPC, resulting in a modular algorithm that can be applied to different systems by tailoring the analytical expression of the ancillary control. In addition, a dynamic tube equation akin to a first-order filter is defined according to an energy-based approach. Finally, the tube parameters are treated as new design variables in the optimization problem, while new constraints and a new performance index are employed in the tube-MPC algorithm, thus resulting in a different optimization problem compared to \(\mathcal {P}_2\).

To streamline the presentation, the first part of this section provides a brief overview of IDA-PBC, the second part considers the case of matched disturbances (i.e. only affecting the actuated position coordinates), while the third part considers both matched and unmatched disturbances (i.e. affecting actuated and unactuated coordinates). The last part summarizes the complete dynamic tube-MPC algorithm for underactuated mechanical systems.

4.1 Overview of IDA-PBC for underactuated mechanical systems

The system dynamics (8) can be expressed in port-controlled Hamiltonian form by defining the total energy as

$$\begin{aligned} H(\mathbf {q},\mathbf {p}) = \frac{1}{2} \mathbf {p}^T \mathbf {M}^{-1} \mathbf {p} + V(\mathbf {q}), \end{aligned}$$
(16)

where the first term is the kinetic energy and the second is the potential energy. The system states are the position \(\mathbf {q}\) and the momenta \(\mathbf {p} = \mathbf {M}(\mathbf {q})\dot{\mathbf {q}}\). The Hamiltonian is positive definite and radially unbounded; thus, it is a suitable Lyapunov candidate function. The open-loop dynamics (8) in the presence of a physical damping matrix \(\mathbf {D} = \mathbf {D}^T > 0\) characterizing the model and of the disturbances \(\delta \) is then

$$\begin{aligned} \left[ \begin{array}{c} \dot{\mathbf {q}} \\ \dot{\mathbf {p}} \end{array}\right]&= \left[ \begin{array}{cc} 0 &{} \mathbf {I}^n \\ -\mathbf {I}^n &{} -\mathbf {D} \end{array}\right] \left[ \begin{array}{c} \nabla _q H\\ \nabla _p H \end{array}\right] + \left[ \begin{array}{c}0\\ \mathbf {G}(\mathbf {q})\end{array}\right] \,\mathbf {u} - \left[ \begin{array}{c} 0 \\ \mathbf {\delta } \end{array}\right] , \end{aligned}$$
(17)
$$\begin{aligned} \mathbf {y}&= \mathbf {G}^T(\mathbf {q}) \nabla _p H(\mathbf {q},\mathbf {p}). \end{aligned}$$
(18)

The control input is \(\mathbf {u} \in \mathbb {R}^{m}\), and the input matrix is \(\mathbf {G}\left( \mathbf {q} \right) \in \mathbb {R}^{n \times m}\), with \(\text {rank}\left( \mathbf {G} \right) = m < n\) for all \(\mathbf {q} \in \mathbb {R}^{n}\). Note that \(\mathbf {G} \) in (17) corresponds to \(\mathbf {A}\) in (9). The control aim typically corresponds to stabilizing the equilibrium \((\mathbf {q},\mathbf {p}) = (\mathbf {q}^*,0)\), which can be unstable in open-loop and satisfies the condition \(\nabla _q V(\mathbf {q}) = 0\); thus, it is a regulation problem.

In the absence of disturbances, the IDA-PBC control law is constructed to achieve the closed-loop dynamics

$$\begin{aligned} \begin{aligned} \begin{bmatrix} \dot{\mathbf {q}} \\ \dot{\mathbf {p}} \\ \end{bmatrix} = \begin{bmatrix} 0 &{} \mathbf {M}^{- 1}\mathbf {M_{d}} \\ - \mathbf {M_{d}}\mathbf {M}^{- 1} &{} \mathbf {J_{2}}-\mathbf {D_{d}} \\ \end{bmatrix}\begin{bmatrix} \nabla _{q}H_{d} \\ \nabla _{p}H_{d} \\ \end{bmatrix} - \begin{bmatrix} 0 \\ \delta \\ \end{bmatrix}, \end{aligned} \end{aligned}$$
(19)

where \(H_{d} = \frac{1}{2}\mathbf {p}^\mathrm{T}\mathbf {M_{d}}^{- 1}\mathbf {p} + V_{d}\) and \(\mathbf {q^{*}} = \text {argmin}\left( V_{d} \right) \) corresponds to a strict minimizer of the closed-loop potential energy \(V_{d}\). The closed-loop damping in (19) is defined as \(\mathbf {D_{d}}=(\mathbf {G}k_{v}\mathbf {G}^\mathrm{T} + \mathbf {DM^{- 1}M_{d}})\). The term \(\mathbf {M_{d}} = \mathbf {M_{d}}^\mathrm{T} > 0\) is the closed-loop inertia matrix, \(\mathbf {J_{2}} = - \mathbf {J_{2}}^\mathrm{T}\) is a free-parameter matrix typically defined as a linear function of the momenta, and \(k_{v} = k_{v}^\mathrm{T} > 0\) is a constant gain matrix. The prescribed equilibrium \(\mathbf {q^{*}}\) is asymptotically stable provided that \(\mathbf {D_{d}}>0\) [28]. This condition is always met for certain classes of mechanical systems, including systems with constant inertia matrix, or in case the kinetic shaping is achieved with \(\mathbf {M_{d}}=k_{T} \mathbf {M}\) for some \(k_{T}>0\) as in [26]. Introducing the pseudo-inverse \(\mathbf {G}^{\dagger } = {\left( \mathbf {G}^\mathrm{T}\mathbf {G}\right) }^{- 1}\mathbf {G}^\mathrm{T}\), the IDA-PBC control law that achieves the closed-loop dynamics (19) is expressed as the sum of an energy-shaping component \(u_{\text {es}}\), which assigns the closed-loop equilibrium \(\mathbf {q^{*}}\), and of a damping-assignment component \(u_{\text {di}}\), which injects damping in the system

$$\begin{aligned} u_{\text {ida-pbc}}= & {} u_{\text {es}} + u_{\text {di}}, \nonumber \\ u_{\text {es}}= & {} \mathbf {G}^{\dagger }\left( \nabla _{q}H - \mathbf {M_{d}M}^{- 1}\nabla _{q}H_{d} + \mathbf {J_{2}M_{d}^{- 1}p} \right) , \nonumber \\ u_{\text {di}}= & {} - k_{v}\mathbf {G}^\mathrm{T}\nabla _{p}H_{d}. \end{aligned}$$
(20)

The terms \(\mathbf {M_{d}}\) and \(V_{d}\) should satisfy the following partial-differential-equations (PDEs), where \(\mathbf {G}^{\bot }\) is a full-rank left annihilator of \(\mathbf {G}\), that is \(\mathbf {G}^{\bot }\mathbf {G} = 0\) and \(\text {rank}\left( \mathbf {G}^{\bot } \right) = n - m\):

$$\begin{aligned} 0= & {} \mathbf {G}^{\bot } \left( \nabla _{q}(\mathbf {p}^\mathrm{T}\mathbf {M}^{- 1}\mathbf {p}) - \mathbf {M_{d}M}^{- 1}\nabla _{q}(\mathbf {p}^\mathrm{T}\mathbf {M_{d}}^{- 1}\mathbf {p})\right) \nonumber \\&+ \mathbf {G}^{\bot } \left( 2\mathbf {J_{2}M_{d}}^{- 1}\mathbf {p} \right) , \end{aligned}$$
(21)
$$\begin{aligned} 0= & {} \mathbf {G}^{\bot }\left( \nabla _{q}V - \mathbf {M_{d}M}^{- 1}\nabla _{q}V_{d} \right) . \end{aligned}$$
(22)

If (2122) are satisfied \(\forall \left( \mathbf {q,p} \right) \in \mathbb {R}^{2n}\) and if \(\delta = 0\), then the equilibrium \(\left( \mathbf {q,p} \right) = \left( \mathbf {q^{*}},0 \right) \) is locally stable. If \(\nabla _{q}V_{d}\left( q^{*} \right) = 0\) and \(\nabla _{q}^{2}V_{d}\left( q^{*} \right) > 0\), the equilibrium is a strict-minimizer of \(V_{d}\). Finally, asymptotic stability is concluded if the output \({\mathbf {y} = \mathbf {G}}^\mathrm{T}\nabla _{p}H_{d}\) is detectable even if \(\mathbf {D}=0\) [44].

A recent extension of IDA-PBC for tracking control was presented in [51] for systems with constant inertia matrix and constant damping matrix. Further to that, a path-following controller that employs the Immersion and Invariance (I&I) approach was proposed in [52]. However, the latter does not guarantee the satisfaction of time constraints.

4.2 Ancillary control for systems with matched disturbances

In this section, the IDA-PBC methodology is employed to design the ancillary control by focusing on underactuated mechanical systems with matched disturbances. Thus, the following assumption is introduced.

Assumption 1

The disturbances \(\delta \) are matched, bounded, and null at equilibrium (i.e., \(\delta = \mathbf {G}\delta _{0}\), \(\left| \delta _{0} \right| < \varepsilon \), where \(\varepsilon \) is known).

Considering that \(H_{d}\) is globally positive definite and radially unbounded, we define a sub-domain of attraction for the closed-loop system (19) as

$$\begin{aligned} \Omega _{c} = \left\{ (\mathbf {q,p}) \in \mathbb {R}^{2n}|H_{d}\left( \mathbf {q,p} \right) < c \right\} , \end{aligned}$$
(23)

where c is a positive scalar time-varying parameter representing the width of the dynamic tube. Once the system trajectory enters the sub-domain of attraction (23), it should remain there indefinitely. To this end, the closed-loop energy \(H_{d}\) should decrease at a faster rate than the scalar c, that is \({\dot{H}}_{d} \le \dot{c}\). This first condition is employed to define the time-derivative of c. Conversely, in case the system trajectory lies outside the sub-domain of attraction, an additional control action is defined to bring the system states to \(\Omega _{c}\). The resulting controller design is detailed in the following proposition, which represents the first theoretical contribution of this work.

Proposition 1

Consider system (17) under Assumption 1 in closed-loop with the IDA-PBC control law

$$\begin{aligned} u_{\text {ida-pbc}}= & {} u_{\text {es}} + u_{\text {di}} \quad H_{d}\left( \mathbf {q,p} \right) < c, \nonumber \\ u_{\text {ida-pbc}}= & {} u_{\text {es}} + u_{\text {di}} + u_{0} \quad H_{d}\left( \mathbf {q,p} \right) \ge c, \end{aligned}$$
(24)

where \(u_{es},u_{di}\) are defined in (20), \( k_{v} \ge \alpha c\), \(\alpha \) is a parameter, while \(u_{0}\) and \(\dot{c}\) are defined as

$$\begin{aligned} u_{0}= & {} - \alpha c \left( \mathbf {G}^\mathrm{T}\nabla _{p}H_{d}\right) , \nonumber \\ \dot{c}= & {} \left| \nabla _{p}H_{d}^\mathrm{T}\mathbf {G} \right| \varepsilon - \alpha \left( \nabla _{p}H_{d}^\mathrm{T}\mathbf {G}\mathbf {G}^\mathrm{T}\nabla _{p}H_{d} \right) c. \end{aligned}$$
(25)

Assume in addition that \(\mathbf {D_{d}}>0\) and that \(\mathbf {q^{*}} = \text {argmin}(V_{d})\). Then, all system trajectories originating in \(\Omega _{c}\) remain in \(\Omega _{c}\) indefinitely. In addition, all system trajectories starting outside \(\Omega _{c}\) converge to \(\Omega _{c}\) asymptotically.

Proof

Computing the time derivative of \(H_{d}\) and substituting (19) yields

$$\begin{aligned} {\dot{H}}_{d} = - \nabla _{p}H_{d}^\mathrm{T}\left( \mathbf {D_{d}}\right) \nabla _{p}H_{d} - \nabla _{p}H_{d}^\mathrm{T}\mathbf {G}\delta _{0}. \end{aligned}$$
(26)

A sufficient condition for the system trajectories starting in \(\Omega _{c}\) to remain in \(\Omega _{c}\) indefinitely is given by \({\dot{H}}_{d} < \dot{c}\). Computing the former inequality while substituting (26) and \(\mathbf {D_{d}}\) yields

$$\begin{aligned} \begin{aligned} {\dot{H}}_{d}&\le - \nabla _{p}H_{d}^\mathrm{T}\left( \mathbf {G}k_{v}\mathbf {G}^\mathrm{T} +\mathbf { DM^{- 1}M_{d}} \right) \nabla _{p}H_{d} \\&\quad + \left| \nabla _{p}H_{d}^\mathrm{T}\mathbf {G} \right| \left| \delta _{0} \right| < \dot{c}. \end{aligned} \end{aligned}$$
(27)

Define the tube dynamics as in (25), which represents a stable first-order filter of the disturbance bound \(\varepsilon \) and is similar in structure to the definition of \(\dot{\varphi }\) in [36]. Substituting \(\dot{c}\) from (25) into (27) and refactoring common terms yields

$$\begin{aligned} \begin{aligned} {\dot{H}}_{d}&\le - \nabla _{p}H_{d}^\mathrm{T}\left( \mathbf {G}k_{v}\mathbf {G}^\mathrm{T} +\mathbf { DM^{- 1}M_{d}} \right) \nabla _{p}H_{d} \\&\quad + \left| \nabla _{p}H_{d}^\mathrm{T}\mathbf {G} \right| \left| \delta _{0} \right| < - \alpha \left( \nabla _{p}H_{d}^\mathrm{T}\mathbf {GG}^\mathrm{T}\nabla _{p}H_{d} \right) c \\&\quad + \left| \nabla _{p}H_{d}^\mathrm{T}\mathbf {G} \right| \varepsilon , \end{aligned} \end{aligned}$$
(28)

which is verified for all \(\alpha c \le k_{v}\) since \(\mathbf {D_{d}}>0 \) and \( \left| \delta _{0} \right| < \varepsilon \) by hypothesis (see Assumption 1). Thus, if the trajectory of the closed-loop system with control input (24) enters the sub-domain of attraction, then it will remain there.

If instead the system trajectory is outside the sub-domain of attraction, that is \(H_{d} > c\), we define the new Lyapunov function candidate \(W = H_{d} - c\) which according to (23) is positive definite in this case. Computing the time-derivative of W and substituting \(\dot{c}\) from (25) yields

$$\begin{aligned} \begin{aligned} \dot{W}&\le - \nabla _{p}H_{d}^\mathrm{T}\left( \mathbf {D_{d}}\right) \nabla _{p}H_{d} + \nabla _{p}H_{d}^\mathrm{T}\mathbf {G}u_{0} \\&\quad + \left| \nabla _{p}H_{d}^\mathrm{T}\mathbf {G} \right| \left( \left| \delta _{0} \right| - \varepsilon \right) \\&\quad + \alpha \left( \nabla _{p}H_{d}^\mathrm{T}\mathbf {GG}^\mathrm{T}\nabla _{p}H_{d} \right) c. \end{aligned} \end{aligned}$$
(29)

Simplifying terms in (29) gives

$$\begin{aligned} \begin{aligned} \dot{W}&\le - \nabla _{p}H_{d}^\mathrm{T}\left( \mathbf {D_{d}} \right) \nabla _{p}H_{d} + \nabla _{p}H_{d}^\mathrm{T}\mathbf {G}u_{0} \\&\quad + \alpha \left( \nabla _{p}H_{d}^\mathrm{T}\mathbf {GG}^\mathrm{T}\nabla _{p}H_{d} \right) c. \end{aligned} \end{aligned}$$
(30)

Substituting \(u_{0}\) from (25) into (30) yields finally the expression \(\dot{W} \le - \nabla _{p}H_{d}^\mathrm{T}\mathbf {D_{d}}\nabla _{p}H_{d} \le 0 \). Thus, the storage function W converges to zero and the system trajectory converges to the dynamic tube asymptotically \(\square \)

Remark 1

The control law (24) is in fact a standard IDA-PBC with a discontinuous and time-varying damping-injection term. Comparing (20) and (24) shows that the latter could be expressed as \(u_{\text {ida-pbc}} = u_{\text {es}} + u_{\text {di}}^{'}\), where \(u_{\text {di}}^{'} = - k_{v}^{'}G^\mathrm{T}\nabla _{p}H_{d}\) and \(k_{v}^{'} = k_{v} + \alpha c\) for the case \(H_{d} \ge c\). The difference from the conventional damping injection is that \(\alpha \) is part of the design variables and is computed by the dynamic tube-MPC algorithm. Note that, since \(\alpha c > 0\), then \(k_{v}^{'} > k_{v}\), and provided that \(\mathbf {DM}^{- 1}\mathbf {M_{d}} > 0\), it would be possible to set \(k_{v} = 0\) such that \(k_{v}^{'} = \alpha c\) when \(H_{d}\ge c\). Thus, the damping assignment would only occur when the system trajectory is outside the dynamic tube.

Remark 2

The control law (24) employs a robust-control approach (i.e., the disturbance is assumed bounded and the dynamic tube depends on the bound \(\varepsilon \)), which is conservative by nature. We have shown in [24, 27] that linearly parameterized disturbances can be estimated adaptively from the open loop (17). Thus, the control law could include the adaptive estimate of \(\delta _{0}\) and the boundedness assumption could be relaxed. The assumption of matched disturbances is relatively strong in practice (i.e., Assumption 1). While unmatched disturbances can be treated in a similar way to matched disturbances for specific classes of mechanical systems such as flexible continuum manipulators [25, 26], this is not the case in general. In particular, extending (24) to unmatched disturbances is not trivial since the rank-deficient term \(\nabla _{p}H_{d}^\mathrm{T}\mathbf {G}u_{0}\) cannot cancel the term \(\alpha \left( \nabla _{p}H_{d}^\mathrm{T}\nabla _{p}H_{d} \right) c\), which is full-rank. To address this issue, an alternative implementation of the dynamic tube and of the ancillary control for the case of matched and unmatched disturbances is proposed in the next sub-section.

4.3 Ancillary control for systems with matched and unmatched disturbances

In order to investigate the case of matched and unmatched disturbances, an adaptive observer is included in the ancillary control law and Assumption 1 is replaced in this section by the following.

Assumption 2

The disturbances include matched and unmatched components which are unknown but bounded (with unknown bound), and their time-derivative is bounded, that is \(\left| \dot{\delta } \right| < \mu \left| \mathbf {p} \right| \) for some known \(\mu > 0\).

In addition, we define a new sub-domain of attraction as

$$\begin{aligned} \Omega _{c}^{'} = \left\{ (\mathbf {q,p},z) \in \mathbb {R}^{3n}|H_{d}^{'}\left( \mathbf {q,p},z \right) < c \right\} , \end{aligned}$$
(31)

where c is the width of the dynamic tube. The new storage function is \(H_{d}^{'} = \frac{1}{2}\mathbf {p}^\mathrm{T}\mathbf {M_{d}}^{- 1}\mathbf {p} + V_{d}^{'} + \frac{1}{2}z^\mathrm{T}z\), where \(\nabla _{q}V_{d}^{'} = \nabla _{q}V_{d} + \Lambda \left( \mathbf {q} \right) \), and the estimation error of the adaptive observer is \(z = \widehat{\delta } + \beta \left( \mathbf {p} \right) - \delta \). The term \(\Lambda (\mathbf {q})\) can be interpreted as a vector of closed-loop non-conservative forces, and it is computed from the following set of algebraic matching equations and strict-minimizer condition [24]

$$\begin{aligned} \mathbf {G}^{\bot }\left( \widetilde{\delta } - \mathbf {M_{d}M}^{- 1}\Lambda (\mathbf {q} ) \right)= & {} 0, \nonumber \\ \nabla _{q}V_{d}(\mathbf {q^{*}}) + \Lambda \left( \mathbf {q^{*}} \right)= & {} 0. \end{aligned}$$
(32)

The adaptive estimate of \(\delta \) is \(\widetilde{\delta } = \widehat{\delta } + \beta \left( \mathbf {p} \right) \), where according to the I&I methodology [3]

$$\begin{aligned} \dot{\widehat{\delta }}= & {} - \nabla _{p}\beta ^\mathrm{T}\left( - \nabla _{q}H - \mathbf {D} \nabla _{p}H + \mathbf {G} u_{\text {ida-pbc}} - \widetilde{\delta }\right) , \nonumber \\ \beta= & {} - \gamma \mathbf {p}, \end{aligned}$$
(33)

where \(\gamma >0 \) is a tuning parameter.

Proposition 2

Consider system (17) under Assumption 2 in closed-loop with the IDA-PBC control law

$$\begin{aligned} \begin{array}{llll} u_{\text {ida-pbc}} &{} = &{} u_{\text {es}} + u_{\text {di}} + u_{\text {adpt}} &{}\quad H_{d}^{'}\left( \mathbf {q,p},z \right) < c, \\ u_{\text {ida-pbc}} &{} = &{} u_{\text {es}} + u_{\text {di}} +u_{\text {adpt}} + u_{0} &{}\quad H_{d}^{'}\left( \mathbf {q,p},z \right) \ge c, \\ \end{array} \end{aligned}$$
(34)

where \( k_{v} \ge \alpha c\) and \(u_{es},u_{di}\) are defined in (20). In addition, \(u_{\text {adpt}}\) is given as

$$\begin{aligned} u_{\text {adpt}} = \mathbf {G}^{\dagger }\left( \widetilde{\delta } - \mathbf {M_{d}M}^{- 1}\Lambda \left( \mathbf {q} \right) \right) . \end{aligned}$$
(35)

Finally, \(u_{0}\) and \(\dot{c}\) are defined as

$$\begin{aligned} u_{0}= & {} - \alpha c \left( \mathbf {G}^\mathrm{T}\nabla _{p}H_{d}\right) , \nonumber \\ \dot{c}= & {} \mu ^{2}\mathbf {p}^\mathrm{T}\mathbf {p} - \alpha \left( \nabla _{p}H_{d}^\mathrm{T}\mathbf {GG}^\mathrm{T}\nabla _{p}H_{d} \right) c. \end{aligned}$$
(36)

Assume also that there exist some \(k_{v},\gamma ,\mu \) such that

$$\begin{aligned}&\gamma \left( \mathbf {G}k_{v}\mathbf {G}^\mathrm{T} +\mathbf { DM^{- 1}M_{d}}\right) - \frac{1}{4}(\mathbf {I}^n + \mu \mathbf {M_{d}})^{2}> 0, \nonumber \\&\gamma \left( \mathbf {D M^{-1}M_{d}}\right) - \frac{1}{4}(\mathbf {I}^n + \mu \mathbf {M_{d}})^{2} > 0, \end{aligned}$$
(37)

and that the equilibrium \(\mathbf {q^{*}} = \text {argmin}(V_{d})\) is assignable, that is

$$\begin{aligned} G^{\bot }\left( \nabla _{q}V(\mathbf {q^{*}}) + \delta \right) = 0. \end{aligned}$$

Then, all system trajectories originating in \(\Omega _{c}^{'}\) remain in \(\Omega _{c}^{'}\) indefinitely. In addition, all system trajectories starting outside \(\Omega _{c}^{'}\) converge to \(\Omega _{c}^{'}\) asymptotically.

Proof

Computing the time-derivative of z while substituting (33) yields \(\dot{z} = - \gamma z -\dot{\delta }\). The closed-loop dynamics with the IDA-PBC control law (34) is then given by

$$\begin{aligned} \begin{aligned} \begin{bmatrix} \dot{\mathbf {q}} \\ \dot{\mathbf {p}} \\ \end{bmatrix} = \begin{bmatrix} 0 &{} \mathbf {M^{- 1}M_{d}} \\ - \mathbf {M_{d}M^{- 1}} &{} \mathbf {J_{2}} - \mathbf {D_{d}} \\ \end{bmatrix} \begin{bmatrix} \nabla _{q}H_{d} + \Lambda (\mathbf {q}) \\ \nabla _{p}H_{d} \\ \end{bmatrix} + \begin{bmatrix} 0\\ z \\ \end{bmatrix}. \end{aligned} \end{aligned}$$
(38)

Computing the time derivative of \(H_{d}^{'}\) along the trajectories of the closed-loop system while substituting (38) yields

$$\begin{aligned} \begin{aligned} \dot{H_{d}^{'}} = - \nabla _{p}H_{d}^\mathrm{T}\mathbf {D_{d}}\nabla _{p}H_{d} + \nabla _{p}H_{d}^\mathrm{T}z - \gamma z^\mathrm{T}z - z^\mathrm{T}\dot{\delta }. \end{aligned} \end{aligned}$$
(39)

Substituting into (39) \(\left| \dot{\delta } \right| < \mu \left| p \right| \) for some \(\mu > 0\) (see Assumption 2) yields

$$\begin{aligned} \begin{aligned} \dot{H_{d}^{'}}&\le - \nabla _{p}H_{d}^\mathrm{T} \mathbf {D_{d}}\nabla _{p}H_{d} - \gamma z^\mathrm{T}z \\&\quad + \nabla _{p}H_{d}^\mathrm{T}(\mathbf {I}^n + \mu \mathbf {M_{d}})z , \end{aligned} \end{aligned}$$
(40)

which can be refactored as

$$\begin{aligned} \begin{aligned} \dot{H_{d}^{'}} \le - \begin{bmatrix} \nabla _{p}H_{d}^\mathrm{T} &{} z^\mathrm{T} \\ \end{bmatrix}\begin{bmatrix} \mathbf {D_{d}} &{} - \frac{1}{2}\Theta _0 \\ - \frac{1}{2}\Theta _0 &{} \gamma \mathbf {I}^n \\ \end{bmatrix} \begin{bmatrix} \nabla _{p}H_{d} \\ z \\ \end{bmatrix}, \end{aligned} \end{aligned}$$
(41)

where \(\Theta _0=(\mathbf {I}^n + \mu \mathbf {M_{d}})\). Employing a Schur complement argument in (41) results in the first inequality in (37), which is a sufficient condition for local stability within the dynamic tube.

Following the logic of Proposition 1, a sufficient condition for the system trajectories starting in \(\Omega _{c}^{'}\) to remain in \(\Omega _{c}^{'}\) indefinitely is given by \({\dot{H}}_{d}^{'} \le \dot{c}\). Computing the former inequality while substituting (40) and \(\dot{c}\) from (36) yields

$$\begin{aligned} \begin{aligned}&- \nabla _{p}H_{d}^\mathrm{T} \mathbf {D_{d}}\nabla _{p}H_{d} + \nabla _{p}H_{d}^\mathrm{T}(\mathbf {I}^n + \mu \mathbf {M_{d}})z \\&\quad - \gamma z^\mathrm{T}z \le \mu ^{2}\mathbf {p}^\mathrm{T}\mathbf {p} \\&\quad - \alpha \left( \nabla _{p}H_{d}^\mathrm{T}\mathbf {GG}^\mathrm{T}\nabla _{p}H_{d}\right) c, \end{aligned} \end{aligned}$$
(42)

which is verified, provided that \( k_{v} \ge \alpha c\) and that the second inequality in (37) holds. Clearly, the condition resulting from the dynamic tube is different and typically more stringent compared to the condition for local stability in (41).

In case the system trajectory lies outside the dynamic tube, we define the new Lyapunov function candidate \(W^{'} = H_{d}^{'} - c\). Computing its time derivative yields

$$\begin{aligned} \begin{aligned} {\dot{W}}^{'}&\le - \begin{bmatrix} \nabla _{p}H_{d}^\mathrm{T} &{} z^\mathrm{T} \\ \end{bmatrix}\begin{bmatrix} \mathbf {D_{d}} &{} - \frac{1}{2}\Theta _0 \\ - \frac{1}{2}\Theta _0 &{} \gamma \mathbf {I}^n \\ \end{bmatrix} \begin{bmatrix} \nabla _{p}H_{d} \\ z \\ \end{bmatrix} \\&\quad - \mu ^{2}\mathbf {p}^\mathrm{T}\mathbf {p} + \nabla _{p}H_{d}^\mathrm{T}\mathbf {G}u_{0} \\&\quad + \alpha \left( \nabla _{p}H_{d}^\mathrm{T}\mathbf {GG}^\mathrm{T}\nabla _{p}H_{d}\right) c. \end{aligned} \end{aligned}$$
(43)

Substituting \(u_{0}\) from (36) into (43) yields finally \(\dot{W}^{'} \le 0 \). Thus, the function \(W^{'}\) converges to zero asymptotically and the system trajectory converges to the dynamic tube asymptotically. Note finally that the condition that ensures convergence of the system trajectory to the dynamic tube corresponds to the condition required for local stability (i.e., it is less stringent that the condition required for the trajectory to remain in the dynamic tube indefinitely) \(\square \)

Remark 3

Employing a different assumption for the disturbances results in a different expression of the dynamic tube. For comparison purposes, assume matched and unmatched disturbances to be bounded and with bounded time-derivative such that \(\left| \dot{\delta } \right| < \varepsilon \) and \(\dot{\delta } = 0\) in proximity of the equilibrium. Computing the time-derivative of \(H_{d}^{'}\) and substituting the Young’s inequality \(\left| z^\mathrm{T}\dot{\delta } \right| \le \frac{1}{4}z^{2} + \varepsilon ^{2}\) yields

$$\begin{aligned} \begin{aligned} \dot{H_{d}^{'}}&\le - \nabla _{p}H_{d}^\mathrm{T} \mathbf {D_{d}}\nabla _{p}H_{d} + \nabla _{p}H_{d}^\mathrm{T}z \\&\quad - \left( \gamma - \frac{1}{4} \right) z^{2} + \varepsilon ^{2}. \end{aligned} \end{aligned}$$
(44)

Refactoring terms in (44) yields in this case

$$\begin{aligned} \begin{aligned} \dot{H_{d}^{'}} \le - \begin{bmatrix} \nabla _{p}H_{d}^\mathrm{T} &{} z \\ \end{bmatrix}\begin{bmatrix} \mathbf {D_{d}} &{} - \frac{1}{2}\mathbf {I}^n \\ - \frac{1}{2}\mathbf {I}^n &{} \left( \gamma - \frac{1}{4} \right) \mathbf {I}^n\\ \end{bmatrix} \begin{bmatrix} \nabla _{p}H_{d} \\ z \\ \end{bmatrix} + \varepsilon ^{2}. \end{aligned} \end{aligned}$$
(45)

Employing a Schur complement argument in (45) results in the following sufficient conditions for ultimate boundedness

$$\begin{aligned} \mathbf {D_{d}}\left( \gamma - \frac{1}{4} \right) - \frac{1}{4}\mathbf {I^{n}} > 0. \end{aligned}$$
(46)

Defining the dynamic tube as

$$\begin{aligned} \dot{c} = \varepsilon ^{2} - \alpha \left( \nabla _{p}H_{d}^\mathrm{T}\mathbf {GG} ^\mathrm{T}\nabla _{p}H_{d} \right) \text {c}, \end{aligned}$$
(47)

and computing the inequality \({\dot{H}}_{d}^{'} \le \dot{c}\) yields then

$$\begin{aligned} \begin{aligned}&- \nabla _{p}H_{d}^\mathrm{T} \mathbf {D_{d}}\nabla _{p}H_{d} + \nabla _{p}H_{d}^\mathrm{T}z - \left( \gamma - \frac{1}{4} \right) z^{2} + \varepsilon ^{2} \\&\quad \le \varepsilon ^{2} - \alpha \left( \nabla _{p}H_{d}^\mathrm{T}\mathbf {GG} ^\mathrm{T}\nabla _{p}H_{d} \right) \text {c}, \end{aligned} \end{aligned}$$
(48)

which is verified provided that \( k_{v} \ge \alpha c\) and that

$$\begin{aligned} \mathbf {D M^{-1}M_{d}}\left( \gamma - \frac{1}{4} \right) - \frac{1}{4}I^{n} > 0. \end{aligned}$$
(49)

When the system trajectory lies outside the dynamic tube, the new Lyapunov function candidate \(W^{'} = H_{d}^{'} - c\) is defined. Computing its time derivative while introducing the additional control action \(u_{0}\) yields

$$\begin{aligned} \begin{aligned} {\dot{W}}^{'}&\le - \begin{bmatrix} \nabla _{p}H_{d}^\mathrm{T} &{} z \\ \end{bmatrix}\begin{bmatrix} \mathbf {D_{d}}&{} - \frac{1}{2}\mathbf {I}^n \\ - \frac{1}{2}\mathbf {I}^n &{} \left( \gamma - \frac{1}{4} \right) \mathbf {I}^n \\ \end{bmatrix} \begin{bmatrix} \nabla _{p}H_{d} \\ z \\ \end{bmatrix} + \varepsilon ^{2} \\&\quad - \varepsilon ^{2} + \nabla _{p}H_{d}^\mathrm{T}\mathbf {G}u_{0} \\&\quad + \alpha \left( \nabla _{p}H_{d}^\mathrm{T}\mathbf {GG}^\mathrm{T}\nabla _{p}H_{d} \right) \text {c.} \end{aligned} \end{aligned}$$
(50)

Simplifying common terms in (50) and computing \(u_{0}\) in order to cancel the term dependent on c yields

$$\begin{aligned} u_{0} = - \alpha c\mathbf {G}^\mathrm{T}\nabla _{p}H_{d}, \end{aligned}$$

which is the same as in Proposition 2. Note, however, that in this hypothetical case the dynamic tube would corresponds to a filter with input \(\varepsilon ^{2}\), which is always positive unless the system is at equilibrium. Thus, differently from Proposition 2, the dynamic tube (47) would only affect the initial transient since its width would continue to grow indefinitely in time. This indicates that, while different implementations of the dynamic tube are possible, the physical implications of the underlying assumptions should be considered at the controller design stage. Note finally that a discrete-time implementation of (34) can be obtained by employing a discrete-time version of the I&I method [23] and of IDA-PBC [22], but it is beyond the scope of this work.

4.4 Control implementation

An MPC algorithm is now constructed by including the ancillary control and the dynamic tube equation as additional constraints. The variables related to the dynamic tube \(\alpha \), \(\dot{\alpha }\), c, \(\dot{c}\) and the auxiliary control variable \(\mathbf {u}_\mathrm{aux}\) are included in the set of design variables at each time step i thus

$$\begin{aligned} \begin{aligned} \mathbf {\chi }^{(k|i)}&= \left( \mathbf {q}^{(k|i)} \dot{\mathbf {q}}^{(k|i)} \ddot{\mathbf {q}}^{(k|i)} \mathbf {a}^{(k|i)} c^{(k|i)} \alpha ^{(k|i)} \right. \\&\quad \left. \dot{c}^{(k|i)} \dot{\alpha }^{(k|i)} \mathbf {u}^{(k|i)} \mathbf {u}_{\text {aux}}^{(k|i)}\right) ^\mathrm{T}, \end{aligned} \end{aligned}$$
(51)

where \(i=1,\ldots ,N_\mathrm{tot}-k\) and k represent the instant when the output error is higher than the preset tolerance. The total set of design variables is \(\mathbf {\chi } = (\mathbf {\chi }^{(1)} \mathbf {\chi }^{(2)} \ldots \mathbf {\chi }^{(N_\mathrm{tot} - k)})^T\) and has dimension \((N_{\text {tot}} - k)(4n + 2m + 4)\). The variables c, \(\dot{c}\) and \(\mathbf {u}_\mathrm{aux}\) are not independent; thus, they could be eliminated in theory. However, preserving them results in improved robustness of numerical conditioning. Thus, the nonlinear programming problem indicated as \(\mathcal {P}_3\) is formulated as

$$\begin{aligned} \begin{aligned}&\mathcal {P}_3: \\&\underset{\mathbf {\chi }}{\text {min}}&J = \sum _{i = 1}^{N_{\text {tot}}-k+1} \left[ \gamma _{\text {obj}} (\alpha ^{(k|i)})^2 + (\mathbf {u}^{(k|i)})^T \mathbf {R} (\mathbf {u}^{(k|i)})\right] h \\&\text {s.t.}&\mathbf {c}_{\text {mot}}(\mathbf {\chi }) = \mathbf {res}(\mathbf {q}, \dot{\mathbf {q}}, \ddot{\mathbf {q}}, \mathbf {u}) \\&&\mathbf {A}_{\text {intg}} \mathbf {\chi } - \mathbf {b}_{\text {intg}} = 0 \\&&\mathbf {q}^{(k|1)} = \mathbf {q}_{\text {out}}^{(k)} \\&&\mathbf {y}^{(N_{\text {tot}}-k)} = \mathbf {y}_d(t^{(N_{\text {tot}}-k)}) \\&&\mathbf {c}_{\text {tube}}(\mathbf {\chi }) = -\dot{c}^{(k|i)} + \varepsilon w_1(\mathbf {q}, \dot{\mathbf {q}}) - c^{(k|i)} \alpha ^{(k|i)} w_2(\mathbf {q}, \dot{\mathbf {q}}) \\&&\mathbf {c}_{\text {aux}}(\mathbf {\chi }) = -(\mathbf {u}_{\text {aux}}^{(k|i)} + \mathbf {u}^{(k|i)}) + u_{\text {ida-pbc}}^{(k|i)}(\mathbf {q}, \dot{\mathbf {q}})\\&&\mathbf {c}_{\alpha }(\mathbf {\chi }) = -\dot{\alpha }^{(k|i)} + \Delta \alpha ^{(k|i)}/ h \\&&\mathbf {v}^{\alpha }_{\text {ineq}}(\mathbf {\chi }) \triangleq \alpha ^{(k|i)}> 0 \\&&\mathbf {v}^c_{\text {ineq}}(\mathbf {\chi }) \triangleq c^{(k|i)} > 0 \\&&\mathbf {v}^{k_v}_{\text {ineq}}(\mathbf {\chi }) \triangleq \alpha ^{(k|i)} c^{(k|i)} \le k_v . \\ \end{aligned} \end{aligned}$$

The term \(\mathbf {res}(\mathbf {q}, \dot{\mathbf {q}}, \ddot{\mathbf {q}}, \mathbf {u})\) is the residual of the dynamic equation considering the MPC control input \(\mathbf {u}\), \(\mathbf {c}_{\text {mot}}\) is a set of binding constraints in the optimization problem, and the condition \(\mathbf {c}_{\text {mot}} \approx 0\) is verified when a solution of the optimization problem \(\mathcal {P}_3\) is found. The initial state at each iteration that lies outside the prescribed trajectory is \(\mathbf {q}_{\text {out}}^{(k)}\). The last six lines of \(\mathcal {P}_3\) are related to the discretized dynamic tube-MPC for \(i = 1,\ldots ,N_{\text {tot}} - k \): the constraint \(\mathbf {c}_{\text {tube}}\) represents the dynamic tube equation; the constraint \(\mathbf {c}_{\text {aux}}\) is related to the ancillary control, where \(u_{\text {ida-pbc}}\) is given in (24) for matched and bounded disturbances, and in (34) for unmatched non-vanishing disturbances; the constraint \(\mathbf {c}_{\alpha }\) is required since \(\alpha \) is time varying. The complete set of equality constraints is

$$\begin{aligned} \begin{aligned} \mathbf {c}_{\text {eq}}&= \left( \mathbf {c}_{\text {mot}}^{(1)},\mathbf {c}_{\text {tube}}^{(1)}, \mathbf {c}_{\text {aux}}^{(1)},\mathbf {c}_{\alpha }^{(1)}\ldots ,\mathbf {c}_{\text {mot}}^{(N_{\text {tot}}-k)}, \mathbf {c}_{\text {tube}}^{(N_{\text {tot}}-k)}, \right. \\&\qquad \left. \mathbf {c}_{\text {aux}}^{(N_{\text {tot}}-k)}, \mathbf {c}_{\alpha }^{(N_{\text {tot}}-k)}\right) \approx 0. \end{aligned} \end{aligned}$$

The gradient for this equality constraint is \({\partial \mathbf {c_{\text {eq}}} / \partial \mathbf {\chi }}\). Inequalities \(\mathbf {v}^{\alpha }_{\text {ineq}}\) and \(\mathbf {v}^c_{\text {ineq}}\) impose positive values for \(\alpha \) and c, while inequality \(\mathbf {v}^{k_v}_{\text {ineq}}\) imposes a bound on \(k_v\) which is a tuning parameter in (20).The complete set of inequality constraints is

$$\begin{aligned} \begin{aligned} \mathbf {v}_{\text {ineq}}&= \left( {\mathbf {v}_{\text {ineq}}^{\alpha }}^{(1)}, {\mathbf {v}_{\text {ineq}}^{\text {c}}}^{(1)}, {\mathbf {v}_{\text {ineq}}^{k_v}}^{(1)},\ldots , {\mathbf {v}_{\text {ineq}}^{\alpha }}^{(N_{\text {tot}}-k)}, \right. \\&\qquad \left. {\mathbf {v}_{\text {ineq}}^{\text {c}}}^{(N_{\text {tot}}-k)}, {\mathbf {v}_{\text {ineq}}^{k_v}}^{(N_{\text {tot}}-k)}\right) . \end{aligned} \end{aligned}$$

The gradient for this inequality constraint is \({\partial \mathbf {v_{\text {ineq}}} / \partial \mathbf {\chi }}\). Following the principles of MPC, only the first element of the design variables \(\mathbf {u}\), \(\mathbf {u}_{\text {aux}}\), c and \(\alpha \) at each iteration are employed in the dynamic tube-MPC control policy

$$\begin{aligned} \mathbf {\pi } = \mathbf {u}_{\text {fb}} = k_m \mathbf {u} + k_a \mathbf {u}_{\text {aux}}, \end{aligned}$$
(52)

which is applied to the plant when the output error is larger than the preset tolerance. The parameters \(k_m\) and \(k_a\) are gains affecting the responsiveness of the system, and \(k_m/k_a\) defines the direction of actuation in the control policy. The dynamic tube-MPC based on IDA-PBC is summarized as Algorithm 1, and a block diagram is shown in Fig. 1.

Algorithm 1: Dynamic tube-MPC for underactuated mechanical systems.

  1. i.

    compute offline the feedforward control \(\mathbf {u}_{\text {ff}}\), solving \(\mathcal {P}_1\), and then apply it to the plant;

  2. ii.

    compute the system output y, and if \(e = y_d -y > \text {tolerance}\), then proceed solving dynamic tube-MPC recursively;

    1. (a)

      estimate states;

    2. (b)

      compute initial guess for the design variables;

    3. (c)

      solve \(\mathcal {P}_3\) for the horizon \(t^{(k|N_{\text {tot}})} - t^{(k|i)} \);

    4. (d)

      from the solution of \(\mathcal {P}_3\) obtain \(\mathbf {u}\), \(\mathbf {u}_\mathrm{aux}\), c, \(\alpha \): use the first elements of these vectors (in time) to compute control policy \(\pi = \mathbf {u}_{\text {fb}}\), and apply it to the plant;

    5. (e)

      compute y for the next time step, and

      if \((e > \text {tolerance})\)

      – go to (a), in a receding horizon problem \(k = k + 1\) with \(i = 1, 2 \ldots (N_{\text {tot}} - k + 1)\);

      else \((e < \text {tolerance})\)

      – go to point iii;

      end

  3. iii.

    continue applying the feedforward control \(\mathbf {u}_{\text {ff}}(k)\) to the plant.

Fig. 1
figure 1

Block diagram of the dynamic tube-MPC algorithm

5 Examples and results

Two examples are presented in order to illustrate the proposed approach: a two-mass-spring-damper system with parametric uncertainty and a inertia-wheel pendulum system with a constant but unknown external disturbance.

5.1 Parametric uncertainty: two-mass-spring-damper

The underactuated mechanical system shown in Fig. 2 consists of two masses connected by springs and by one damper with parametric uncertainty. We shall initial consider linear springs and subsequently one nonlinear spring. The vector of the generalized position coordinates is defined as \(\mathbf {q} = (x_1, x_2)^T\), where \(x_j\) are the relative translations of the node \(j = 1,2\). The state vector is defined as \(\mathbf {x} = (x_1, \dot{x}_1, x_2, \dot{x}_2)^T\), and the dynamic equations for the linear case are

$$\begin{aligned} \varvec{\dot{x}} = \mathbf {A}_{\text {ss}} \mathbf {x} + \mathbf {B}_{\text {ss}} \mathbf {u}, \end{aligned}$$
(53)

where

$$\begin{aligned} \begin{aligned} \displaystyle \mathbf {A}_{\text {ss}}&= \left( \begin{array}{cccc} 0 &{}\quad 1 &{}\quad 0 &{}\quad 0 \\ -\frac{(k_1 + k_2)}{m_1} &{}\quad -\frac{b}{m_1} &{} \quad \frac{k_2}{m_1} &{}\quad \frac{b}{m_1} \\ 0 &{} \quad 0 &{}\quad 0 &{}\quad 1 \\ \frac{k_2}{m_2} &{}\quad \frac{b}{m_2} &{}\quad -\frac{(k_2 + k_3)}{m_2} &{}\quad - \frac{b}{m_2} \end{array} \right) , \\ \displaystyle \mathbf {B}_{\text {ss}}&= \left( \begin{array}{c} 0 \\ \frac{1}{m_1} \\ 0 \\ 0 \end{array} \right) . \end{aligned} \end{aligned}$$

The mechanical parameters of the system are defined as \(m_1 = 1.0 \text {kg}\), \(m_2 = 3.0 \text {kg}\), \(k_1 = 5.0 \text {N/m}\), \(k_2 = 3.0 \text {N/m}\), \(k_3 = 1.0 \text {N/m}\) and \(b = 7.0 \text {N s/m}\). The physical damper has a parametric uncertainty \(\Delta b = 2.0 \text {N s/m}\). The plant output is simulated as a direct dynamic problem (i.e., initial value problem); hence, numerical errors can also be a source of uncertainty. The time discretization is implemented using 120 points, the initial time \(t_0 = 0 \text {s}\), the final time \(t_\mathrm{f} = 30 \text {s}\), and the time step \(h = 0.25 \text {s}\). The initial state of the system is \(\mathbf {x} = (0.5, 0.8294, 1.0, -0.0027)^\mathrm{T}\).

Fig. 2
figure 2

Schematic of the two-mass-spring-damper system

The potential energy of the system is \(V = \frac{k_{1}x_{1}^{2}}{2} + \frac{k_{2}\left( x_{2} - x_{1} \right) ^{2}}{2} + \frac{k_{3}x_{2}^{2}}{2}\). The kinetic energy is \(T = \frac{1}{2}m_{1}{\dot{x}}_{1}^{2} + \frac{1}{2}m_{2}{\dot{x}}_{2}^{2}\). The damping matrix is \(\mathbf {D} = b\ \begin{bmatrix} 1 &{} - 1 \\ - 1 &{} 1 \\ \end{bmatrix}\). The input matrix for the port-controlled Hamiltonian formulation is \(\mathbf {G} = \begin{bmatrix} 1 \\ 0 \\ \end{bmatrix}\), and the inertia matrix \(\mathbf {M} = \begin{bmatrix} m_{1} &{} 0 \\ 0 &{} m_{2} \\ \end{bmatrix}\) is constant. The only stable equilibrium point of the system in open loop is \(x_{1} = x_{2} = 0\).

The control goal corresponds to moving the second mass to a prescribed position such that \(\left( x_{1},x_{2} \right) = (x_{1}^{*},x_{2}^{*})\). Since the system is underactuated, the position \(x_{1}^{*}\) of the first mass would depend on that of the second mass. Thus, the two masses cannot be driven to independent prescribed values simultaneously. We employ an IDA-PBC design, such that the closed-loop system has constant inertia matrix given by \(\mathbf {M_{d}} = \begin{bmatrix} a_{1} &{} - a_{2} \\ - a_{2} &{} a_{3} \\ \end{bmatrix}\), where \(a_{1}> 0,\ a_{2}> 0,\ a_{3} > 0\) and \(a_{1}a_{3} > a_{2}^{2}\). Since \(\mathbf {M}\) is constant, the kinetic-energy PDE (21) can be trivially solved with a constant \(\mathbf {M_{d}}\) and with \(J_{2} = 0\). The potential-energy PDE (22) becomes instead

$$\begin{aligned} - \frac{a_{2}}{m_{1}}\nabla _{x_{1}}V_{d} + \frac{a_{3}}{m_{2}}\nabla _{x_{2}}V_{d} = \left( k_{2} + k_{3} \right) x_{2} - k_{2}x_{1}, \end{aligned}$$

and the candidate solution is chosen as

$$\begin{aligned} V_{d}&= \frac{\left( m_{1}x_{1}^{2}\left( k_{2} + \frac{\left( a_{3}m_{1}\left( k_{2} + k_{3} \right) \right) }{a_{2}m_{2}} \right) \right) }{2a_{2}} \nonumber \\&\quad - \frac{m_{1}^{2}\left( k_{2} + k_{3} \right) \left( \frac{a_{3}x_{1}^{2}}{m_{2}} + \frac{a_{2}x_{1}x_{2}}{m_{1}} \right) }{a_{2}^{2}}\nonumber \\&\quad + \frac{m_{1}^{2}m_{2}\left( k_{2} + \ k_{3} \right) \left( \frac{a_{3}x_{1}}{m_{2}}\ + \frac{a_{2}\left( x_{2}\ - x_{2}^{*} \right) }{m_{1}} \right) ^{2}}{2a_{2}^{2}a_{3}} \nonumber \\&\quad + \frac{m_{1}{x_{2}^{*}}^{2}\left( k_{2}\ + \ k_{3} \right) ^{2}}{2a_{2}k_{2}}. \end{aligned}$$
(54)

Typically, the minimizer conditions \(\nabla _{q}V_{d}\left( q^{*} \right) = 0\) and \(\nabla _{q}^{2}V_{d}\left( q^{*} \right) > 0\) are satisfied by introducing in \(V_{d}\) a term dependent on the position error \(\left( q - q^{*} \right) \) multiplied by a constant tuning parameter \(k_{p}\). However, in this case the expression of \(V_{d}\) in (54) does not contain \(k_{p}\) since the latter has been chosen as \(k_{p} = \frac{m_{1}^{2}m_{2}\left( k_{2} + k_{3} \right) }{a_{2}^{2}a_{3}}\) in order to verify the conditions \(\nabla _{x}V_{d} = 0\) and \(\nabla _{x}^{2}V_{d} = \frac{ k_{2}m_{1}m_{2}\left( k_{2}\ + \ k_{3} \right) }{a_{2}a_{3}} > 0\) at \(\left( x_{1},x_{2} \right) = \left( \frac{\left( k_{2} + k_{3} \right) x_{2}^{*}}{k_{2}},x_{2}^{*} \right) \), for any \(x_{2}^{*}\).

The baseline IDA-PBC control law is then given by

$$\begin{aligned} u_{\text {ida-pbc}}&= u_{\text {es}} + u_{\text {di}}\nonumber \\ u_{\text {es}}&= k_{1}x_{1}\ + \ k_{2}\left( x_{1}\ - \ x_{2} \right) \nonumber \\&\quad + \frac{a_{1}\left( k_{2}x_{2}^{*}\ - \ k_{2}x_{1}\ + \ k_{3}x_{2}^{*} \right) }{a_{2}} \nonumber \\&\quad + \frac{a_{2}\left( k_{2}\ + \ k_{3} \right) \left( x_{2}\ - \ x_{2}^{*} \right) }{a_{3}}, \nonumber \\ u_{\text {di}}&= - \frac{k_{v}\left( a_{3}m_{1}{\dot{x}}_{1}\ + \ a_{2}m_{2}{\dot{x}}_{2} \right) }{a_{1}a_{3}\ - \ a_{2}^{2}}, \end{aligned}$$
(55)

where \(k_{v}, a_{1}, a_{2}, a_{3}\) are tuning parameters. The closed-loop damping matrix \(\mathbf {D_{d}}\) is positive definite and has determinant \(k_{v}b\left( a_{2}\ + \ a_{3} \right) /m_{2}\). It must be noted that the control input does not depend on the open-loop damping \(\mathbf {D}\), which therefore can be uncertain. Since this system does not have external disturbances, we set \(u_{\text {adpt}} = 0\). The dynamic tube equation and the additional control action \(u_{0}\) that is included in \(u_{\text {ida-pbc}}\) in the optimization problem \(\mathcal {P}_3\) are given by (25)

$$\begin{aligned} \begin{aligned} u_{0}&= - \alpha c \left( \frac{a_{3}m_{1}\dot{x_{1}}+a_{2}m_{2}\dot{x_{2}}}{a_{1}a_{3}-a_{2}^{2}}\right) , \\ \dot{c}&= \left| \frac{a_{3}m_{1}\dot{x_{1}}+a_{2}m_{2}\dot{x_{2}}}{a_{1}a_{3}-a_{2}^{2}}\right| \varepsilon \\&\quad - \alpha c \left( \frac{a_{3}m_{1}\dot{x_{1}}+a_{2}m_{2}\dot{x_{2}}}{a_{1}a_{3}-a_{2}^{2}}\right) ^{2}. \end{aligned} \end{aligned}$$

The proposed control algorithm can be readily extended to a system with a nonlinear spring \(k_1=k_{10}+ k_{11} x_1^{2}\). In such case, the potential energy V changes accordingly, but the potential-energy PDE (22) is preserved. Thus, the candidate solution (54) remains valid and the equations defining the dynamic tube remain the same. However, due to the change in V, the control input \(u_{\text {es}}\) in (55) changes to

$$\begin{aligned} \begin{aligned} u_{\text {es}}&= k_{10}x_{1} + 2k_{11}x_{1}^{3} + k_{2}\left( x_{1} - x_{2} \right) \\&\quad + \frac{a_{1}\left( k_{2}x_{2}^{*} - k_{2}x_{1} + k_{3}x_{2}^{*} \right) }{a_{2}} \\&\quad + \frac{a_{2}\left( k_{2}\ + \ k_{3} \right) \left( x_{2}\ - \ x_{2}^{*} \right) }{a_{3}}. \end{aligned} \end{aligned}$$

5.1.1 Simulation results

The IDA-PBC parameters were selected as \(a_{1} =1,\ a_{2} =-1,\ a_{3} =2\) , \(\varepsilon = 1\), \(k_v = 1\). The weights of the performance index were tuned empirically as \(\mathbf {R}=\beta _{\text {obj}} = 0.5\) and \(\gamma _{\text {obj}} = 0.7\), while \(k_a=k_m=1\). The output error tolerance was set to 0.12 m.

Fig. 3
figure 3

Simulation results: dynamic tube-MPC for the two-mass-spring-damper system

Figure 3a shows the relative error for the output, which has a maximum amplitude of around 15%. Figure 3b shows the control input, the feedforward control computed using \(\mathcal {P}_1\), and the feedback control computed using \(\mathcal {P}_3\). The dynamic tube-MPC control policy acts to correct the output in the time interval \(1.5 \le t \le 3.0\). Figure 3c shows the positions and a comparison between the desired trajectory and the plant output \(x_2\). Figure 3d shows the velocities of the masses. Figure 3e shows the phase diagram for the untracked position \(x_{1}\), while Fig. 3f, g show the independent variable \(\alpha \) computed by \(\mathcal {P}_3\) and its time derivative \(\dot{\alpha }\) to illustrate their evolution in time. Figure 3h shows the evolution of c during the optimization: c is constant in all iterations for the interval \(6.0 \le t \le 24.0\). Figure 3i shows the dynamic tube geometry that is obtained by using the first values of c at each iteration of the dynamic tube-MPC algorithm. The opposite value of c is plotted with a dotted line for graphical purposes.

Fig. 4
figure 4

Simulation results comparing the output for different controllers: a, b dynamic tube-MPC, tube-MPC (fixed \(\alpha \)) and MPC; c, d different gains in the control policy

Fig. 5
figure 5

Simulation results for a new prescribed trajectory with the same passage points and with a linear interpolation

Fig. 6
figure 6

Simulation results for a new prescribed trajectory (pulse command)

A further set of results is shown in Fig. 4 in order to compare the dynamic tube-MPC with a tube-MPC that employs a fixed value of \(\alpha \), and with a baseline MPC (solution of \(\mathcal {P}_2\)). The dynamic tube-MPC is clearly superior to the baseline MPC and to the tube-MPC with fixed tube geometry. Figure 4c, d illustrates the effect of the ratio \(k_m/k_a\) on the output dynamics. In particular, varying \(k_m\) has only a small effect on the initial transient and a negligible effect after that. In order to further validate the proposed approach, additional simulations were performed: (i) considering the same parameters and a new desired trajectory, but with the same passage points (reference for the interpolation) and with a linear interpolation, see Fig. 5. Six iterations were needed in this case (one more than the first trajectory) for the dynamic tube-MPC to converge; (ii) considering the same parameters and a new desired trajectory (pulse command), see Fig. 6.

Fig. 7
figure 7

Simulation results for nonlinear spring with stiffness \(k_1=k_{10}+ k_{11} x_1^{2}\)

Additional results for the case of a nonlinear spring \(k_1=k_{10}+ k_{11} x_1^{2}\), where \(k_{10}=5\) Nm and \(k_{11}=0.05\) Nm, are shown in Fig. 7. Also in this case the proposed control algorithm correctly achieves the tracking goal. Differently from the linear case, the initial transient shows some oscillations that, however, vanish over time. Note that in this case the feedback control is active for a longer time than in the linear case even though the same tuning parameters have been employed. This is due to the higher complexity of the nonlinear problem.

5.2 External disturbance: inertia-wheel pendulum system

The inertia-wheel pendulum consists of an unactuated pendulum with a balanced actuated rotor at the tip [46] (see Fig. 8). This system is a classical example of underactuated mechanism which exhibits nonlinear characteristics due to its inverted pendulum structure [29, 54]. The vector of the angular position is defined as \(\mathbf {q} = (q_1, q_2)^T\). The pendulum angle is measured from the vertical, while the rotor angle \(q_{2}\) is measured relative to the pendulum. The equations of motion are

$$\begin{aligned} \mathbf {M} {\ddot{\mathbf {q}}} + \mathbf {B}_n {\dot{\mathbf {q}}} + \mathbf {g}_n \sin (\mathbf {q}) = \mathbf {A} u - \mathbf {\delta }, \end{aligned}$$
(56)

where

$$\begin{aligned} \begin{array}{cccc} \displaystyle \mathbf {M} = \left( \begin{array}{cc} (a_1 + a_2) &{} a_2 \\ a_2 &{} a_2 \end{array} \right) , \quad &{} \displaystyle \mathbf {B}_n = \left( \begin{array}{cc} b_1 &{} 0 \\ 0 &{} b_2 \end{array} \right) , \\ \displaystyle \mathbf {g}_n = \left( \begin{array}{cc} a_3 &{} 0 \\ 0 &{} 0 \end{array} \right) , \quad &{} \displaystyle \mathbf {A} = \left( \begin{array}{c} 0 \\ 1 \end{array} \right) , \end{array} \end{aligned}$$

and \( a_1 = m_p l_{c1}^2 + m_w l^2 + l_p\), \(a_2 = I_w\) and \(a_3 = g (m_p l_{c1} + m_w l)\). The values of the parameters are defined in Table 1 and correspond to [46].

The open-loop potential energy is \(V = a_{3}\cos (q_{1})\), and the inertia matrix \(\mathbf {M}\) is constant. An unmodeled constant external disturbance \(\mathbf {\delta } = [18, 0]^T\) is acting on the unactuated pendulum. Time discretization is implemented using 120 points, the initial time \(t_0 = 0 \text {s}\), the final time \(t_\mathrm{f} = 0.15 \text {s}\), and the time step \(h = 0.0013 \text {s}\). The initial configuration for the system is defined as \(\mathbf {q}(t_0) = (0, 0.01)^T\) and \({\dot{\mathbf {q}}}(t_0) = (0.1874, 0.1074)^T\).

The control goal corresponds to following a desired output trajectory \(q_{2} = q_{2}^{*}\). Note that the position \(q_1\) can only be stabilized at \(q_1 = 0\) and \(q_1 = \pi \), thus trajectory tracking is only possible for \(q_2\). Since the matrix \(\mathbf {M}\) is constant, the kinetic-energy PDE (21) is solvable with \(\mathbf {J_{2}} = 0\) and with a constant inertia matrix \(\mathbf {M_{d}} = \left( a_{1}a_{2} - a_{2}^{2} \right) \begin{bmatrix} m_{1} &{} m_{2} \\ m_{2} &{} m_{3} \\ \end{bmatrix}\), where \(m_{2} = m_{1}a_{2}/a_{1} + \psi \) and \(m_{1},m_{3},\psi \) are constant positive parameters. The candidate solution of the potential-energy PDE (22) is

$$\begin{aligned} \begin{aligned} V_{d}&= a_{3}\cos \left( q_{1} \right) \frac{1}{\left( m_{1} - m_{2} \right) a_{2}} + \frac{k_{p}}{2}\left( q_{2} + \gamma _{0}q_{1} \right) ^{2}, \\ \gamma _{0}&= - \frac{m_{2}a_{1} - m_{1}a_{2}}{\left( m_{1} - m_{2} \right) a_{2}}. \\ \end{aligned} \end{aligned}$$
(57)

The baseline IDA-PBC control law which includes the disturbance compensation term \( u_{\text {adpt}}\) is

$$\begin{aligned}&u_{\text {ida-pbc}} = u_{\text {es}} + u_{\text {di}} + u_{\text {adpt}}, \nonumber \\&u_{\text {es}} = \gamma _{2}\sin \left( q_{1} \right) - k_{p}\gamma _{3}\left( q_{2} - q_{2}^{*} + \gamma _{0}q_{1} \right) , \nonumber \\&u_{\text {di}} = - \frac{k_{v}\left( {m_{1}p}_{2} - m_{2}p_{1} \right) }{\left( a_{1}a_{2} - a_{2}^{2} \right) \left( m_{1}m_{3} - m_{2}^{2} \right) }, \nonumber \\&u_{\text {adpt}} = \widetilde{\delta }_2 - \widetilde{\delta }_1 \frac{m_2 - m_3}{m_1 - m_2} + k_p \gamma _3 \gamma _{0} q_1^*. \end{aligned}$$
(58)

The disturbance estimates \(\widetilde{\delta }_1 = \widehat{\delta }_1 + \beta _1\) and \(\widetilde{\delta }_2 = \widehat{\delta }_2 + \beta _2\) are computed according to (33) as

$$\begin{aligned} \begin{aligned} \dot{\widehat{\delta }}_1&= \gamma (a_2 a_3 \sin (q_1) - a_2 u_{\text {ida-pbc}} - a_2 \widetilde{\delta }_1 + a_2 \widetilde{\delta }_2 ), \\ \dot{\widehat{\delta }}_2&= \gamma (-a_2 a_3 \sin (q_1) + (a_1 + a_2) u_{\text {ida-pbc}} \\&\quad + a_2 \widetilde{\delta }_1 - (a_1 + a_2) \widetilde{\delta }_2 ), \\ \beta _1&= -\gamma ((a_1 + a_2)a_2 - a_2^2) \dot{q}_1, \\ \beta _2&= -\gamma ((a_1 + a_2)a_2 - a_2^2) \dot{q}_2, \end{aligned} \end{aligned}$$
(59)

where the terms \(k_{p},\ k_{v}, \gamma >0\) are constant tuning parameters, while the constant terms \(\gamma _{2},\gamma _{3}\) are defined as

$$\begin{aligned} \begin{aligned} \gamma _{2}&= a_{3}(m_{2} - m_{3})/(m_{1} - m_{2})\ , \\ \gamma _{3}&= (\varepsilon a_{1}(m_{2} - m_{3})/(m_{1} - m_{2}) - (m_{3}a_{1} - m_{2}a_{2})). \\ \end{aligned} \end{aligned}$$

The dynamic tube equation and the additional control action \(u_{0}\) that is included in \(u_{\text {ida-pbc}}\) in the optimization problem \(\mathcal {P}_3\) are given by (36) as

$$\begin{aligned} \begin{aligned} u_{0}&= - \frac{\alpha c \left( {m_{1}p}_{2} - m_{2}p_{1} \right) }{\left( a_{1}a_{2} - a_{2}^{2} \right) \left( m_{1}m_{3} - m_{2}^{2} \right) }, \\ \dot{c}&= \left( p_{1}^{2}+p_{2}^{2}\right) \mu ^{2} \\&\quad - \alpha c \left( \frac{\left( {m_{1}p}_{2} - m_{2}p_{1} \right) }{\left( a_{1}a_{2} - a_{2}^{2} \right) \left( m_{1}m_{3} - m_{2}^{2} \right) }\right) ^{2}. \end{aligned} \end{aligned}$$

5.2.1 Simulation results

The IDA-PBC tuning parameters have been chosen as \(\psi = 1.0\), \(k_v = 0.8\), \(k_p = 2.0\), \(\mathbf {M_{d}} = \begin{bmatrix} 0.4 &{} \star \\ (\frac{-0.4 a_2}{a_1+a_2}+\psi ) &{} 5 \\ \end{bmatrix}\), and \(\gamma =3.5\). The weights \(\mathbf {R}=\beta _{\text {obj}} = 5.0\) and \(\gamma _{\text {obj}} = 0.7\) were tuned empirically. The output error tolerance was set to \(0.0015 \text {rad}\) and the disturbance bound to \(\mu =1\). The control policy was implemented with \(k_m=20\) and \(k_a=-4\).

Fig. 8
figure 8

Schematic of the inertia-wheel pendulum

Table 1 Mechanical parameters of the inertia-wheel-pendulum

Figure 9a shows the output error with a maximum amplitude of approximately \(2.3\%\). Figure 9b shows the control input, the feedforward control (problem \(\mathcal {P}_1\)), and the feedback control (dynamic tube-MPC, problem \(\mathcal {P}_3\)). The control policy corrects the output during the time interval \(0.100 \le t \le 0.111\) and \(0.130 \le t \le 0.139\). Figure 9c shows the angular positions, where the plant output is \(q_2\), and the desired trajectory. Figure 9d shows the angular velocities of the simulated plant. Figure 9e shows the phase diagram of the unactuated state \(q_{1}\), while Fig. 9f, g show the independent variable \(\alpha \) computed by \(\mathcal {P}_3\) and its time derivative \(\dot{\alpha }\) to illustrate their evolution in time. Figure 9h shows that c varies in time during the optimization process. In this case, thirteen iterations were required for the convergence of the dynamic tube-MPC, eight of which for the first part and five for the second. Figure 9i shows the tube geometry corresponding to the first values of c at each iteration of the dynamic tube-MPC algorithm. Finally, Fig. 10 shows that the dynamic tube-MPC is superior to a tube-MPC with fixed tube geometry (i.e. that employs a fixed value of \(\alpha \)). A further set of results that refer to larger values of \(q_1\) is shown in Figs. 11 and 12 to better highlight the nonlinear characteristics of the inertia-wheel pendulum . The control policy has been implemented with \(k_m=10, k_a=-5\) in the former and \(k_m=2, k_a=-1.2\) in the latter, while an output error tolerance of 0.0015 has been employed in both cases. The proposed control algorithm correctly achieves the regulation goal for \(q_2\), while \(q_1\) remains in a nonlinear range (i.e., \(\sin {(q_1)} \ne q_1\)) similar to that employed in [29]. In summary, the results indicate that dynamic tube-MPC is effective for nonlinear underactuated mechanical systems in the presence of a class of external disturbance.

Fig. 9
figure 9

Simulation results showing the proposed dynamic tube-MPC for the inertia-wheel-pendulum

Fig. 10
figure 10

Simulation results comparing dynamic tube-MPC and tube-MPC with fixed \(\alpha \)

Fig. 11
figure 11

Simulation results showing the proposed dynamic tube-MPC for the inertia-wheel-pendulum with larger angles \(q_1 \ge 0.2\)

Fig. 12
figure 12

Simulation results showing the proposed dynamic tube-MPC for the inertia-wheel-pendulum with larger angles \(q_1 \ge 1\)

6 Conclusion

A dynamic tube-MPC for underactuated mechanical system has been proposed. This is a robust control method that combines MPC with an ancillary control designed with the IDA-PBC methodology resulting in a new nonlinear MPC algorithm. An adaptive control law is also introduced in order to compensate the effect of a class of unknown external disturbances under some assumptions. The proposed strategy was demonstrated with simulations on two examples: a two-mass-spring-damper system with parametric uncertainty on the damper and with either a linear or a nonlinear spring; a inertia-wheel-pendulum system with unmodeled external disturbances. The simulation results indicate that the proposed approach is superior to tube-MPC with fixed tube geometry.

While the proposed approach is general in principle, practical limitations include: (i) the ability to solve analytically the PDEs, which are required to compute the ancillary control law using IDA-PBC; (ii) the large amount of parameters; (iii) the possibility of numerical errors which can propagate across multiple optimal control problems and might become comparable to the output error.

Future work aims to extend the proposed method by further relaxing the assumptions on the external disturbances. In addition, we aim to investigate the performance of the proposed algorithm for a wider range of underactuated mechanical systems and to analyze in more detail the effect of the tuning parameters on the performance and on the stability of the closed-loop system.