Abstract
We derive an optimal control formulation for a nonholonomic mechanical system using the nonholonomic constraint itself as the control. We focus on Suslov’s problem, which is defined as the motion of a rigid body with a vanishing projection of the body frame angular velocity on a given direction \(\varvec{\xi }\). We derive the optimal control formulation, first for an arbitrary group, and then in the classical realization of Suslov’s problem for the rotation group \(\textit{SO}(3)\). We show that it is possible to control the system using the constraint \(\varvec{\xi }(t)\) and demonstrate numerical examples in which the system tracks quite complex trajectories such as a spiral.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Controlling systems with constraints, in particular nonholonomic constraints, is an important topic of modern mechanics (Bloch 2003). Nonholonomic constraints tend to appear in mechanics as idealized representations of rolling without slipping, or some other physical process when a nonpotential force such as friction prevents the motion of the body in some directions that are controlled by generalized velocities (Arnold et al. 1997). Because of the importance of these systems in mechanics, control of nonholonomic systems has been studied extensively. We refer the reader to the recent papers on the motion planning of nonholonomic systems (Li and Canny 1992; Jean 2014), controllability (Lewis and Murray 1997; Lewis 2000; Cortés and Martínez 2004), controlled Lagrangians (Zenkov 2000; Zenkov et al. 2002; Bloch et al. 2015b), and the symmetry reduction approach to the optimal control of holonomic and nonholonomic systems (Bloch 1996; Koon and Marsden 1997; Bloch 2003; Agrachev and Sachkov 2004; Bullo and Lewis 2005). Recent progress in the area of optimal control of nonholonomic systems from the geometric point of view with extensive examples has been summarized in Bloch et al. (2015a), to which we refer the reader interested in the historical background and recent developments.
Physically, the methods of introducing control to a nonholonomic mechanical system can be roughly divided into two parts based on the actual realization. One way is to apply an external force while enforcing the system to respect the nonholonomic constraints for all times. This is akin to the control of the motion of the Chaplygin sleigh using internal point masses (Osborne and Zenkov 2005) and the control of the continuously variable transmission (CVT) studied in Bloch et al. (2015a). The second way is to think of controlling the direction of the motion by controlling the enforced direction of motion, as, for example, is done in the snakeboard (Koon and Marsden 1997). On some levels, one can understand the physical validity of the latter control approach, since one would expect that a reasonably constructed mechanical system with an adequate steering mechanism should be controllable. The goal of this paper can be viewed as “orthogonal” to the latter approach. More precisely, we consider the control of the direction normal to the allowed motion. Physically, it is not obvious that such a control mechanism is viable, since it provides quite a “weak” control of the system: There could be many directions normal to a given vector, and the system is free to move in a high-dimensional hyper-plane. As it turns out, however, this type of control has the advantage of preserving appropriate integrals of motion, which yield additional restrictions on the motion of the system. While this discussion is by no means rigorous, it shows that there is a possibility of the system being controllable.
Moreover, the approach of control using the nonholonomic constraint itself has additional advantages. As was discussed recently in Gay-Balmaz and Putkaradze (2016), allowing nonholonomic constraints to vary in time preserves integrals of motion of the system. A general theory was derived for energy conservation in the case of nonholonomically constrained systems with the configuration manifold being a semidirect product group (for example, the rolling unbalanced ball on a plane). It was also shown that additional integrals of motion can persist with perturbations of the constraints. These ideas, we believe, are also useful for control theory applications. Indeed, we shall show that using the nonholonomic constraints themselves as control preserve energy and thus puts additional constraints on the possible trajectories in phase space. On the one hand, this makes points with different energies unreachable; on the other hand, for tracking a desired trajectory on the fixed energy surface, the control is more efficient because it reduces the number of effective dimensions in the dynamics.
The paper is structured as follows. Section 2 outlines the general concepts and notations behind the mechanics of a rigid body with nonholonomic constraints, in order to make the discussion self-consistent. We discuss the concepts of symmetry reduction, variational principles, and both the Lagrange–d’Alembert and vakonomic approaches to nonholonomic mechanics. Section 2.3 outlines the general principles of Pontryagin optimal control for dynamical systems defined in \({\mathbb {R}}^n\). Sections 3 and 4 discuss the control of a specific problem posed by Suslov, which describes the motion of a rigid body under the influence of a nonholonomic constraint, stating that the projection of the body angular velocity onto a fixed axis (i.e., a nullifier axis) vanishes. While this problem is quite artificial in its mechanical realization, because of its relative simplicity, it has been quite popular in the mechanics literature. The idea of this paper is to control the motion of the rigid body by changing the nullifier axis in time. A possible mechanical realization of Suslov’s problem when the nullifier axis is fixed is given in Borisov et al. (2011); however, it is unclear how to realize Suslov’s problem when the nullifier axis is permitted to change in time. We have chosen to focus on Suslov’s problem, as it is one of the (deceptively) simplest examples of a mechanical system with nonholonomic constraints. Thus, this paper is concerned with the general theory applied to Suslov’s problem rather than its physical realization. Section 3 derives the pure and controlled equations of motion for an arbitrary group, while Sect. 4 derives the pure and controlled equations of motion for SO(3). In Sect. 3.2, particular attention is paid to the derivation of the boundary conditions needed to correctly apply the principles of Pontryagin optimal control while obtaining the controlled equations of motion. In Sect. 4.2, we show that this problem is controllable for \(\textit{SO}(3)\). In Sect. 5, we derive an optimal control procedure for this problem and show numerical simulations that illustrate the possibility to solve quite complex optimal control and trajectory tracking problems. Section 6 provides a conclusion and summary of results, while Appendix A gives a brief survey of numerical methods to solve optimal control problems.
2 Background: Symmetry Reduction, Nonholonomic Constraints, and Optimal Control in Classical Mechanics
2.1 Symmetry Reduction and the Euler–Poincaré Equation
A mechanical system consists of a configuration space, which is a manifold M, and a Lagrangian \(L(q,{\dot{q}}): \textit{TM} \rightarrow {\mathbb {R}}\), \((q,{\dot{q}}) \in \textit{TM}\). The equations of motion are given by Hamilton’s variational principle of stationary action, which states that
for all smooth variations \(\delta q(t)\) of the curve q(t) that are defined for \(a\le t \le b\) and that vanish at the endpoints (i.e., \(\delta q(a)=\delta q(b)=0\)). Application of Hamilton’s variational principle yields the Euler–Lagrange equations of motion:
In the case when there is an intrinsic symmetry in the equations, in particular when \(M=G\), a Lie group, and when there is an appropriate invariance of the Lagrangian with respect to G, these Euler–Lagrange equations, defined on the tangent bundle of the group TG, can be substantially simplified, which is the topic of the Euler–Poincaré description of motion (Holm 2011; Poincaré 1901). More precisely, if the Lagrangian L is left-invariant, i.e., \(L(hg,h {\dot{g}})=L(g,{\dot{g}})\) \(\forall h \in G\), we can define the symmetry-reduced Lagrangian through the symmetry reduction \(\ell =\ell (g^{-1} {\dot{g}})\). Then, the equations of motion (2.2) are equivalent to the Euler–Poincaré equations of motion obtained from the variational principle
The variations \(\eta (t)\), assumed to be sufficiently smooth, are sometimes called free variations. Application of the variational principle (2.3) yields the Euler–Poincaré equations of motion:
For right-invariant Lagrangians, i.e., \(L(gh,{\dot{g}}h)=L(g,{\dot{g}})\) \(\forall h \in G\), the Euler–Poincaré equations of motion (2.4) change by altering the sign in front of \(\mathrm{ad}^*_\xi \) from minus to plus. In what follows, we shall only consider the left-invariant systems for simplicity of exposition.
As an illustrative example, let us consider the motion of a rigid body rotating about its center of mass, fixed in space, with the unreduced Lagrangian defined as \(L=L(\Lambda , {\dot{\Lambda }})\), \(\Lambda \in \textit{SO}(3)\). The fact that the Lagrangian is left-invariant comes from the physical fact that the Lagrangian of a rigid body is invariant under rotations. The Lagrangian is then just the kinetic energy, \(L=L(\varvec{\Omega })=\frac{1}{2} {\mathbb {I}} \varvec{\Omega }\cdot \varvec{\Omega }\), with \({\mathbb {I}}\) being the inertia tensor and \(\varvec{\Omega }=\left( \Lambda ^{-1} {\dot{\Lambda }}\right) ^\vee \). With respect to application of the group on the left, which corresponds to the description of the equations of motion in the body frame, the symmetry-reduced Lagrangian should be of the form \(\ell \left( \Lambda ^{-1} {\dot{\Lambda }} \right) \). Here, we have defined the hat map \({\textvisiblespace }^\wedge : {\mathbb {R}}^3 \rightarrow \mathfrak {so}(3)\) and its inverse \({\textvisiblespace }^\vee : \mathfrak {so}(3) \rightarrow {\mathbb {R}}^3\) to be isomorphisms between \(\mathfrak {so}(3)\) (antisymmetric matrices) and \({\mathbb {R}}^3\) (vectors), computed as \(\widehat{a}_{ij}=- \epsilon _{i j k} a_k\). Then, \(\mathrm{ad}^*_{\varvec{\Omega }} \varvec{\Pi }= - \varvec{\Omega }\times \varvec{\Pi }\) and the Euler–Poincaré equations of motion for the rigid body become
which are the well-known Euler equations of motion for a rigid body rotating about its center of mass, fixed in space.
2.2 Nonholonomic Constraints and Lagrange–d’Alembert’s Principle
Suppose a mechanical system having configuration space M, a manifold of dimension n, must satisfy \(m < n\) constraints that are linear in velocity. To express these velocity constraints formally, the notion of a distribution is needed. Given the manifold M, a distribution \({\mathcal {D}}\) on M is a subset of the tangent bundle \(\textit{TM} = \bigcup _{q \in M} T_q M\): \({\mathcal {D}} = \bigcup _{q \in M} {\mathcal {D}}_q\), where \({\mathcal {D}}_q \subset T_q M\) and \(m = \mathrm {dim} \, {\mathcal {D}}_q < \mathrm {dim} \, T_q M = n\) for each \(q \in M\). A curve \(q(t) \in M\) satisfies the constraints if \({\dot{q}}(t) \in {\mathcal {D}}_{q(t)}\). Lagrange–d’Alembert’s principle states that the equations of motion are determined by
for all smooth variations \(\delta q(t)\) of the curve q(t) such that \(\delta q(t) \in {\mathcal {D}}_{q(t)}\) for all \(a\le t \le b\) and such that \(\delta q(a)=\delta q(b)=0\), and for which \({\dot{q}}(t) \in {\mathcal {D}}_{q(t)}\) for all \(a\le t \le b\). If one writes the nonholonomic constraint in local coordinates as \(\sum _{i=1}^n A(q)^j_i {\dot{q}}^i=0\), \(j=1, \ldots , m < n\), then (2.6) is written in local coordinates as
where the \(\lambda _j\) are Lagrange multipliers enforcing \(\sum _{i=1}^n A(q)^j_i {\delta q}^i=0\), \(j=1, \ldots , m\). Aside from Lagrange–d’Alembert’s approach, there is also an alternative vakonomic approach to derive the equations of motion for nonholonomic mechanical systems. Simply speaking, that approach relies on substituting the constraint into the Lagrangian before taking variations or, equivalently, enforcing the constraints using the appropriate Lagrange multiplier method. In general, it is an experimental fact that all known nonholonomic mechanical systems obey the equations of motion resulting from Lagrange–d’Alembert’s principle (Lewis and Murray 1995).
2.3 Optimal Control and Pontryagin’s Minimum Principle
Given a dynamical system with a state \({\mathbf {x} }\) in \({\mathbb {R}}^n\), a fixed initial time a, and a fixed or free terminal time \(b>a\), suppose it is desired to find a control \(\mathbf {u}\) in \({\mathbb {R}}^k\) that minimizes
subject to satisfying the equations of motion \({\dot{{\mathbf {x} }}} = \mathbf {f}( {\mathbf {x} }, \mathbf {u},t)\), \(m_1\) initial conditions \(\varvec{\phi }({\mathbf {x} }(a))=\mathbf {0}\), and \(m_2\) terminal conditions \(\varvec{\psi }({\mathbf {x} }(b),b)=\mathbf {0}\). Following Bryson (1975) by using the method of Lagrange multipliers, this problem may be solved by finding \(\mathbf {u}\), \({\mathbf {x} }(a)\), and b that minimizes
for an \(m_1\)-dimensional constant Lagrange multiplier vector \(\varvec{\rho }\), an \(m_2\)-dimensional constant Lagrange multiplier vector \(\varvec{\nu }\), and an n-dimensional time-varying Lagrange multiplier vector \(\varvec{\pi }\). Defining the Hamiltonian H as \(H({\mathbf {x} }, \mathbf {u},\varvec{\pi },t) = L({\mathbf {x} }, \mathbf {u},t)+\left\langle \varvec{\pi },\mathbf {f}({\mathbf {x} }, \mathbf {u},t)\right\rangle \) and by integrating by parts, S becomes
Before proceeding further with Pontryagin’s minimum principle, some terminology from the calculus of variations is briefly reviewed. Suppose that y is a time-dependent function, w is a time-independent variable, and Q is a scalar-valued functional that depends on y and w. The variation of y is \(\delta y \equiv \left. \frac{\partial y}{\partial \epsilon } \right| _{\epsilon =0}\), the differential of y is \(\mathrm {d} y \equiv \delta y + {\dot{y}} \mathrm {d}t = \left. \frac{\partial y}{\partial \epsilon } \right| _{\epsilon =0}+\left. \frac{\partial y}{\partial t} \right| _{\epsilon =0} \mathrm {d}t\), and the differential of w is \(\mathrm {d} w \equiv \left. \frac{\mathrm {d} w}{\mathrm {d} \epsilon } \right| _{\epsilon =0}\), where \(\epsilon \) represents an independent “variational” variable. The variation of Q with respect to y is \(\delta _y Q \equiv \frac{\partial Q}{\partial y} \delta y\), while the differential of Q with respect to w is \(\mathrm {d}_{w} Q \equiv \frac{\partial Q}{\partial w} \mathrm {d} w\). The total differential (or for brevity “the differential”) of Q is \(\mathrm {d} Q \equiv \delta _y Q + \mathrm {d}_{w} Q =\frac{\partial Q}{\partial y} \delta y + \frac{\partial Q}{\partial w} \mathrm {d} w\). Colloquially, the variation of Q with respect to y means the change in Q due to a small change in y, the differential of Q with respect to w means the change in Q due to a small change in w, and the total differential of Q means the change in Q due to small changes in y and w. The extension to vectors of time-dependent functions and time-independent variables is straightforward. If \(\mathbf {y}\) is a vector of time-dependent functions, \(\mathbf {w}\) is a vector of time-independent variables, and Q is a scalar-valued functional depending on \(\mathbf {y}\) and \(\mathbf {w}\), then the variation of \(\mathbf {y}\) is \(\delta \mathbf {y} \equiv \left. \frac{\partial \mathbf {y}}{\partial \epsilon } \right| _{\epsilon =0}\), the differential of \(\mathbf {y}\) is \(\mathrm {d} \mathbf {y} \equiv \delta \mathbf {y} + {\dot{\mathbf {y}}} \mathrm {d}t = \left. \frac{\partial \mathbf {y}}{\partial \epsilon } \right| _{\epsilon =0}+\left. \frac{\partial \mathbf {y}}{\partial t} \right| _{\epsilon =0} \mathrm {d}t\), the differential of \(\mathbf {w}\) is \(\mathrm {d} \mathbf {w} \equiv \left. \frac{\mathrm {d} \mathbf {w}}{\mathrm {d} \epsilon } \right| _{\epsilon =0}\), the variation of Q with respect to \(\mathbf {y}\) is \(\delta _\mathbf {y} Q \equiv \frac{\partial Q}{\partial \mathbf {y}} \delta \mathbf {y}\), the differential of Q with respect to \(\mathbf {w}\) is \(\mathrm {d}_{\mathbf {w}} Q \equiv \frac{\partial Q}{\partial \mathbf {w}} \mathrm {d} \mathbf {w}\), and the total differential (or for brevity “the differential”) of Q is \(\mathrm {d} Q \equiv \delta _\mathbf {y} Q + \mathrm {d}_{\mathbf {w}} Q =\frac{\partial Q}{\partial \mathbf {y}} \delta \mathbf {y} + \frac{\partial Q}{\partial \mathbf {w}} \mathrm {d} \mathbf {w}\).
Returning to (2.10), demanding that \(\mathrm {d} S=0\) for all variations \(\delta {\mathbf {x} }\), \(\delta \mathbf {u}\), and \(\delta \varvec{\pi }\) and for all differentials \(\mathrm {d} \varvec{\rho }\), \(\mathrm {d} \varvec{\nu }\), \(\mathrm {d} {\mathbf {x} }(b)\), and \(\mathrm {d} b\) gives the optimally controlled equations of motion, which are canonical in the variables \({\mathbf {x} }\) and \(\varvec{\pi }\),
In addition to these equations, the solution must satisfy the optimality condition
and the boundary conditions which are obtained by equating the appropriate boundary terms involving variations or differentials to zero. The left boundary conditions are
and the right boundary conditions are
where the final right boundary condition (2.18) is only needed if the terminal time b is free. If \(H_{\varvec{u}\varvec{u}}\) is nonsingular, the implicit function theorem guarantees that the optimality condition (2.13) determines the k-vector \(\varvec{u}\). (2.11), (2.12), and (2.14)–(2.18) define a two-point boundary value problem (TPBVP). If the terminal time b is fixed, then the solution of the 2n differential equations (2.11) and (2.12) and the choice of the \(m_1+m_2\) free parameters \(\varvec{\rho }\) and \(\varvec{\nu }\) are determined by the \(2n+m_1+m_2\) boundary conditions (2.14)–(2.17). If the terminal time b is free, then the solution of the 2n ordinary differential equations (2.11) and (2.12) and the choice of the \(m_1+m_2+1\) free parameters \(\varvec{\rho }\), \(\varvec{\nu }\), and b are determined by the \(2n+m_1+m_2+1\) boundary conditions (2.14)–(2.18).
3 Suslov’s Optimal Control Problem for an Arbitrary Group
3.1 Derivation of Suslov’s Pure Equations of Motion
Suppose G is a Lie group having all appropriate properties for the application of the Euler–Poincaré theory (Marsden and Ratiu 2013; Holm 2011). As we mentioned above, if the Lagrangian \(L=L(g,{\dot{g}})\) is left G-invariant, then the problem can be reduced to the consideration of the symmetry-reduced Lagrangian \(\ell (\Omega )\) with \(\Omega =g^{-1} {\dot{g}}\). Here, we concentrate on the left-invariant Lagrangians as being pertinent to the dynamics of a rigid body. A parallel theory of right-invariant Lagrangians can be developed as well in a completely equivalent fashion (Holm 2011). We also assume that there is a suitable pairing between the Lie algebra \({\mathfrak {g}}\) and its dual \({\mathfrak {g}}^*\), which leads to the co-adjoint operator
Then, the equations of motion are obtained by Euler–Poincaré’s variational principle
with the variations \(\delta \Omega \) satisfying
where \(\eta (t)\) is an arbitrary \({\mathfrak {g}}\)-valued function satisfying \(\eta (a)=\eta (b)=0\). Then, the equations of motion are the Euler–Poincaré equations of motion
Let \(\xi (t) \in {\mathfrak {g}}^*\), with \(\xi (t) \ne 0 \; \forall t\) and introduce the constraint
Due to the constraint (3.4), Lagrange–d’Alembert’s principle states that the variations \(\eta \in {\mathfrak {g}}\) have to satisfy
Using (3.5), Suslov’s pure equations of motion are obtained:
where \(\lambda \) is the Lagrange multiplier enforcing (3.5). In order to explicitly solve (3.6) for \(\lambda \), we will need to further assume a linear connection between the angular momentum \(\Pi \) and the angular velocity \(\Omega \). Thus, we assume that \(\Pi = {\mathbb {I}} \Omega \), where \({\mathbb {I}} :{\mathfrak {g}} \rightarrow {\mathfrak {g}}^*\) is an invertible linear operator with an adjoint \({\mathbb {I}}^* :{\mathfrak {g}}^* \rightarrow {\mathfrak {g}}\); \({\mathbb {I}}\) has the physical meaning of the inertia operator when the Lie group G under consideration is the rotation group \(\textit{SO}(3)\). Under this assumption, we pair both sides of (3.6) with \({\mathbb {I}}^{-1*} \xi \) and obtain the following expression for the Lagrange multiplier \(\lambda \):
If we, moreover, assume that \(\gamma (\xi ,t)\) is a constant, e.g., \(\gamma (\xi ,t)=0\) as is in the standard formulation of Suslov’s problem, and \({\mathbb {I}} :{\mathfrak {g}} \rightarrow {\mathfrak {g}}\) is a time-independent, invertible linear operator that is also self-adjoint (i.e., \({\mathbb {I}}={\mathbb {I}}^*\)), then (3.7) simplifies to
the kinetic energy is
the time derivative of the kinetic energy is
and kinetic energy is conserved if \(\gamma (\xi ,t)=0\).
3.2 Derivation of Suslov’s Optimally Controlled Equations of Motion
Consider the problem (3.6) and assume that \(\Pi = {\mathbb {I}} \Omega \) so that the explicit equation for the Lagrange multiplier (3.7) holds. We now turn to the central question of the paper, namely optimal control of the system by varying the nullifier (or annihilator) \(\xi (t)\). The optimal control problem is defined as follows. Consider a fixed initial time a, a fixed or free terminal time \(b > a\), the cost function \(C\left( \Omega ,\dot{\Omega },\xi ,{\dot{\xi }},t\right) \), and the following optimal control problem
and subject to the left and right boundary conditions \(\Omega (a)=\Omega _a\) and \(\Omega (b)=\Omega _b\). Construct the performance index
where the additional unknowns are a \({\mathfrak {g}}\)-valued function of time \(\kappa (t)\) and the constants \(\rho , \, \nu \in {\mathfrak {g}}^*\) enforcing the boundary conditions.
Remark 3.1
(On the nature of the pairing in (3.12)) For simplicity of calculation and notation, we assume that the pairing in (3.12) between vectors in \({\mathfrak {g}}\) and \({\mathfrak {g}}^*\) is the same as the one used in the derivation of Suslov’s problem in Sect. 3.1. In principle, one could use a different pairing which would necessitate a different notation for the \(\mathrm{ad}\) operator. We believe that while such generalization is rather straightforward, it introduces a cumbersome and nonintuitive notation. For the case when \(G=\textit{SO}(3)\) considered later in Sect. 4, we will take the simplest possible pairing, the scalar product of vectors in \({\mathbb {R}}^3\). In that case, the \({\text{ ad }}\) and \(\mathrm{ad}^*\) operators are simply the vector cross product with an appropriate sign.
Pontryagin’s minimum principle gives necessary conditions that a minimum solution of (3.11) must satisfy, if it exists. These necessary conditions are obtained by equating the differential of S to 0, resulting in appropriately coupled equations for the state and control variables. While this calculation is well established (Bryson 1975; Hull 2013), we present it here for completeness of the exposition as it is relevant to our further discussion.
Following Bryson (1975), we denote all variations of S coming from the time-dependent variables \(\kappa \), \(\Omega \), and \(\xi \) as \(\delta S\) and write \(\delta S = \delta _{\kappa } S+\delta _{\Omega } S+\delta _{\xi } S\). By using partial differentiation, the variation of S with respect to each time-independent variable \(\rho \), \(\nu \), and b is \(\left\langle \frac{\partial S}{\partial \rho },\mathrm {d} \rho \right\rangle \), \(\left\langle \frac{\partial S}{\partial \nu }, \mathrm {d} \nu \right\rangle \), and \(\frac{\partial S}{\partial b} \mathrm {d} b\), respectively. Thus, the differential of S is given by
Each term in \(\mathrm {d} S\) is computed below. It is important to present this calculation in some detail, in particular, because of the contribution of the boundary conditions. The variation of S with respect to \(\kappa \) is
Since \(\delta \Omega (a)=0\), \(\mathrm {d} \Omega (b) = \delta \Omega (b) + {\dot{\Omega }}(b) \mathrm {d} b\), and
the variation of S with respect to \(\Omega \) is
Since \(\mathrm {d} \xi (b) = \delta \xi (b) + {\dot{\xi }}(b) \mathrm {d} b \), the variation of S with respect to \(\xi \) is
The remaining terms in \(\mathrm {d}S\), due to variations of S with respect to the time-independent variables, are
and
Adding all the terms in \(\mathrm {d} S\) together and demanding that \(\mathrm {d} S=0\) for all \(\delta \kappa \), \(\delta \Omega \), \(\delta \xi \), \(\mathrm {d} \Omega (b)\), \(\mathrm {d} \xi (b)\), \(\mathrm {d} \rho \), \(\mathrm {d} \nu \), and \(\mathrm {d} b\) (note here that \(\delta \kappa \), \(\delta \Omega \), and \(\delta \xi \) are variations defined for \(a \le t \le b\)) gives the two-point boundary value problem defined by the following equations of motion on \(a\le t \le b\)
the left boundary conditions at \(t=a\)
and the right boundary conditions at \(t=b\)
where \(\lambda \) is given by (3.7) and the final right boundary condition (3.28) is only needed if the terminal time b is free. Equations (3.21), (3.22), and (3.23) together with the left boundary conditions (3.24)–(3.25) and the right boundary conditions (3.26)–(3.27) and, if needed, (3.28), constitute the optimally controlled equations of motion for Suslov’s problem using change in the nonholonomic constraint direction as the control.
4 Suslov’s Optimal Control Problem for Rigid Body Motion
4.1 Derivation of Suslov’s Pure Equations of Motion
Having discussed the formulation of Suslov’s problem in the general case for an arbitrary group, let us now turn our attention to the case of the particular Lie group \(G=\textit{SO}(3)\), which represents Suslov’s problem in its original formulation and where the unreduced Lagrangian is \(L=L(\Lambda , {\dot{\Lambda }})\), with \(\Lambda \in \textit{SO}(3)\). Suslov’s problem studies the behavior of the body angular velocity \(\varvec{\Omega }\equiv \left[ \Lambda ^{-1} {\dot{\Lambda }} \right] ^\vee \in {\mathbb {R}}^3\) subject to the nonholonomic constraint
for some prescribed, possibly time-varying vector \(\varvec{\xi }\in {\mathbb {R}}^3\) expressed in the body frame. Physically, such a system corresponds to a rigid body rotating about a fixed point, with the rotation required to be normal to the prescribed vector \(\varvec{\xi }(t)\in {\mathbb {R}}^3\) . The fact that the vector \(\varvec{\xi }\) identifying the nonholonomic constraint is defined in the body frame makes direct physical interpretation and realization of Suslov’s problem somewhat challenging. Still, Suslov’s problem is perhaps one of the simplest and, at the same time, most insightful and pedagogical problems in the field of nonholonomic mechanics, and has attracted considerable attention in the literature. The original formulation of this problem is due to Suslov in 1902 (Suslov 1946) (still only available in Russian), where he assumed that \(\varvec{\xi }\) was constant. This research considers the more general case where \(\varvec{\xi }\) varies with time. In order to match the standard state-space notation in control theory, the state-space control is assumed to be \(\varvec{u}= {\dot{\varvec{\xi }}}\). We shall also note that the control theoretical treatment of unconstrained rigid body motion from the geometric point of view is discussed in detail in Agrachev and Sachkov (2004), Chapters 19 (for general compact Lie groups) and 22.
For conciseness, the time dependence of \(\varvec{\xi }\) is often suppressed in what follows. We shall note that there is a more general formulation of Suslov’s problem when \(G=\textit{SO}(3)\) which includes a potential energy in the Lagrangian,
Depending on the type of potential energy, there are up to 3 additional integrals of motion. For a review of Suslov’s problem and a summary of results in this area, the reader is referred to an article by Kozlov (2002).
Let us choose a body frame coordinate system with an orthonormal basis \(\left( \mathbf {E}_1,\mathbf {E}_2,\mathbf {E}_3 \right) \) in which the rigid body’s inertia matrix \({\mathbb {I}}\) is diagonal (i.e., \({\mathbb {I}} = \mathrm {diag}({{\mathbb {I}}_1,{\mathbb {I}}_2,{\mathbb {I}}_3})\)) and suppose henceforth that all body frame tensors are expressed with respect to this particular choice of coordinate system. Let \(\left( \mathbf {e}_1,\mathbf {e}_2,\mathbf {e}_3 \right) \) denote the orthonormal basis for the spatial frame coordinate system and denote the transformation from the body to spatial frame coordinate systems by the rotation matrix \(\Lambda (t) \in \textit{SO}(3)\). The rigid body’s Lagrangian is its kinetic energy: \(l = \frac{1}{2} \left\langle {\mathbb {I}} \varvec{\Omega }, \varvec{\Omega }\right\rangle \). Applying Lagrange–d’Alembert’s principle to the nonholonomic constraint (4.1) yields the equations of motion
where the Lagrange multiplier \(\lambda \) is given as
thereby incorporating the constraint equation. In order to make \(\lambda \) well defined in (4.4), note that it is implicitly assumed that \(\varvec{\xi }\ne 0\) (i.e., \(\varvec{\xi }(t) \ne 0 \; \forall t\)). As is easy to verify, equations (4.3) and (4.4) are a particular case of the equations of motion (3.6) and the Lagrange multiplier (3.8). Also, equations (4.3) and (4.4) generalize the well-known equations of motion for Suslov’s problem (Bloch 2003) to the case of time-varying \(\varvec{\xi }(t)\). For the purposes of optimal control theory, we rewrite (4.3) and (4.4) into a single equation as
We would like to state several useful observations about the nature of the dynamics in the free Suslov’s problem, i.e., the results that are valid for arbitrary \(\varvec{\xi }(t)\), before proceeding to the optimal control case.
On the nature of constraint preservation Suppose that \(\varvec{\Omega }(t)\) is a solution to (4.5) (equivalently (4.3)), for a given \(\varvec{\xi }(t)\) with \(\lambda \) given by (4.4). We can rewrite the equation for the Lagrange multiplier as
On the other hand, multiplying both sides of (4.3) by \({\mathbb {I}}^{-1} \varvec{\xi }\) and solving for \(\lambda \) gives
Thus, from (4.6) and (4.7) it follows that the equations of motion (4.5) with \(\lambda \) given by (4.4) lead to \(\displaystyle {\frac{\mathrm {d} }{\mathrm {d} t}} \left\langle \varvec{\Omega }, \varvec{\xi }\right\rangle = 0\), so that \(\left\langle \varvec{\Omega }, \varvec{\xi }\right\rangle = c\), a constant that is not necessarily equal to 0. In other words, the equations (4.5), (4.4) need an additional condition determining the value of \(\left\langle \varvec{\Omega }, \varvec{\xi }\right\rangle =0\). Therefore, a solution \(\left( \varvec{\Omega },\varvec{\xi }\right) \) to Suslov’s problem requires that \(\mathbf {q}\left( \varvec{\Omega },\varvec{\xi }\right) =\mathbf {0}\) and \(\left\langle \varvec{\Omega }(a), \varvec{\xi }(a) \right\rangle =0\), where \(t=a\) is the initial time.
On the invariance of solutions with respect to scaling of \(\varvec{\xi }\) In the classical formulation of Suslov’s problem, it is usually assumed that \(|\varvec{\xi }|=1\). When \(\varvec{\xi }(t)\) is allowed to change, the normalization of \(\varvec{\xi }\) becomes an issue that needs to be clarified. Indeed, suppose that \(\varvec{\Omega }(t)\) is a solution to (4.5) for a given \(\varvec{\xi }(t)\), so that \(\mathbf {q}\left( \varvec{\Omega },\varvec{\xi }\right) =\mathbf {0}\) and further assume that \(\left\langle \varvec{\Omega },\varvec{\xi }\right\rangle = 0\). Next, consider a smooth, scalar-valued function \(\pi (t)\) with \(\pi (t) \ne 0\) on the interval \(t \in [a,b]\), and consider the pair \(\left( \varvec{\Omega },\pi \varvec{\xi }\right) \). Then,
Hence, a solution \( \varvec{\Omega }(t)\) to (4.5) with \(\left\langle \varvec{\Omega },\varvec{\xi }\right\rangle =c = 0\) does not depend on the magnitude of \(\varvec{\xi }(t)\). As it turns out, this creates a degeneracy in the optimal control problem that has to be treated with care.
Energy conservation Multiplying both sides of (4.3) by \(\varvec{\Omega }\), gives the time derivative of kinetic energy:
where we have denoted \(\left\langle \varvec{\Omega },\varvec{\xi }\right\rangle =c=\) const. Thus, if \(c=0\) (as is the case for Suslov’s problem), kinetic energy is conserved:
for some positive constant \(e_S\), and \(\varvec{\Omega }\) lies on the surface of an ellipsoid which we will denote by E. The constant kinetic energy ellipsoid determined by the rigid body’s inertia matrix \({\mathbb {I}}\) and initial body angular velocity \(\varvec{\Omega }(a)=\varvec{\Omega }_a\) on which \(\varvec{\Omega }\) lies is denoted by
Integrating (4.9) with respect to time from a to b gives the change in kinetic energy:
Thus, \(\varvec{\Omega }(a)\) and \(\varvec{\Omega }(b)\) lie on the surface of the same ellipsoid iff \(c=0\) or \(\int _a^b \lambda \, \mathrm {d} t=0\). If \(c=0\), as is the case for Suslov’s problem, the conservation of kinetic energy holds for all choices of \(\varvec{\xi }\), constant or time-dependent. We shall note that if the vector \(\varvec{\xi }\) is constant in time, and is an eigenvector of the inertia matrix \({\mathbb {I}}\), then there is an additional integral \(\frac{1}{2} \left\langle {\mathbb {I}} \varvec{\Omega }, {\mathbb {I}} \varvec{\Omega }\right\rangle \). However, for \(\varvec{\xi }(t)\) varying in time, which is the case studied here, such an integral does not apply.
4.2 Controllability and Accessibility of Suslov’s Pure Equations of Motion
We shall now turn our attention to the problem of controlling Suslov’s problem by changing the vector \(\varvec{\xi }(t)\) in time. Before posing the optimal control problem, let us first consider the general question of controllability and accessibility using the Lie group approach to controllability as derived in Brockett (1972), Nijmeijer and Van der Schaft (2013), and Isidori (2013). Since for the constraint \(\left\langle \varvec{\Omega }, \varvec{\xi }\right\rangle =0\) all trajectories must lie on the energy ellipsoid (4.11), both the initial and terminal point of the trajectory must lie on the ellipsoid corresponding to the same energy. We shall therefore assume that the initial and terminal points, as well as the trajectory itself, lie on the ellipsoid (4.11). Before we proceed, let us remind the reader of the relevant definitions and theorems concerning controllability and accessibility, following Bloch (2003).
Definition 4.1
An affine nonlinear control system is a differential equation having the form
where M is a smooth n-dimensional manifold, \(x \in M\), \(u=\left( u_1,\ldots ,u_k \right) \) is a time-dependent, vector-valued map from \({\mathbb {R}}\) to a constraint set \(\Phi \subset {\mathbb {R}}^k\), and f and \(g_i\), \(i=1,\ldots ,k\), are smooth vector fields on M. The manifold M is said to be the state-space of the system, u is said to be the control, f is said to be the drift vector field, and \(g_i\), \(i=1,\ldots ,k\), are said to be the control vector fields. u is assumed to be piecewise smooth or piecewise analytic, and such a u is said to be admissible. If \(f \equiv 0\), the system (4.13) is said to be driftless; otherwise, the system (4.13) is said to have drift.
Definition 4.2
Let a be a fixed initial time. The system (4.13) is said to be controllable if for any pair of states \(x_a, x_b \in M\) there exists a terminal time \(b \ge a\) and an admissible control u defined on the time interval [a, b] such that there is a trajectory of (4.13) with \(x(a) = x_a\) and \(x(b) = x_b\).
Definition 4.3
Given \(x_a \in M\) and a time \(t \ge a\), \(R(x_a,t)\) is defined to be the set of all \(y \in M\) for which there exists an admissible control u defined on the time interval [a, t] such that there is a trajectory of (4.13) with \(x(a) = x_a\) and \(x(t) = y\). The reachable set from \(x_a\) at time \(b \ge a\) is defined to be
Definition 4.4
The accessibility algebra \({\mathcal {C}}\) of the system (4.13) is the smallest Lie algebra of vector fields on M that contains the vector fields f and \(g_i\), \(i=1,\ldots ,k\); that is, \({\mathcal {C}}=\text{ Lie } \, \{\mathbf {f},\mathbf {g}_1,\ldots ,\mathbf {g}_k \}\) is the span of all possible Lie brackets of f and \(g_i\), \(i=1,\ldots ,k\).
Definition 4.5
The accessibility distribution C of the system (4.13) is the distribution generated by the vector fields in \({\mathcal {C}}\); that is, given \(x_a \in M\), \(C(x_a) =\text{ Lie }_{x_a} \, \{\mathbf {f},\mathbf {g}_1,\ldots ,\mathbf {g}_k \}\) is the span of the vector fields X in \({\mathcal {C}}\) at \(x_a\).
Definition 4.6
The system (4.13) is said to be accessible from \(x_a \in M\) if for every \(b > a\), \(R_b(x_a)\) contains a nonempty open set.
Theorem 4.7
If \(\mathrm {dim} \; C(x_a)=n\) for some \(x_a \in M\), then the system (4.13) is accessible from \(x_a\).
Theorem 4.8
Suppose the system (4.13) is analytic. If \(\mathrm {dim} \; C(x_a)=n \; \forall x_a \in M\) and \(f=0\), then the system (4.13) is controllable.
To apply the theory of controllability and accessibility to Suslov’s problem, we first need to rewrite the equations of motion for Suslov’s problem in the “affine nonlinear control” form
where \({\mathbf {x} }\) is the state variable and \(u_i\) are the controls. We denote the state of the system by \(\mathbf x \equiv \left[ \begin{array}{c} \varvec{\Omega }\\ \varvec{\xi }\\ \end{array} \right] \) and the control by \(\mathbf {u} \equiv {\dot{\varvec{\xi }}}\). Thus, the individual components of the state and control are \(x_1 = \Omega _1\), \(x_2 = \Omega _2\), \(x_3 = \Omega _3\), \(x_4 = \xi _1\), \(x_5 = \xi _2\), \(x_6 = \xi _3\), \(u_1 = {{\dot{\xi }}}_1\), \(u_2 = {{\dot{\xi }}}_2\), and \(u_3 = {{\dot{\xi }}}_3\). The equations of motion (4.5) can be expressed as
To correlate (4.16) with (4.15), the functions \(\mathbf {f}\) and \(\mathbf {g}\) in (4.15) are defined as
and
Here, \(\mathbf {f}(\mathbf {x}) \) is the drift vector field and \(\mathbf {g}_i(\mathbf {x})\), \(1 \le i \le 3\), are the control vector fields; \(\mathbf {0}_{3 \times 1} = \left( 0,0,0\right) ^T\) denotes the \(3 \times 1\) column vector of zeros and \(\mathbf {e}_i\), \(i=1,2,3\), denote the standard orthonormal basis vectors for \({\mathbb {R}}^3\). An alternative way to express each control vector field \(\mathbf {g}_i\), \(1 \le i \le 3\), is through the differential geometric notation
As noted in the previous section, the first three components, \(\varvec{\Omega }\), of the state \(\mathbf {x}\) solving (4.15) must lie on the ellipsoid E given in (4.11), under the assumption that
for some time a. As shown in the previous section, (4.20) implies that a solution of (4.15) satisfies \(\left\langle \varvec{\Omega }(t),\varvec{\xi }(t)\right\rangle =0\) for all t. Also, it is assumed that \(\varvec{\xi }\ne 0\) (i.e., \(\varvec{\xi }(t) \ne 0 \; \forall t\)). Hence, the state-space manifold is \(M = \left\{ \mathbf {x} \in {\mathbb {R}}^6 | \, \frac{1}{2} \left\langle {\mathbb {I}} \varvec{\Omega }, \, \varvec{\Omega }\right\rangle =e_S, \left\langle \varvec{\Omega },\varvec{\xi }\right\rangle =0,\right. \left. \varvec{\xi }\ne 0 \right\} \). Let \(K = {\mathbb {R}}^6 \backslash \ \mathbf {0}\), a 6-dimensional submanifold of \({\mathbb {R}}^6\). Note that \(M = \Phi ^{-1}(\mathbf {0}_{2 \times 1})\), where \(\Phi : K \rightarrow {\mathbb {R}}^2\) is defined by
The derivative of \(\Phi \) at \(\mathbf {x} \in K\), \(\left( \Phi _*\right) _\mathbf {x} : T_\mathbf {x} K \rightarrow T_{\Phi (\mathbf {x})} {\mathbb {R}}^2\), is
Since \(\left( \Phi _*\right) _\mathbf {x}\) has rank 2 for each \(\mathbf {x} \in K\), \(\Phi \) is by definition a submersion and \(M = \Phi ^{-1}(\mathbf {0}_{2 \times 1})\) is a closed embedded submanifold of K of dimension 4 by Corollary 8.9 of Lee (2003). Being an embedded submanifold of K, M is also an immersed submanifold of K (Lee 2003).
The tangent space to M at \(\mathbf {x} \in M\) is
Using (4.17), (4.18), and (4.22), it is easy to check that \( \left( \Phi _*\right) _\mathbf {x} \left( \mathbf {f}(\mathbf x) \right) = \mathbf {0}_{2 \times 1}\) and \( \left( \Phi _*\right) _\mathbf {x} \left( \mathbf {g}_i(\mathbf x) \right) = \mathbf {0}_{2 \times 1}\) for \(1 \le i \le 3\). Hence, \(\mathbf {f}(\mathbf x) \in T_{\mathbf {x}} M\) and \(\mathbf {g}_i(\mathbf x) \in T_{\mathbf {x}} M\) for \(1 \le i \le 3\) by Lemma 8.15 of Lee (2003). So \(\mathbf {f}, \mathbf {g}_1, \mathbf {g}_2,\) and \(\mathbf {g}_3\) are smooth vector fields on K which are also tangent to M. Since M is an immersed submanifold of K, \(\left[ \mathbf {X},\mathbf {Y}\right] \) is tangent to M if \(\mathbf {X}\) and \(\mathbf {Y}\) are smooth vector fields on K that are tangent to M, by Corollary 8.28 of Lee (2003). Hence, \(\text{ Lie }_{\mathbf {x}} \, \{\mathbf {f},\mathbf {g}_1,\mathbf {g}_2,\mathbf {g}_3 \} \subset T_{\mathbf {x}} M\) and therefore \(\text{ rank } \, \text{ Lie }_{\mathbf {x}} \, \{\mathbf {f},\mathbf {g}_1,\mathbf {g}_2,\mathbf {g}_3 \} \le \dim \, T_{\mathbf {x}} M = 4\).
For \(1 \le i,j \le 3\) and \(i \ne j\), the Lie bracket of the control vector field \(\mathbf {g}_i\) with the control vector field \(\mathbf {g}_j\) is computed as
recalling that \(\left\langle \varvec{\xi }, {\mathbb {I}}^{-1} \varvec{\xi }\right\rangle = \sum _{i=1}^3 \frac{\xi _i^2}{d_i} \).
Next, to prove controllability and compute the appropriate prolongation, consider the \(6 \times 6\) matrix comprised of the columns of the vector fields \(\mathbf {g}_i(\mathbf {x})\) and their commutators \([\mathbf {g}_i(\mathbf {x}),\mathbf {g}_j(\mathbf {x})]\), projected to the basis of the space \((\partial _{\varvec{\Omega }}, \partial _{\varvec{\xi }})\):
where we have defined
In (4.25), \(I_{3 \times 3}\) denotes the \(3 \times 3\) identity matrix and \(\mathbf {0}_{3 \times 3}\) denotes the \(3 \times 3\) zero matrix. Since \(V \subset \text{ Lie }_{\mathbf {x}} \, \{\mathbf {f},\mathbf {g}_1,\mathbf {g}_2,\mathbf {g}_3 \}\), \( \text{ rank } \, V \le \text{ rank } \, \text{ Lie }_{\mathbf {x}} \, \{\mathbf {f},\mathbf {g}_1,\mathbf {g}_2,\mathbf {g}_3 \} \le \dim \, T_{\mathbf {x}} M = 4\). It will be shown that \( \text{ rank } \, V = 4\), so that \(\text{ rank } \, \text{ Lie }_{\mathbf {x}} \, \{\mathbf {f},\mathbf {g}_1,\mathbf {g}_2,\mathbf {g}_3 \} = \dim \, T_{\mathbf {x}} M = 4\).
Since the bottom 3 rows of the first 3 columns of V are \(I_{3 \times 3}\), the first 3 columns of V are linearly independent. Note that since \({\mathbb {I}}^{-1}\) is a diagonal matrix with positive diagonal entries, \(\text{ rank } \, \frac{{{\mathbb {I}}}^{-1}}{\left\langle \varvec{\xi }, {\mathbb {I}}^{-1} \varvec{\xi }\right\rangle } A = \text{ rank } \, A\). If \(\text{ rank } \, \frac{{{\mathbb {I}}}^{-1}}{\left\langle \varvec{\xi }, {\mathbb {I}}^{-1} \varvec{\xi }\right\rangle } A = \text{ rank } \, A >0\), each of the last 3 columns of V, if nonzero, is linearly independent of the first 3 columns of V since the bottom 3 rows of the first 3 columns are \(I_{3 \times 3}\) and the bottom 3 rows of the last 3 columns are \(\mathbf {0}_{3 \times 3}\). Hence, \(\text{ rank } \, V = 3+\text{ rank } \, A\). Since \(\text{ rank } \, V \le 4\), \(\text{ rank } \, A\) is 0 or 1.
The first matrix in the sum composing A in (4.26) has rank 2, since \(\varvec{\Omega }\ne \mathbf {0}\) (i.e., at least one component of \(\varvec{\Omega }\) is nonzero). The 3 columns of the first matrix in (4.26) are each orthogonal to \(\varvec{\Omega }\) and have rank 2; hence, the columns of the first matrix in (4.26) span the 2-dimensional plane in \({\mathbb {R}}^3\) orthogonal to \(\varvec{\Omega }\). Since \(\left\langle \varvec{\Omega },\varvec{\xi }\right\rangle =0\), \(\varvec{\xi }\) lies in the 2-dimensional plane orthogonal to \(\varvec{\Omega }\) and so lies in the span of the columns of the first matrix. Thus, \(\varvec{\xi }\) and \(\varvec{\Omega }\times \varvec{\xi }\) is an orthogonal basis for the plane in \({\mathbb {R}}^3\) orthogonal to \(\varvec{\Omega }\). Since the columns of the first matrix span this plane, at least one column, say the \(j^\mathrm {th}~(1 \le j \le 3)\), has a nonzero component parallel to \(\varvec{\Omega }\times \varvec{\xi }\). The second matrix in the sum composing A in (4.26) consists of 3 column vectors, each of which is a scalar multiple of \(\varvec{\xi }\). Hence, the \(j^\mathrm {th}\) column in A has a nonzero component parallel to \(\varvec{\Omega }\times \varvec{\xi }\). Thus, A has rank 1, V has rank 4, and \(\text{ rank } \, \text{ Lie }_{\mathbf {x}} \, \{\mathbf {f},\mathbf {g}_1,\mathbf {g}_2,\mathbf {g}_3 \} = \dim \, T_{\mathbf {x}} M = 4\). By Theorems 4.7 and 4.8, this implies that (4.15) is controllable or accessible, depending on whether \(\mathbf {f}\) is nonzero. Thus, we have proved
Theorem 4.9
(On the controllability and accessibility of Suslov’s problem) Suppose we have Suslov’s problem \(\varvec{q}\left( \varvec{\Omega },\varvec{\xi }\right) =\mathbf {0}\) with the control variable \({\dot{\varvec{\xi }}}(t)\). Then,
-
1.
If \({\mathbb {I}} = c I_{3 \times 3}\) for a positive constant c, then \(\varvec{\Omega }\) lies on a sphere of radius c, \(\mathbf {f}=\mathbf {0}\) for all points in M, and (4.15) is driftless and controllable.
-
2.
If \({\mathbb {I}} \ne c I_{3 \times 3}\) for all positive constants c (i.e., at least two of the diagonal entries of \({\mathbb {I}}\) are unequal), then \(\varvec{\Omega }\) lies on a nonspherical ellipsoid, \(\mathbf {f} \ne \mathbf {0}\) at most points in M, and (4.15) has drift and is accessible.
4.3 Suslov’s Optimal Control Problem
Let us now turn our attention to the optimal control of Suslov’s problem by varying the direction \(\varvec{\xi }(t)\). The general theory was outlined in Sect. 3.2, so we will go through the computations briefly, while at the same time trying to make this section as self-contained as possible. Suppose it is desired to maneuver Suslov’s rigid body from a prescribed initial body angular velocity \(\varvec{\Omega }_a \in E\) at a prescribed initial time \(t=a\) to another prescribed terminal body angular velocity \(\varvec{\Omega }_b \in E\) at a fixed or free terminal time \(t=b,\) where \(b \ge a\), subject to minimizing some time-dependent cost function C over the duration of the maneuver (such as minimizing the energy of the control vector \(\varvec{\xi }\) or minimizing the duration \(b-a\) of the maneuver). Note that since a solution to Suslov’s problem conserves kinetic energy, it is always assumed that \(\varvec{\Omega }_a,\varvec{\Omega }_b \in E\). Thus, a time-varying control vector \(\varvec{\xi }\) and terminal time b are sought that generate a time-varying body angular velocity \(\varvec{\Omega }\), such that \(\varvec{\Omega }(a)=\varvec{\Omega }_a \in E\), \(\varvec{\Omega }(b)=\varvec{\Omega }_b \in E\), \(\left\langle \varvec{\Omega }(a),\varvec{\xi }(a) \right\rangle =0\), the pure equations of motion \(\mathbf {q}=\mathbf {0}\) are satisfied for \(a \le t \le b\), and \(\int _a^b C \mathrm {d}t\) is minimized.
The natural way to formulate this optimal control problem is:
The collection of constraints in (4.27) is actually overdetermined. To see this, recall that a solution to \(\mathbf {q}=0\), \(\varvec{\Omega }(a)=\varvec{\Omega }_a\), and \(\left\langle \varvec{\Omega }(a),\varvec{\xi }(a)\right\rangle =0\) sits on the constant kinetic energy ellipsoid E. If \(\left( \varvec{\Omega },\varvec{\xi }\right) \) satisfies \(\mathbf {q}=\mathbf {0}\), \(\varvec{\Omega }(a)=\varvec{\Omega }_a \in E\), and \(\left\langle \varvec{\Omega }(a),\varvec{\xi }(a)\right\rangle =0\), then \(\varvec{\Omega }(b) \in E\), a 2-d manifold. Thus, only two rather than three parameters of \(\varvec{\Omega }(b)\) need to be prescribed. So the constraint \(\varvec{\Omega }(b)=\varvec{\Omega }_b\) in (4.27) is overprescribed and can lead to singular Jacobians when trying to solve (4.27) numerically, especially via the indirect method. A numerically more stable formulation of the optimal control problem is:
and where \(\varvec{\phi }: E \rightarrow {\mathbb {R}}^2 \) is some parameterization of the 2-d manifold E. For example, \(\varvec{\phi }\) might map a point on E expressed in Cartesian coordinates to its azimuth and elevation in spherical coordinates. Using the properties of the dynamics, the problem (4.27) can be simplified further to read
which omits the constraint \(\left\langle \varvec{\Omega }(a),\varvec{\xi }(a)\right\rangle =0\). One can see that (4.27) and (4.29) are equivalent as follows. Suppose \(\left( \varvec{\Omega },\varvec{\xi }\right) \) satisfies \(\mathbf {q}=\mathbf {0}\), \(\varvec{\Omega }(a)=\varvec{\Omega }_a \in E\), and \(\varvec{\Omega }(b)=\varvec{\Omega }_b \in E\). Since \(\varvec{\Omega }_a,\varvec{\Omega }_b \in E\) have the same kinetic energy, i.e., \(T(a)=\frac{1}{2}\left\langle \varvec{\Omega }_a,{\mathbb {I}} \varvec{\Omega }_a \right\rangle =\frac{1}{2}\left\langle \varvec{\Omega }_b,{\mathbb {I}} \varvec{\Omega }_b \right\rangle =T(b)\), equation (4.12) shows that \(\left\langle \varvec{\Omega }(a),\varvec{\xi }(a)\right\rangle =0\) or \(\int _a^b \lambda \, \mathrm {d} t=0\). The latter possibility, \(\int _a^b \lambda \, \mathrm {d} t=0\), represents an additional constraint and thus is unlikely to occur. Thus, a solution of (4.29) should be expected to satisfy the omitted constraint \(\left\langle \varvec{\Omega }(a),\varvec{\xi }(a)\right\rangle =0\).
In what follows, we assume the following form of the cost function C in (4.29):
where \(\alpha \), \(\beta \), \(\gamma \), \(\eta \), and \(\delta \) are nonnegative constant scalars. The first term in (4.30), \(\frac{\alpha }{4} \left[ \left| \varvec{\xi }\right| ^2 -1 \right] ^2\), encourages the control vector \(\varvec{\xi }\) to have near unit magnitude. The second term in (4.30), \(\frac{\beta }{2} \left| {\dot{\varvec{\xi }}} \right| ^2\), encourages the control vector \(\varvec{\xi }\) to follow a minimum energy trajectory. The first term in (4.30) is needed because the magnitude of \(\varvec{\xi }\) does not affect a solution of \(\mathbf {q}=0\), and in the absence of the first term in (4.30), the second term in (4.30) will try to shrink \(\varvec{\xi }\) to \(\mathbf {0}\), causing numerical instability. An alternative to including the first term in (4.30) is to revise the formulation of the optimal control problem to include the path constraint \(\left| \varvec{\xi }\right| =1\). The third term in (4.30), \(\frac{\gamma }{2} \left| \varvec{\Omega }- \varvec{\Omega }_d \right| ^2\), encourages the body angular velocity \(\varvec{\Omega }\) to follow a prescribed, time-varying trajectory \(\varvec{\Omega }_d\). The fourth term in (4.30), \(\frac{\eta }{2} \left| {\dot{\varvec{\Omega }}} \right| ^2\), encourages the body angular velocity vector \(\varvec{\Omega }\) to follow a minimum energy trajectory. The final term in (4.30), \(\delta \), encourages a minimum time solution.
As in Sect. 4.2, using state-space terminology, the state is \(\mathbf x \equiv \left[ \begin{array}{c} \varvec{\Omega }\\ \varvec{\xi }\\ \end{array} \right] \) and the control is \(\mathbf {u} \equiv {\dot{\varvec{\xi }}}\) for the optimal control problem (4.29). It is always assumed that the control \(\mathbf {u} = {\dot{\varvec{\xi }}}\) is differentiable, and therefore continuous, or equivalently that \(\varvec{\xi }\) is twice differentiable.
4.4 Derivation of Suslov’s Optimally Controlled Equations of Motion
Following the method of Bryson (1975), Hull (2013), to construct a control vector \(\varvec{\xi }\) and terminal time b solving (4.29), the pure equations of motion are added to the cost function through a time-varying Lagrange multiplier vector and the initial and terminal constraints are added using constant Lagrange multiplier vectors. A control vector \(\varvec{\xi }\) and terminal time b are sought that minimize the performance index
where \(\varvec{\rho }\) and \(\varvec{\nu }\) are constant Lagrange multiplier vectors enforcing the initial and terminal constraints \(\varvec{\Omega }(a)=\varvec{\Omega }_a\) and \(\varvec{\Omega }(b)=\varvec{\Omega }_b\) and \(\varvec{\kappa }\) is a time-varying Lagrange multiplier vector enforcing the pure equations of motion defined by \(\mathbf {q}=\mathbf {0}\) as given in (4.5). In the literature, the time-varying Lagrange multiplier vector used to adjoin the equations of motion to the cost function is often called the adjoint variable or the costate. Henceforth, the time-varying Lagrange multiplier vector is referred to as the costate.
The control vector \(\varvec{\xi }\) and terminal time b minimizing S are found by finding conditions for which the differential of S, \(\mathrm {d}S\), equals 0. The differential of S is defined as the first-order change in S with respect to changes in \(\varvec{\kappa }\), \(\varvec{\Omega }\), \(\varvec{\xi }\), \(\varvec{\rho }\), \(\varvec{\nu }\), and b. Assuming that the cost function is of the form \(C\left( \varvec{\Omega },{\dot{\varvec{\Omega }}},\varvec{\xi },{\dot{\varvec{\xi }}},t\right) \) and equating the differential of S to zero give, after either some rather tedious direct calculations or by using the results of Sect. 3.2, Suslov’s optimally controlled equations of motion:
for \(a \le t \le b\), the left boundary conditions
and the right boundary conditions
Using the first equation in (4.32), which is equivalent to (4.5), and the second equation in (4.34), the third equation in (4.34), corresponding to free terminal time, can be simplified, so that the right boundary conditions simplify to
Equations (4.32), (4.33), and (4.35) form a TPBVP. Observe that the unknowns in this TPBVP are \(\varvec{\kappa }\), \(\varvec{\Omega }\), \(\varvec{\xi }\), and b, while the constant Lagrange multiplier vectors \(\varvec{\rho }\) and \(\varvec{\nu }\) are irrelevant.
This application of Pontryagin’s minimum principle differs slightly from the classical treatment of optimal control theory reviewed in Sect. 2.3. Let us connect our derivation to that section. In the classical application, the Hamiltonian involves 6 costates \(\varvec{\pi }\in {\mathbb {R}}^6\) and is given by
whereas in our derivation above, the Hamiltonian involves only 3 costates \(\varvec{\kappa }\in {\mathbb {R}}^3\) and is given by
with \(L\left( \varvec{\Omega },\varvec{\xi },\varvec{u},t\right) =C\left( \varvec{\Omega },{\dot{\varvec{\Omega }}},\varvec{\xi },{\dot{\varvec{\xi }}},t\right) \), since \({\dot{\varvec{\Omega }}}\) is a function of \(\varvec{\Omega }\), \(\varvec{\xi }\), and \({\dot{\varvec{\xi }}}\) and since \(\varvec{u}= {\dot{\varvec{\xi }}}\).
It can be shown that the classical costates \(\varvec{\pi }\) can be obtained from the reduced costates \(\varvec{\kappa }\), derived here, via
Now consider the particular cost function (4.30) corresponding to the optimal control problem (4.29). For this cost function, the partial derivative of the Hamiltonian (4.36) with respect to the control \(\varvec{u}={\dot{\varvec{\xi }}}\) is
where we have defined for brevity
The second partial derivative of the Hamiltonian (4.36) with respect to the control \(\varvec{u}={\dot{\varvec{\xi }}}\) is
where \({\tilde{c}} =\frac{\eta }{\left\langle \varvec{\xi }, {\mathbb {I}}^{-1} \varvec{\xi }\right\rangle ^2} \varvec{\xi }^\mathsf {T} {\mathbb {I}}^{-2} \varvec{\xi }\) is a nonnegative scalar. Recall that it is assumed that \(\beta \ge 0\). If \(\beta =0\), then \(H_{\varvec{u}\varvec{u}}={\tilde{c}} \varvec{\Omega }\varvec{\Omega }^\mathsf {T}\) is singular since \(\varvec{\Omega }\varvec{\Omega }^\mathsf {T}\) is a rank 1 matrix. Hence, if \(H_{\varvec{u}\varvec{u}}\) is nonsingular, then \(\beta > 0\). Now suppose that \(\beta >0\). Part of the Sherman–Morrison formula (Dahlquist and Björck 1974) says that given an invertible matrix \(A \in {\mathbb {R}}^{n \times n}\) and \(\varvec{w},\varvec{v}\in {\mathbb {R}}^{n \times 1}\), \(A+\varvec{w}\varvec{v}^\mathsf {T}\) is invertible if and only if \(1+\varvec{v}^\mathsf {T}A^{-1} \varvec{w}\ne 0\). Letting \(A = \beta I\) and \(\varvec{w}=\varvec{v}=\sqrt{{\tilde{c}}}\varvec{\Omega }\), the Sherman–Morrison formula guarantees that \(H_{\varvec{u}\varvec{u}}\) is nonsingular if \(1+\frac{{\tilde{c}}}{\beta } \varvec{\Omega }^\mathsf {T} \varvec{\Omega }\ne 0\). But \(\frac{{\tilde{c}}}{\beta } \varvec{\Omega }^\mathsf {T} \varvec{\Omega }\ge 0\), so \(1+\frac{{\tilde{c}}}{\beta } \varvec{\Omega }^\mathsf {T} \varvec{\Omega }\ge 1\) and \(H_{\varvec{u}\varvec{u}}\) is nonsingular. Therefore, \(H_{\varvec{u}\varvec{u}}\) is nonsingular if and only if \(\beta >0\). Thus, the optimal control problem (4.29) is nonsingular if and only if \(\beta >0\). Since singular optimal control problems require careful analysis and solution methods, it is assumed for the remainder of this paper, except in Sect. 5.1, that \(\beta >0\). As explained in the paragraph after (4.30), \(\beta >0\) requires that \(\alpha >0\). So for the remainder of this paper, except in Sect. 5.1, it is assumed that \(\beta >0\) and \(\alpha >0\) when considering the optimal control problem (4.29).
For the particular cost function (4.30), with \(\alpha >0\), \(\beta >0\), \(\gamma \ge 0\), \(\eta \ge 0\), and \(\delta \ge 0\), the optimally controlled equations of motion (4.32) defined on \(a\le t\le b\) become
the left boundary conditions (4.33) become
and the right boundary conditions (4.35) become
(4.42) is an implicit system of ODEs since \({\dot{\varvec{\kappa }}}\) depends on \({\ddot{\varvec{\Omega }}}\), which in turn depends on \({\ddot{\varvec{\xi }}}\), while \({\ddot{\varvec{\xi }}}\) depends on \({\dot{\varvec{\kappa }}}\). While one can in principle proceed to solve these equations as an implicit system of ODEs, an explicit expression for the highest derivatives can be found which reveals possible singularities in the system. The system can be written explicitly as
where we have defined \(\dot{\mathbf{n}}_1\) as
\(\mathbf{g}\) as
and \(\mathbf{h}\) as
The ODEs (4.45) and the left and right boundary conditions (4.43)–(4.44) define a TPBVP for the solution to Suslov’s optimal control problem (4.29) using the cost function (4.30). We shall also notice that while casting the optimal control problem as an explicit system of ODEs such as (4.45) brings it to the standard form amenable to numerical solution, it loses the geometric background of the optimal control problem derived earlier in Sect. 3.2.
Remark 4.10
(On optimal solutions with switching structure and bang-bang control) It is worth noting that in our paper we allow the control \({\dot{\varvec{\xi }}}\) to be unbounded so that it may take arbitrary values in \({\mathbb {R}}^3\). In addition, note that at the end of the previous section, the control \({\dot{\varvec{\xi }}}\) is assumed to be differentiable and therefore continuous. However, if we were to set up a restriction on the control such as \(|{\dot{\varvec{\xi }}}|\le M\) for a fixed \(|\varvec{\xi }|\), say \(|\varvec{\xi }|=1\), and permit \({\dot{\varvec{\xi }}}\) to be piecewise continuous, then the solutions to the optimal control problems tend to lead to bang-bang control obtained by piecing together solutions with \(|{\dot{\varvec{\xi }}}|=M\). The constraint \(|\varvec{\xi }|=1\) is equivalent to the constraint \(\left\langle \varvec{\xi },{\dot{\varvec{\xi }}} \right\rangle =0\) with the initial condition \(\left| \varvec{\xi }(a)\right| =1\). The constraint \(|{\dot{\varvec{\xi }}}|\le M\) is equivalent to the constraint \(|{\dot{\varvec{\xi }}}|^2-M^2-\theta ^2 = 0\), where \(\theta \) is a so-called slack variable. To incorporate these constraints, the Hamiltonian given in (4.36) must be amended to
where \(\varvec{\mu }= \begin{bmatrix} \mu _1 \\ \mu _2 \end{bmatrix} \in {\mathbb {R}}^2\) are new costates enforcing the new constraints and the control now consists of \(\varvec{u}= {\dot{\varvec{\xi }}}\) and \(\theta \). A solution that minimizes the optimal control problem with Hamiltonian (4.49) is determined from the necessary optimality conditions \(H_{\varvec{u}} = \mathbf {0}\) and \(H_\theta = -2 \mu _2 \theta = 0\). The latter condition implies that \(\mu _2=0\) or \(\theta =0\). If \(\mu _2 = 0\), the control \(\varvec{u}= {\dot{\varvec{\xi }}}\) is determined from \(H_{\varvec{u}} = \mathbf {0}\) and \(\theta \) is determined from \(\theta ^2 = | {\dot{\varvec{\xi }}} |^2-M\). If \(\theta = 0\), the control \(\varvec{u}= {\dot{\varvec{\xi }}}\) is determined from \(| {\dot{\varvec{\xi }}} |^2=M\) and \(\mu _2\) is determined from \(H_{\varvec{u}} = \mathbf {0}\). The difficulty is determining the intervals on which \(\mu _2=0\) or \(\theta =0\); this is the so-called optimal switching structure. In this paper, this difficulty is avoided by assuming that the control \(\varvec{u}= {\dot{\varvec{\xi }}}\) is unbounded and differentiable rather than bounded and piecewise continuous. Instead of bounding the control \(\varvec{u}= {\dot{\varvec{\xi }}}\) through hard constraints, large magnitude controls are penalized by the term \(\frac{\beta }{2} \left| {\dot{\varvec{\xi }}} \right| ^2\) in the cost function (4.30).
5 Numerical Solution of Suslov’s Optimal Control Problem
5.1 Analytical Solution of a Singular Version of Suslov’s Optimal Control Problem
In what follows, we shall focus on the numerical solution of the optimal control problem (4.29) by solving (4.45), (4.43), and (4.44), with \(\alpha >0\), \(\beta >0\), \(\gamma \ge 0\), \(\eta \ge 0\), and \(\delta \ge 0\). As these equations represent a nonlinear TPBVP, having a good initial approximate solution is crucial for the convergence of numerical methods. Because of the complexity of the problem, the numerical methods show no convergence to the solution unless the case considered is excessively simple. Instead, we employ the continuation procedure, namely we solve a problem with the values of the parameters chosen in such a way that an analytical solution of (4.29) can be found. Starting from this analytical solution, we seek a continuation of the solution to the desired values of the parameters. As it turns out, this procedure enables the computation of rather complex trajectories as illustrated by the numerical examples in Sect. 5.3.
To begin, let us consider a simplification of the optimal control problem (4.29). Suppose the terminal time is fixed to \(b=b_p\), \(\beta =0\), \(\eta =0\), and \(\delta =0\). In addition, suppose \(\varvec{\Omega }_d\) is replaced by \(\varvec{\Omega }_p\), where \(\varvec{\Omega }_p\) satisfies the following properties:
Property 5.1
\(\varvec{\Omega }_p\) is a differentiable function such that \(\varvec{\Omega }_p(a)=\varvec{\Omega }_a\) and \(\varvec{\Omega }_p(b_p)=\varvec{\Omega }_b\).
Property 5.2
\(\varvec{\Omega }_p\) lies on the constant kinetic energy manifold E, i.e., \(\left\langle {\mathbb {I}} \varvec{\Omega }_p, {\dot{\varvec{\Omega }}}_p \right\rangle =0\) iff \(\left\langle {\mathbb {I}} \varvec{\Omega }_p, \varvec{\Omega }_p \right\rangle =\left\langle {\mathbb {I}} \varvec{\Omega }_a , \varvec{\Omega }_a \right\rangle \).
Property 5.3
\(\varvec{\Omega }_p\) does not satisfy Euler’s equations at any time, i.e., \({\mathbb {I}} {\dot{\varvec{\Omega }}}_p(t) - \left[ {\mathbb {I}} \varvec{\Omega }_p(t) \right] \times \varvec{\Omega }_p(t) \ne \mathbf {0} \; \forall t \in [a,b_p]\).
Under these assumptions, (4.29) simplifies to
As discussed immediately after (4.41), (5.1) is a singular optimal control problem since \(\beta =0\). If there exists \(\varvec{\xi }_p\) such that \(\left| \varvec{\xi }_p \right| =1\) and \(\mathbf {q}\left( \varvec{\Omega }_p, \varvec{\xi }_p \right) =\mathbf {0}\), then \(\varvec{\xi }_p\) is a solution to the singular optimal control problem (5.1) provided Property 5.1 is satisfied. To wit, for such a \(\varvec{\xi }_p\) and given Property 5.1, take \(\varvec{\Omega }=\varvec{\Omega }_p\) and \(\varvec{\xi }=\varvec{\xi }_p\). Then, \(\mathbf {q}\left( \varvec{\Omega }, \varvec{\xi }\right) =\mathbf {q}\left( \varvec{\Omega }_p, \varvec{\xi }_p \right) =\mathbf {0}\), \(\varvec{\Omega }(a)=\varvec{\Omega }_p(a)=\varvec{\Omega }_a\), \(\varvec{\Omega }(b_p)=\varvec{\Omega }_p(b_p)=\varvec{\Omega }_b\), and \(\displaystyle \int _a^{b_p} \left[ \frac{\alpha }{4} \left[ \left| \varvec{\xi }\right| ^2 -1 \right] ^2+\frac{\gamma }{2} \left| \varvec{\Omega }- \varvec{\Omega }_p \right| ^2 \right] \, \mathrm {d} t =0\).
Now to construct such a \(\varvec{\xi }_p\), assume \(\varvec{\Omega }_p\) satisfies Properties 5.1–5.3. To motivate the construction of \(\varvec{\xi }_p\), also assume that \({\hat{\varvec{\xi }}}\) exists for which \(\mathbf {q}\left( \varvec{\Omega }_p,{\hat{\varvec{\xi }}} \right) =\mathbf {0}\), \({\hat{\varvec{\xi }}}(t) \ne 0 \; \forall t \in \left[ a,b_p\right] \), and \(\left\langle \varvec{\Omega }_p,{\hat{\varvec{\xi }}} \right\rangle =c=0\). Since \(\left\langle \varvec{\Omega }_p,{\hat{\varvec{\xi }}} \right\rangle =c=0\), \(\mathbf {q}\left( \varvec{\Omega }_p,\pi {\hat{\varvec{\xi }}} \right) =\mathbf {0}\) for any rescaling \(\pi \) of \({\hat{\varvec{\xi }}}\). Letting \({\tilde{\varvec{\xi }}} \equiv \lambda \left( \varvec{\Omega }_p,{\hat{\varvec{\xi }}} \right) {\hat{\varvec{\xi }}} = {\mathbb {I}} {\dot{\varvec{\Omega }}}_p - \left( {\mathbb {I}} \varvec{\Omega }_p \right) \times \varvec{\Omega }_p\), \(\mathbf {q}\left( \varvec{\Omega }_p,{{\tilde{\varvec{\xi }}}} \right) =\mathbf {q}\left( \varvec{\Omega }_p,\lambda \left( \varvec{\Omega }_p,{\hat{\varvec{\xi }}} \right) {\hat{\varvec{\xi }}} \right) =\mathbf {0}\). Next, by Property 5.3 (i.e., \({\mathbb {I}} {\dot{\varvec{\Omega }}}_p(t) - \left[ {\mathbb {I}} \varvec{\Omega }_p(t) \right] \times \varvec{\Omega }_p(t) \ne \mathbf {0} \; \forall t \in \left[ a,b_p\right] \)), normalize \({{\tilde{\varvec{\xi }}}}\) to produce a unit magnitude control vector \(\varvec{\xi }_p\):
Again due to scale invariance of the control vector, \(\mathbf {q}\left( \varvec{\Omega }_p,\varvec{\xi }_p \right) =\mathbf {0}\).
One can note that this derivation of \(\varvec{\xi }_p\) possessing the special properties \(\mathbf {q}\left( \varvec{\Omega }_p,\varvec{\xi }_p \right) =\mathbf {0}\) and \(\left| \varvec{\xi }_p \right| =1\) relied on the existence of some \({\hat{\varvec{\xi }}}\) for which \(\mathbf {q}\left( \varvec{\Omega }_p,{\hat{\varvec{\xi }}} \right) =\mathbf {0}\), \({\hat{\varvec{\xi }}}(t) \ne \mathbf {0} \; \forall t \in \left[ a,b_p\right] \), and \(\left\langle \varvec{\Omega }_p,{\hat{\varvec{\xi }}} \right\rangle =c=0\). Given \(\varvec{\xi }_p\) defined by (5.2) and by Property 5.2 (i.e., \(\left\langle {\mathbb {I}} \varvec{\Omega }_p, {\dot{\varvec{\Omega }}}_p \right\rangle =0\)), it is trivial to check that \(\left\langle \varvec{\Omega }_p,\varvec{\xi }_p \right\rangle =0\), so that indeed \(\mathbf {q}\left( \varvec{\Omega }_p,\varvec{\xi }_p \right) =\mathbf {0}\) with
Thus, \(\varvec{\xi }_p\) defined by (5.2) is a solution of the singular optimal control problem (5.1). Moreover, \(\varvec{\xi }_p\) has the desirable property \(\left\langle \varvec{\Omega }_p,\varvec{\xi }_p \right\rangle =0\). The costate \(\varvec{\kappa }= \mathbf {0}\) satisfies the ODE TPBVP (4.45), (4.43)–(4.44) corresponding to the analytic solution pair \((\varvec{\Omega }_p,\varvec{\xi }_p)\).
5.2 Numerical Solution of Suslov’s Optimal Control Problem via Continuation
Starting from the analytic solution pair \((\varvec{\Omega }_p,\varvec{\xi }_p)\) solving (5.1), the full optimal control problem can then be solved by continuation in \(\gamma \), \(\beta \), \(\eta \), and \(\delta \) using the following algorithm. We refer the reader to Allgower and Georg (2003) as a comprehensive reference on numerical continuation methods, as well as our discussion in Appendix. Consider the continuation cost function \(C_{\alpha ,\beta _c,\gamma _c,\eta _c,\delta _c}\), where \(\beta _c\), \(\gamma _c\), \(\eta _c\), and \(\delta _c\) are variables. If \(\gamma =0\), choose \(\beta _m\) such that \(0<\beta _m \ll \min \{\alpha ,\beta ,1\}\); otherwise if \(\gamma >0\), choose \(\beta _m\) such that \(0<\beta _m \ll \min \{\alpha ,\beta ,\gamma \}\). If the terminal time b is fixed, choose \(b_p=b\); otherwise, if the terminal time is free, choose \(b_p\) as explained below.
If \(\gamma =0\), choose \(\varvec{\Omega }_p\) to be some nominal function satisfying Properties 5.1–5.3, such as the projection of the line segment connecting \(\varvec{\Omega }_a\) to \(\varvec{\Omega }_b\) onto E and let \(b_p\) be the time such that \(\varvec{\Omega }_p(b_p)=\varvec{\Omega }_b\). For fixed terminal time \(b_p\), solve (4.29) with cost function \(C_{\alpha ,\beta _m,\gamma _c,0,0}\) by continuation in \(\gamma _c\), starting from \(\gamma _c = 1\) with the initial solution guess \((\varvec{\Omega }_p,\varvec{\xi }_p)\) and ending at \(\gamma _c = \gamma =0\) with the final solution pair \((\varvec{\Omega }_1,\varvec{\xi }_1)\).
If \(\gamma >0\) and \(\varvec{\Omega }_d\) does not satisfy Properties 5.1–5.3, choose \(\varvec{\Omega }_p\) to be some function “near” \(\varvec{\Omega }_d\) that does satisfy Properties 5.1–5.3 and let \(b_p\) be the time such that \(\varvec{\Omega }_p(b_p)=\varvec{\Omega }_b\). For fixed terminal time \(b_p\), solve (4.29) with cost function \(C_{\alpha ,\beta _m,\gamma _c,0,0}\) by continuation in \(\gamma _c\), starting from \(\gamma _c = 0\) with the initial solution guess \((\varvec{\Omega }_p,\varvec{\xi }_p)\) and ending at \(\gamma _c = \gamma \) with the final solution pair \((\varvec{\Omega }_1,\varvec{\xi }_1)\).
If \(\gamma >0\) and \(\varvec{\Omega }_d\) satisfies Properties 5.1–5.3, choose \(\varvec{\Omega }_p=\varvec{\Omega }_d\), let \(b_p\) be the time such that \(\varvec{\Omega }_d(b_p)=\varvec{\Omega }_b\), and construct the solution pair \((\varvec{\Omega }_1,\varvec{\xi }_1)\) with \(\varvec{\Omega }_1=\varvec{\Omega }_p\) and \(\varvec{\xi }_1 = \varvec{\xi }_p\).
For fixed terminal time \(b_p\), solve (4.29) with cost function \(C_{\alpha ,\beta _c,\gamma ,0,0}\) by continuation in \(\beta _c\), starting from \(\beta _c = \beta _m\) with the initial solution guess \((\varvec{\Omega }_1,\varvec{\xi }_1)\) and ending at \(\beta _c = \beta \) with the final solution pair \((\varvec{\Omega }_2,\varvec{\xi }_2)\). Next, for fixed terminal time \(b_p\), solve (4.29) with cost function \(C_{\alpha ,\beta ,\gamma ,\eta _c,0}\) by continuation in \(\eta _c\), starting from \(\eta _c =0\) with the initial solution guess \((\varvec{\Omega }_2,\varvec{\xi }_2)\) and ending at \(\eta _c=\eta \) with the final solution pair \((\varvec{\Omega }_3,\varvec{\xi }_3)\). If the terminal time is fixed, then this is the final solution. If the terminal time is free, solve (4.29) with cost function \(C_{\alpha ,\beta ,\gamma ,\eta ,\delta _c}\), letting the terminal time vary, by continuation in \(\delta _c\), starting from
with the initial solution guess \((\varvec{\Omega }_3,\varvec{\xi }_3,b_p)\) and ending at \(\delta _c=\delta \) with final solution triple \((\varvec{\Omega }_4,\varvec{\xi }_4,b_4)\). If the terminal time is free, then this is the final solution.
5.3 Numerical Solution of Suslov’s Optimal Control Problem via the Indirect Method and Continuation
Suslov’s optimal control problem was solved numerically using the following inputs and setup. The rigid body’s inertia matrix is
The initial time is \(a=0\), and the terminal time b is free. The initial and terminal body angular velocities are \(\varvec{\Omega }_a=\varvec{\phi }(a)/2=[5, \, 0, \, 0]^\mathsf {T}\) and \(\varvec{\Omega }_b=\varvec{\phi }_\parallel (b_d) \approx [2.7541, \, -2.3109, \, -1.4983]^\mathsf {T}\), respectively, where \(\varvec{\phi }\) and \(\varvec{\phi }_\parallel \) are defined below in (5.5)–(5.7) and \(b_d=10\).
The desired body angular velocity \(\varvec{\Omega }_d\) (see Fig. 1) is approximately the projection of a spiral onto the constant kinetic energy ellipsoid E determined by the rigid body’s inertia matrix \({\mathbb {I}}\) and initial body angular velocity \(\varvec{\Omega }_a\) and defined in (4.11). Concretely, we aim to track a spiral-like trajectory \(\varvec{\Omega }_d\) on the constant kinetic energy ellipsoid E:
The setup for \(\varvec{\Omega }_d\) is to be understood as follows. The graph of \(\varvec{\phi }\) (5.5) defines a spiral in the plane \(x=10\). Given a nonzero vector \(\mathbf {v} \in {\mathbb {R}}^3\), the parallel projection operator \(\parallel \) (5.6) constructs the vector \(\mathbf {v}_\parallel \) that lies at the intersection between the ray \(R_{\mathbf {v}} = \left\{ t \mathbf {v} : t>0 \right\} \) and the ellipsoid E. The spiral \(\varvec{\phi }_\parallel \) defined by (5.7) is the projection of the spiral \(\varvec{\phi }\) onto the ellipsoid E, which begins at \(\varvec{\Omega }_a\) at time a, and terminates at \(\varvec{\Omega }_b\) at time \(b_d\). Also, \(\sigma \) (5.8) is a sigmoid function, i.e., a smooth approximation of the unit step function, and s (5.9) is the time translation of \(\sigma \) to time \(b_d\). \(\varvec{\Omega }_d\) (5.10) utilizes the translated sigmoid function s to compute a weighted average of the projected spiral \(\varvec{\phi }_\parallel \) and \(\varvec{\Omega }_b\) so that \(\varvec{\Omega }_d\) follows the projected spiral \(\varvec{\phi }_\parallel \) for \(0\le t < b_d\), holds steady at \(\varvec{\Omega }_b\) for \(t>b_d\), and smoothly transitions between \(\varvec{\phi }_\parallel \) and \(\varvec{\Omega }_b\) at time \(b_d\). The coefficients of the cost function (4.30) are chosen to be \(\alpha = 1\), \(\beta = .1\), \(\gamma = 1\), \(\eta = 1 \, \mathrm {or} \, .01\), and \(\delta = .2\).
The optimal control problem (4.29) was solved numerically via the indirect method, i.e., by numerically solving the ODE TPBVP (4.45), (4.43)–(4.44) through continuation in \(\beta \), \(\eta \), and \(\delta \) starting from the analytic solution to the singular optimal control problem (5.1), as outlined in Sect. 5.2. Because most ODE BVP solvers only solve problems defined on a fixed time interval, the ODE TPBVP (4.45), (4.43)–(4.44) was reformulated on the normalized time interval \(\left[ 0,1\right] \) through a change of variables by defining \(T \equiv b-a\) and by defining normalized time \(s \equiv \frac{t-a}{T}\); if the terminal time b is fixed, then T is a known constant, whereas if the terminal time b is free, then T is an unknown parameter that must be solved for in the ODE TPBVP. The finite difference automatic continuation solver acdc from the MATLAB package bvptwp was used to solve the ODE TPBVP by performing continuation in \(\beta \), \(\eta \), and \(\delta \), with the relative error tolerance set to 1e-8. The result of acdc was then passed through the MATLAB collocation solver sbvp using Gauss (rather than equidistant) collocation points with the absolute and relative error tolerances set to 1e-8. sbvp was used to clean up the solution provided by acdc because collocation exhibits superconvergence when solving regular (as opposed to singular) ODE TPBVP using Gauss collocation points. To make acdc and sbvp execute efficiently, the ODEs were implemented in MATLAB in vectorized fashion. For accuracy and efficiency, the MATLAB software ADiGator was used to supply vectorized, automatic ODE Jacobians to acdc and sbvp. For accuracy, the MATLAB Symbolic Math Toolbox was used to supply symbolically computed BC Jacobians to acdc and sbvp. ADiGator constructs Jacobians through automatic differentiation, while the MATLAB Symbolic Math Toolbox constructs Jacobians through symbolic differentiation.
Figures 2 and 3 show the results for \(\eta =1\) and \(\eta =.01\), respectively. The optimal terminal time is \(b=11.36\) for \(\eta =1\) and is \(b=9.84\) for \(\eta =.01\). Figures 2a and 3a show the optimal body angular velocity \(\varvec{\Omega }\), the desired body angular velocity \(\varvec{\Omega }_d\), and the projection \(\varvec{\xi }_\parallel \) of the control vector \(\varvec{\xi }\) onto the ellipsoid E. Recall that \(\gamma \), through the cost function term \(\frac{\gamma }{2} \left| \varvec{\Omega }- \varvec{\Omega }_d \right| ^2\), influences how closely the optimal body angular velocity \(\varvec{\Omega }\) tracks the desired body angular velocity \(\varvec{\Omega }_d\), while \(\eta \), through the cost function term \(\frac{\eta }{2} \left| {\dot{\varvec{\Omega }}} \right| ^2\), influences how closely the optimal body angular velocity \(\varvec{\Omega }\) tracks a minimum energy trajectory. For \(\gamma =1\), \(\frac{\gamma }{\eta }=1\) when \(\eta =1\) and \(\frac{\gamma }{\eta }=100\) when \(\eta =.01\). As expected, comparing Figs. 2a and 3a, the optimal body angular velocity \(\varvec{\Omega }\) tracks the desired body angular velocity \(\varvec{\Omega }_d\) much more accurately for \(\eta =.01\) compared to \(\eta =1\). Figures 2b and 3b demonstrate that the numerical solutions preserve the nonholonomic orthogonality constraint \(\left\langle \varvec{\Omega }, \varvec{\xi }\right\rangle =0\) to machine precision. Figures 2c and 3c show that the magnitude \(\left| \varvec{\xi }\right| \) of the control vector \(\varvec{\xi }\) remains close to 1, as encouraged by the cost function term \(\frac{\alpha }{4} \left[ \left| \varvec{\xi }\right| ^2 -1 \right] ^2\). Figures 2d and 3d show the costates \(\varvec{\kappa }\). In Figs. 2a, 3a, 2d, and 3d, a green marker indicates the beginning of a trajectory, while a red marker indicates the end of trajectory. In Fig. 3a, the yellow marker on the desired body angular velocity indicates \(\varvec{\Omega }_d(b)\), where \(b=9.84\) is the optimal terminal time for \(\eta =.01\).
To investigate the stability of the controlled system, we have perturbed the control \({\dot{\varvec{\xi }}}\) obtained from solving the optimal control ODE TPBVP and observed that the perturbed solution \(\varvec{\Omega }\) obtained by solving the pure equations of motion (4.5) as an ODE IVP using this perturbed control is similar to the anticipated \(\varvec{\Omega }\) corresponding to the solution of the optimal control ODE TPBVP and the unperturbed control. While more studies of stability are needed, this is an indication that the controlled system we studied is stable, at least in terms of the state variables \(\varvec{\Omega }\) and \(\varvec{\xi }\) under perturbations of the control \({\dot{\varvec{\xi }}}\). More studies of the stability of the controlled system will be undertaken in the future.
Verification of a local minimum solution It is also desirable to verify that the numerical solutions obtained by our continuation indirect method do indeed provide a local minimum of the optimal control problem. Chapter 21 in reference Agrachev and Sachkov (2004) and also reference Bonnard et al. (2007) provide sufficient conditions for a solution satisfying Pontryagin’s minimum principle to be a local minimum; however, the details are quite technical and may be investigated in future work. These sufficient conditions must be checked numerically rather than analytically. COTCOT and HamPath, also mentioned in Appendix, are numerical software packages which do check these sufficient conditions numerically.
Due to the technicality of the sufficient conditions discussed in Agrachev and Sachkov (2004), Bonnard et al. (2007), we have resorted to a different numerical justification. More precisely, to validate that the solutions obtained by our optimal control procedure, or the so-called indirect method solutions, indeed correspond to local minima, we have fed the solutions obtained by our method into several different MATLAB direct method solvers as initial solution guesses. We provide a survey of the current state of direct method solvers for optimal control problems in Appendix.
Note that the indirect method only produces a solution that meets the necessary conditions for a local minimum to (4.29), while the direct method solution meets the necessary and sufficient conditions for a local minimum to a finite-dimensional approximation of (4.29). Thus, it may be concluded that an indirect method solution is indeed a local minimum solution of (4.29) if the direct method solution is close to the indirect method solution. The indirect method solutions were validated against the MATLAB direct method solvers GPOPS-II and FALCON.m. GPOPS-II uses pseudospectral collocation techniques, uses the IPOPT NLP solver, uses hp-adaptive mesh refinement, and can use ADiGator to supply vectorized, automatic Jacobians and Hessians. FALCON.m uses trapezoidal or backward Euler local collocation techniques, uses the IPOPT NLP solver, and can use the MATLAB Symbolic Math Toolbox to supply symbolically computed Jacobians and Hessians. Both direct method solvers we have tried converged to a solution close to that provided by the indirect method, which is to be expected since the direct method solvers are only solving a finite-dimensional approximation of (4.29). Thus, we are confident that the solutions we have found in this section indeed correspond to local minima of the optimal control problems.
6 Conclusions
We have derived the equations of motion for the optimal control of Suslov’s problem and demonstrated the controllability of Suslov’s problem by varying the nonholonomic constraint vector \(\varvec{\xi }\) in time. It is shown that the problem has the desirable controllability in the classical control theory sense. We have also demonstrated that an optimal control procedure, using continuation from an analytical solution, not only can reach the desired final state, but can also force the system to follow a quite complex trajectory such as a spiral on the constant kinetic energy ellipsoid E. We have also investigated the sufficient conditions for a local minimum, and while we did not implement them, all the numerical evidence we have points to the solutions found being local minima.
The procedure outlined here opens up a possibility to control nonholonomic problems by continuous time variation of the constraint. We have derived the analysis case only for Suslov’s problem, which we consider one of the most fundamental problems in nonholonomic mechanics. It would be interesting to generalize the theory of the optimal control derived here to the case of an arbitrary manifold. Of particular importance for the controllability will be the dimensionality and geometry of the constraint versus that of the manifold. This will be addressed in future work.
Notes
Matlab Documentation Center. Optimization Toolbox, Constrained Optimization, fmincon.
References
Aceto, L., Mazzia, F., Trigiante, D.: The performances of the code TOM on the Holt problem. In: Antreich, K., Bulirsch, R., Gilg, A., Rentrop, P. (eds.) Modeling, Simulation, and Optimization of Integrated Circuits, pp. 349–360. Springer, Berlin (2003)
Agrachev, A., Sachkov, Y.: Control Theory from the Geometric Viewpoint. Springer, Berlin (2004)
Allgower, E.L., Georg, K.: Introduction to Numerical Continuation Methods, vol. 45. SIAM, Philadelphia (2003)
Arnold, V.I., Kozlov, V.V., Neishtadt, A.I.: Mathematical Aspects of Classical and Celestial Mechanics, 2nd edn. Spinger, Berlin (1997)
Ascher, U.M., Spiteri, R.J.: Collocation software for boundary value differential-algebraic equations. SIAM J. Sci. Comput. 15(4), 938–952 (1994)
Ascher, U.M., Christiansen, J., Russell, R.D.: Algorithm 569: COLSYS: collocation software for boundary-value ODEs [D2]. ACM Trans. Math. Softw. (TOMS) 7(2), 223–229 (1981)
Ascher, U.M., Mattheij, R.M.M., Russell, R.D.: Numerical Solution of Boundary Value Problems for Ordinary Differential Equations, vol. 13. SIAM, Philadelphia (1994)
Auzinger, W., et al.: A collocation code for singular boundary value problems in ordinary differential equations. Numer. Algorithms 33(1–4), 27–39 (2003)
Bader, G., Ascher, U.M.: A new basis implementation for a mixed order boundary value ODE solver. SIAM J. Sci. Stat. Comput. 8(4), 483–500 (1987)
Bashir-Ali, Z., Cash, J.R., Silva, H.H.M.: Lobatto deferred correction for stiff two-point boundary value problems. Comput. Math. Appl. 36(10), 59–69 (1998)
Becerra, V.M.: Solving complex optimal control problems at no cost with PSOPT. In: 2010 IEEE International Symposium on Computer-Aided Control System Design, pp. 1391–1396. IEEE (2010)
Betts, J.T.: Survey of numerical methods for trajectory optimization. J. Guid. Control Dyn. 21(2), 193–207 (1998)
Betts, J.T.: Practical Methods for Optimal Control and Estimation Using Nonlinear Programming, vol. 19. SIAM, Philadelphia (2010)
Betts, J.T.: Sparse optimization suite (SOS). In: Applied Mathematical Analysis, LLC. (Based on the Algorithms Published in Betts, JT, Practical Methods for Optimal Control and Estimation Using Nonlinear Programming. SIAM Press, Philadelphia, PA. (2010).) (2013)
Biegler, L.T.: Nonlinear Programming: Concepts, Algorithms, and Applications to Chemical Processes, vol. 10. SIAM, Philadelphia (2010)
Birkisson, Á., Driscoll, T.A.: Automatic Linearity Detection (2013a). http://eprints.maths.ox.ac.uk/1672/
Birkisson, Á.: Numerical solution of nonlinear boundary value problems for ordinary differential equations in the continuous framework. Ph.D. thesis, University of Oxford (2013b)
Birkisson, Á., Driscoll, T.A.: Automatic Fréchet differentiation for the numerical solution of boundary-value problems. ACM Trans. Math. Softw. (TOMS) 38(4), 26 (2012)
Bloch, A.M.: Nonholonomic Mechanics and Control, vol. 24. Springer, New York (2003)
Bloch, A.M., et al.: Nonholonomic mechanical systems with symmetry. Arch. Ration. Mech. Anal. 136, 21–99 (1996)
Bloch, A.A., et al.: A geometric approach to the optimal control of nonholonomic mechanical systems. In: Bettiol, P., Cannarsa, P., Colombo, G., Motta, M., Rampazzo, F. (eds.) Analysis and Geometry in Control Theory and its Applications, pp. 35–64. Springer INdAM Series, Basel (2015a)
Bloch, A.M., Krupka, D., Zenkov, D.V.: The Helmholtz conditions and the method of controlled Lagrangians. In: Zenkov, D.V. (ed.) The Inverse Problem of the Calculus of Variations: Local and Global Theory, pp. 1–31. Atlantic Press, Minneapolis (2015b)
Boisvert, J.J., Muir, P.H., Spiteri, R.J.: py_bvp: a universal Python interface for BVP codes. In: Proceedings of the 2010 Spring Simulation Multiconference, p. 95. Society for Computer Simulation International (2010)
Boisvert, J.J., Muir, P.H., Spiteri, R.J.: A Runge–Kutta BVODE solver with global error and defect control. ACM Trans. Math. Softw. (TOMS) 39(2), 11 (2013)
Bonnans, F., et al.: BOCOP: User Guide (2014). http://bocop.org/
Bonnans, F., et al.: BocopHJB 1.0. 1–User Guide, PhD thesis. INRIA (2015)
Bonnard, B., Caillau, J.-B., Trélat, E.: Computation of conjugate times in smooth optimal control: the COTCOT algorithm. In: Proceedings of the 44th IEEE Conference on Decision and Control and European Control Conference 2005 (CDC-ECC’05), Séville, Spain. IEEE (2005)
Bonnard, B., Caillau, J.-B., Trélat, E.: Second order optimality conditions in the smooth case and applications in optimal control. ESAIM Control Optim. Calc. Var. 13(2), 207–236 (2007)
Borisov, A.V., Kilin, A.A., Mamaev, I.S.: Hamiltonicity and integrability of the Suslov problem. Regul. Chaotic Dyn. 16(1–2), 104–116 (2011)
Brauer, G.L., Cornick, D.E., Stevenson, R.: Capabilities and Applications of the Program to Optimize Simulated Trajectories (POST). Program Summary Document (1977)
Brockett, R.W.: System theory on group manifolds and coset spaces. SIAM J. Control 10(2), 265–284 (1972)
Brugnano, L., Trigiante, D.: A new mesh selection strategy for ODEs. Appl. Numer. Math. 24(1), 1–21 (1997)
Bryson, A.E.: Applied Optimal Control: Optimization, Estimation and Control. CRC Press, Boca Raton (1975)
Bryson, A.E.: Dynamic Optimization, vol. 1. Prentice Hall, Englewood Cliffs (1999)
Bullo, F., Lewis, A.D.: Geometric Control of Mechanical Systems: Modeling, Analysis, and Design for Simple Mechanical Control Systems. Texts in Applied Mathematics. Springer, New York (2005)
Byrd, R.H., Nocedal, J., Waltz, R.A.: KNITRO: an integrated package for nonlinear optimization. In: Di Pillo, G., Roma, M. (eds.) Large-Scale Nonlinear Optimization, pp. 35–59. Springer, Berlin (2006)
Caillau, J.-B., Cots, O., Gergaud, J.: Differential continuation for regular optimal control problems. Optim. Methods Softw. 27(2), 177–196 (2012)
Cash, J.R., Wright, M.H.: A deferred correction method for nonlinear two-point boundary value problems: implementation and numerical evaluation. SIAM J. Sci. Stat. Comput. 12(4), 971–989 (1991)
Cash, J.R., Mazzia, F.: A new mesh selection algorithm, based on conditioning, for two-point boundary value codes. J. Comput. Appl. Math. 184(2), 362–381 (2005)
Cash, J.R., Mazzia, F.: Hybrid mesh selection algorithms based on conditioning for two-point boundary value problems. JNAIAM J. Numer. Anal. Ind. Appl. Math. 1(1), 81–90 (2006)
Cash, J.R., Moore, G., Wright, R.W.: An automatic continuation strategy for the solution of singularly perturbed nonlinear boundary value problems. ACM Trans. Math. Softw. (TOMS) 27(2), 245–266 (2001)
Cash, J.R., et al.: The role of conditioning in mesh selection algorithms for first order systems of linear two point boundary value problems. J. Comput. Appl. Math. 185(2), 212–224 (2006)
Cash, J.R., et al.: Algorithm 927: the MATLAB code bvptwp. m for the numerical solution of two point boundary value problems. ACM Trans. Math. Softw. (TOMS) 39(2), 15 (2013)
Chen, Y.Q., Schwartz, A.L.: RIOTS95: a MATLAB toolbox for solving general optimal control problems and its applications to chemical processes (2002). http://www.schwartz-home.com/riots/
Chow, Y.T., et al.: Algorithm for overcoming the curse of dimensionality for certain non-convex Hamilton–Jacobi equations, projections and differential games. Ann. Math. Sci. Appl. (2016, to appear)
Cizniar, M., et al.: Dynopt–dynamic optimisation code for MATLAB. In: Technical Computing Prague 2005 (2005)
Community Portal for Automatic Differentiation (2016). http://www.autodiff.org/. Visited 10/08/2016
Cortés, J., Martínez, E.: Mechanical control systems on Lie algebroids. IMA J. Math. Control Inf. 21, 457–492 (2004)
Dahlquist, G., Björck, A.: Numerical Methods. Series in Automatic Computation. Prentice-Hall, Englewood Cliffs (1974)
Dankowicz, H., Schilder, F.: Recipes for Continuation. SIAM, Philadelphia (2013)
Darbon, J., Osher, S.: Algorithms for Overcoming the Curse of Dimensionality for Certain Hamilton–Jacobi Equations Arising in Control Theory and Elsewhere. arXiv preprint arXiv:1605.01799 (2016)
Doedel, E.J., et al.: AUTO-07P: Continuation and Bifurcation Software for Ordinary Differential Equations (2007)
Driscoll, T.A., Hale, N., Trefethen, L.N.: Chebfun Guide (2014). Pafnuty Publications. http://www.chebfun.org/docs/guide/
Enright, W.H., Muir, P.H.: Runge–Kutta software with defect control for boundary value ODEs. SIAM J. Sci. Comput. 17(2), 479–497 (1996)
Evans, L.C.: Partial Differential Equations. American Mathematical Society, Providence. ISBN: 9780821849743, 0821849743 (2010)
Falugi, P., Kerrigan, E., Van Wyk, E.: Imperial College london Optimal Control Software User Guide (ICLOCS). Department of Electrical and Electronic Engineering, Imperial College London, London (2010)
Gay-Balmaz, F., Putkaradze, V.: On noisy extensions of nonholonomic constraints. J. Nonlinear Sci. 26(6), 1571–1613 (2016). doi:10.1007/s00332-016-9313-x
Gerdts, M.: Optimal Control of ODEs and DAEs. Walter de Gruyter, Berlin (2012)
Gill, P.E., Wong, E.: Users Guide for SNCTRL. (2015). https://ccom.ucsd.edu/research/abstract.php?id=102
Gill, P.E., Murray, W., Saunders, M.A.: SNOPT: an SQP algorithm for large-scale constrained optimization. SIAM Rev. 47(1), 99–131 (2005)
Goh, C.J., Teo, K.L.: MISER: a FORTRAN program for solving optimal control problems. Adv. Eng. Softw. (1978) 10(2), 90–99 (1988)
Hale, N., Moore, D.R.: A sixth-order extension to the MATLAB package bvp4c of J. Kierzenka and L. Shampine (2008)
Holm, D.D.: Geometric mechanics: rotating, translating, and rolling. In: Geometric Mechanics. Imperial College Press, London. ISBN: 9781848167773 (2011)
Holmström, K., Göran, A., Edvall, M.M.: Users Guide for Tomlab 7 (2010). Tomlab Optimization, Vallentuna and Västerås. http://tomopt.com/tomlab/download/manuals.php
Houska, B., Ferreau, H.J., Diehl, M.: ACADO toolkitAn open-source framework for automatic control and dynamic optimization. Optim. Control Appl. Methods 32(3), 298–312 (2011)
Hull, D.G.: Optimal Control Theory for Applications. Springer, Berlin (2013)
Isidori, A.: Nonlinear Control Systems. Springer, Berlin (2013)
Jansch, C., Well, K.H., Schnepper, K.: GESOP-Eine Software Umgebung Zur Simulation Und Optimierung. In: Proceedings des SFB (1994)
Jean, F.: Control of Nonholonomic Systems: From Sub-Riemannian Geometry to Motion Planning. Springer, Cham (2014)
Kelly, M.P.: OptimTraj User’s Guide, Version 1.5. (2016). https://github.com/MatthewPeterKelly/OptimTraj/tree/master/docs/UsersGuide
Kierzenka, J., Shampine, L.F.: A BVP solver that controls residual and error. JNAIAM J. Numer. Anal. Ind. Appl. Math. 3(1–2), 27–41 (2008)
Kitzhofer, G., et al.: The new Matlab code bvpsuite for the solution of singular implicit BVPs. J. Numer. Anal. Ind. Appl. Math 5, 113–134 (2010)
Koon, W.-S., Marsden, J.E.: Optimal control for holonomic and nonholonomic mechanical system with symmetry and Lagrangian reduction. SIAM J. Control Optim. 35, 901–929 (1997)
Kozlov, V.V.: On the integration theory of equations of nonholonomic mechanics. Regul. Chaotic Dyn. 7(2), 161–176 (2002)
Kunkel, P.: Differential-Algebraic Equations: Analysis and Numerical Solution. European Mathematical Society, Germany (2006)
Lee, J.M.: Smooth Manifolds. Springer, Berlin (2003)
Lewis, A.D.: Simple mechanical control systems with constraints. IEEE Trans. Automat. Control 45, 14201436 (2000)
Lewis, A.D., Murray, R.M.: Variational principles for constrained systems: theory and experiment. Int. J. Non Linear Mech. 30(6), 793–815 (1995)
Lewis, A.D., Murray, R.M.: Controllability of simple mechanical control systems. SIAM J. Control Optim. 35, 766–790 (1997)
Li, Z., Canny, J.F. (eds.): Nonholonomic Motion Planning. Springer, New York (1992)
Marsden, J.E., Ratiu, T.S.: Introduction to Mechanics and Symmetry: A Basic Exposition of Classical Mechanical Systems, vol. 17. Springer, Berlin (2013)
Mazzia, F., Sgura, I.: Numerical approximation of nonlinear BVPs by means of BVMs. Appl. Numer. Math. 42(1), 337–352 (2002)
Mazzia, F., Trigiante, D.: A hybrid mesh selection strategy based on conditioning for boundary value ODE problems. Numer. Algorithms 36(2), 169–187 (2004)
Mazzia, F., Cash, J.R., Soetaert, K.: Solving boundary value problems in the open source software R: package bvpSolve. Opusc. Math. 34(2), 387–403 (2014)
Nijmeijer, H., Van der Schaft, A.: Nonlinear Dynamical Control Systems. Springer, Berlin (2013)
Nikolayzik, T., Büskens, C., Gerdts, M.: Nonlinear large-scale optimization with WORHP. In: Proceedings of the 13th AIAA/ISSMO Multidisciplinary Analysis Optimization Conference, vol. 13, no. 15.09 (2010)
Oberle, H.J., Grimm, W.: BNDSCO: A Program for the Numerical Solution of Optimal Control Problems. Inst. für Angewandte Math. der Univ, Hamburg (2001)
Osborne, J., Zenkov, D.V.: Steering the Chaplygin sleigh by a moving mass. In: Proceedings of CDC 2005, vol. 44, pp. 1114–1118
Patterson, M.A., Rao, A.V.: GPOPS-II: a MATLAB software for solving multiple-phase optimal control problems using hp-adaptive Gaussian quadrature collocation methods and sparse nonlinear programming. ACM Trans. Math. Softw. (TOMS) 41(1), 1 (2014)
Poincaré, H.: Sur une forme nouvelle des équations de la mécanique. CR Acad. Sci 132, 369–371 (1901)
Rao, A.V.: A survey of numerical methods for optimal control. Adv. Astronaut. Sci. 135(1), 497–528 (2009)
Rao, A.V., et al.: Algorithm 902: Gpops, a matlab software for solving multiple-phase optimal control problems using the gauss pseudospectral method. ACM Trans. Math. Softw. (TOMS) 37(2), 22 (2010)
Rieck, M., et al.: FALCON.m User Guide. Institute of Flight System Dynamics, Technische Universität München, Munich (2016)
Ross, I.M.: Users Manual for DIDO: A MATLAB Application Package for Solving Optimal Control Problems. Elissar Global (2004). http://www.elissarglobal.com/academic/products/
Rutquist, P.E., Edvall, M.M.: Propt-matlab Optimal Control Software, vol. 260. Tomlab Optimization Inc, Vallentuna and Västerås (2010)
Shampine, L.F., Kierzenka, J., Reichelt, M.W.: Solving boundary value problems for ordinary differential equations in MATLAB with bvp4c. In: Tutorial notes, pp. 437–448 (2000)
Shampine, L.F., Muir, P.H., Xu, H.: A user-friendly Fortran BVP solver1. JNAIAM 1(2), 201–217 (2006)
Suslov, G.K.: Theoretical Mechanics, vol. 3. Gostekhizdat, Moscow (1946)
The Numerical Algorithms Group (NAG). The NAG Library. www.nag.com
Vlases, W.G., et al.: Optimal trajectories by implicit simulation. In: Boeing Aerospace and Electronics, Technical Report WRDC-TR-90-3056, Wright-Patterson Air Force Base (1990)
von Stryk, O.: Users Guide for DIRCOL. Technische Universität Darmstadt, Darmstadt (2000)
Wächter, A., Biegler, L.T.: On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming. Math. Program. 106(1), 25–57 (2006)
Weinstein, M.J., Patterson, M.A., Rao, A.V.: Utilizing the algorithmic differentiation package ADiGator for solving optimal control problems using direct collocation. In: AIAA Guidance, Navigation, and Control Conference, p. 1085 (2015)
Weinstein, M.J., Rao, A.V.: Algorithm: ADiGator, a toolbox for the algorithmic differentiation of mathematical functions in MATLAB using source transformation via operator overloading. ACM Trans. Math. Softw. (2016, in revision). http://www.anilvrao.com/Publications/SubmittedJournalPublications/adigator-CALGO.pdf
Zenkov, D.V., et al.: Matching and stabilization of low dimensional nonholonomic systems. In: Proc CDC 2000, vol 39. pp. 1289–1295 (2000)
Zenkov, D.V., Bloch, A.M., Marsden, J.E.: Flat nonholonomic matching. In: Proceedings of ACC 2002, pp. 2812–2817 (2002)
Acknowledgements
This problem has been suggested to us by Prof. D. V. Zenkov during a visit to the University of Alberta. Subsequent discussions and continued interest by Prof. Zenkov to this project are gratefully acknowledged. We also acknowledge fruitful discussions with Profs. A. A. Bloch, D. D. Holm, M. Leok, F. Gay-Balmaz, T. Hikihara, A. Lewis, and H. Yoshimura. There were many helpful exchanges with Prof. H. Oberle, Prof. L. F. Shampine (bvp4c and bvp5c), Prof. F. Mazzia (TOM and bvptwp), J. Willkomm (ADiMat), M. Weinsten (ADiGator), and Prof. A. Rao (GPOPS-II) concerning ODE BVP solvers, automatic differentiation software, and direct method solvers. Both authors of this project were partially supported by the NSERC Discovery Grant and the University of Alberta Centennial Fund. In addition, Stuart Rogers was supported by the FGSR Graduate Travel Award, the IGR Travel Award, the GSA Academic Travel Award, and the AMS Fall Sectional Grad Student Travel Grant. We also thank the Alberta Innovates Technology Funding (AITF) for providing support to the authors through the Alberta Centre for Earth Observation Sciences (CEOS).
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Anthony Bloch.
A Survey of Numerical Methods for Solving Optimal Control Problems: Dynamic Programming, the Direct Method, and the Indirect Method
A Survey of Numerical Methods for Solving Optimal Control Problems: Dynamic Programming, the Direct Method, and the Indirect Method
There are three approaches to solving an optimal control problem: (1) dynamic programming, (2) the direct method, and (3) the indirect method. Bryson (1975), Bryson (1999) present an introduction to dynamic programming. Betts (1998), Rao (2009) are thorough survey articles on the direct and indirect methods. Reference Gerdts (2012) is a recent treatise providing detailed descriptions of both the direct and indirect methods, Biegler (2010) is a comprehensive reference on the direct method, while Betts (2010) provides a comprehensive, modern treatment of the local collocation technique of the direct method.
In dynamic programming, a PDE, called the Hamilton–Jacobi–Bellman equation (Evans 2010), is formulated and solved. However, due to the curse of dimensionality, solution of this PDE is only practical for very simple problems. Therefore, very few numerical solvers implement dynamic programming to solve optimal control problems. For example, BOCOPHJB (Bonnans et al. 2015) is free C++ software implementing the dynamic programming approach. Darbon and Osher (2016), Chow (2016) constitute recent research that seeks to overcome the curse of dimensionality in certain special cases.
Note that because the control function u is an unknown function of time, an optimal control problem is infinite-dimensional. In the direct method, the infinite-dimensional optimal control problem is approximated by a finite-dimensional nonlinear programming (NLP) problem by parameterizing the control function u as a finite linear combination of basis functions. In the sequential approach of the direct method, the state is reconstructed from a guess of the unknown coefficients for the control basis functions, the unknown parameters, the unknown initial states, and the unknown terminal time by multiple shooting. In the simultaneous approach of the direct method, the state is also parameterized as a finite linear combination of basis functions. In the direct method, the ODE, initial boundary conditions, terminal boundary conditions, and path constraints are represented as a system of algebraic inequalities, and the objective function is minimized subject to satisfying the system of algebraic inequalities, with the unknowns being the coefficients for the control and/or state basis functions, parameters, initial states, and the terminal time. In the Lagrange and Bolza formulations, the objective function is approximated via numerical quadrature. There are many NLP problem solvers available, such as IPOPT (Wächter and Biegler 2006), WORHP (Nikolayzik et al. 2010), SNOPT (Gill et al. 2005), KNITRO (Byrd et al. 2006), and \(\texttt {MATLAB}\)’s \(\texttt {fmincon}\);Footnote 1 of these NLP problem solvers, only IPOPT and WORHP are free. Most direct method solvers utilize one of these NLP problem solvers.
The packages RIOTS (Chen and Schwartz 2002), DYNOPT (Cizniar et al. 2005), ICLOCS (Falugi et al. 2010), GPOPS (Rao et al. 2010), FALCON.m (Rieck et al. 2016), and OptimTraj (Kelly 2016) are free MATLAB implementations of the direct method, while DIDO (Ross 2004), PROPT (Rutquist and Edvall 2010), and GPOPS-II (Patterson and Rao 2014) are commercial MATLAB implementations of the direct method. BOCOP (Bonnans et al. 2014), ACADO (Houska et al. 2011), and PSOPT (Becerra 2010) are free C++ implementations of the direct method. MISER (Goh and Teo 1988), DIRCOL (von Stryk 2000), SNCTRL (Gill and Wong 2015), OTIS (Vlases et al. 1990), and POST (Brauer et al. 1977) are free Fortran implementations of the direct method, while GESOP (Jansch et al. 1994) and SOS (Betts 2013) are commercial Fortran implementations of the direct method.
In the indirect method, Pontryagin’s minimum principle uses the calculus of variations to formulate necessary conditions for a minimum solution to the optimal control problem. These necessary conditions take the form of a differential algebraic equation (DAE) with boundary conditions; such a problem is called a DAE boundary value problem (BVP). In some cases, through algebraic manipulation, it is possible to convert the DAE to an ordinary differential equation (ODE), thereby producing an ODE BVP.
A DAE BVP can be solved numerically by multiple shooting, collocation, or quasilinearization (Kunkel 2006). bvpsuite (Kitzhofer et al. 2010) is a free MATLAB collocation DAE BVP solver. COLDAE (Ascher and Spiteri 1994) is a free Fortran quasilinearization DAE BVP solver, which solves each linearized problem via collocation. The commercial Fortran code SOS, mentioned previously, also has the capability to solve DAE BVPs arising from optimal control problems via multiple shooting or collocation.
An ODE BVP can be solved numerically by multiple shooting, Runge–Kutta methods, collocation (which is a special subset of Runge–Kutta methods), finite differences, or quasilinearization (Ascher et al. 1994). bvp4c (Shampine et al. 2000), bvp5c (Kierzenka and Shampine 2008), bvp6c (Hale and Moore 2008), and sbvp (Auzinger et al. 2003) are MATLAB collocation ODE BVP solvers; bvp4c and bvp5c come standard with MATLAB, while bvp6c and sbvp are free. bvptwp (Cash et al. 2013) is a free MATLAB ODE BVP solver package implementing 6 algorithms: twpbvp_m, twpbvpc_m, twpbvp_l, twpbvpc_l, acdc, and acdcc; acdc and acdcc perform automatic continuation. twpbvp_m and twpbvpc_m rely on Runge–Kutta methods, while the other 4 algorithms rely on collocation. TOM (Brugnano and Trigiante 1997; Mazzia and Sgura 2002; Aceto et al. 2003; Mazzia and Trigiante 2004; Cash et al. 2006) is a free MATLAB quasilinearization ODE BVP solver, which uses finite differences to solve each linearized problem. solvebvp (Birkisson 2013b; Birkisson and Driscoll 2012, 2013a) is a MATLAB quasilinearization ODE BVP solver available in the free MATLAB toolbox Chebfun (Driscoll et al. 2014); solvebvp uses spectral collocation to solve each linearized problem. COTCOT (Bonnard et al. 2005), HamPath (Caillau et al. 2012), and BNDSCO (Oberle and Grimm 2001) are free Fortran indirect method optimal control problem solvers that use multiple shooting to solve the ODE BVPs.
MIRKDC (Enright and Muir 1996), BVP_SOLVER (Shampine et al. 2006), and BVP_SOLVER-2 (Boisvert et al. 2013) and TWPBVP (Cash and Wright 1991) and TWPBVPC (Cash and Mazzia 2005) are free Fortran Runge–Kutta method ODE BVP solvers. TWPBVPL (Bashir-Ali et al. 1998), TWPBVPLC (Cash and Mazzia 2006), ACDC (Cash et al. 2001), and ACDCC (Cash et al. 2013) are free Fortran collocation ODE BVP solvers. bvptwp, mentioned previously, is a MATLAB reimplementation of the Fortran solvers TWPBVP, TWPBVPC, TWPBVPL, TWPBVPLC, ACDC, and ACDCC. COLSYS (Ascher et al. 1981), COLNEW (Bader and Ascher 1987), and COLMOD (Cash et al. 2001) are free Fortran collocation quasilinearization ODE BVP solvers; COLMOD is an automatic continuation version of COLNEW. bvpSolve (Mazzia et al. 2014) is an R library that wraps the Fortran solvers TWPBVP, TWPBVPC, TWPBVPL, TWPBVPLC, ACDC, and ACDCC and COLSYS, COLNEW, COLMOD, and COLDAE. py_bvp (Boisvert et al. 2010) is a Python library that wraps the Fortran solvers TWPBVPC, COLNEW, and BVP_SOLVER. The NAG Library (www.nag.com) is a commercial Fortran library consisting of several multiple shooting, collocation, and finite difference ODE BVP solvers. The Fortran solvers in the NAG Library are accessible from other languages (like C, Python, MATLAB, and .NET) via wrappers.
The NLP solver utilized by a direct method and the ODE/DAE BVP solver utilized by an indirect method must compute Jacobians and/or Hessians (i.e., derivatives) of the functions involved in the optimal control problem. These derivatives may be approximated by finite differences, but for increased accuracy and in many cases increased efficiency, exact (to machine precision) derivatives are desirable. These exact derivatives may be computed through symbolic or automatic differentiation. Usually, a symbolic derivative evaluates much more rapidly than an automatic derivative; however, due to expression explosion, symbolic derivatives cannot always be obtained for very complicated functions. The Symbolic Math Toolbox and TomSym (Holmström et al. 2010) are commercial MATLAB toolboxes that compute symbolic derivatives. While the Symbolic Math Toolbox only computes un-vectorized symbolic derivatives, TomSym computes both un-vectorized and vectorized symbolic derivatives. ADiGator (Weinstein and Rao 2016; Weinstein et al. 2015) is a free MATLAB toolbox capable of computing both un-vectorized and vectorized automatic derivatives. Usually in MATLAB, a vectorized automatic derivative evaluates much more rapidly than an un-vectorized symbolic derivative (wrapped within a for loop). Only the MATLAB Symbolic Math Toolbox and ADiGator were utilized in this research. For other automatic differentiation packages available in many programming languages, see Community Portal for Automatic Differentiation (2016).
A dynamic programming solution satisfies necessary and sufficient conditions for a global minimum solution of an optimal control problem. A direct method solution satisfies necessary and sufficient conditions for a local minimum solution of a finite-dimensional approximation of an optimal control problem, while an indirect method solution only satisfies necessary conditions for a local minimum solution of an optimal control problem. Thus, the dynamic programming approach is the holy grail for solving an optimal control problem; however, as mentioned previously, dynamic programming is impractical due to the curse of dimensionality. Therefore, in practice only direct and indirect methods are used to solve optimal control problems.
Since the direct method solves a finite-dimensional approximation of the original optimal control problem, the direct method is not as accurate as the indirect method. Moreover, the indirect method converges much more rapidly than the direct method. However, in addition to solving for the states and controls, the indirect method must also solve for the costates. Since the costates are unphysical, they are very difficult to guess initially. Therefore, though the direct method may be slower than the indirect method and may not be quite as accurate as the indirect method, the direct method is much more robust to poor initial guesses of the states and controls. Therefore, the preferred method of solution for many practical applications tends to be the direct method.
In some cases, it is possible to surmount the problem of providing a good initial guess required to obtain convergence via the indirect method. In such cases, the indirect method will converge substantially faster than the direct method. If a solution of a simpler optimal control problem is known and if the simpler and original optimal control problems are related by a continuous parameter, it may be possible to perform numerical continuation in the parameter from the solution of the simpler optimal control problem to a solution of the original optimal control problem. In the literature, numerical continuation is also sometimes called the differential path following method or the homotopy method. Allgower and Georg (2003) is a comprehensive reference on numerical continuation methods. bvpsuite implements a continuation algorithm to solve DAE BVPs, where the continuation parameter may have turning points (i.e., the continuation parameter need not monotonically increase or decrease). COCO (Dankowicz and Schilder 2013), a free collection of MATLAB toolboxes, and AUTO (Doedel et al. 2007), free Fortran software, implement sophisticated algorithms for the numerical continuation (permitting turning points) of ODE BVPs. followpath, available in the MATLAB toolbox Chebfun, is able to utilize Chebfun’s quasilinearization ODE BVP solver bvpsolve to solve ODE BVPs via continuation, where the continuation parameter may have turning points. acdc / ACDC, acdcc / ACDCC, and COLMOD implement continuation algorithms to solve ODE BVPs, but the continuation parameter is assumed to monotonically increase or decrease. HamPath, mentioned previously, is a free Fortran indirect method optimal control problem solver which uses continuation (permitting turning points) in concert with multiple shooting. Of these numerical continuation tools, only acdc and acdcc were used in this research.
In order to converge to the solution of the true optimal control problem rather than a finite-dimensional approximation, a direct method may use h, p, or hp methods. In the h method, the degree of the approximating polynomial on each mesh interval is held fixed, while the mesh is adaptively refined until the solution meets given error tolerances. In the p method, the mesh is held fixed, while the degree of the approximating polynomial on each mesh interval is adaptively increased until the solution meets given error tolerances. In the hp method, which is implemented by GPOPS-II, the mesh and the degree of the approximating polynomial on each mesh interval are adaptively refined until the solution meets given error tolerances.
We have used the indirect method to numerically solve Suslov’s optimal control problem, as it is vastly superior in speed compared to other methods and is capable of dealing with solutions having sharp gradients, if an appropriate BVP solver is utilized. This was made possible by constructing an analytical state and control solution to a singular optimal control problem and then by using continuation in the cost function coefficients to solve the actual optimal control problem. The singular optimal control problem can be solved analytically since the initial conditions \(\varvec{\xi }(a)\) are not prescribed, because the constraint \(\left\langle \varvec{\Omega }(a),\varvec{\xi }(a)\right\rangle =0\) is not explicitly enforced, and because it is easy to solve Suslov’s pure equations of motion for \(\varvec{\xi }\) in terms of \(\varvec{\Omega }\). Since the necessary conditions obtained by applying Pontryagin’s minimum principle to Suslov’s optimal control problem can be formulated as an ODE BVP, ODE rather than DAE BVP solvers were used. The MATLAB solvers bvp4c, bvp5c, bvp6c, sbvp, bvptwp, and TOM were used to solve the ODE BVP. sbvp and bvptwp, which have up to 8th-order accuracy, were found to be the most robust in solving the ODE BVP for Suslov’s optimal control problem; the other MATLAB solvers were found to be very inefficient, requiring many thousands of mesh points due to their lower accuracy. The numerical results presented in Sect. 5.3 were obtained via bvptwp’s automatic continuation solver acdc and sbvp.
Rights and permissions
About this article
Cite this article
Putkaradze, V., Rogers, S. Constraint Control of Nonholonomic Mechanical Systems. J Nonlinear Sci 28, 193–234 (2018). https://doi.org/10.1007/s00332-017-9406-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00332-017-9406-1