Abstract
Numerical methods that preserve geometric invariants of the system, such as energy, momentum or the symplectic form, are called geometric integrators. In this paper we present a method to construct symplectic-momentum integrators for higher-order Lagrangian systems. Given a regular higher-order Lagrangian \(L:T^{(k)}Q\rightarrow {\mathbb {R}}\) with \(k\ge 1\), the resulting discrete equations define a generally implicit numerical integrator algorithm on \(T^{(k-1)}Q\times T^{(k-1)}Q\) that approximates the flow of the higher-order Euler–Lagrange equations for L. The algorithm equations are called higher-order discrete Euler–Lagrange equations and constitute a variational integrator for higher-order mechanical systems. The general idea for those variational integrators is to directly discretize Hamilton’s principle rather than the equations of motion in a way that preserves the invariants of the original system, notably the symplectic form and, via a discrete version of Noether’s theorem, the momentum map. We construct an exact discrete Lagrangian \(L_d^e\) using the locally unique solution of the higher-order Euler–Lagrange equations for L with boundary conditions. By taking the discrete Lagrangian as an approximation of \(L_d^e\), we obtain variational integrators for higher-order mechanical systems. We apply our techniques to optimal control problems since, given a cost function, the optimal control problem is understood as a second-order variational problem.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
This paper is concerned with the design of geometric integrators for higher-order variational systems. The study of higher-order variational systems has regularly attracted a lot of attention from the applied and theoretical points of view (see León and Rodrigues 1985 and references therein). But recently there is a renewed interest in these systems due to new and relevant applications in optimal control for robotics or aeronautics, or the study of air traffic control and computational anatomy (Burnett et al. 2013; Colombo and Martín de Diego 2014; Crouch and Silva Leite 1995; Gay-Balmaz et al. 2012a, b, 2011; Hussein and Bloch 2004; Machado et al. 2010; Noakes et al. 1989).
A continuous higher-order system is modeled by a Lagrangian on a higher-order tangent bundle \(T^{(k)}Q\), that is, a function \(L:T^{(k)}Q \rightarrow {{\mathbb {R}}}\). The corresponding Euler–Lagrange equations are a system of implicit 2k-order differential equations. Of course the explicit integration of most of these Lagrangian systems is too complicated to integrate directly or even it is generically not possible. In these cases, it is necessary to discretize the equations taking approximations at several points in time over the interval of integration.
Among the different numerical integrators that one can derive for continuous higher-order systems, one of the most successful ideas is to discretize first the variational principle (instead of the equations of motion) and to derive the numerical method applying discrete calculus of variations (Marsden and West 2001; Veselov 1988; Wendlandt and Marsden 1997). The advantage of this procedure is that automatically we have preservation of some of the geometric structures involved, like symplectic forms or preservation of momentum, moreover, a good behavior of the associated energy. These methods have their roots in the optimal control literature in the 1960s (Jordan and Polak 1964).
In previous approaches (see, e.g., Benito et al. 2006; Colombo et al. 2012, 2013), the theory of discrete variational mechanics for higher-order systems was derived using a discrete Lagrangian \(L_d:Q^{k+1}\rightarrow {\mathbb {R}}\) where \(Q^{k+1}\) is the cartesian product of \(k+1\) copies of the configuration manifold Q. There, \(k+1\) points are used to approximate the positions and the higher-order velocities (such as the standard velocities, accelerations, jerks) and to represent in this way elements of the higher-order tangent bundle \(T^{(k)}Q\).
We will see in this paper that the most natural approach is to take a discrete Lagrangian \(L_d:T^{(k-1)}Q\times T^{(k-1)}Q \rightarrow {{\mathbb {R}}}\) since actually the discrete variational calculus is not based on the discretization of the Lagrangian itself, but on the discretization of the associated action. We will see that a suitable approximation of the action
is given by a Lagrangian of the form \(L_d:T^{(k-1)}Q\times T^{(k-1)}Q \rightarrow {{\mathbb {R}}}\). Moreover, we will derive a particular choice of discrete Lagrangian which gives an exact correspondence between discrete and continuous systems, the exact discrete Lagrangian. For instance, if we take the Lagrangian \(L(q, \dot{q}, \ddot{q})=\frac{1}{2}\ddot{q}^2\), the corresponding exact discrete Lagrangian \(L_{d}^{e}:TQ\times TQ\rightarrow {\mathbb {R}}\) is
where q(t) is the unique solution of the Euler–Lagrange equations for L verifying \(q(0)=q_0\), \(\dot{q}(0)=v_0\), \(q(h)=q_h\), \(\dot{q}(h)=v_h\) for h small enough (see Sect. 2).
Observe from the previous example that now this theory of variational integrators for higher-order systems is even simpler, since it fits directly into the standard discrete mechanics theory for a discrete Lagrangian of the form \(L_d:M\times M\rightarrow {{\mathbb {R}}}\) where \(M=T^{(k-1)}Q\). We will show that if the original Lagrangian is regular then so is the exact discrete Lagrangian, in the sense of Marsden and West (2001). Moreover, in the corresponding applications, for instance in optimal control theory or splines theory, typically we are dealing with initial and final boundary conditions which are not necessary discretized, in contrast to previously proposed methods Bloch et al. (2009), Lee et al. (2008), Leok and Shingel (2012).
The paper is structured as follows. In Sect. 2, we show that a regular higher-order Lagrangian system has a unique solution for given nearby endpoint conditions using a direct variational proof of existence and uniqueness of the local boundary value problem, which employs a regularization procedure. In Sect. 3 we introduce the notion of exact discrete Lagrangian for higher-order systems and we design the construction of variational integrators for higher-order Lagrangian systems taking approximations of the exact discrete Lagrangian. We obtain the discrete Euler–Lagrange equations for a discrete Lagrangian defined in the cartesian product of two copies of \(T^{(k-1)}Q\). Section 4 is devoted to the study of the relation between the discrete and continuous dynamics. We show the relation between the discrete Legendre transformations and the continuous one, and we also show that the exact discrete Lagrangian associated with a higher-order regular Lagrangian is also regular. Finally, in Sect. 5, we apply our techniques to study optimal control problems for fully actuated mechanical systems.
2 Existence and Uniqueness of Solutions for the Boundary Value Problem
2.1 Higher-Order Tangent Bundles
First we recall some basic facts about the higher-order tangent bundle theory. For more details, see Crampin et al. (1986) and León and Rodrigues (1985).
Let Q be a differentiable manifold. We introduce the following equivalence relation in the set \(C^{k}(I, Q)\) of k-differentiable curves from the interval \(I\subseteq {\mathbb {R}}\) to Q, where \(0\in I\). By definition, two curves \(\gamma _1\) and \(\gamma _2\) belonging to \(C^{k}(I, Q)\) have contact of order k at \(q_0 = \gamma _1(0) = \gamma _2(0)\) if there is a local chart \((\varphi , U)\) of Q such that \(q_0 \in U\) and
for all \(s = 0,\dots ,k\). The equivalence class of a curve \(\gamma \) will be denoted by \([\gamma ]_0^{(k)}\). The set of equivalence classes will be denoted by \(T^{(k)}Q\) and it is not hard to show that it has a natural structure of differentiable manifold. Moreover, \( \tau _Q^k :T^{(k)} Q \rightarrow Q\) where \(\tau _Q^k \big ([\gamma ]_0^{(k)}\big ) = \gamma (0)\) is a fiber bundle called the tangent bundle of order k of Q. Clearly, \(T^{(1)} Q = TQ\).
From a local chart \(q^{(0)}=(q^i)\) on a neighborhood U of Q with \(i=1,\ldots ,n=\dim Q\), it is possible to induce local coordinates \((q^{(0)},q^{(1)},\dots ,q^{(k)})\) on \(T^{(k)}U=(\tau _Q^k)^{-1}(U)\equiv U\times ({\mathbb {R}}^{n})^{k}\). Sometimes we will resort to the usual notation \(q^{(0)}\equiv (q^{i})\), \(q^{(1)}\equiv (\dot{q}^i)\) and \(q^{(2)}\equiv (\ddot{q}^i)\).
There is a canonical embedding \(j_{k}:T^{(k)}Q\rightarrow T T^{(k-1)}Q\) defined as \(j_k([\gamma ]_0^{(k)})=[{\gamma }^{(k-1)}]_0^{(1)}\), where \({\gamma }^{(k-1)}\) is the lift of the curve \(\gamma \) to \(T^{(k-1)}Q\); that is, the curve \({\gamma }^{(k-1)}:I\rightarrow T^{(k-1)}Q\) is given by \(\gamma ^{(k-1)}(t)=[\gamma _t]_0^{(k-1)}\) where \(\gamma _t(s)=\gamma (t+s)\). In local coordinates,
2.2 Hamilton’s Principle and Considerations about the Existence and Uniqueness of Solutions
Let \(L:T^{(k)}Q \rightarrow {\mathbb {R}}\) be a Lagrangian of order \(k\ge 1\), of class \(C^{k+1}\). Since our result will be local, we assume from now on that Q is an open subset of \({\mathbb {R}}^n\). Take coordinates \(\left( q^{(0)}, q^{(1)}, \dots , q^{(k)}\right) \) on \(T^{(k)}Q \equiv Q\times ({\mathbb {R}}^n)^k\) as before. We suppose that L is regular in the sense that the Hessian matrix
is a regular matrix. Let also \(h>0\) be given. We can formulate Hamilton’s principle as follows.
Variational Principle 1
Find a \(C^k\) curve \(q:[0,h] \rightarrow Q\) such that it is a critical point of the action
among those curves whose first \(k-1\) derivatives are fixed at the endpoints, that is, with given values for \(q(0), \dot{q}(0),\,\dots ,\, q^{(k-1)}(0)\) and \(q(h), \dot{q}(h),\,\dots ,\, q^{(k-1)}(h)\).
Hamilton’s principle is a constrained problem in the Banach space \(C^k([0,h], {\mathbb {R}}^n)\). Now if q(t) is a solution to this problem that is not only \(C^k\) but \(C^{2k}\), then it satisfies the well-known kth-order Euler–Lagrange equationsFootnote 1
For a regular Lagrangian, (1) can be written as an explicit 2k-order ordinary differential equation. Existence and uniqueness of solutions for the initial value problem can be guaranteed using basic ODE theory. Doing the same for for the boundary value problem of finding a solution q(t) of (1) with given values for \(q(0), \dot{q}(0),\,\dots ,\, q^{(k-1)}(0)\) and \(q(h), \dot{q}(h),\,\dots ,\, q^{(k-1)}(h)\) requires different techniques. For instance, in Agarwal (1986, Ch. 9) it is shown that there exists a unique solution to an explicit 2k-order ODE with this kind of boundary conditions, for small enough h and close enough boundary values. See also Eldering (2012) (Appendix A) for results on the existence, uniqueness and smooth dependence on parameters of solutions of ODEs.
In principle, however, there could exist solutions to Hamilton’s variational principle that are \(C^k\) but not \(C^{2k}\), and thus do not satisfy (1). Therefore, uniqueness of solutions to the variational principle cannot yet be guaranteed. One possibility for avoiding this situation is stating Hamilton’s principle in the (smaller) \(C^{2k}\) context from the beginning. In this section we proceed differently, acknowledging the fact the variational principle makes sense in the \(C^k\) setting. We prove local existence and uniqueness of \(C^k\) solutions to Hamilton’s principle from a direct variational point of view. We will see that these solutions turn out to be automatically \(C^{2k}\), so they satisfy Euler–Lagrange equations a posteriori.
Our argument for the existence and uniqueness of solutions will involve a regularization procedure which follows closely the proof by Patrick (2006) for first-order Lagrangians; the formulas, of course, reduce to those in Patrick (2006) for order 1, but we introduce an additional modification using orthonormal polynomials. See also Buttazzo et al. (1998), Giaquinta and Hildebrandt (1996) for discussions on the regularity of extremals for variational problems.
2.3 Nonregularity of Hamilton’s Principle
We want to determine whether there exists a unique solution curve to Hamilton’s principle, given endpoint conditions that are close enough. The main obstacle for a straightforward affirmative answer is that the local boundary value problem as stated above is nonregular at \(h=0\). That is, the constraint function \(g:C^{k}([0,h], Q) \rightarrow ({\mathbb {R}}^{n})^{k}\times ({\mathbb {R}}^{n})^{k}\)
maps into the diagonal of \(T^{(k-1)}Q \times T^{(k-1)}Q\) for \(h=0\) and is not therefore a submersion. For \(h\ne 0\), the constraint function is a submersion.
The approach consists in replacing this problem by an equivalent one that is regular at \(h=0\), and showing that locally there is a unique solution to the regularized problem.
2.4 Regularization
First we replace the space of curves on Q in the variational problem by the space of curves on \(T^{(k)}Q\) and include additional constraints. Denote an arbitrary curve by
\(t\in [0,h]\). Here we have modified our notation for coordinates on \(T^{(k)}Q\), using superscripts in square brackets to make a distinction with the actual derivatives of q(t).
Variational Principle 2
Find a curve \((q^{[0]}(t), q^{[1]} (t), \dots , q^{[k]} (t))\) on \(T^{(k)}Q\), with \(q^{[l]}\in C^{k-l}([0,h], {\mathbb {R}}^n)\), \(l=0, \dots , k\), such that it is a critical point of
subject to the constraints
where \((q^{[0]}_i, q^{[1]}_i, \dots , q^{[k-1]}_i)\), \(i=1,2\), are given points in \(T^{(k-1)}Q\).
Now reparameterize the curve by defining
For \(h>0\), the curve \((Q^{[0]}(u), \dots , Q^{[k]} (u))\) satisfies an equivalent variational problem as follows. Since h is a constant for each instance of the problem, we can use
as an objective function. The first set of constraints becomes
where \(j=0, \dots , k-1\).
The reparametrized variational principle is the following.
Variational Principle 3
Find a curve \(\left( Q^{[0]}(u), \dots , Q^{[k]} (u)\right) \) on \(T^{(k)}Q\), \(Q^{[l]}\in C^{k-l}([0,1], {\mathbb {R}}^n)\), \(l=0, \dots , k\), that is a critical point of
subject to the constraints
where \(j=0, \dots , k-1\), and \(\left( q^{[0]}_i, q^{[1]}_i, \dots , q^{[k-1]}_i\right) \), \(i=1,2\), are given points in \(T^{(k-1)}Q\).
The objective S does not depend on h, and the constraints are smooth through \(h=0\).
Remark 2.1
For \(h=0\), the constraints (2) imply that \(Q^{[0]}(u)\), ..., \(Q^{[k-1]} (u)\) remain constant, which restricts the possible values of the endpoint conditions in order to have a compatible set of constraints. More precisely, \(q^{[j]}_1=q^{[j]}_2\) for \(j=0,\ldots ,k-1\); otherwise there would be no curves satisfying the constraints. This kind of restriction also appears in the original variational principle 1. Moreover, the problem becomes the unconstrained problem of finding a curve \(Q^{[k]}(u)\in C^0([0,1],{\mathbb {R}}^{n})\) such that it is a critical point of
This means
Differentiating with respect to u, and using the fact that the Lagrangian is regular, we obtain that \(Q^{[k]}(u)\) is constant.
In preparation for the next step for regularization, let us solve the constraints (2) to get
This means that the functions \(Q^{[j]}(u)\), \(j=0, \dots , k-1\), can be expressed in terms of \(Q^{[j]}(0)\), ..., \(Q^{[k-1]}(0)\), the function \(Q^{[k]}(u)\) and h. For example, for \(k=2\) we have
For a general k, and for \(j=0, \dots , k-1\), an iterated change of order of integration yields
If the upper bound of summation is less than the lower bound, the sum is understood to be 0.
Note that taking \(u=1\), the final endpoint data \((q^{[0]}_2, \dots , q^{[k-1]}_2)\) can now be written as
so we define
We will discuss the case \(h=0\) in Remark 2.2.
Now replace the curves and endpoint data by just \(Q^{[k]}(u)\), \((q^{[0]}_1, \dots , q^{[k-1]}_1)\), and \((z^{[0]}, \dots , z^{[k-1]})\), to get a new variational principle.
Variational Principle 4
Given h, \((q^{[0]}_1, \dots , q^{[k-1]}_1)\) and \((z^{[0]}, \dots , z^{[k-1]})\), find a continuous curve \(Q^{[k]}:[0,1] \rightarrow {\mathbb {R}}^n\) that is a critical point of
where \(Q^{[0]}(u)\), ..., \(Q^{[k-1]}(u)\) are defined as in (5) by
subject to the constraints
Observe that the constraint functions do not depend on h and are linear on the curve \(Q^{[k]}\). This variational principle is already regular through \(h=0\), as we will see when we proceed to find the solutions later.
Remark 2.2
The data \(q^{[0]}_1\), ..., \(q^{[k-1]}_1\), \(z^{[0]}\), ..., \(z^{[k-1]}\) can be transformed into the endpoint conditions for the variational principle 3 in a straightforward way, for any h, using (6) and (7). The converse (7) is possible only for \(h\ne 0\), in principle. However, if \(h=0\) let \((Q^{[0]}(u),\ldots ,Q^{[k]}(u))\) a solution for the variational principle 3 with boundary conditions \((q_{1}^{[0]},\ldots , q^{[k-1]}_1)\) and \((q^{[0]}_2,\ldots ,q^{[k-1]}_2)\). Define \(z^{[j]}\) by the constraint in (4). Since \(Q^{[k]}\) is constant and \(\frac{(1-s)^{k-j-1}}{(k-j-1)!}>0\) in (0, 1), to different values of \(Q^{[k]}\) correspond different values of \(z^{[j]}\). Then \(Q^{[k]}\) is a solution of 4 with boundary conditions \(q^{[0]}_1\), ..., \(q^{[k-1]}_1\), \(z^{[0]}\), ..., \(z^{[k-1]}\).
Finally, we will introduce a modification that will enable us to carry out the computations in the next section easily. Consider the inner product on \(C^0([0,1],{\mathbb {R}})\) given by
If \(f\in C^0([0,1],{\mathbb {R}})\) and \(V=(V_1, \dots , V_n)\in C^0([0,1],{\mathbb {R}}^n)\) we define the bilinear operation
Then the integrals appearing in the constraints in the variational principle 4 are \(\langle \langle a^{[k]}_j, Q^{[k]}\rangle \rangle \), where \(a^{[k]}_j\) are the polynomials
These form a basis of the space of polynomials of degree at most \(k-1\). Let us consider a basis \(b^{[k]}_j(s)\), \(j=0, \dots , k-1\), of the same space of polynomials consisting of orthonormal polynomials on [0, 1], and let \((\gamma ^{[k],i}_j)\), where \(i,j=0, \dots , k-1\), be the invertible real matrix such that \(a^{[k]}_j(s)=\gamma ^{[k],i}_jb^{[k]}_i(s)\). For example, for \(k=2\),
and we can take for instance the orthonormal basis
therefore,
Using this matrix, the constraints can be rewritten as
for \(j=0, \dots , k-1\). This allows us to reformulate the variational principle in an equivalent way by replacing the data \((z^{[0]}, \dots , z^{[k-1]})\) and constraints \(\langle \langle a^{[k]}_j,Q^{[k]}\rangle \rangle =z^{[j]}\) by new data \((w^{[0]}, \dots , w^{[k-1]})\) and constraints \(\langle \langle b^{[k]}_j,Q^{[k]}\rangle \rangle =w^{[j]}\), \(j=0, \dots , k-1\). The old and new data are related by
Variational Principle 5
Given h, \((q^{[0]}_1, \dots , q^{[k-1]}_1)\) and \((w^{[0]}, \dots , w^{[k-1]})\), find a continuous curve \(Q^{[k]}:[0,1] \rightarrow {\mathbb {R}}^n\) that is a critical point of
where \(Q^{[0]}(u)\), ..., \(Q^{[k-1]}(u)\) are defined by
subject to the constraints
2.5 Solution of the Regularized Problem
Next, we will study the existence and uniqueness of solutions associated with variational principle 5. We will show that the boundary value problem is well posed, and that even though the variational problem is posed on the space of \(C^{k}\) solutions, the extremizers are \(C^{2k}\) and hence satisfy the Euler–Lagrange equations.
We start the proof by showing the \(C^{k+1}\) differentiability of the action \(S_h\) for the variational principle 5. Next, we compute the gradient of \(S_h\) in order to solve the equation “gradient of \(S_h\) perpendicular to constraint space.” After introducing an orthogonal decomposition of the constraint space we obtain that \(S_h\) has a critical point on the constraint set if and only if the orthogonal projection of the gradient of \(S_h\) is 0 and hence we can find the stationary curve for the variational principle 5. Using the implicit function theorem we obtain existence and uniqueness of solutions for the variational principle 5. Finally, we reverse the regularization to obtain a unique \(C^{2k}\) solution of the original variational principle.
2.5.1 Step 1—\(C^{k+1}\) Differentiability of \(S_{h}\):
Let \(S_h\) be given as in the variational principle 5, regarded as a real-valued map defined on the Banach space \(C^0([0,1],{\mathbb {R}}^n)\) of curves \(Q^{[k]}(u)\). We can also consider its restriction to the Banach space \(C^k([0,1],{\mathbb {R}}^n)\). We are going to use the following lemma Abraham et al. (1988).
Lemma 2.3
(Omega Lemma) Let E, F be Banach spaces, U open in E, and M a compact topological space. Let \(g:U \rightarrow F\) be a \(C^r\) map, \(r>0\). The map
is also \(C^r\), and \(D\Omega _g(f)\cdot h=[(Dg)\circ f]\cdot h\).
The objective \(S_h\) is the composition of the maps
where i is defined by \(Q^{[k]}(u) \mapsto (Q^{[0]}(u), \dots , Q^{[k]} (u))\). Here \(Q^{[0]}(u),\dots \), \(Q^{[k-1]} (u)\) stand for the right-hand sides of (9). Both i and \(\int \) are bounded affine and therefore \(C^\infty \). By the Omega Lemma, \(\Omega _L\) is \(C^{k+1}\) because L is \(C^{k+1}\), and therefore so is \(S_h\).
If we regard \(S_h\) as defined on \(C^k([0,1],{\mathbb {R}}^n)\), we should append the inclusion \(C^k([0,1],{\mathbb {R}}^n)\hookrightarrow C^0([0,1],{\mathbb {R}}^n)\) to the left side of the diagram above. This inclusion is \(C^\infty \) because it is linear and bounded (\(\Vert Q^{[k]}\Vert _{C^0}\le \Vert Q^{[k]}\Vert _{C^k}\) for all \(Q^{[k]}\)). Then \(S_h\) is \(C^{k+1}\) also as a map defined on \(C^k([0,1],{\mathbb {R}}^n)\). In order to cover both cases, from now on l will denote 0 or k interchangeably.
2.5.2 Step 2—Computing the Gradient of \(S_h\):
We need a suitable notion of the gradient of \(S_h\), in order to find where it is perpendicular to the constraint space. In order to do that, let us first compute \(\mathbf {d}S_h[Q^{[k]}(u)]\), for \(Q^{[k]}\) of class \(C^l\). The functions \(Q^{[0]}(u)\), ..., \(Q^{[k-1]}(u)\) are defined by (9). Since \(S_h\) is smooth, we will compute \(\mathbf {d}S_h\) using directional derivatives. For an arbitrary \(\delta Q^{[k]}\) of class \(C^l\), take a deformation \(Q^{[k]}_\epsilon (u)=Q^{[k]}(u)+\epsilon \delta Q^{[k]}(u)\) of \(Q^{[k]}(u)\). For \(j=0, \dots , k-1\), define the corresponding lower order curves as in (9) by
so \(Q^{[j]}_0(u)=Q^{[j]}(u)\) and
Denoting \(a^{[k]}_j(u,s)={(u-s)^{k-j-1}}/{(k-j-1)!}\) and \(Q(u)=(Q^{[0]}(u), \dots \), \(Q^{[k]} (u))\) for short, we have
For each \(u\in [0,1]\), the first factor in the integrand of the last expression is in \(({\mathbb {R}}^{n})^{*}\). If \(\sharp :({\mathbb {R}}^{n})^{*}\rightarrow {\mathbb {R}}^{n}\) denotes the index raising operator associated to the Euclidean inner product, define
Since \({\partial L}/{\partial q^{[0]}}\), ..., \({\partial L}/{\partial q^{[k]}}\) are \(C^{k}\) and the curve Q is \(C^l\) (\(l=0\) or \(l=k\)) , then \(\nabla S_h[Q^{[k]}(u)]\) is \(C^l([0,1],{\mathbb {R}}^n)\). Then we have a vector field
which we call the gradient of \(S_h\). By the Omega Lemma, \(\nabla S_h\) is a \(C^k\) map.
2.5.3 Step 3—Orthogonal Decomposition of the Constraint Space and Critical Points of \(S_h\):
Let us now compute the tangent space to the constraint set. If we consider the inner product on \(C^{l}([0,1],{\mathbb {R}}^n)\) given by
then
The constraints \(g_j[Q^{[k]}(s)]:=\langle \langle b^{[k]}_j,Q^{[k]}\rangle \rangle =w^{[j]}\), \(j=0, \dots , k-1\), in the variational principle 5 are bounded and linear, and therefore \(C^\infty \), and the corresponding derivatives are the same functions \(g_j\). Define
so
is the tangent space to the constraint set. They are actually parallel since the constraints are linear. It is not difficult to show using the definitions that the space
of \({\mathbb {R}}^n\)-valued polynomials of degree at most \(k-1\) is indeed the \(\llbracket ,\rrbracket \)-orthogonal complement of E, which is then a split subspace (see “Appendix” for a proof). The orthogonal projection \(P:C^l([0,1],{\mathbb {R}}^n)=E\oplus E^\perp \rightarrow E\) is given by
Now \(S_h\) has a critical point on the constraint set (for any value of the constraints) if and only if the projection \(P\nabla S_h\) of \(\nabla S_h\) to the tangent space E of the constraint set is 0.
2.5.4 Step 4—Existence and Uniqueness for the Regularized Problem:
In order to find solutions to the variational principle 5, we solve
for \(Q^{[k]}_E\), near
This can be solved using the implicit function theorem by requiring that the partial derivative of \(P\nabla S_h(Q^{[k]})\) at the point \(Q^{[k]}=0\) with respect to the space E is a linear isomorphism. The variables \(w^{[0]}, \dots , w^{[k-1]}\), \(q^{[0]}_1, \dots , q^{[k-1]}_1\) and h are seen as parameters that can move in some neighborhood. Note that it is not necessary to solve for \(Q^{[k]}_{E^\perp }\) since it is completely determined by \(w^{[0]}, \dots , w^{[k-1]}\) using the constraint equations in variational principle 5.
In order to compute this partial derivative, take a deformation of \(Q^{[k]}=0\) of the form \(Q^{[k]}_\epsilon =\epsilon \delta Q^{[k]}_E\), where \(\delta Q^{[k]}_E\in E\). Recalling (10) and noting that \(h=0\), we have
Here the inner products vanish because \(\frac{\partial ^2 L}{\partial q^{[k]2}}(\bar{q}^{[0]}, \dots , \bar{q}^{[k-1]},0)\) is a constant matrix (that is, it does not depend on u) and \(\langle \langle b^{[j]},\delta Q^{[k]}_E\rangle \rangle =0\) for \(j=0, \dots , k-1\).
Then the derivative is precisely \(\frac{\partial ^2 L}{\partial q^{[k]2}}(\bar{q}^{[0]}, \dots , \bar{q}^{[k-1]},0)\), seen as a linear map from E into itself, and if L is regular then it is an isomorphism.
By the implicit function theorem, there are neighborhoods \(W_1\subseteq ({\mathbb {R}}^n)^k\times ({\mathbb {R}}^n)^k\times {\mathbb {R}}\) (with variables \((q^{[0]}_1, \dots , q^{[k-1]}_1;w^{[0]}, \dots , w^{[k-1]};h)\)) containing \((\bar{q}^{[0]}\), \(\dots , \bar{q}^{[k-1]};0, \dots , 0;0)\) and \(W_2^{l}\subseteq C^l([0,1],{\mathbb {R}}^n)\) containing the constant curve \(Q^{[k]}(u)=0\), and a \(C^{k}\) map \(\psi :W_1 \rightarrow W_2^{l}\) such that for each \((q^{[0]}_1, \dots , q^{[k-1]}_1;w^{[0]}\), \(\dots , w^{[k-1]};h)\in W_1\), the curve
is the unique critical point in \(W_2^{l}\) of the variational problem 5. Thus, \(\psi \) maps initial conditions, constraint values (which encode the final endpoint conditions for the original problem) and h into \(C^l\) curves.
Let us now consider the cases \(l=0\) and \(l=k\) separately. Taking \(l=k\), \(\psi \) has values in \(W_2^{k}\subseteq C^{k}([0,1],{\mathbb {R}}^n)\). Taking \(l=0\), \(\psi \) has values in \(W_2^{0}\subseteq C^{0}([0,1],{\mathbb {R}}^n)\). However, since \(C^{k}([0,1],{\mathbb {R}}^n)\subset C^{0}([0,1],{\mathbb {R}}^n)\), this \(\psi \) also provides the unique solution among the \(C^0\) curves in a \(C^0\)-open neighborhood of the curve \(u \mapsto 0\), say \(\{Q^{[k]}(u)\,|\,\Vert Q^{[k]}\Vert _0<\epsilon \}\).
2.5.5 Step 5—Reverse of the Regularization:
Let us now reverse the regularization in order to obtain a unique \(C^{2k}\) solution of the variational principle 1. Let \(h\ne 0\). For \((q_1,q_2)=((q_{1}^{[0]},\ldots ,q_{1}^{[k-1]}),(q_{2}^{[0]},\ldots ,q_{2}^{[k-1]}))\in ({\mathbb {R}}^{n})^{k}\times ({\mathbb {R}}^{n})^{k}\) the corresponding values of \(z^{[0]},\ldots ,z^{[k-1]}\) are given by (7) and the values of \(w^{[0]},\ldots ,w^{[k-1]}\) can be computed from (8) using the inverse matrix of \(\left( \gamma _{j}^{[k],i}\right) \). This defines a smooth function \((w^{[0]},\ldots ,w^{[k-1]})=\varpi (q_1,q_2,h)\). Note that the condition that \(q_1\) and \(q_2\) are close translates into the condition that \((w^{[0]},\ldots ,w^{[k-1]})\) is close to 0.
Let \(h>0\) be such that \((\bar{q}^{[0]}, \dots , \bar{q}^{[k-1]};0,\dots ,0;h)\in W_1\). Define
and for each \((q_1,q_2)=\left( (q^{[0]}_1, \dots , q^{[k-1]}_1),(q^{[0]}_2, \dots , q^{[k-1]}_2)\right) \in W_1\) define the curve \(Q^{[0]}_{(q_1,q_2)}(u)\) according to (5) as
Since \(\psi \) takes values in the \(C^k\) curves, \(Q^{[0]}_{(q_1,q_2)}(u)\) is \(C^{2k}\) by the reasoning leading to equation (5).
Now reparameterize with \(t=hu\) to get a \(C^{2k}\) curve
on Q, defined for \(t\in [0,h]\). This curve is the unique solution of the variational principle 1 with endpoint conditions \(q_1\) and \(q_2\).
This solution is \(C^{2k}\), and unique among the curves corresponding to \(Q^{[k]}\) continuous with \(\Vert Q^{[k]}\Vert _0<\epsilon \). These are the \(C^k\) curves q(t) on Q with \(\Vert q^{(k)}\Vert _0<\epsilon /h^k\), which are the \(C^k\) curves in some \(C^k\) neighborhood of the constant curve \(t \mapsto \bar{q}^{[0]}\).
3 The Exact Discrete Lagrangian and Discrete Equations for Second-Order Systems
Next, we will consider second-order Lagrangian systems, motivated by the study of optimal control problems. Let Q be a configuration manifold and let \(L:T^{(2)}Q\rightarrow {\mathbb {R}}\) be a regular Lagrangian.
Definition 3.1
Given a small enoughFootnote 2 \(h>0\), the exact discrete Lagrangian \(L_d^{e}:TQ\times TQ\rightarrow {\mathbb {R}}\) is defined by
where \(q:[0,h]\rightarrow Q\) is the unique solution of the Euler–Lagrange equations for the second-order Lagrangian L,
satisfying the boundary conditions \(q(0)=q_0,q(h)=q_1,\dot{q}(0)=\dot{q}_0\) and \(\dot{q}(h)=\dot{q}_1\).
Strictly speaking, the exact discrete Lagrangian is defined not on \(TQ\times TQ\) but on a neighborhood of the diagonal. For the sake of simplicity, we will not make this distinction. Our idea is to take a discrete Lagrangian \(L_{d}:TQ\times TQ\rightarrow {\mathbb {R}}\) as an approximation of \(L_{d}^{e}:TQ\times TQ\rightarrow {\mathbb {R}}\), to construct variational integrators in the same way as in discrete mechanics (see Sect. 4). In other words, for given \(h>0\) we define \(L_d(q_0,v_0,q_1,v_1)\) as an approximation of the action integral along the exact solution curve segment q(t) with boundary conditions \(q(0)=q_0\), \(\dot{q}(0)=v_0\), \(q(h)=q_1\), and \(\dot{q}(h)=v_1\). For example, we can use the formula
where \(\kappa \), \(\chi \) and \(\zeta \) are functions of \((q_0,v_0,q_1,v_1)\in TQ\times TQ\) which approximate the configuration q(t), the velocity \(\dot{q}(t)\) and the acceleration \(\ddot{q}(t)\), respectively, in terms of the initial and final positions and velocities. We can also, for instance, consider suitable linear combinations of discrete Lagrangians of this type, for instance, weighted averages of the type
or other combinations.
For completeness, we will derive the discrete equations for the Lagrangian \(L_{d}:TQ\times TQ\rightarrow {\mathbb {R}}\), but these results are a direct translation of Marsden and West Marsden and West (2001) to our case.
Given the grid \(\{t_{k}=kh\mid k=0,\ldots ,N\}\), \(Nh=T\), define the discrete path space \({\mathcal {P}}_{d}(TQ):=\{(q_{d},v_{d}):\{t_{k}\}_{k=0}^{N}\rightarrow TQ\}\). We will identify a discrete trajectory \((q_{d},v_{d})\in {\mathcal {P}}_{d}(TQ)\) with its image \((q_{d},v_{d})=\{(q_{k},v_{k})\}_{k=0}^{N}\) where \((q_{k},v_{k}):=(q_{d}(t_{k}),v_{d}(t_{k}))\). The discrete action \({\mathcal {A}}_{d}:{\mathcal {P}}_{d}(TQ)\rightarrow {\mathbb {R}}\) along this sequence is calculated by summing the discrete Lagrangian evaluated at each pair of adjacent points of the discrete path, that is,
We would like to point out that the discrete path space is isomorphic to the smooth product manifold which consists on \(N+1\) copies of TQ, the discrete action inherits the smoothness of the discrete Lagrangian, and the tangent space \(T_{(q_{d},v_{d})}{\mathcal {P}}_{d}(TQ)\) at \((q_{d},v_{d})\) is the set of maps \(a_{(q_{d},v_{d})}:\{t_{k}\}_{k=0}^{N}\rightarrow TTQ\) such that \(\tau _{TQ}\circ a_{(q_{d},v_{d})}=(q_{d},v_{d})\) where \(\tau _{TQ}:TTQ\rightarrow TQ\) is the canonical projection.
Hamilton’s principle seeks discrete curves \(\{(q_{k},v_{k})\}_{k=0}^{N}\) that satisfy
for all variations \(\{(\delta q_{k},\delta v_{k})\}_{k=0}^{N}\) vanishing at the endpoints. This is equivalent to the discrete Euler–Lagrange equations
for \(1\le k\le N-1\).
Given a solution \(\{q_{k}^{*},v_{k}^{*}\}_{k\in {\mathbb {Z}}}\) of equations (11) and assuming that the \(2n\times 2n\) matrix
is nonsingular, it is possible to define the (local) discrete flow \(F_{L_{d}}:{\mathcal {U}}_{k}\subset TQ\times TQ\rightarrow TQ\times TQ\) mapping \((q_{k-1},v_{k-1},q_{k},v_{k})\) to \((q_{k},v_{k},q_{k+1},v_{k+1})\) from (11) where \({\mathcal {U}}_{k}\) is a neighborhood of the point \((q_{k-1}^{*},v_{k-1}^{*},q_{k}^{*},v_{k}^{*})\). The simplecticity and momentum preservation of the discrete flow is derived in Marsden and West (2001).
Example 3.2
(Cubic splines) Let \(Q={\mathbb {R}}^{n}\) and \(L:T^{(2)}Q\equiv ({\mathbb {R}}^{n})^{3}\rightarrow {\mathbb {R}}\) be the second-order Lagrangian given by \(L(q,\dot{q},\ddot{q})=\frac{1}{2}\ddot{q}^{2}\).
It is well known that the solutions to the corresponding Euler–Lagrange equations \(q^{(4)}=0\) are the so-called cubic splines \(q(t)=at^{3}+bt^{2}+ct+d\), for \(a,b,c,d\in {\mathbb {R}}^{n}\). We define \(L_{d}:({\mathbb {R}}^{n}\times {\mathbb {R}}^{n})\times ({\mathbb {R}}^{n}\times {\mathbb {R}}^{n})\rightarrow {\mathbb {R}}\) as follows. Write
Given sufficiently close \((q_0,v_0),(q_1,v_1)\in TQ\) we can use equations (12) to obtain approximations of the acceleration of the exact solution joining these boundary conditions at time h, which we call
Then we define
Solving the discrete second-order Euler–Lagrange equations for this discrete Lagrangian, the evolution of the discrete trajectory is
In the following section we will continue this example and show some simulations.
3.1 Discrete Legendre Transforms
We define the discrete Legendre transforms \({\mathbb {F}}^{+}L_{d},{\mathbb {F}}^{-}L_{d}:TQ\times TQ\rightarrow T^{*}TQ\) which maps the space \(TQ\times TQ\) into \(T^{*}TQ\). These are given by
If both discrete fiber derivatives are locally diffeomorphisms for nearby \((q_0,v_0)\) and \((q_1,v_1)\), then we say that \(L_d\) is regular.
Using the discrete Legendre transforms the discrete Euler–Lagrange equations (11) can be rewritten as
It will be useful to note that
that is,
Remark 3.3
It is easy to extend this framework to higher-order mechanical systems. Let \(L:T^{(\ell )}Q\rightarrow {\mathbb {R}}\) be a regular higher-order Lagrangian. Given a small enough \(h>0\), the exact discrete Lagrangian \(L_d^{e}:T^{(\ell -1)}Q\times T^{(\ell -1)}Q\rightarrow {\mathbb {R}}\) is defined by
where \(q(t):I\subset {\mathbb {R}}\rightarrow Q\) is the unique solution of the Euler–Lagrange equations for the higher-order Lagrangian L,
satisfying the boundary conditions \(q(0)=q_0^{(0)},\dot{q}(0)=q_0^{(1)},\ldots ,q^{(\ell -1)}(0)=q_0^{(\ell -1)},q(h)=q_1^{(0)},\dot{q}(h)=q_1^{(1)},\ldots ,q^{(\ell -1)}(h)=q_1^{(\ell -1)}\).
The exact discrete Lagrangian is actually defined on a neighborhood of the diagonal of \(T^{(\ell -1)}Q\times T^{(\ell -1)}Q\). We take \(L_{d}:T^{(\ell -1)}Q\times T^{(\ell -1)}Q\rightarrow {\mathbb {R}}\) to be an approximation of \(L_{d}^{e}\) in order to construct variational integrators for higher-order mechanical systems.
Given a discrete path \(\{(q_k^{(0)},\dots ,q_k^{(\ell -1)})\in T^{(\ell -1)}Q\}|_{k=0}^N\), the corresponding discrete action is defined as
Hamilton’s principle seeks discrete paths that satisfy \(\delta {\mathcal {A}}_d=0\) for all variations \(\{(\delta q_k^{(0)},\dots ,\delta q_k^{(\ell -1)})|_{k=0}^N\}\) vanishing at the endpoints \(k=0,N\). This is equivalent to the discrete higher-order Euler–Lagrange equations for \(L_{d}\):
for \(i=1,\dots ,\ell \) and \(k=1,\dots ,N-1\).
4 Relationship Between Discrete and Continuous Variational Systems
Let \(L:T^{(2)}Q\rightarrow {\mathbb {R}}\) be a regular Lagrangian and, for small enough \(h>0\), consider the exact discrete Lagrangian defined before, that is, a function \(L_d^{e}:TQ\times TQ\rightarrow {\mathbb {R}}\) given by
where \(q:[0,h]\rightarrow Q\) is the unique solution of the Euler–Lagrange equations for the second-order Lagrangian L,
satisfying the boundary conditions \(q(0)=q_0,q(h)=q_1,\dot{q}(0)=\dot{q}_0\) and \(\dot{q}(h)=\dot{q}_1\).
The Legendre transformation associated to L is defined to be the map \({\mathbb {F}}L:T^{(3)}Q\rightarrow T^{*}TQ\) given by (see León and Rodrigues (1985))
We will see that there is a special relationship between the Legendre transform of a regular Lagrangian and the discrete Legendre transforms of the corresponding exact discrete Lagrangian \(L_d^{e}\).
Theorem 4.1
Let \(L:T^{(2)}Q\rightarrow {\mathbb {R}}\) be a regular Lagrangian and \(L_d^{e}:TQ\times TQ\rightarrow {\mathbb {R}}\), the corresponding exact discrete Lagrangian. Then L and \(L_d^{e}\) have Legendre transformations related by
where q (t) is a solution of the second-order Euler–Lagrange equations.
Proof
We begin by computing the derivatives of \(L_d^{e}\).
where we have used integration by parts and the fact that
Therefore,
Since q(t) is a solution of the Euler–Lagrange equations for \(L:T^{(2)}Q\rightarrow {\mathbb {R}}\), the last term is zero. Therefore,
because
On the other hand,
Since q(t) is a solution of the Euler–Lagrange equations, the first term is zero, and using that
we have
Therefore,
With similar arguments, we can also prove that
and
and in consequence,
\(\square \)
In what follows we will study the relation between the regularity of the continuous Lagrangian, given by the Hessian matrix
and the regularity condition corresponding to the exact discrete Lagrangian \(L_d^e:TQ\times TQ\rightarrow {{\mathbb {R}}}\)
For the next theorem, we restrict ourselves to Lagrangians that can be written locally as
where \((g_{ij}(q))\) is a regular matrix for all q. It is also possible to write this condition intrinsically by using a metric, a connection, a one-form and a function. This covers the kind of Lagrangians that appear in interpolation problems (Gay-Balmaz et al. 2012a) and in optimal control problems with cost functionals of the form \(\frac{1}{2}\int _0^T\Vert u\Vert ^2\mathrm{{d}}t\), where u represents the control force applied to a system having a (first-order) Lagrangian of mechanical type (see Sect. 5).
Theorem 4.2
Let \(L:T^{(2)}Q \rightarrow {\mathbb {R}}\) be a regular Lagrangian of the type (16). For small enough \(h>0\), the corresponding exact discrete Lagrangian \(L_d^e:TQ\times TQ\rightarrow {{\mathbb {R}}}\) is also regular.
Proof
We will work locally. Given \(q_0\), \( \dot{q}_0\), \(q_1\), \( \dot{q}_1\), consider the curve q(t) that solves the Euler–Lagrange equations with those boundary values, as in the definition of \(L_d^e\). Using the Taylor expansions for q(t) and \( \dot{q}(t)\), we can write
for \(h\rightarrow 0\). By differentiating these expressions with respect to the parameters \(q_0\) and \(\dot{q}_0\), we get two systems of equations from which we find
Analogously,
Let us compute \(D_{13}L_d^e\). Denote by F the right-hand side of (15), so
Recall that \(q(0),\dot{q}(0),\ddot{q}(0), q^{(3)}(0)\) are obtained as the initial conditions for the higher-order Euler–Lagrange equations that correspond to the boundary conditions \(q(0),\dot{q}(0),q(h),\dot{q}(h)\). We have
Then
In the expression above, the derivatives are evaluated at the arguments corresponding to time 0 for each function. It is important to note that the first factor involves \(\ddot{q}(0)\) and \(q^{(3)}(0)\), which can blow up for \(h \rightarrow 0\), even in the simple case of cubic splines. However, for L of the type (16) we have
These expressions do not contain \(\ddot{q}\) or \(q^{(3)}\), so they are for \(h \rightarrow 0\). Therefore,
The remaining derivatives in \({\mathcal {W}}_d\) can be computed without using the special form (16) of the Lagrangian.
Seeing \({\mathcal {W}}_d\) as a block matrix, a well-known result from linear algebra leads us to
That is, for small enough h, if L is regular then \(L_{d}^{e}\) is regular. \(\square \)
In what follows we denote \(({\textit{TQ}}\times {\textit{TQ}})_2\) the subset of \(({\textit{TQ}}\times {\textit{TQ}})\times ({\textit{TQ}}\times {\textit{TQ}})\) given by
If \(L:T^{(2)}Q\rightarrow {\mathbb {R}}\) is a regular Lagrangian then the Euler–Lagrange equations for L gives rise to a system of explicit fourth-order differential equations
Therefore, for h given, it is possible to derive the following application (see Agarwal (1986))
which maps \((q(0),\dot{q}(0),\ddot{q}(0),q^{(3)}(0))\in T^{(3)}Q\) into \((q(h),\dot{q}(h),\ddot{q}(h),q^{(3)}(h))\in T^{(3)}Q\). Therefore, from Theorem 4.1 we deduce the commutativity of the diagram in Fig. 1.
Definition 4.3
The discrete Hamiltonian flow is defined by \(\widetilde{F}_{L_d}:T^{*}TQ\rightarrow T^{*}TQ\) as
Alternatively, it can also be defined as \(\widetilde{F}_{L_d}={\mathbb {F}}^{+}L_d\circ F_{L_d}\circ ({\mathbb {F}}^{+}L_d)^{-1}\).
Theorem 4.4
The diagram in Fig. 2 is commutative.
Proof
The central triangle is (14). The parallelogram on the left-hand side is commutative by (17), so the triangle on the left is commutative. The triangle on the right is the same as the triangle on the left, with shifted indices. Then parallelogram on the right-hand side is commutative, which gives the equivalence stated in the definition of the discrete Hamiltonian flow. \(\square \)
Corollary 4.5
The following definitions of the discrete Hamiltonian map are equivalent
and have the coordinate expression \(\widetilde{F}_{L_{d}}:(q_0,\dot{q}_{0},p_0,\tilde{p}_0)\mapsto (q_1,\dot{q}_1,p_1,\tilde{p}_1)\), where we use the notation
Combining Theorem (4.1) with the diagram in Fig. 2 gives the commutative diagram shown in Fig. 3 for the exact discrete Lagrangian.
Here, \(F_{H}^{h}\) denotes the flow of the Hamiltonian vector field \(X_{H}\) associated with the Hamiltonian \(H:T^{*}TQ\rightarrow {\mathbb {R}}\) given by \(H=E_{L}\circ ({\mathbb {F}}L)^{-1}\) where \(E_{L}:T^{(3)}Q\rightarrow {\mathbb {R}}\) denotes the energy function associated to L (see León and Rodrigues 1985).
Theorem 4.6
Under these conditions we have that \(F_H^h = \tilde{F}_{L_d^e}\).
Example 4.7
(Cubic splines (cont.)) Recall that in this example \(Q={\mathbb {R}}^n\) and \(L= \frac{1}{2}\ddot{q}^2\). Since the exact solutions for the second-order Euler–Lagrange equation for L can be found explicitly, it is easy to show that the discrete exact Lagrangian is
From the corresponding discrete second-order Euler–Lagrange equation, the evolution is
It is interesting to note that both this exact method and method (13) preserve the quantity
A simulation for method (13) is shown in Fig. 4.
4.1 Variational Error Analysis
Now we rewrite the result of Patrick (2006) and Marsden and West (2001) for the particular case of a Lagrangian \(L_d:TQ\times TQ\rightarrow {\mathbb {R}}\).
Definition 4.8
Let \(L_{d}:TQ\times TQ\rightarrow {\mathbb {R}}\) be a discrete Lagrangian. We say that \(L_{d}\) is a discretization of order r if there exist an open subset \(U_{1}\subset T^{(2)}Q\) with compact closure and constants \(C_1>0\), \(h_1>0\) so that
for all solutions q(t) of the second-order Euler–Lagrange equations with initial conditions \((q_0,\dot{q}_0,\ddot{q}_0)\in U_1\) and for all \(h\le h_1\).
Following Marsden and West (2001), Patrick and Cuell (2009), we have the next result about the order of our variational integrator.
Theorem 4.9
If \(\widetilde{F}_{L_d}\) is the evolution map of an order r discretization \(L_d:TQ\times TQ\rightarrow {\mathbb {R}}\) of the exact discrete Lagrangian \(L_d^{e}:TQ\times TQ\rightarrow {\mathbb {R}}\), then
In other words, \(\widetilde{F}_{L_d}\) gives an integrator of order r for \(\widetilde{F}_{L_{d}^{e}}=F_{H}^{h}\).
Note that given a discrete Lagrangian \(L_{d}:TQ\times TQ\rightarrow {\mathbb {R}}\) its order can be calculated by expanding the expressions for \(L_d(q(0),\dot{q}(0),q(h),\dot{q}(h), h)\) in a Taylor series in h and comparing this to the same expansions for the exact Lagrangian. If the series agree up to r terms, then the discrete Lagrangian is of order r.
5 Application to Optimal Control of Mechanical Systems
In this section we will study how to apply our variational integrator to optimal control problems. We will study optimal control problems for fully actuated mechanical systems, and we will show how our methods can be applied to the optimal control of a robotic leg.
In the following we will assume that all the control systems are controllable, that is, for any two points \(q_0\) and \(q_f\) in the configuration space Q, there exists an admissible control u(t) defined on some interval [0, T] such that the system with initial condition \(q_0\) reaches the point \(q_f\) at time T (see Bloch 2003; Bullo and Lewis 2005 for example).
5.1 Optimal Control of Fully Actuated Systems
Let \(L:TQ\rightarrow {\mathbb {R}}\) be a regular Lagrangian and take local coordinates \((q^{A})\) on Q where \(1\le A\le n\). For this Lagrangian the controlled Euler–Lagrange equations are
where \(u=(u_{A})\in U\subset {\mathbb {R}}^{n}\) is an open subset of \({\mathbb {R}}^n\), the set of control parameters.
The optimal control problem consists in finding a trajectory of the state variables and control inputs \((q^{(A)}(t),u^{A}(t))\) satisfying (18) given initial and final conditions \((q^{A}(t_0),\dot{q}^{A}(t_0))\), \((q^{A}(t_f),\dot{q}^{A}(t_f))\) respectively, minimizing the cost function
where \(C:TQ\times U\rightarrow {\mathbb {R}}\).
From (18) we can rewrite the cost function as a second-order Lagrangian \(\widetilde{L}:T^{(2)}Q\rightarrow {\mathbb {R}}\) given by
replacing the controls by the Euler–Lagrange equations in the cost function (see Bloch 2003 for example).
Suppose that \(Q={\mathbb {R}}^n\). Then we can define a discretization of the Lagrangian \(\widetilde{L}:T^{(2)}Q\rightarrow {\mathbb {R}}\) by a discrete Lagrangian \(\widetilde{L}_d:TQ\times TQ\rightarrow {\mathbb {R}}\),
In the first term, we have computed an approximate value of the acceleration \(a_k\) by using the Taylor expansion \(q_{k+1}\approx q_k+hv_k+\frac{h^2}{2}a_k\). For the second term, we have approximated \(a_{k+1}\) using \(q_{k}\approx q_{k+1}-hv_{k+1}+\frac{h^2}{2} a_{k+1}\), as in Example 3.2.
Other natural possibilities for \(\widetilde{L}_d\) are, for instance,
or
Applying the results given in Sect. 3, we know that the minimizers of the cost function are obtained by solving the discrete second-order Euler–Lagrange equations
If the matrix
is regular, then one can define the discrete Lagrangian map to solve the optimal control problem.
Example 5.1
(Two-link manipulator) We consider the optimal control of a two-link manipulator which is a classical example studied in robotics (see, e.g., Murray et al. 1994 and Ober-Blöbaum et al. 2011). The two-link manipulator consists of two coupled (planar) rigid bodies with mass \(m_i\), length \(l_i\) and moments of inertia with respect to the joints \(J_i\), with \(i = 1, 2\), respectively.
Let \(\theta _1\) and \(\theta _2\) be the configuration angles measured as in Fig. 5. If we assume one end of the first link to be fixed in an inertial reference frame, the configuration of the system is locally specified by the coordinates \((\theta _1, \theta _2)\in {\mathbb {S}}^{1}\times {\mathbb {S}}^{1}\). The Lagrangian is given by the kinetic energy of the system minus the potential energy, that is,
where g is the constant gravitational acceleration.
Control torques \(u_{1}\) and \(u_{2}\) are applied at the base of the first link and at the joint between the two links. The equations of motion of the controlled system are
We look for trajectories \((\theta _{1}(t),\theta _{2}(t), u(t))\) of the state variables and control inputs for given initial and final conditions, that is, for given values of \((\theta _{1}(0),\theta _{2}(0), \dot{\theta }_1(0), \dot{\theta }_2(0))\) and \((\theta _1(T), \theta _2(T), \dot{\theta }_1(T), \dot{\theta }_2(T))\), and minimizing the cost functional
We construct the discrete Lagrangian \(\widetilde{L}_d:T({\mathbb {S}}^{1}\times {\mathbb {S}}^{1})\times T({\mathbb {S}}^{1}\times {\mathbb {S}}^{1})\rightarrow {\mathbb {R}}\), discretizing the Lagrangian \(\displaystyle {\widetilde{L}:T^{(2)}({\mathbb {S}}^{1}\times {\mathbb {S}}^{1})\rightarrow {\mathbb {R}}}\) given by
taking the same discretization as in equation (12) to approximate the acceleration and taking midpoint averages to approximate the position and velocity.
Figures 6 and 7 show the results from a numerical simulation of the method, taking the system from the stable mechanical equilibrium \((\theta _{1}(0),\theta _{2}(0), \dot{\theta }_1(0), \dot{\theta }_2(0))=(-\pi /2,0,0,0)\) to the unstable equilibrium \((\theta _{1}(T),\theta _{2}(T), \dot{\theta }_1(T), \dot{\theta }_2(T))=(\pi /2,0,0,0)\). We have used \(T=10\), \(N=1000\), \(m_1=0.375\), \(m_2=0.25\), \(l_1=1.5\), \(l_2=1\), \(J_1=\frac{m_1l_1^2}{3}\), \(J_2=\frac{m_2l_2^2}{3}\), and \(g=9.8\). In addition, the reader can find a video of the simulation in www.youtube.com/watch?v=ZUUH0596a30. The algorithm generates a sequence of velocities as well as positions, but we represent only the positions in the figures.
We have also considered a different setting where the angle \(\theta _2\) is restricted to move between 0 and 170 degrees, inspired by an elbow joint. This range of motion is enforced by adding a continuous, piecewise linear function \(V(\theta _2)\) to the cost function, with slope \(-1000\) for \(\theta _2<0^\circ \), 0 for \(0^\circ<\theta _2<170^\circ \), and 1000 for \(\theta _2>170^\circ \). We simulated the optimal trajectory with the same endpoint conditions and physical parameters as above, with \(N=200\). A video of the resulting motion can be found in www.youtube.com/watch?v=OxOFHdT7emQ.
6 Conclusions and Future Research
In this paper we design variational integrators for higher-order variational systems and their application to optimal control problems. The general idea for those variational integrators is to directly discretize Hamilton’s principle rather than the equations of motion in a way that preserves the original system invariants, notably the symplectic form and, via a discrete version of Noether’s theorem, the momentum map.
We show that a regular higher-order Lagrangian system has a unique solution for given nearby endpoint conditions using a direct variational proof of existence and uniqueness for the local boundary value problem using a regularization procedure assuming only \(C^k\) differentiability (instead of \(C^{2k}\) as in standard ODE theory).
We have seen that taking a discrete Lagrangian function \(L_d:T^{(k-1)}Q\times T^{(k-1)}Q \rightarrow {{\mathbb {R}}}\) we obtain the appropriate approximation of the action \( \int ^h_0 L(q, \dot{q}, \ldots , q^{(k)})\, \mathrm{{d}}t\). Moreover, we derive a particular choice of discrete Lagrangian which gives an exact correspondence between discrete and continuous systems, the exact discrete Lagrangian. We show that if the original Lagrangian is regular then it is also the exact discrete Lagrangian and how is the relation between the discrete Legendre transformations with the continuous one.
As future research, we are interested in the construction of an exact discrete Lagrangian function for higher-order mechanical systems subject to higher-order constraints. The main point will be to show the existence and uniqueness of solutions for the boundary value problem for higher-order systems subject to higher-order constraints. After it, one could define the exact discrete Lagrangian for constrained systems in a similar fashion that the ones shown in this work. Since optimal control problems for the class of underactuated mechanical systems can be seen as constrained higher-order variational problems, the extension of the constructions given in this work can be useful to new developments in the field of geometric integration for optimal control problems. The case of optimal control of nonholonomic systems will be developed.
Notes
For \(k=1\), recall writing \(\delta \dot{q}=\dot{(\delta q)}\) when deriving the Euler–Lagrange equations, assuming that q is \(C^2\).
By this we mean, from now on, that there exists \(h_0>0\) such that for all \(h\in (0,h_0)\) the definition or proof holds.
References
Abraham, R., Marsden, J.E., Ratiu, T.: Manifolds, Tensor Analysis, and Applications, vol. 75 of Applied Mathematical Sciences, 2nd edn. Springer, New York (1988). doi:10.1007/978-1-4612-1029-0
Agarwal, R.P.: Boundary Value Problems for Higher Order Differential Equations. World Scientific Publishing Co., Inc., Teaneck (1986). doi:10.1142/0266
Benito, R., de León, M., Martín de Diego, D.: Higher-order discrete Lagrangian mechanics. Int. J. Geom. Methods Mod. Phys. 3, 421–436 (2006). doi:10.1142/S0219887806001235
Bloch, A.M.: Nonholonomic Mechanics and Control, vol. 24 of Interdisciplinary Applied Mathematics. Springer, New York (2003). doi:10.1007/b97376. With the collaboration of J. Baillieul, P. Crouch and J. Marsden, With scientific input from P. S. Krishnaprasad, R. M. Murray and D. Zenkov, Systems and Control
Bloch, A.M., Hussein, I.I., Leok, M., Sanyal, A.K.: Geometric structure-preserving optimal control of a rigid body. J. Dyn. Control Syst. 15, 307–330 (2009). doi:10.1007/s10883-009-9071-2
Bullo, F., Lewis, A.D.: Geometric Control of Mechanical Systems, vol. 49 of Texts in Applied Mathematics. Springer, New York (2005). doi:10.1007/978-1-4899-7276-7. Modeling, analysis, and design for simple mechanical control systems
Burnett, C.L., Holm, D.D., Meier, D.M.: Inexact trajectory planning and inverse problems in the Hamilton-Pontryagin framework. Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 469, 20130249 (2013). doi:10.1098/rspa.2013.0249
Buttazzo, G., Giaquinta, M., Hildebrandt, S.: One-Dimensional Variational Problems, an Introduction, vol. 15 of Oxford Lecture Series in Mathematics and its Applications. The Clarendon Press, Oxford University Press, New York (1998)
Colombo, L., Martín de Diego, D., Zuccalli, M.: On variational integrators for optimal control of mechanical control systems. Rev. R. Acad. Cienc. Exactas Fís. Nat. Ser. A Math. RACSAM 106, 161–171 (2012). doi:10.1007/s13398-011-0032-8
Colombo, L., Martín de Diego, D., Zuccalli, M.: Higher-order discrete variational problems with constraints. J. Math. Phys. 54, 093507 (2013). doi:10.1063/1.4820817
Colombo, L., Martín de Diego, D.: Higher-order variational problems on Lie groups and optimal control applications. J. Geom. Mech. 6, 451–478 (2014). doi:10.3934/jgm.2014.6.451
Crampin, M., Sarlet, W., Cantrijn, F.: Higher-order differential equations and higher-order Lagrangian mechanics. Math. Proc. Cambridge Philos. Soc. 99, 565–587 (1986). doi:10.1017/S0305004100064501
Crouch, P., Silva Leite, F.: The dynamic interpolation problem: on Riemannian manifolds, Lie groups, and symmetric spaces. J. Dynam. Control Syst. 1, 177–202 (1995). doi:10.1007/BF02254638
de León, M., Rodrigues, P.R.: Generalized Classical Mechanics and Field Theory, vol. 112 of North-Holland Mathematics Studies. North-Holland Publishing Co., Amsterdam (1985). A geometrical approach of Lagrangian and Hamiltonian formalisms involving higher order derivatives, Notes on Pure Mathematics, 102
Eldering, J.: Persistence of noncompact normally hyperbolic invariant manifolds in bounded geometry, PhD thesis, Universiteit Utrecht (2012)
Gay-Balmaz, F., Holm, D.D., Ratiu, T.S.: Higher order Lagrange-Poincaré and Hamilton-Poincaré reductions. Bull. Braz. Math. Soc. (N.S.) 42, 579–606 (2011). doi:10.1007/s00574-011-0030-7
Gay-Balmaz, F., Holm, D.D., Meier, D.M., Ratiu, T.S., Vialard, F.-X.: Invariant higher-order variational problems. Comm. Math. Phys. 309, 413–458 (2012). doi:10.1007/s00220-011-1313-y
Gay-Balmaz, F., Holm, D.D., Meier, D.M., Ratiu, T.S., Vialard, F.-X.: Invariant higher-order variational problems II. J. Nonlinear Sci. 22, 553–597 (2012). doi:10.1007/s00332-012-9137-2
Giaquinta, M., Hildebrandt, S.: Calculus of Variations I, vol. 310 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer, Berlin (1996)
Hussein, I.I., Bloch, A.M.: Dynamic interpolation on Riemannian manifolds: an application to interferometric imaging, In: Proceedings of the 2004 American control conference, pp. 685–690 (2004)
Jordan, B.W., Polak, E.: Theory of a class of discrete optimal control systems. J. Electron. Control (1) 17, 697–711 (1964)
Lee, T., Leok, M., McClamroch, N.H.: Optimal attitude control of a rigid body using geometrically exact computations on \({\rm SO}(3)\). J. Dyn. Control Syst. 14, 465–487 (2008). doi:10.1007/s10883-008-9047-7
Leok, M., Shingel, T.: Prolongation-collocation variational integrators. IMA J. Numer. Anal. 32, 1194–1216 (2012). doi:10.1093/imanum/drr042
Machado, L., Silva Leite, F., Krakowski, K.: Higher-order smoothing splines versus least squares problems on Riemannian manifolds. J. Dyn. Control Syst. 16, 121–148 (2010). doi:10.1007/s10883-010-9080-1
Marsden, J.E., West, M.: Discrete mechanics and variational integrators. Acta Numer. 10, 357–514 (2001). doi:10.1017/S096249290100006X
Murray, R.N., Li, Z.X., Sastry, S.S.: A mathematical introduction to robotic manipulation. CRC Press, Boca Raton (1994)
Noakes, L., Heinzinger, G., Paden, B.: Cubic splines on curved spaces. IMA J. Math. Control Inform. 6, 465–473 (1989). doi:10.1093/imamci/6.4.465
Ober-Blöbaum, S., Junge, O., Marsden, J.E.: Discrete mechanics and optimal control: an analysis. ESAIM Control Optim. Calc. Var. 17, 322–352 (2011). doi:10.1051/cocv/2010012
Patrick, G.W.: Lagrangian mechanics without ordinary differential equations. Rep. Math. Phys. 57, 437–443 (2006). doi:10.1016/S0034-4877(06)80030-3
Patrick, G.W., Cuell, C.: Error analysis of variational integrators of unconstrained Lagrangian systems. Numer. Math. 113, 243–264 (2009). doi:10.1007/s00211-009-0245-3
Veselov, A.P.: Integrable systems with discrete time, and difference operators. Funktsional. Anal. i Prilozhen. 22, 1–13 (1988). doi:10.1007/BF01077598
Wendlandt, J.M., Marsden, J.E.: Mechanical integrators derived from a discrete variational principle. Phys. D 106, 223–246 (1997). doi:10.1016/S0167-2789(97)00051-1
Acknowledgments
This work has been supported by MICINN (Spain) Grant MTM 2013-42870-P, ICMAT Severo Ochoa Project SEV-2011-0087 and IRSES-project “Geomech-246981.” The research of S. Ferraro has been supported by CONICET Argentina (PIP 2013-2015 GI 11220120100532CO), ANPCyT Argentina (PICT 2013-1302) and SGCyT UNS. We would like to thank the referee for the helpful comments.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Anthony Bloch.
Appendix: A Technical Result for Sect. 2
Appendix: A Technical Result for Sect. 2
Let E be the kernel of g, where \(g=(g_0,\ldots ,g_{k-1}):C^{l}([0,1],{\mathbb {R}}^{n})\rightarrow ({\mathbb {R}}^{n})^{k}\) and \(g_{j}[\cdot ]=\langle \langle b_{j}^{[k]},\cdot \rangle \rangle \). In the context of Sect. 2.5, E is the tangent space of the constraint set defined using the linear constraints \(g_{j}\), and l is either 0 or k.
In this Appendix we show that the orthogonal complement of E is the space F of \({\mathbb {R}}^{n}\)-valued polynomials of degree at most \(k-1\),
where \(b_j^{[k]}\), \(j=0,\ldots ,k-1\), is a basis of the space of real-valued polynomials of degree at most \(k-1\) consisting of orthonormal polynomials on [0, 1].
Lemma 6.1
\(F=E^{\perp }\), where the orthogonal complement is taken with respect to the inner product \(\llbracket ,\rrbracket \) in \(C^{l}([0,1],{\mathbb {R}}^{n})\).
Proof
We will prove that E and F are orthogonal (with zero intersection) and that their sum is the whole space \(C^{l}([0,1],{\mathbb {R}}^{n})\).
Let \(e\in E\) and \(c^{j}b_{j}^{[k]}\in F\).
since \(e\in E={\text {Ker}}g\).
The fact that \(E\cap F=\{0\}\) can be obtained either by using that the inner product is nondegenerate or directly as follows. Take \(e\in E\cap F\), so \(e=c^{j}b_{j}^{[k]}\). For all \(j'\), we have \(0=g_{j'}[e]=\langle \langle b_{j'}^{[k]},c^{j}b_{j}^{[k]}\rangle \rangle =c^{j'}\), which means that \(e=0\).
Finally, take \(e\in C^{l}([0,1],{\mathbb {R}}^{n})\). Write
The third term is in F. The remaining part of the right-hand side is in E since for all \(j'\),
Therefore, \(C^{l}([0,1],{\mathbb {R}}^{n})=E+F\). From the first part of the proof, we obtain that there is an orthogonal decomposition \(C^{l}([0,1],{\mathbb {R}}^{n})=E\oplus F\). \(\square \)
Rights and permissions
About this article
Cite this article
Colombo, L., Ferraro, S. & Martín de Diego, D. Geometric Integrators for Higher-Order Variational Systems and Their Application to Optimal Control. J Nonlinear Sci 26, 1615–1650 (2016). https://doi.org/10.1007/s00332-016-9314-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00332-016-9314-9