1 Introduction

This paper is concerned with the design of geometric integrators for higher-order variational systems. The study of higher-order variational systems has regularly attracted a lot of attention from the applied and theoretical points of view (see León and Rodrigues 1985 and references therein). But recently there is a renewed interest in these systems due to new and relevant applications in optimal control for robotics or aeronautics, or the study of air traffic control and computational anatomy (Burnett et al. 2013; Colombo and Martín de Diego 2014; Crouch and Silva Leite 1995; Gay-Balmaz et al. 2012a, b, 2011; Hussein and Bloch 2004; Machado et al. 2010; Noakes et al. 1989).

A continuous higher-order system is modeled by a Lagrangian on a higher-order tangent bundle \(T^{(k)}Q\), that is, a function \(L:T^{(k)}Q \rightarrow {{\mathbb {R}}}\). The corresponding Euler–Lagrange equations are a system of implicit 2k-order differential equations. Of course the explicit integration of most of these Lagrangian systems is too complicated to integrate directly or even it is generically not possible. In these cases, it is necessary to discretize the equations taking approximations at several points in time over the interval of integration.

Among the different numerical integrators that one can derive for continuous higher-order systems, one of the most successful ideas is to discretize first the variational principle (instead of the equations of motion) and to derive the numerical method applying discrete calculus of variations (Marsden and West 2001; Veselov 1988; Wendlandt and Marsden 1997). The advantage of this procedure is that automatically we have preservation of some of the geometric structures involved, like symplectic forms or preservation of momentum, moreover, a good behavior of the associated energy. These methods have their roots in the optimal control literature in the 1960s (Jordan and Polak 1964).

In previous approaches (see, e.g., Benito et al. 2006; Colombo et al. 2012, 2013), the theory of discrete variational mechanics for higher-order systems was derived using a discrete Lagrangian \(L_d:Q^{k+1}\rightarrow {\mathbb {R}}\) where \(Q^{k+1}\) is the cartesian product of \(k+1\) copies of the configuration manifold Q. There, \(k+1\) points are used to approximate the positions and the higher-order velocities (such as the standard velocities, accelerations, jerks) and to represent in this way elements of the higher-order tangent bundle \(T^{(k)}Q\).

We will see in this paper that the most natural approach is to take a discrete Lagrangian \(L_d:T^{(k-1)}Q\times T^{(k-1)}Q \rightarrow {{\mathbb {R}}}\) since actually the discrete variational calculus is not based on the discretization of the Lagrangian itself, but on the discretization of the associated action. We will see that a suitable approximation of the action

$$\begin{aligned} \int ^h_0 L(q, \dot{q}, \ldots , q^{(k)})\; \mathrm{{d}}t \end{aligned}$$

is given by a Lagrangian of the form \(L_d:T^{(k-1)}Q\times T^{(k-1)}Q \rightarrow {{\mathbb {R}}}\). Moreover, we will derive a particular choice of discrete Lagrangian which gives an exact correspondence between discrete and continuous systems, the exact discrete Lagrangian. For instance, if we take the Lagrangian \(L(q, \dot{q}, \ddot{q})=\frac{1}{2}\ddot{q}^2\), the corresponding exact discrete Lagrangian \(L_{d}^{e}:TQ\times TQ\rightarrow {\mathbb {R}}\) is

$$\begin{aligned} L^e_d(q_0, v_0, q_h, v_h)&=\int _0^hL(q(t), \dot{q}(t), \ddot{q}(t))\; \mathrm{{d}}t\\&=\frac{6}{h^3}(q_0-q_h)^2+\frac{6}{h^2}(q_0-q_h)(v_0+v_h)+\frac{2}{h}\left( v_0^2+v_0v_h+v_h^2\right) \end{aligned}$$

where q(t) is the unique solution of the Euler–Lagrange equations for L verifying \(q(0)=q_0\), \(\dot{q}(0)=v_0\), \(q(h)=q_h\), \(\dot{q}(h)=v_h\) for h small enough (see Sect. 2).

Observe from the previous example that now this theory of variational integrators for higher-order systems is even simpler, since it fits directly into the standard discrete mechanics theory for a discrete Lagrangian of the form \(L_d:M\times M\rightarrow {{\mathbb {R}}}\) where \(M=T^{(k-1)}Q\). We will show that if the original Lagrangian is regular then so is the exact discrete Lagrangian, in the sense of Marsden and West (2001). Moreover, in the corresponding applications, for instance in optimal control theory or splines theory, typically we are dealing with initial and final boundary conditions which are not necessary discretized, in contrast to previously proposed methods Bloch et al. (2009), Lee et al. (2008), Leok and Shingel (2012).

The paper is structured as follows. In Sect. 2, we show that a regular higher-order Lagrangian system has a unique solution for given nearby endpoint conditions using a direct variational proof of existence and uniqueness of the local boundary value problem, which employs a regularization procedure. In Sect. 3 we introduce the notion of exact discrete Lagrangian for higher-order systems and we design the construction of variational integrators for higher-order Lagrangian systems taking approximations of the exact discrete Lagrangian. We obtain the discrete Euler–Lagrange equations for a discrete Lagrangian defined in the cartesian product of two copies of \(T^{(k-1)}Q\). Section 4 is devoted to the study of the relation between the discrete and continuous dynamics. We show the relation between the discrete Legendre transformations and the continuous one, and we also show that the exact discrete Lagrangian associated with a higher-order regular Lagrangian is also regular. Finally, in Sect. 5, we apply our techniques to study optimal control problems for fully actuated mechanical systems.

2 Existence and Uniqueness of Solutions for the Boundary Value Problem

2.1 Higher-Order Tangent Bundles

First we recall some basic facts about the higher-order tangent bundle theory. For more details, see Crampin et al. (1986) and León and Rodrigues (1985).

Let Q be a differentiable manifold. We introduce the following equivalence relation in the set \(C^{k}(I, Q)\) of k-differentiable curves from the interval \(I\subseteq {\mathbb {R}}\) to Q, where \(0\in I\). By definition, two curves \(\gamma _1\) and \(\gamma _2\) belonging to \(C^{k}(I, Q)\) have contact of order k at \(q_0 = \gamma _1(0) = \gamma _2(0)\) if there is a local chart \((\varphi , U)\) of Q such that \(q_0 \in U\) and

$$\begin{aligned} \frac{\mathrm{d}^s}{\mathrm{d}t^s}\left( \varphi \circ \gamma _1(t)\right) {\Big |}_{t=0} = \frac{\mathrm{d}^s}{\mathrm{d}t^s} \left( \varphi \circ \gamma _2(t)\right) {\Big |}_{t=0}\; , \end{aligned}$$

for all \(s = 0,\dots ,k\). The equivalence class of a curve \(\gamma \) will be denoted by \([\gamma ]_0^{(k)}\). The set of equivalence classes will be denoted by \(T^{(k)}Q\) and it is not hard to show that it has a natural structure of differentiable manifold. Moreover, \( \tau _Q^k :T^{(k)} Q \rightarrow Q\) where \(\tau _Q^k \big ([\gamma ]_0^{(k)}\big ) = \gamma (0)\) is a fiber bundle called the tangent bundle of order k of Q. Clearly, \(T^{(1)} Q = TQ\).

From a local chart \(q^{(0)}=(q^i)\) on a neighborhood U of Q with \(i=1,\ldots ,n=\dim Q\), it is possible to induce local coordinates \((q^{(0)},q^{(1)},\dots ,q^{(k)})\) on \(T^{(k)}U=(\tau _Q^k)^{-1}(U)\equiv U\times ({\mathbb {R}}^{n})^{k}\). Sometimes we will resort to the usual notation \(q^{(0)}\equiv (q^{i})\), \(q^{(1)}\equiv (\dot{q}^i)\) and \(q^{(2)}\equiv (\ddot{q}^i)\).

There is a canonical embedding \(j_{k}:T^{(k)}Q\rightarrow T T^{(k-1)}Q\) defined as \(j_k([\gamma ]_0^{(k)})=[{\gamma }^{(k-1)}]_0^{(1)}\), where \({\gamma }^{(k-1)}\) is the lift of the curve \(\gamma \) to \(T^{(k-1)}Q\); that is, the curve \({\gamma }^{(k-1)}:I\rightarrow T^{(k-1)}Q\) is given by \(\gamma ^{(k-1)}(t)=[\gamma _t]_0^{(k-1)}\) where \(\gamma _t(s)=\gamma (t+s)\). In local coordinates,

$$\begin{aligned} j_k\left( q^{(0)},q^{(1)},q^{(2)},\dots ,q^{(k)}\right) =\left( q^{(0)},q^{(1)},\dots ,q^{(k-1)};q^{(1)}, q^{(2)},\dots ,q^{(k)}\right) \; . \end{aligned}$$

2.2 Hamilton’s Principle and Considerations about the Existence and Uniqueness of Solutions

Let \(L:T^{(k)}Q \rightarrow {\mathbb {R}}\) be a Lagrangian of order \(k\ge 1\), of class \(C^{k+1}\). Since our result will be local, we assume from now on that Q is an open subset of \({\mathbb {R}}^n\). Take coordinates \(\left( q^{(0)}, q^{(1)}, \dots , q^{(k)}\right) \) on \(T^{(k)}Q \equiv Q\times ({\mathbb {R}}^n)^k\) as before. We suppose that L is regular in the sense that the Hessian matrix

$$\begin{aligned} \left( \frac{\partial ^2 L}{\partial q^{(k)i}\partial q^{(k)j}}\right) \end{aligned}$$

is a regular matrix. Let also \(h>0\) be given. We can formulate Hamilton’s principle as follows.

Variational Principle 1

Find a \(C^k\) curve \(q:[0,h] \rightarrow Q\) such that it is a critical point of the action

$$\begin{aligned} S_h=\int _0^h L\left( q (t), \dot{q} (t), \dots , q^{(k)} (t)\right) \mathrm{{d}}t \end{aligned}$$

among those curves whose first \(k-1\) derivatives are fixed at the endpoints, that is, with given values for \(q(0), \dot{q}(0),\,\dots ,\, q^{(k-1)}(0)\) and \(q(h), \dot{q}(h),\,\dots ,\, q^{(k-1)}(h)\).

Hamilton’s principle is a constrained problem in the Banach space \(C^k([0,h], {\mathbb {R}}^n)\). Now if q(t) is a solution to this problem that is not only \(C^k\) but \(C^{2k}\), then it satisfies the well-known kth-order Euler–Lagrange equationsFootnote 1

$$\begin{aligned} \sum _{j=0}^k(-1)^j \frac{\mathrm{d}^j}{\mathrm{{d}}t^j}\frac{\partial L}{\partial q^{(j)}}=0. \end{aligned}$$
(1)

For a regular Lagrangian, (1) can be written as an explicit 2k-order ordinary differential equation. Existence and uniqueness of solutions for the initial value problem can be guaranteed using basic ODE theory. Doing the same for for the boundary value problem of finding a solution q(t) of (1) with given values for \(q(0), \dot{q}(0),\,\dots ,\, q^{(k-1)}(0)\) and \(q(h), \dot{q}(h),\,\dots ,\, q^{(k-1)}(h)\) requires different techniques. For instance, in Agarwal (1986, Ch. 9) it is shown that there exists a unique solution to an explicit 2k-order ODE with this kind of boundary conditions, for small enough h and close enough boundary values. See also Eldering (2012) (Appendix A) for results on the existence, uniqueness and smooth dependence on parameters of solutions of ODEs.

In principle, however, there could exist solutions to Hamilton’s variational principle that are \(C^k\) but not \(C^{2k}\), and thus do not satisfy (1). Therefore, uniqueness of solutions to the variational principle cannot yet be guaranteed. One possibility for avoiding this situation is stating Hamilton’s principle in the (smaller) \(C^{2k}\) context from the beginning. In this section we proceed differently, acknowledging the fact the variational principle makes sense in the \(C^k\) setting. We prove local existence and uniqueness of \(C^k\) solutions to Hamilton’s principle from a direct variational point of view. We will see that these solutions turn out to be automatically \(C^{2k}\), so they satisfy Euler–Lagrange equations a posteriori.

Our argument for the existence and uniqueness of solutions will involve a regularization procedure which follows closely the proof by Patrick (2006) for first-order Lagrangians; the formulas, of course, reduce to those in Patrick (2006) for order 1, but we introduce an additional modification using orthonormal polynomials. See also Buttazzo et al. (1998), Giaquinta and Hildebrandt (1996) for discussions on the regularity of extremals for variational problems.

2.3 Nonregularity of Hamilton’s Principle

We want to determine whether there exists a unique solution curve to Hamilton’s principle, given endpoint conditions that are close enough. The main obstacle for a straightforward affirmative answer is that the local boundary value problem as stated above is nonregular at \(h=0\). That is, the constraint function \(g:C^{k}([0,h], Q) \rightarrow ({\mathbb {R}}^{n})^{k}\times ({\mathbb {R}}^{n})^{k}\)

$$\begin{aligned} g:q( \cdot ) \mapsto \left( q (0), \dot{q} (0), \dots , q^{(k-1)} (0);q (h), \dot{q} (h), \dots , q^{(k-1)} (h)\right) \end{aligned}$$

maps into the diagonal of \(T^{(k-1)}Q \times T^{(k-1)}Q\) for \(h=0\) and is not therefore a submersion. For \(h\ne 0\), the constraint function is a submersion.

The approach consists in replacing this problem by an equivalent one that is regular at \(h=0\), and showing that locally there is a unique solution to the regularized problem.

2.4 Regularization

First we replace the space of curves on Q in the variational problem by the space of curves on \(T^{(k)}Q\) and include additional constraints. Denote an arbitrary curve by

$$\begin{aligned} \left( q (t)=q^{[0]} (t), q^{[1]} (t), \dots , q^{[k]} (t)\right) \in T^{(k)}Q \equiv Q\times ({\mathbb {R}}^n)^k, \end{aligned}$$

\(t\in [0,h]\). Here we have modified our notation for coordinates on \(T^{(k)}Q\), using superscripts in square brackets to make a distinction with the actual derivatives of q(t).

Variational Principle 2

Find a curve \((q^{[0]}(t), q^{[1]} (t), \dots , q^{[k]} (t))\) on \(T^{(k)}Q\), with \(q^{[l]}\in C^{k-l}([0,h], {\mathbb {R}}^n)\), \(l=0, \dots , k\), such that it is a critical point of

$$\begin{aligned} S_h=\int _0^hL\left( q^{[0]}(t), q^{[1]} (t), \dots , q^{[k]} (t)\right) \mathrm{{d}}t \end{aligned}$$

subject to the constraints

$$\begin{aligned} q^{[j+1]}(t)= \frac{\mathrm{d}q^{[j]}}{\mathrm{{d}}t}(t),\quad q^{[j]}(0)=q^{[j]}_1,\quad q^{[j]}(h)=q^{[j]}_2,\quad j=0, \dots , k-1, \end{aligned}$$

where \((q^{[0]}_i, q^{[1]}_i, \dots , q^{[k-1]}_i)\), \(i=1,2\), are given points in \(T^{(k-1)}Q\).

Now reparameterize the curve by defining

$$\begin{aligned} Q^{[j]}(u)=q^{[j]}(hu),\quad j=0,\dots ,k, \quad u\in [0,1]. \end{aligned}$$

For \(h>0\), the curve \((Q^{[0]}(u), \dots , Q^{[k]} (u))\) satisfies an equivalent variational problem as follows. Since h is a constant for each instance of the problem, we can use

$$\begin{aligned} \frac{1}{h} \int _0^hL\left( q^{[0]}(t), q^{[1]} (t), \dots , q^{[k]} (t)\right) \mathrm{{d}}t=\int _0^1L\left( Q^{[0]}(u), \dots , Q^{[k]} (u)\right) \mathrm{{d}}u \end{aligned}$$

as an objective function. The first set of constraints becomes

$$\begin{aligned} 0= \frac{\mathrm{d}q^{[j]}}{\mathrm{{d}}t}(t)-q^{[j+1]}(t)= \left( \frac{1}{h} \frac{\mathrm{d}Q^{[j]}}{\mathrm{{d}}u}(u)-Q^{[j+1]}(u) \right) _{u=t/h} \end{aligned}$$

where \(j=0, \dots , k-1\).

The reparametrized variational principle is the following.

Variational Principle 3

Find a curve \(\left( Q^{[0]}(u), \dots , Q^{[k]} (u)\right) \) on \(T^{(k)}Q\), \(Q^{[l]}\in C^{k-l}([0,1], {\mathbb {R}}^n)\), \(l=0, \dots , k\), that is a critical point of

$$\begin{aligned} S=\int _0^1L\left( Q^{[0]}(u), \dots , Q^{[k]} (u)\right) \mathrm{{d}}u, \end{aligned}$$

subject to the constraints

$$\begin{aligned} \frac{\mathrm{d}Q^{[j]}}{\mathrm{{d}}u}(u)&=hQ^{[j+1]}(u), \end{aligned}$$
(2)
$$\begin{aligned} Q^{[j]}(0)&=q^{[j]}_1, \end{aligned}$$
(3)
$$\begin{aligned} Q^{[j]}(1)&=q^{[j]}_2, \end{aligned}$$
(4)

where \(j=0, \dots , k-1\), and \(\left( q^{[0]}_i, q^{[1]}_i, \dots , q^{[k-1]}_i\right) \), \(i=1,2\), are given points in \(T^{(k-1)}Q\).

The objective S does not depend on h, and the constraints are smooth through \(h=0\).

Remark 2.1

For \(h=0\), the constraints (2) imply that \(Q^{[0]}(u)\), ..., \(Q^{[k-1]} (u)\) remain constant, which restricts the possible values of the endpoint conditions in order to have a compatible set of constraints. More precisely, \(q^{[j]}_1=q^{[j]}_2\) for \(j=0,\ldots ,k-1\); otherwise there would be no curves satisfying the constraints. This kind of restriction also appears in the original variational principle 1. Moreover, the problem becomes the unconstrained problem of finding a curve \(Q^{[k]}(u)\in C^0([0,1],{\mathbb {R}}^{n})\) such that it is a critical point of

$$\begin{aligned} \int _{0}^{1}L\left( q^{[0]},\ldots ,q^{[k-1]},Q^{[k]}(u)\right) \mathrm{{d}}u. \end{aligned}$$

This means

$$\begin{aligned} \frac{\partial L}{\partial q^{[k]}}\left( q^{[0]},q^{[1]},\ldots ,q^{[k-1]},Q^{[k]}(u)\right) =0. \end{aligned}$$

Differentiating with respect to u, and using the fact that the Lagrangian is regular, we obtain that \(Q^{[k]}(u)\) is constant.

In preparation for the next step for regularization, let us solve the constraints (2) to get

$$\begin{aligned} Q^{[j]}(u)=Q^{[j]}(0)+h\int _0^u Q^{[j+1]}(s)\,\mathrm{{d}}s,\quad j=0, \dots , k-1. \end{aligned}$$

This means that the functions \(Q^{[j]}(u)\), \(j=0, \dots , k-1\), can be expressed in terms of \(Q^{[j]}(0)\), ..., \(Q^{[k-1]}(0)\), the function \(Q^{[k]}(u)\) and h. For example, for \(k=2\) we have

$$\begin{aligned} Q^{[1]}(u)&=Q^{[1]}(0)+h\int _0^u Q^{[2]}(s)\,\mathrm{{d}}s,\\ Q^{[0]}(u)&=Q^{[0]}(0)+h\int _0^u Q^{[1]}(s)\,\mathrm{{d}}s\\&=Q^{[0]}(0)+huQ^{[1]}(0)+h^2\int _0^u\int _0^s Q^{[2]}(\tau )\,\mathrm{{d}}\tau \,\mathrm{{d}}s\\&=Q^{[0]}(0)+huQ^{[1]}(0)+h^2\int _0^u(u-\tau ) Q^{[2]}(\tau )\,\mathrm{{d}}\tau . \end{aligned}$$

For a general k, and for \(j=0, \dots , k-1\), an iterated change of order of integration yields

$$\begin{aligned} Q^{[j]}(u)=Q^{[j]}(0)+\sum _{i=1}^{k-j-1}\frac{h^iu^i}{i!}Q^{[j+i]}(0)+h^{k-j}\int _0^u \frac{(u-s)^{k-j-1}}{(k-j-1)!}Q^{[k]}(s)\,\mathrm{{d}}s. \end{aligned}$$
(5)

If the upper bound of summation is less than the lower bound, the sum is understood to be 0.

Note that taking \(u=1\), the final endpoint data \((q^{[0]}_2, \dots , q^{[k-1]}_2)\) can now be written as

$$\begin{aligned} q^{[j]}_2=Q^{[j]}(1)=q^{[j]}_1+\sum _{i=1}^{k-j-1}\frac{h^i}{i!}q^{[j+i]}_1+h^{k-j}\int _0^1 \frac{(1-s)^{k-j-1}}{(k-j-1)!}Q^{[k]}(s)\,\mathrm{{d}}s, \end{aligned}$$
(6)

so we define

$$\begin{aligned} z^{[j]}=\int _0^1 \frac{(1-s)^{k-j-1}}{(k-j-1)!}Q^{[k]}(s)\,\mathrm{{d}}s=\frac{1}{h^{k-j}}\left( q^{[j]}_2- \sum _{i=0}^{k-j-1}\frac{h^i}{i!}q^{[j+i]}_1\right) . \end{aligned}$$
(7)

We will discuss the case \(h=0\) in Remark 2.2.

Now replace the curves and endpoint data by just \(Q^{[k]}(u)\), \((q^{[0]}_1, \dots , q^{[k-1]}_1)\), and \((z^{[0]}, \dots , z^{[k-1]})\), to get a new variational principle.

Variational Principle 4

Given h, \((q^{[0]}_1, \dots , q^{[k-1]}_1)\) and \((z^{[0]}, \dots , z^{[k-1]})\), find a continuous curve \(Q^{[k]}:[0,1] \rightarrow {\mathbb {R}}^n\) that is a critical point of

$$\begin{aligned} {\mathcal {S}}=\int _0^1L\left( Q^{[0]}(u), \dots , Q^{[k]} (u)\right) \mathrm{{d}}u, \end{aligned}$$

where \(Q^{[0]}(u)\), ..., \(Q^{[k-1]}(u)\) are defined as in (5) by

$$\begin{aligned} Q^{[j]}(u)= & {} q^{[j]}_1+\sum _{i=1}^{k-j-1}\frac{h^iu^i}{i!}q^{[j+i]}_1+h^{k-j}\int _0^u \frac{(u-s)^{k-j-1}}{(k-j-1)!}Q^{[k]}(s)\,\mathrm{{d}}s,\\&j=0,\ldots , k-1 \end{aligned}$$

subject to the constraints

$$\begin{aligned} \int _0^1 \frac{(1-s)^{k-j-1}}{(k-j-1)!}Q^{[k]}(s)\,\mathrm{{d}}s=z^{[j]},\quad j=0, \dots , k-1. \end{aligned}$$

Observe that the constraint functions do not depend on h and are linear on the curve \(Q^{[k]}\). This variational principle is already regular through \(h=0\), as we will see when we proceed to find the solutions later.

Remark 2.2

The data \(q^{[0]}_1\), ..., \(q^{[k-1]}_1\), \(z^{[0]}\), ..., \(z^{[k-1]}\) can be transformed into the endpoint conditions for the variational principle 3 in a straightforward way, for any h, using (6) and (7). The converse (7) is possible only for \(h\ne 0\), in principle. However, if \(h=0\) let \((Q^{[0]}(u),\ldots ,Q^{[k]}(u))\) a solution for the variational principle 3 with boundary conditions \((q_{1}^{[0]},\ldots , q^{[k-1]}_1)\) and \((q^{[0]}_2,\ldots ,q^{[k-1]}_2)\). Define \(z^{[j]}\) by the constraint in (4). Since \(Q^{[k]}\) is constant and \(\frac{(1-s)^{k-j-1}}{(k-j-1)!}>0\) in (0, 1), to different values of \(Q^{[k]}\) correspond different values of \(z^{[j]}\). Then \(Q^{[k]}\) is a solution of 4 with boundary conditions \(q^{[0]}_1\), ..., \(q^{[k-1]}_1\), \(z^{[0]}\), ..., \(z^{[k-1]}\).

Finally, we will introduce a modification that will enable us to carry out the computations in the next section easily. Consider the inner product on \(C^0([0,1],{\mathbb {R}})\) given by

$$\begin{aligned} \langle f, g \rangle = \int _0^1 f(s) g(s)\,\mathrm{{d}}s. \end{aligned}$$

If \(f\in C^0([0,1],{\mathbb {R}})\) and \(V=(V_1, \dots , V_n)\in C^0([0,1],{\mathbb {R}}^n)\) we define the bilinear operation

$$\begin{aligned} \langle \langle f, V \rangle \rangle = \int _0^1 f(s) V(s)\,\mathrm{{d}}s=\left( \langle f, V_0\rangle , \dots ,\langle f, V_n\rangle \right) \in {\mathbb {R}}^n. \end{aligned}$$

Then the integrals appearing in the constraints in the variational principle 4 are \(\langle \langle a^{[k]}_j, Q^{[k]}\rangle \rangle \), where \(a^{[k]}_j\) are the polynomials

$$\begin{aligned} a^{[k]}_j(s)=\frac{(1-s)^{k-j-1}}{(k-j-1)!},\quad j=0, \dots , k-1. \end{aligned}$$

These form a basis of the space of polynomials of degree at most \(k-1\). Let us consider a basis \(b^{[k]}_j(s)\), \(j=0, \dots , k-1\), of the same space of polynomials consisting of orthonormal polynomials on [0, 1], and let \((\gamma ^{[k],i}_j)\), where \(i,j=0, \dots , k-1\), be the invertible real matrix such that \(a^{[k]}_j(s)=\gamma ^{[k],i}_jb^{[k]}_i(s)\). For example, for \(k=2\),

$$\begin{aligned} a^{[2]}_0(s)=1-s,\quad a^{[2]}_1(s)=1, \end{aligned}$$

and we can take for instance the orthonormal basis

$$\begin{aligned} b^{[2]}_0(s)=\sqrt{3}(1-2s),\quad b^{[2]}_1(s)=1; \end{aligned}$$

therefore,

$$\begin{aligned} \gamma ^{[2],0}_0=\textstyle \frac{1}{2\sqrt{3}},\quad \gamma ^{[2],1}_0=\frac{1}{2},\quad \gamma ^{[2],0}_1=0,\quad \gamma ^{[2],1}_1=1. \end{aligned}$$

Using this matrix, the constraints can be rewritten as

$$\begin{aligned} z^{[j]}=\langle \langle a^{[k]}_j,Q^{[k]}\rangle \rangle =\gamma ^{[k],i}_j\langle \langle b^{[k]}_i(s),Q^{[k]}\rangle \rangle , \end{aligned}$$

for \(j=0, \dots , k-1\). This allows us to reformulate the variational principle in an equivalent way by replacing the data \((z^{[0]}, \dots , z^{[k-1]})\) and constraints \(\langle \langle a^{[k]}_j,Q^{[k]}\rangle \rangle =z^{[j]}\) by new data \((w^{[0]}, \dots , w^{[k-1]})\) and constraints \(\langle \langle b^{[k]}_j,Q^{[k]}\rangle \rangle =w^{[j]}\), \(j=0, \dots , k-1\). The old and new data are related by

$$\begin{aligned} \sum _{i=0}^{k-1} \gamma ^{[k],i}_j w^{[i]}=z^{[j]}. \end{aligned}$$
(8)

Variational Principle 5

Given h, \((q^{[0]}_1, \dots , q^{[k-1]}_1)\) and \((w^{[0]}, \dots , w^{[k-1]})\), find a continuous curve \(Q^{[k]}:[0,1] \rightarrow {\mathbb {R}}^n\) that is a critical point of

$$\begin{aligned} S_h=\int _0^1L\left( Q^{[0]}(u), \dots , Q^{[k]} (u)\right) \,\mathrm{{d}}u, \end{aligned}$$

where \(Q^{[0]}(u)\), ..., \(Q^{[k-1]}(u)\) are defined by

$$\begin{aligned} Q^{[j]}(u)=q^{[j]}_1+\sum _{i=1}^{k-j-1}\frac{h^iu^i}{i!}q^{[j+i]}_1+h^{k-j}\int _0^u \frac{(u-s)^{k-j-1}}{(k-j-1)!}Q^{[k]}(s)\,\mathrm{{d}}s, \end{aligned}$$
(9)

subject to the constraints

$$\begin{aligned} \int _0^1 b^{[k]}_j(s)Q^{[k]}(s)\,\mathrm{{d}}s=w^{[j]},\quad j=0, \dots , k-1. \end{aligned}$$

2.5 Solution of the Regularized Problem

Next, we will study the existence and uniqueness of solutions associated with variational principle 5. We will show that the boundary value problem is well posed, and that even though the variational problem is posed on the space of \(C^{k}\) solutions, the extremizers are \(C^{2k}\) and hence satisfy the Euler–Lagrange equations.

We start the proof by showing the \(C^{k+1}\) differentiability of the action \(S_h\) for the variational principle 5. Next, we compute the gradient of \(S_h\) in order to solve the equation “gradient of \(S_h\) perpendicular to constraint space.” After introducing an orthogonal decomposition of the constraint space we obtain that \(S_h\) has a critical point on the constraint set if and only if the orthogonal projection of the gradient of \(S_h\) is 0 and hence we can find the stationary curve for the variational principle 5. Using the implicit function theorem we obtain existence and uniqueness of solutions for the variational principle 5. Finally, we reverse the regularization to obtain a unique \(C^{2k}\) solution of the original variational principle.

2.5.1 Step 1—\(C^{k+1}\) Differentiability of \(S_{h}\):

Let \(S_h\) be given as in the variational principle 5, regarded as a real-valued map defined on the Banach space \(C^0([0,1],{\mathbb {R}}^n)\) of curves \(Q^{[k]}(u)\). We can also consider its restriction to the Banach space \(C^k([0,1],{\mathbb {R}}^n)\). We are going to use the following lemma Abraham et al. (1988).

Lemma 2.3

(Omega Lemma) Let EF be Banach spaces, U open in E, and M a compact topological space. Let \(g:U \rightarrow F\) be a \(C^r\) map, \(r>0\). The map

$$\begin{aligned} \Omega _g:C^0(M,U) \rightarrow C^0(M,F)\quad \text {defined by}\quad \Omega _g(f)=g\circ f \end{aligned}$$

is also \(C^r\), and \(D\Omega _g(f)\cdot h=[(Dg)\circ f]\cdot h\).

The objective \(S_h\) is the composition of the maps

where i is defined by \(Q^{[k]}(u) \mapsto (Q^{[0]}(u), \dots , Q^{[k]} (u))\). Here \(Q^{[0]}(u),\dots \), \(Q^{[k-1]} (u)\) stand for the right-hand sides of (9). Both i and \(\int \) are bounded affine and therefore \(C^\infty \). By the Omega Lemma, \(\Omega _L\) is \(C^{k+1}\) because L is \(C^{k+1}\), and therefore so is \(S_h\).

If we regard \(S_h\) as defined on \(C^k([0,1],{\mathbb {R}}^n)\), we should append the inclusion \(C^k([0,1],{\mathbb {R}}^n)\hookrightarrow C^0([0,1],{\mathbb {R}}^n)\) to the left side of the diagram above. This inclusion is \(C^\infty \) because it is linear and bounded (\(\Vert Q^{[k]}\Vert _{C^0}\le \Vert Q^{[k]}\Vert _{C^k}\) for all \(Q^{[k]}\)). Then \(S_h\) is \(C^{k+1}\) also as a map defined on \(C^k([0,1],{\mathbb {R}}^n)\). In order to cover both cases, from now on l will denote 0 or k interchangeably.

2.5.2 Step 2—Computing the Gradient of \(S_h\):

We need a suitable notion of the gradient of \(S_h\), in order to find where it is perpendicular to the constraint space. In order to do that, let us first compute \(\mathbf {d}S_h[Q^{[k]}(u)]\), for \(Q^{[k]}\) of class \(C^l\). The functions \(Q^{[0]}(u)\), ..., \(Q^{[k-1]}(u)\) are defined by (9). Since \(S_h\) is smooth, we will compute \(\mathbf {d}S_h\) using directional derivatives. For an arbitrary \(\delta Q^{[k]}\) of class \(C^l\), take a deformation \(Q^{[k]}_\epsilon (u)=Q^{[k]}(u)+\epsilon \delta Q^{[k]}(u)\) of \(Q^{[k]}(u)\). For \(j=0, \dots , k-1\), define the corresponding lower order curves as in (9) by

$$\begin{aligned} Q^{[j]}_\epsilon (u)= q^{[j]}_1+\sum _{i=1}^{k-j-1}\frac{h^iu^i}{i!}q^{[j+i]}_1+h^{k-j}\int _0^u \frac{(u-s)^{k-j-1}}{(k-j-1)!} Q^{[k]}_\epsilon (s)\,\mathrm{{d}}s, \end{aligned}$$
(10)

so \(Q^{[j]}_0(u)=Q^{[j]}(u)\) and

$$\begin{aligned} \left. \frac{\mathrm{d}}{\mathrm{d}\epsilon }\right| _{\epsilon =0}Q^{[j]}_\epsilon (u)= h^{k-j}\int _0^u \frac{(u-s)^{k-j-1}}{(k-j-1)!}\delta Q^{[k]}(s)\,\mathrm{{d}}s. \end{aligned}$$

Denoting \(a^{[k]}_j(u,s)={(u-s)^{k-j-1}}/{(k-j-1)!}\) and \(Q(u)=(Q^{[0]}(u), \dots \), \(Q^{[k]} (u))\) for short, we have

$$\begin{aligned} \mathbf {d}S_h&[Q^{[k]}(u)]\cdot \delta Q^{[k]}(u)=\left. \frac{\mathrm{d}}{\mathrm{d}\epsilon }\right| _{\epsilon =0}\int _0^1L\left( Q^{[0]}_\epsilon (u), \dots , Q^{[k]}_\epsilon (u)\right) \mathrm{{d}}u\\&=\int _0^1 \left( \sum _{j=0}^{k-1} \frac{\partial L}{\partial q^{[j]}}(Q(u))h^{k-j}\int _0^u a^{[k]}_j(u,s) \delta Q^{[k]}(s)\,\mathrm{{d}}s+\frac{\partial L}{\partial q^{[k]}}(Q(u))\delta Q^{[k]}(u)\right) \mathrm{{d}}u\\&=\sum _{j=0}^{k-1}\int _0^1\int _s^1\frac{\partial L}{\partial q^{[j]}}(Q(u))h^{k-j} a^{[k]}_j(u,s) \delta Q^{[k]}(s)\,\mathrm{{d}}u\,\mathrm{{d}}s +\int _0^1 \frac{\partial L}{\partial q^{[k]}}(Q(u))\delta Q^{[k]}(u)\,\mathrm{{d}}u\\&=\sum _{j=0}^{k-1}\int _0^1\int _u^1\frac{\partial L}{\partial q^{[j]}}(Q(s))h^{k-j} a^{[k]}_j(s,u) \delta Q^{[k]}(u)\,\mathrm{{d}}s\,\mathrm{{d}}u +\int _0^1 \frac{\partial L}{\partial q^{[k]}}(Q(u))\delta Q^{[k]}(u)\,\mathrm{{d}}u\\&=\int _0^1\left( \sum _{j=0}^{k-1}\int _u^1\frac{\partial L}{\partial q^{[j]}}(Q(s))h^{k-j} a^{[k]}_j(s,u)\,\mathrm{d}s + \frac{\partial L}{\partial q^{[k]}}(Q(u))\right) \delta Q^{[k]}(u)\,\mathrm{{d}}u. \end{aligned}$$

For each \(u\in [0,1]\), the first factor in the integrand of the last expression is in \(({\mathbb {R}}^{n})^{*}\). If \(\sharp :({\mathbb {R}}^{n})^{*}\rightarrow {\mathbb {R}}^{n}\) denotes the index raising operator associated to the Euclidean inner product, define

$$\begin{aligned} \nabla S_h[Q^{[k]}(u)](u):=\left( \sum _{j=0}^{k-1}\int _u^1\frac{\partial L}{\partial q^{[j]}}(Q(s))h^{k-j} a^{[k]}_j(s,u)\,\mathrm{{d}}s + \frac{\partial L}{\partial q^{[k]}}(Q(u))\right) ^{\sharp }. \end{aligned}$$

Since \({\partial L}/{\partial q^{[0]}}\), ..., \({\partial L}/{\partial q^{[k]}}\) are \(C^{k}\) and the curve Q is \(C^l\) (\(l=0\) or \(l=k\)) , then \(\nabla S_h[Q^{[k]}(u)]\) is \(C^l([0,1],{\mathbb {R}}^n)\). Then we have a vector field

$$\begin{aligned} \nabla S_h:C^l([0,1],{\mathbb {R}}^n)\rightarrow C^{l}([0,1],{\mathbb {R}}^n) \end{aligned}$$

which we call the gradient of \(S_h\). By the Omega Lemma, \(\nabla S_h\) is a \(C^k\) map.

2.5.3 Step 3—Orthogonal Decomposition of the Constraint Space and Critical Points of \(S_h\):

Let us now compute the tangent space to the constraint set. If we consider the inner product on \(C^{l}([0,1],{\mathbb {R}}^n)\) given by

$$\begin{aligned} \llbracket V,W \rrbracket =\int _0^1 V(u)\cdot W(u)\,\mathrm{{d}}u, \end{aligned}$$

then

$$\begin{aligned} \mathbf {d}S_h[Q^{[k]}(u)]\cdot \delta Q^{[k]}(u)=\llbracket \nabla S_h[Q^{[k]}(u)],\delta Q^{[k]}(u)\rrbracket . \end{aligned}$$

The constraints \(g_j[Q^{[k]}(s)]:=\langle \langle b^{[k]}_j,Q^{[k]}\rangle \rangle =w^{[j]}\), \(j=0, \dots , k-1\), in the variational principle 5 are bounded and linear, and therefore \(C^\infty \), and the corresponding derivatives are the same functions \(g_j\). Define

$$\begin{aligned} g=(g_0, \dots , g_{k-1}):C^l([0,1],{\mathbb {R}}^n) \rightarrow ({\mathbb {R}}^n)^k \end{aligned}$$

so

$$\begin{aligned} E={\text {Ker}}g \subset C^l([0,1],{\mathbb {R}}^n) \end{aligned}$$

is the tangent space to the constraint set. They are actually parallel since the constraints are linear. It is not difficult to show using the definitions that the space

$$\begin{aligned} E^\perp =\{c^jb^{[k]}_j \,|\, c^0, \dots , c^{k-1}\in {\mathbb {R}}^n\} \end{aligned}$$

of \({\mathbb {R}}^n\)-valued polynomials of degree at most \(k-1\) is indeed the \(\llbracket ,\rrbracket \)-orthogonal complement of E, which is then a split subspace (see “Appendix” for a proof). The orthogonal projection \(P:C^l([0,1],{\mathbb {R}}^n)=E\oplus E^\perp \rightarrow E\) is given by

$$\begin{aligned} P\left( \delta Q^{[k]}(u)\right) =\delta Q^{[k]}(u)-\sum _{j=0}^{k-1}\langle \langle b^{[k]}_j, \delta Q^{[k]}\rangle \rangle \, b^{[k]}_j. \end{aligned}$$

Now \(S_h\) has a critical point on the constraint set (for any value of the constraints) if and only if the projection \(P\nabla S_h\) of \(\nabla S_h\) to the tangent space E of the constraint set is 0.

2.5.4 Step 4—Existence and Uniqueness for the Regularized Problem:

In order to find solutions to the variational principle 5, we solve

$$\begin{aligned} P\nabla S_h(Q^{[k]})=P\nabla S_h(Q^{[k]}_E\oplus Q^{[k]}_{E^\perp })=0 \end{aligned}$$

for \(Q^{[k]}_E\), near

$$\begin{aligned} Q^{[k]}=0,\quad w^{[0]}= \dots = w^{[k-1]}=0,\\ q^{[0]}_1=\bar{q}^{[0]}, \dots , q^{[k-1]}_1=\bar{q}^{[k-1]},\quad h=0. \end{aligned}$$

This can be solved using the implicit function theorem by requiring that the partial derivative of \(P\nabla S_h(Q^{[k]})\) at the point \(Q^{[k]}=0\) with respect to the space E is a linear isomorphism. The variables \(w^{[0]}, \dots , w^{[k-1]}\), \(q^{[0]}_1, \dots , q^{[k-1]}_1\) and h are seen as parameters that can move in some neighborhood. Note that it is not necessary to solve for \(Q^{[k]}_{E^\perp }\) since it is completely determined by \(w^{[0]}, \dots , w^{[k-1]}\) using the constraint equations in variational principle 5.

In order to compute this partial derivative, take a deformation of \(Q^{[k]}=0\) of the form \(Q^{[k]}_\epsilon =\epsilon \delta Q^{[k]}_E\), where \(\delta Q^{[k]}_E\in E\). Recalling (10) and noting that \(h=0\), we have

Here the inner products vanish because \(\frac{\partial ^2 L}{\partial q^{[k]2}}(\bar{q}^{[0]}, \dots , \bar{q}^{[k-1]},0)\) is a constant matrix (that is, it does not depend on u) and \(\langle \langle b^{[j]},\delta Q^{[k]}_E\rangle \rangle =0\) for \(j=0, \dots , k-1\).

Then the derivative is precisely \(\frac{\partial ^2 L}{\partial q^{[k]2}}(\bar{q}^{[0]}, \dots , \bar{q}^{[k-1]},0)\), seen as a linear map from E into itself, and if L is regular then it is an isomorphism.

By the implicit function theorem, there are neighborhoods \(W_1\subseteq ({\mathbb {R}}^n)^k\times ({\mathbb {R}}^n)^k\times {\mathbb {R}}\) (with variables \((q^{[0]}_1, \dots , q^{[k-1]}_1;w^{[0]}, \dots , w^{[k-1]};h)\)) containing \((\bar{q}^{[0]}\), \(\dots , \bar{q}^{[k-1]};0, \dots , 0;0)\) and \(W_2^{l}\subseteq C^l([0,1],{\mathbb {R}}^n)\) containing the constant curve \(Q^{[k]}(u)=0\), and a \(C^{k}\) map \(\psi :W_1 \rightarrow W_2^{l}\) such that for each \((q^{[0]}_1, \dots , q^{[k-1]}_1;w^{[0]}\), \(\dots , w^{[k-1]};h)\in W_1\), the curve

$$\begin{aligned} Q^{[k]}=\psi (q^{[0]}_1, \dots , q^{[k-1]}_1;w^{[0]}, \dots , w^{[k-1]};h)\in C^l([0,1],{\mathbb {R}}^n) \end{aligned}$$

is the unique critical point in \(W_2^{l}\) of the variational problem 5. Thus, \(\psi \) maps initial conditions, constraint values (which encode the final endpoint conditions for the original problem) and h into \(C^l\) curves.

Let us now consider the cases \(l=0\) and \(l=k\) separately. Taking \(l=k\), \(\psi \) has values in \(W_2^{k}\subseteq C^{k}([0,1],{\mathbb {R}}^n)\). Taking \(l=0\), \(\psi \) has values in \(W_2^{0}\subseteq C^{0}([0,1],{\mathbb {R}}^n)\). However, since \(C^{k}([0,1],{\mathbb {R}}^n)\subset C^{0}([0,1],{\mathbb {R}}^n)\), this \(\psi \) also provides the unique solution among the \(C^0\) curves in a \(C^0\)-open neighborhood of the curve \(u \mapsto 0\), say \(\{Q^{[k]}(u)\,|\,\Vert Q^{[k]}\Vert _0<\epsilon \}\).

2.5.5 Step 5—Reverse of the Regularization:

Let us now reverse the regularization in order to obtain a unique \(C^{2k}\) solution of the variational principle 1. Let \(h\ne 0\). For \((q_1,q_2)=((q_{1}^{[0]},\ldots ,q_{1}^{[k-1]}),(q_{2}^{[0]},\ldots ,q_{2}^{[k-1]}))\in ({\mathbb {R}}^{n})^{k}\times ({\mathbb {R}}^{n})^{k}\) the corresponding values of \(z^{[0]},\ldots ,z^{[k-1]}\) are given by (7) and the values of \(w^{[0]},\ldots ,w^{[k-1]}\) can be computed from (8) using the inverse matrix of \(\left( \gamma _{j}^{[k],i}\right) \). This defines a smooth function \((w^{[0]},\ldots ,w^{[k-1]})=\varpi (q_1,q_2,h)\). Note that the condition that \(q_1\) and \(q_2\) are close translates into the condition that \((w^{[0]},\ldots ,w^{[k-1]})\) is close to 0.

Let \(h>0\) be such that \((\bar{q}^{[0]}, \dots , \bar{q}^{[k-1]};0,\dots ,0;h)\in W_1\). Define

$$\begin{aligned} \widetilde{W}_1=\{(q_1,q_2)\in ({\mathbb {R}}^n)^k \times ({\mathbb {R}}^n)^k\,|\, (q_1;\varpi (q_1,q_2,h);h)\in W_1\} \end{aligned}$$

and for each \((q_1,q_2)=\left( (q^{[0]}_1, \dots , q^{[k-1]}_1),(q^{[0]}_2, \dots , q^{[k-1]}_2)\right) \in W_1\) define the curve \(Q^{[0]}_{(q_1,q_2)}(u)\) according to (5) as

$$\begin{aligned} Q^{[0]}_{(q_1,q_2)}(u)=\sum _{i=0}^{k-1}\frac{h^iu^i}{i!}q^{[i]}_1+h^{k}\int _0^u \frac{(u-s)^{k-1}}{(k-1)!} \psi \left( q_1;\varpi (q_1,q_2,h);h\right) (s)\,\mathrm{{d}}s. \end{aligned}$$

Since \(\psi \) takes values in the \(C^k\) curves, \(Q^{[0]}_{(q_1,q_2)}(u)\) is \(C^{2k}\) by the reasoning leading to equation (5).

Now reparameterize with \(t=hu\) to get a \(C^{2k}\) curve

$$\begin{aligned} q^{[0]}_{(q_1,q_2)}(t)=\sum _{i=0}^{k-1}\frac{t^i}{i!}q^{[i]}_1+\left( \frac{t}{u}\right) ^{k}\int _0^{t/h} \frac{(t/h-s)^{k-1}}{(k-1)!} \psi \left( q_1;\varpi (q_1,q_2,h);h\right) (s)\,\mathrm{{d}}s \end{aligned}$$

on Q, defined for \(t\in [0,h]\). This curve is the unique solution of the variational principle 1 with endpoint conditions \(q_1\) and \(q_2\).

This solution is \(C^{2k}\), and unique among the curves corresponding to \(Q^{[k]}\) continuous with \(\Vert Q^{[k]}\Vert _0<\epsilon \). These are the \(C^k\) curves q(t) on Q with \(\Vert q^{(k)}\Vert _0<\epsilon /h^k\), which are the \(C^k\) curves in some \(C^k\) neighborhood of the constant curve \(t \mapsto \bar{q}^{[0]}\).

3 The Exact Discrete Lagrangian and Discrete Equations for Second-Order Systems

Next, we will consider second-order Lagrangian systems, motivated by the study of optimal control problems. Let Q be a configuration manifold and let \(L:T^{(2)}Q\rightarrow {\mathbb {R}}\) be a regular Lagrangian.

Definition 3.1

Given a small enoughFootnote 2 \(h>0\), the exact discrete Lagrangian \(L_d^{e}:TQ\times TQ\rightarrow {\mathbb {R}}\) is defined by

$$\begin{aligned} L_d^{e}\left( q_0,\dot{q}_0,q_1,\dot{q}_1\right) =\int _{0}^{h}L\left( q(t),\dot{q}(t),\ddot{q}(t)\right) \mathrm{{d}}t, \end{aligned}$$

where \(q:[0,h]\rightarrow Q\) is the unique solution of the Euler–Lagrange equations for the second-order Lagrangian L,

$$\begin{aligned} \frac{\mathrm{d}^2}{\mathrm{{d}}t^2}\frac{\partial L}{\partial \ddot{q}}-\frac{\mathrm{d}}{\mathrm{{d}}t}\frac{\partial L}{\partial \dot{q}}+\frac{\partial L}{\partial q}=0, \end{aligned}$$

satisfying the boundary conditions \(q(0)=q_0,q(h)=q_1,\dot{q}(0)=\dot{q}_0\) and \(\dot{q}(h)=\dot{q}_1\).

Strictly speaking, the exact discrete Lagrangian is defined not on \(TQ\times TQ\) but on a neighborhood of the diagonal. For the sake of simplicity, we will not make this distinction. Our idea is to take a discrete Lagrangian \(L_{d}:TQ\times TQ\rightarrow {\mathbb {R}}\) as an approximation of \(L_{d}^{e}:TQ\times TQ\rightarrow {\mathbb {R}}\), to construct variational integrators in the same way as in discrete mechanics (see Sect. 4). In other words, for given \(h>0\) we define \(L_d(q_0,v_0,q_1,v_1)\) as an approximation of the action integral along the exact solution curve segment q(t) with boundary conditions \(q(0)=q_0\), \(\dot{q}(0)=v_0\), \(q(h)=q_1\), and \(\dot{q}(h)=v_1\). For example, we can use the formula

$$\begin{aligned} L_d(q_0,v_0, q_1,v_1)=hL\left( \kappa (q_0,v_0,q_1,v_1),\chi (q_0,v_0,q_1,v_1),\zeta (q_0,v_0,q_1,v_1)\right) , \end{aligned}$$

where \(\kappa \), \(\chi \) and \(\zeta \) are functions of \((q_0,v_0,q_1,v_1)\in TQ\times TQ\) which approximate the configuration q(t), the velocity \(\dot{q}(t)\) and the acceleration \(\ddot{q}(t)\), respectively, in terms of the initial and final positions and velocities. We can also, for instance, consider suitable linear combinations of discrete Lagrangians of this type, for instance, weighted averages of the type

$$\begin{aligned} L_{d}(q_0,v_0,q_1,v_1)=\frac{1}{2}L\left( q_0,v_0,\frac{v_1-v_0}{h}\right) +\frac{1}{2}L\left( q_1,v_1,\frac{v_1-v_0}{h}\right) , \end{aligned}$$

or other combinations.

For completeness, we will derive the discrete equations for the Lagrangian \(L_{d}:TQ\times TQ\rightarrow {\mathbb {R}}\), but these results are a direct translation of Marsden and West Marsden and West (2001) to our case.

Given the grid \(\{t_{k}=kh\mid k=0,\ldots ,N\}\), \(Nh=T\), define the discrete path space \({\mathcal {P}}_{d}(TQ):=\{(q_{d},v_{d}):\{t_{k}\}_{k=0}^{N}\rightarrow TQ\}\). We will identify a discrete trajectory \((q_{d},v_{d})\in {\mathcal {P}}_{d}(TQ)\) with its image \((q_{d},v_{d})=\{(q_{k},v_{k})\}_{k=0}^{N}\) where \((q_{k},v_{k}):=(q_{d}(t_{k}),v_{d}(t_{k}))\). The discrete action \({\mathcal {A}}_{d}:{\mathcal {P}}_{d}(TQ)\rightarrow {\mathbb {R}}\) along this sequence is calculated by summing the discrete Lagrangian evaluated at each pair of adjacent points of the discrete path, that is,

$$\begin{aligned} {\mathcal {A}}_d(q_{d},v_{d}):=\sum _{k=0}^{N-1}L_d(q_k,v_k,q_{k+1},v_{k+1}). \end{aligned}$$

We would like to point out that the discrete path space is isomorphic to the smooth product manifold which consists on \(N+1\) copies of TQ, the discrete action inherits the smoothness of the discrete Lagrangian, and the tangent space \(T_{(q_{d},v_{d})}{\mathcal {P}}_{d}(TQ)\) at \((q_{d},v_{d})\) is the set of maps \(a_{(q_{d},v_{d})}:\{t_{k}\}_{k=0}^{N}\rightarrow TTQ\) such that \(\tau _{TQ}\circ a_{(q_{d},v_{d})}=(q_{d},v_{d})\) where \(\tau _{TQ}:TTQ\rightarrow TQ\) is the canonical projection.

Hamilton’s principle seeks discrete curves \(\{(q_{k},v_{k})\}_{k=0}^{N}\) that satisfy

$$\begin{aligned} \delta \sum _{k=0}^{N-1}L_{d}(q_{k},v_{k},q_{k+1}, v_{k+1})=0 \end{aligned}$$

for all variations \(\{(\delta q_{k},\delta v_{k})\}_{k=0}^{N}\) vanishing at the endpoints. This is equivalent to the discrete Euler–Lagrange equations

$$\begin{aligned} D_3L_d(q_{k-1},v_{k-1},q_k,v_k)+D_1L_d(q_k,v_k,q_{k+1},v_{k+1})=0,\end{aligned}$$
(11a)
$$\begin{aligned} D_4L_d(q_{k-1},v_{k-1},q_k,v_k)+D_2L_d(q_k,v_k,q_{k+1},v_{k+1})=0, \end{aligned}$$
(11b)

for \(1\le k\le N-1\).

Given a solution \(\{q_{k}^{*},v_{k}^{*}\}_{k\in {\mathbb {Z}}}\) of equations (11) and assuming that the \(2n\times 2n\) matrix

$$\begin{aligned} \left( \begin{array}{cc} D_{13}L_d(q_{k},v_{k},q_{k+1},v_{k+1}) &{} D_{14}L_d(q_{k},v_{k},q_{k+1},v_{k+1}) \\ D_{23}L_d(q_{k},v_{k},q_{k+1},v_{k+1}) &{} D_{24}L_d(q_{k},v_{k},q_{k+1},v_{k+1}) \\ \end{array} \right) \end{aligned}$$

is nonsingular, it is possible to define the (local) discrete flow \(F_{L_{d}}:{\mathcal {U}}_{k}\subset TQ\times TQ\rightarrow TQ\times TQ\) mapping \((q_{k-1},v_{k-1},q_{k},v_{k})\) to \((q_{k},v_{k},q_{k+1},v_{k+1})\) from (11) where \({\mathcal {U}}_{k}\) is a neighborhood of the point \((q_{k-1}^{*},v_{k-1}^{*},q_{k}^{*},v_{k}^{*})\). The simplecticity and momentum preservation of the discrete flow is derived in Marsden and West (2001).

Example 3.2

(Cubic splines) Let \(Q={\mathbb {R}}^{n}\) and \(L:T^{(2)}Q\equiv ({\mathbb {R}}^{n})^{3}\rightarrow {\mathbb {R}}\) be the second-order Lagrangian given by \(L(q,\dot{q},\ddot{q})=\frac{1}{2}\ddot{q}^{2}\).

It is well known that the solutions to the corresponding Euler–Lagrange equations \(q^{(4)}=0\) are the so-called cubic splines \(q(t)=at^{3}+bt^{2}+ct+d\), for \(a,b,c,d\in {\mathbb {R}}^{n}\). We define \(L_{d}:({\mathbb {R}}^{n}\times {\mathbb {R}}^{n})\times ({\mathbb {R}}^{n}\times {\mathbb {R}}^{n})\rightarrow {\mathbb {R}}\) as follows. Write

(12a)
(12b)

Given sufficiently close \((q_0,v_0),(q_1,v_1)\in TQ\) we can use equations (12) to obtain approximations of the acceleration of the exact solution joining these boundary conditions at time h, which we call

$$\begin{aligned} a_0=\frac{2}{h^{2}}(q_1-q_0-hv_0)\hbox { and } a_1=\frac{2}{h^{2}}(q_0-q_1+hv_1). \end{aligned}$$

Then we define

$$\begin{aligned} L_{d}(q_0,v_0,q_1,v_1)= & {} \frac{h}{2}\left( L(q_0,v_0,a_0)+L(q_1,v_1,a_1)\right) \\= & {} \frac{(hv_1+q_0-q_1)^{2}}{h^3}+\frac{(-hv_0-q_0+q_1)^2}{h^{3}}. \end{aligned}$$

Solving the discrete second-order Euler–Lagrange equations for this discrete Lagrangian, the evolution of the discrete trajectory is

$$\begin{aligned}&q_{k+1}=q_{k-1}+2hv_k, \end{aligned}$$
(13a)
$$\begin{aligned}&v_{k+1}=v_{k-1}+4\left( v_{k}-\frac{q_{k}-q_{k-1}}{h}\right) . \end{aligned}$$
(13b)

In the following section we will continue this example and show some simulations.

3.1 Discrete Legendre Transforms

We define the discrete Legendre transforms \({\mathbb {F}}^{+}L_{d},{\mathbb {F}}^{-}L_{d}:TQ\times TQ\rightarrow T^{*}TQ\) which maps the space \(TQ\times TQ\) into \(T^{*}TQ\). These are given by

$$\begin{aligned} {\mathbb {F}}^{+}L_{d}(q_{0},v_0,q_1,v_1)&=\left( q_0,v_0,-D_{1}L_{d}(q_0,v_0,q_1,v_1),-D_{2}L_{d}(q_0,v_0,q_1,v_1)\right) ,\\ {\mathbb {F}}^{-}L_{d}(q_{0},v_0,q_1,v_1)&=\left( q_1,v_1,D_{3}L_{d}(q_0,v_0,q_1,v_1),D_{4}L_{d}(q_0,v_0,q_1,v_1)\right) . \end{aligned}$$

If both discrete fiber derivatives are locally diffeomorphisms for nearby \((q_0,v_0)\) and \((q_1,v_1)\), then we say that \(L_d\) is regular.

Using the discrete Legendre transforms the discrete Euler–Lagrange equations (11) can be rewritten as

$$\begin{aligned} {\mathbb {F}}^{-}L_{d}(q_k,v_k,q_{k+1},v_{k+1})={\mathbb {F}}^{+}L_{d}(q_{k-1},v_{k-1},q_k,v_k). \end{aligned}$$

It will be useful to note that

$$\begin{aligned} {\mathbb {F}}^{-}L_{d}\circ F_{L_d}(q_0,v_0,q_1,v_1)&={\mathbb {F}}^{-}L_{d}(q_1,v_1,q_2,v_2)\\&=\left( q_1,v_1,-D_{1}L_{d}(q_1,v_1,q_2,v_2),-D_{2}L_{d}(q_1,v_1,q_2,v_2)\right) \\&=\left( q_1,v_1,D_{3}L_{d}(q_0,v_0,q_1,v_1),D_{4}L_{d}(q_0,v_0,q_1,v_1)\right) \\&={\mathbb {F}}^{+}L_{d}(q_0,v_0,q_1,v_1), \end{aligned}$$

that is,

$$\begin{aligned} {\mathbb {F}}^{+}L_{d}={\mathbb {F}}^{-}L_{d}\circ F_{L_{d}}. \end{aligned}$$
(14)

Remark 3.3

It is easy to extend this framework to higher-order mechanical systems. Let \(L:T^{(\ell )}Q\rightarrow {\mathbb {R}}\) be a regular higher-order Lagrangian. Given a small enough \(h>0\), the exact discrete Lagrangian \(L_d^{e}:T^{(\ell -1)}Q\times T^{(\ell -1)}Q\rightarrow {\mathbb {R}}\) is defined by

$$\begin{aligned} L_d^{e}\left( q_0^{(0)},q_0^{(1)},\ldots ,q_0^{(\ell -1)};q_1^{(0)},q_1^{(1)},\ldots ,q_1^{(\ell -1)}\right) =\int _{0}^{h}L\left( q(t),\dot{q}(t),\ldots ,q^{(\ell )}(t)\right) \mathrm{{d}}t, \end{aligned}$$

where \(q(t):I\subset {\mathbb {R}}\rightarrow Q\) is the unique solution of the Euler–Lagrange equations for the higher-order Lagrangian L,

$$\begin{aligned} \sum _{j=0}^{\ell }(-1)^j \frac{\mathrm{d}^j}{\mathrm{{d}}t^j}\frac{\partial L}{\partial q^{(j)}}=0, \end{aligned}$$

satisfying the boundary conditions \(q(0)=q_0^{(0)},\dot{q}(0)=q_0^{(1)},\ldots ,q^{(\ell -1)}(0)=q_0^{(\ell -1)},q(h)=q_1^{(0)},\dot{q}(h)=q_1^{(1)},\ldots ,q^{(\ell -1)}(h)=q_1^{(\ell -1)}\).

The exact discrete Lagrangian is actually defined on a neighborhood of the diagonal of \(T^{(\ell -1)}Q\times T^{(\ell -1)}Q\). We take \(L_{d}:T^{(\ell -1)}Q\times T^{(\ell -1)}Q\rightarrow {\mathbb {R}}\) to be an approximation of \(L_{d}^{e}\) in order to construct variational integrators for higher-order mechanical systems.

Given a discrete path \(\{(q_k^{(0)},\dots ,q_k^{(\ell -1)})\in T^{(\ell -1)}Q\}|_{k=0}^N\), the corresponding discrete action is defined as

$$\begin{aligned} {\mathcal {A}}_d:=\sum _{k=0}^{N-1}L_d\left( q_k^{(0)},\dots ,q_k^{(\ell -1)};q_{k+1}^{(0)},\dots ,q_{k+1}^{(\ell -1)}\right) . \end{aligned}$$

Hamilton’s principle seeks discrete paths that satisfy \(\delta {\mathcal {A}}_d=0\) for all variations \(\{(\delta q_k^{(0)},\dots ,\delta q_k^{(\ell -1)})|_{k=0}^N\}\) vanishing at the endpoints \(k=0,N\). This is equivalent to the discrete higher-order Euler–Lagrange equations for \(L_{d}\):

$$\begin{aligned}&D_{i+\ell }L_d\left( q_{k-1}^{(0)},\dots ,q_{k-1}^{(\ell -1)};q_{k}^{(0)},\dots ,q_{k}^{(\ell -1)}\right) \\&\quad +\,D_{i}L_d\left( q_{k}^{(0)},\dots ,q_{k}^{(\ell -1)};q_{k+1}^{(0)},\dots ,q_{k+1}^{(\ell -1)}\right) =0 \end{aligned}$$

for \(i=1,\dots ,\ell \) and \(k=1,\dots ,N-1\).

4 Relationship Between Discrete and Continuous Variational Systems

Let \(L:T^{(2)}Q\rightarrow {\mathbb {R}}\) be a regular Lagrangian and, for small enough \(h>0\), consider the exact discrete Lagrangian defined before, that is, a function \(L_d^{e}:TQ\times TQ\rightarrow {\mathbb {R}}\) given by

$$\begin{aligned} L_d^{e}\left( q_0,\dot{q}_0,q_1,\dot{q}_1\right) =\int _{0}^{h}L\left( q(t),\dot{q}(t),\ddot{q}(t)\right) \mathrm{{d}}t, \end{aligned}$$

where \(q:[0,h]\rightarrow Q\) is the unique solution of the Euler–Lagrange equations for the second-order Lagrangian L,

$$\begin{aligned} \frac{\mathrm{d}^2}{\mathrm{{d}}t^2}\frac{\partial L}{\partial \ddot{q}}-\frac{\mathrm{d}}{\mathrm{{d}}t}\frac{\partial L}{\partial \dot{q}}+\frac{\partial L}{\partial q}=0 \end{aligned}$$

satisfying the boundary conditions \(q(0)=q_0,q(h)=q_1,\dot{q}(0)=\dot{q}_0\) and \(\dot{q}(h)=\dot{q}_1\).

The Legendre transformation associated to L is defined to be the map \({\mathbb {F}}L:T^{(3)}Q\rightarrow T^{*}TQ\) given by (see León and Rodrigues (1985))

$$\begin{aligned} {\mathbb {F}}L(q,\dot{q},\ddot{q},{q}^{(3)})=\left( q,\dot{q},\frac{\partial L}{\partial \dot{q}}-\frac{\mathrm{d}}{\mathrm{{d}}t}\frac{\partial {L}}{\partial \ddot{q}},\frac{\partial L}{\partial \ddot{q}}\right) . \end{aligned}$$

We will see that there is a special relationship between the Legendre transform of a regular Lagrangian and the discrete Legendre transforms of the corresponding exact discrete Lagrangian \(L_d^{e}\).

Theorem 4.1

Let \(L:T^{(2)}Q\rightarrow {\mathbb {R}}\) be a regular Lagrangian and \(L_d^{e}:TQ\times TQ\rightarrow {\mathbb {R}}\), the corresponding exact discrete Lagrangian. Then L and \(L_d^{e}\) have Legendre transformations related by

$$\begin{aligned} {\mathbb {F}}^{-}L_d^{e}\left( q(0),\dot{q}(0),q(h),\dot{q}(h)\right)&={\mathbb {F}}L\left( q(0),\dot{q}(0),\ddot{q}(0), {q}^{(3)}(0)\right) \\ {\mathbb {F}}^{+}L_d^{e}\left( q(0),\dot{q}(0),q(h),\dot{q}(h)\right)&={\mathbb {F}}L\left( q(h),\dot{q}(h),\ddot{q}(h), {q}^{(3)}(h)\right) , \end{aligned}$$

where q (t) is a solution of the second-order Euler–Lagrange equations.

Proof

We begin by computing the derivatives of \(L_d^{e}\).

$$\begin{aligned} \frac{\partial L_d^{e}}{\partial q_0}&=\int _{0}^{h}\left( \frac{\partial L}{\partial q}\frac{\partial q}{\partial q_0}+\frac{\partial L}{\partial \dot{q}}\frac{\partial \dot{q}}{\partial q_0}+\frac{\partial L}{\partial \ddot{q}}\frac{\partial \ddot{q}}{\partial q_0}\right) \mathrm{{d}}t\\&=\int _0^{h}\left( \frac{\partial L}{\partial q}\frac{\partial q}{\partial q_0}+\frac{\partial L}{\partial \dot{q}}\frac{\partial \dot{q}}{\partial q_0}-\left( \frac{\mathrm{d}}{\mathrm{{d}}t}\frac{\partial L}{\partial \ddot{q}}\right) \frac{\partial \dot{q}}{\partial q_0}\right) \mathrm{{d}}t+\left( \frac{\partial L}{\partial \ddot{q}}\frac{\partial \dot{q}}{\partial q_0}\right) \Big |_{0}^{h}\\&=\int _0^{h}\left( \frac{\partial L}{\partial q}\frac{\partial q}{\partial q_0}+\left( \frac{\partial L}{\partial \dot{q}}-\frac{\mathrm{d}}{\mathrm{{d}}t}\frac{\partial L}{\partial \ddot{q}}\right) \frac{\partial \dot{q}}{\partial q_0}\right) \mathrm{{d}}t, \end{aligned}$$

where we have used integration by parts and the fact that

$$\begin{aligned} \frac{\partial \dot{q}}{\partial q_0}(0)=0\hbox { and } \frac{\partial \dot{q}}{\partial q_0}(h)=0. \end{aligned}$$

Therefore,

$$\begin{aligned} \frac{\partial L_d^{e}}{\partial q_0}=\left( \left( \frac{\partial L}{\partial \dot{q}}-\frac{\mathrm{d}}{\mathrm{{d}}t}\frac{\partial L}{\partial \ddot{q}}\right) \frac{\partial {q}}{\partial q_0}\right) \Big |_{0}^{h}+\int _{0}^{h}\left( \frac{\partial L}{\partial q}-\frac{\mathrm{d}}{\mathrm{{d}}t}\frac{\partial L}{\partial \dot{q}}+\frac{\mathrm{d}^2}{\mathrm{{d}}t^2}\frac{\partial L}{\partial \ddot{q}}\right) \frac{\partial q}{\partial q_0}\,\mathrm{{d}}t. \end{aligned}$$

Since q(t) is a solution of the Euler–Lagrange equations for \(L:T^{(2)}Q\rightarrow {\mathbb {R}}\), the last term is zero. Therefore,

$$\begin{aligned} \frac{\partial L_d^{e}}{\partial q_0}=\left( \left( \frac{\partial L}{\partial \dot{q}}-\frac{\mathrm{d}}{\mathrm{{d}}t}\frac{\partial L}{\partial \ddot{q}}\right) \frac{\partial q}{\partial q_0}\right) \Big |_{0}^{h}= \left( -\frac{\partial L}{\partial \dot{q}}+\frac{\mathrm{d}}{\mathrm{{d}}t}\frac{\partial L}{\partial \ddot{q}}\right) (q(0),\dot{q}(0),\ddot{q}(0), q^{(3)}(0)), \end{aligned}$$
(15)

because

$$\begin{aligned} \frac{\partial q}{\partial q_0}(0)={\text {Id}} \hbox { and } \frac{\partial q}{\partial q_0}(h)=0. \end{aligned}$$

On the other hand,

$$\begin{aligned} \frac{\partial L_d^{e}}{\partial \dot{q}_0}&=\int _{0}^{h}\left( \frac{\partial L}{\partial q}\frac{\partial q}{\partial \dot{q}_0}+\frac{\partial L}{\partial \dot{q}}\frac{\partial \dot{q}}{\partial \dot{q}_0}+\frac{\partial L}{\partial \ddot{q}}\frac{\partial \ddot{q}}{\partial \dot{q}_0}\right) \mathrm{{d}}t\\&=\int _0^{h}\left( \frac{\partial L}{\partial q}\frac{\partial q}{\partial \dot{q}_0}+\frac{\partial L}{\partial \dot{q}}\frac{\partial \dot{q}}{\partial \dot{q}_0}-\left( \frac{\mathrm{d}}{\mathrm{{d}}t}\frac{\partial L}{\partial \ddot{q}}\right) \frac{\partial \dot{q}}{\partial \dot{q}_0}\right) \mathrm{{d}}t+\left( \frac{\partial L}{\partial \ddot{q}}\frac{\partial \dot{q}}{\partial \dot{q}_0}\right) \Big |_{0}^{h}\\&=\int _0^{h}\left( \frac{\partial L}{\partial q}\frac{\partial q}{\partial \dot{q}_0}+\left( \frac{\partial L}{\partial \dot{q}}-\frac{\mathrm{d}}{\mathrm{{d}}t}\frac{\partial L}{\partial \ddot{q}}\right) \frac{\partial \dot{q}}{\partial \dot{q}_0}\right) \mathrm{{d}}t+\left( \frac{\partial L}{\partial \ddot{q}}\frac{\partial \dot{q}}{\partial \dot{q}_0}\right) \Big |_{0}^{h}\\&=\int _{0}^{h}\left( \frac{\partial L}{\partial q}-\frac{\mathrm{d}}{\mathrm{{d}}t}\frac{\partial L}{\partial \dot{q}}+\frac{\mathrm{d}^2}{\mathrm{{d}}t^2}\frac{\partial L}{\partial \ddot{q}}\right) \frac{\partial q}{\partial \dot{q}_0}\mathrm{{d}}t+\frac{\partial L}{\partial \ddot{q}}\frac{\partial \dot{q}}{\partial \dot{q}_0}\Big |_{0}^{h}+\left( \frac{\partial L}{\partial \dot{q}}-\frac{\mathrm{d}}{\mathrm{{d}}t}\frac{\partial L}{\partial \ddot{q}}\right) \frac{\partial q}{\partial \dot{q}_0}\Big |_{0}^{h}. \end{aligned}$$

Since q(t) is a solution of the Euler–Lagrange equations, the first term is zero, and using that

$$\begin{aligned} \frac{\partial \dot{q}}{\partial \dot{q}_0}(0)={\text {Id}},\quad \frac{\partial \dot{q}}{\partial \dot{q}_0}(h)=0,\quad \frac{\partial q}{\partial \dot{q}_0}(0)=0, \hbox { and } \frac{\partial q}{\partial \dot{q}_0}(h)=0, \end{aligned}$$

we have

$$\begin{aligned} \frac{\partial L_d^{e}}{\partial \dot{q}_0}=-\frac{\partial L}{\partial \ddot{q}}(q(0),\dot{q}(0),\ddot{q}(0)). \end{aligned}$$

Therefore,

$$\begin{aligned} {\mathbb {F}}^{-}L_d^{e}(q(0),\dot{q}(0),q(h),\dot{q}(h))&=\Big (q(0), \dot{q}(0),-\frac{\partial L_d^{e}}{\partial {q}_0}(q(0),\dot{q}(0),q(h),\dot{q}(h)),\\&\qquad -\frac{\partial L_d^{e}}{\partial \dot{q}_0}(q(0),\dot{q}(0),q(h),\dot{q}(h))\Big )\\&={\mathbb {F}}L(q(0),\dot{q}(0),\ddot{q}(0), q^{(3)}(0)). \end{aligned}$$

With similar arguments, we can also prove that

$$\begin{aligned} \frac{\partial L_d^{e}}{\partial q_1}=\left( \frac{\partial L}{\partial \dot{q}}-\frac{\mathrm{d}}{\mathrm{{d}}t}\frac{\partial L}{\partial \ddot{q}}\right) \left( q(h),\dot{q}(h),\ddot{q}(h), q^{(3)}(h)\right) \end{aligned}$$

and

$$\begin{aligned} \frac{\partial L_d^{e}}{\partial \dot{q}_1}=\frac{\partial L}{\partial \ddot{q}}(q(h),\dot{q}(h),\ddot{q}(h)), \end{aligned}$$

and in consequence,

$$\begin{aligned} {\mathbb {F}}^{+}L_d^{e}\left( q(0),\dot{q}(0),q(h),\dot{q}(h)\right) ={\mathbb {F}}L\left( q(h),\dot{q}(h),\ddot{q}(h), q^{(3)}(h)\right) . \end{aligned}$$

\(\square \)

In what follows we will study the relation between the regularity of the continuous Lagrangian, given by the Hessian matrix

$$\begin{aligned} {\mathcal W}=\left( \frac{\partial ^2 L}{\partial \ddot{q}\; \partial \ddot{q}}\right) \end{aligned}$$

and the regularity condition corresponding to the exact discrete Lagrangian \(L_d^e:TQ\times TQ\rightarrow {{\mathbb {R}}}\)

$$\begin{aligned} {\mathcal W}_d=\left( \begin{array}{rr} D_{13}L_d^e&{}D_{14} L_d^e\\ D_{23}L_d^e&{}D_{24}L_d^e \end{array} \right) . \end{aligned}$$

For the next theorem, we restrict ourselves to Lagrangians that can be written locally as

$$\begin{aligned} L(q, \dot{q}, \ddot{q})= \frac{1}{2}g_{ij}(q) \ddot{q}^i \ddot{q}^j+ \ddot{q}^if_i(q, \dot{q})+ V(q, \dot{q}), \end{aligned}$$
(16)

where \((g_{ij}(q))\) is a regular matrix for all q. It is also possible to write this condition intrinsically by using a metric, a connection, a one-form and a function. This covers the kind of Lagrangians that appear in interpolation problems (Gay-Balmaz et al. 2012a) and in optimal control problems with cost functionals of the form \(\frac{1}{2}\int _0^T\Vert u\Vert ^2\mathrm{{d}}t\), where u represents the control force applied to a system having a (first-order) Lagrangian of mechanical type (see Sect. 5).

Theorem 4.2

Let \(L:T^{(2)}Q \rightarrow {\mathbb {R}}\) be a regular Lagrangian of the type (16). For small enough \(h>0\), the corresponding exact discrete Lagrangian \(L_d^e:TQ\times TQ\rightarrow {{\mathbb {R}}}\) is also regular.

Proof

We will work locally. Given \(q_0\), \( \dot{q}_0\), \(q_1\), \( \dot{q}_1\), consider the curve q(t) that solves the Euler–Lagrange equations with those boundary values, as in the definition of \(L_d^e\). Using the Taylor expansions for q(t) and \( \dot{q}(t)\), we can write

for \(h\rightarrow 0\). By differentiating these expressions with respect to the parameters \(q_0\) and \(\dot{q}_0\), we get two systems of equations from which we find

Analogously,

Let us compute \(D_{13}L_d^e\). Denote by F the right-hand side of (15), so

$$\begin{aligned} \frac{\partial L_d^{e}}{\partial q_0^i}(q(0),\dot{q}(0),q(h),\dot{q}(h))&= \left( -\frac{\partial L}{\partial \dot{q}^i}+\frac{\mathrm{d}}{\mathrm{{d}}t}\frac{\partial L}{\partial \ddot{q}^i}\right) \left( q(0),\dot{q}(0),\ddot{q}(0), q^{(3)}(0)\right) \\&=F_i\left( q(0),\dot{q}(0),\ddot{q}(0), q^{(3)}(0)\right) . \end{aligned}$$

Recall that \(q(0),\dot{q}(0),\ddot{q}(0), q^{(3)}(0)\) are obtained as the initial conditions for the higher-order Euler–Lagrange equations that correspond to the boundary conditions \(q(0),\dot{q}(0),q(h),\dot{q}(h)\). We have

$$\begin{aligned} F_i=-\frac{\partial L}{\partial \dot{q}^i}+ \frac{\partial ^2 L}{\partial q^j \partial \ddot{q}^i} \dot{q}^j + \frac{\partial ^2 L}{\partial \dot{q}^j \partial \ddot{q}^i} \ddot{q}^j + \frac{\partial ^2 L}{\partial \ddot{q}^j \partial \ddot{q}^i} q^{(3)j}. \end{aligned}$$

Then

In the expression above, the derivatives are evaluated at the arguments corresponding to time 0 for each function. It is important to note that the first factor involves \(\ddot{q}(0)\) and \(q^{(3)}(0)\), which can blow up for \(h \rightarrow 0\), even in the simple case of cubic splines. However, for L of the type (16) we have

$$\begin{aligned} \frac{\partial ^2 L}{\partial \ddot{q}^k\partial \dot{q}^i}=\frac{\partial f_k}{\partial \dot{q}^i},\qquad \frac{\partial ^2 L}{\partial \dot{q}^k\partial \ddot{q}^i}=\frac{\partial f_i}{\partial \dot{q}^k},\qquad \frac{\mathrm{d}{\mathcal {W}}_{ik}}{\mathrm{{d}}t}= \frac{\mathrm{d}}{\mathrm{{d}}t}\frac{\partial ^2 L}{\partial \ddot{q}^k\partial \ddot{q}^i}=\frac{\mathrm{d}}{\mathrm{{d}}t}g_{ik}=\frac{\partial g_{ik}}{\partial q^l} \dot{q}^l. \end{aligned}$$

These expressions do not contain \(\ddot{q}\) or \(q^{(3)}\), so they are for \(h \rightarrow 0\). Therefore,

The remaining derivatives in \({\mathcal {W}}_d\) can be computed without using the special form (16) of the Lagrangian.

Seeing \({\mathcal {W}}_d\) as a block matrix, a well-known result from linear algebra leads us to

That is, for small enough h, if L is regular then \(L_{d}^{e}\) is regular. \(\square \)

In what follows we denote \(({\textit{TQ}}\times {\textit{TQ}})_2\) the subset of \(({\textit{TQ}}\times {\textit{TQ}})\times ({\textit{TQ}}\times {\textit{TQ}})\) given by

$$\begin{aligned} ({\textit{TQ}}\times {\textit{TQ}})_2:=\{(q_0,\dot{q}_0,q_1,\dot{q}_1,\tilde{q}_1,\dot{\tilde{q}}_1,q_2,\dot{q}_2)\mid \bar{\pi }_2(q_0,\dot{q}_0,q_1,\dot{q}_1)=\bar{\pi }_1(\tilde{q}_1,\dot{\tilde{q}}_1,q_2,\dot{q}_2)\}. \end{aligned}$$

If \(L:T^{(2)}Q\rightarrow {\mathbb {R}}\) is a regular Lagrangian then the Euler–Lagrange equations for L gives rise to a system of explicit fourth-order differential equations

$$\begin{aligned} q^{(4)}=\Psi (q,\dot{q},\ddot{q},q^{(3)}). \end{aligned}$$

Therefore, for h given, it is possible to derive the following application (see Agarwal (1986))

$$\begin{aligned} \Psi _{L}^{h}:T^{(3)}Q\rightarrow T^{(3)}Q \end{aligned}$$

which maps \((q(0),\dot{q}(0),\ddot{q}(0),q^{(3)}(0))\in T^{(3)}Q\) into \((q(h),\dot{q}(h),\ddot{q}(h),q^{(3)}(h))\in T^{(3)}Q\). Therefore, from Theorem 4.1 we deduce the commutativity of the diagram in Fig. 1.

Fig. 1
figure 1

Correspondence between the discrete Legendre transforms and the continuous Hamiltonian flow

Definition 4.3

The discrete Hamiltonian flow is defined by \(\widetilde{F}_{L_d}:T^{*}TQ\rightarrow T^{*}TQ\) as

$$\begin{aligned} \widetilde{F}_{L_d}={\mathbb {F}}^{-}L_d\circ F_{L_d}\circ ({\mathbb {F}}^{-}L_d)^{-1}. \end{aligned}$$
(17)

Alternatively, it can also be defined as \(\widetilde{F}_{L_d}={\mathbb {F}}^{+}L_d\circ F_{L_d}\circ ({\mathbb {F}}^{+}L_d)^{-1}\).

Theorem 4.4

The diagram in Fig. 2 is commutative.

Fig. 2
figure 2

Correspondence between the discrete Lagrangian and the discrete Hamiltonian maps

Proof

The central triangle is (14). The parallelogram on the left-hand side is commutative by (17), so the triangle on the left is commutative. The triangle on the right is the same as the triangle on the left, with shifted indices. Then parallelogram on the right-hand side is commutative, which gives the equivalence stated in the definition of the discrete Hamiltonian flow. \(\square \)

Corollary 4.5

The following definitions of the discrete Hamiltonian map are equivalent

$$\begin{aligned} \widetilde{F}_{L_d}&={\mathbb {F}}^{+}L_d\circ F_{L_d}\circ ({\mathbb {F}}^{+}L_d)^{-1},\\ \widetilde{F}_{L_d}&={\mathbb {F}}^{-}L_d\circ F_{L_d}\circ ({\mathbb {F}}^{-}L_d)^{-1},\\ \widetilde{F}_{L_d}&={\mathbb {F}}^{+}L_d\circ ({\mathbb {F}}^{-}L_d)^{-1}, \end{aligned}$$

and have the coordinate expression \(\widetilde{F}_{L_{d}}:(q_0,\dot{q}_{0},p_0,\tilde{p}_0)\mapsto (q_1,\dot{q}_1,p_1,\tilde{p}_1)\), where we use the notation

$$\begin{aligned} p_0&=-D_1L_d(q_0,\dot{q}_0,q_1,\dot{q}_1),\\ \tilde{p}_0&=-D_2L_d(q_0,\dot{q}_0,q_1,\dot{q}_1),\\ p_1&=D_3L_d(q_0,\dot{q}_0,q_1,\dot{q}_1),\\ \tilde{p}_1&=D_4L_d(q_0,\dot{q}_0,q_1,\dot{q}_1). \end{aligned}$$

Combining Theorem (4.1) with the diagram in Fig. 2 gives the commutative diagram shown in Fig. 3 for the exact discrete Lagrangian.

Fig. 3
figure 3

Correspondence between the exact discrete Lagrangian and the continuous Hamiltonian flow

Here, \(F_{H}^{h}\) denotes the flow of the Hamiltonian vector field \(X_{H}\) associated with the Hamiltonian \(H:T^{*}TQ\rightarrow {\mathbb {R}}\) given by \(H=E_{L}\circ ({\mathbb {F}}L)^{-1}\) where \(E_{L}:T^{(3)}Q\rightarrow {\mathbb {R}}\) denotes the energy function associated to L (see León and Rodrigues 1985).

Theorem 4.6

Under these conditions we have that \(F_H^h = \tilde{F}_{L_d^e}\).

Example 4.7

(Cubic splines (cont.)) Recall that in this example \(Q={\mathbb {R}}^n\) and \(L= \frac{1}{2}\ddot{q}^2\). Since the exact solutions for the second-order Euler–Lagrange equation for L can be found explicitly, it is easy to show that the discrete exact Lagrangian is

$$\begin{aligned} L_d^e(q_0,v_0,q_1,v_1)=\frac{6}{h^3}(q_0-q_1)^2+\frac{6}{h^2}(q_0-q_1)(v_0+v_1)+\frac{2}{h}\left( v_0^2+v_0v_1+v_1^2\right) . \end{aligned}$$

From the corresponding discrete second-order Euler–Lagrange equation, the evolution is

$$\begin{aligned} q_{k+1}&=5q_{k-1}-4q_k+2h(v_{k-1}+2v_k),\\ v_{k+1}&=v_{k-1}+\frac{2}{h}(q_{k-1}-2q_k+q_{k+1}). \end{aligned}$$

It is interesting to note that both this exact method and method (13) preserve the quantity

$$\begin{aligned} \varphi (q_k,v_k,q_{k+1},v_{k+1})=\frac{q_{k+1}-q_k}{h}-\frac{v_k+v_{k+1}}{2}. \end{aligned}$$

A simulation for method (13) is shown in Fig. 4.

Fig. 4
figure 4

Left simulation of the method (13) with \(q_0=(0,0)\) \(v_0=(10,10)\), \(q_N=(10,0)\), \(v_N=(10,20)\), \(N=21\), depicting the computed points and velocities in the xy-plane (velocities are scaled). Right Error in position and velocity for different values of h

4.1 Variational Error Analysis

Now we rewrite the result of Patrick (2006) and Marsden and West (2001) for the particular case of a Lagrangian \(L_d:TQ\times TQ\rightarrow {\mathbb {R}}\).

Definition 4.8

Let \(L_{d}:TQ\times TQ\rightarrow {\mathbb {R}}\) be a discrete Lagrangian. We say that \(L_{d}\) is a discretization of order r if there exist an open subset \(U_{1}\subset T^{(2)}Q\) with compact closure and constants \(C_1>0\), \(h_1>0\) so that

$$\begin{aligned} |L_{d}(q(0),\dot{q}(0),q(h),\dot{q}(h),h)-L_{d}^{e}(q(0),\dot{q}(0),q(h),\dot{q}(h),h)|\le C_{1}h^{r+1} \end{aligned}$$

for all solutions q(t) of the second-order Euler–Lagrange equations with initial conditions \((q_0,\dot{q}_0,\ddot{q}_0)\in U_1\) and for all \(h\le h_1\).

Following Marsden and West (2001), Patrick and Cuell (2009), we have the next result about the order of our variational integrator.

Theorem 4.9

If \(\widetilde{F}_{L_d}\) is the evolution map of an order r discretization \(L_d:TQ\times TQ\rightarrow {\mathbb {R}}\) of the exact discrete Lagrangian \(L_d^{e}:TQ\times TQ\rightarrow {\mathbb {R}}\), then

$$\begin{aligned} \widetilde{F}_{L_d}=\widetilde{F}_{L_{d}^{e}}+{\mathcal {O}}(h^{r+1}). \end{aligned}$$

In other words, \(\widetilde{F}_{L_d}\) gives an integrator of order r for \(\widetilde{F}_{L_{d}^{e}}=F_{H}^{h}\).

Note that given a discrete Lagrangian \(L_{d}:TQ\times TQ\rightarrow {\mathbb {R}}\) its order can be calculated by expanding the expressions for \(L_d(q(0),\dot{q}(0),q(h),\dot{q}(h), h)\) in a Taylor series in h and comparing this to the same expansions for the exact Lagrangian. If the series agree up to r terms, then the discrete Lagrangian is of order r.

5 Application to Optimal Control of Mechanical Systems

In this section we will study how to apply our variational integrator to optimal control problems. We will study optimal control problems for fully actuated mechanical systems, and we will show how our methods can be applied to the optimal control of a robotic leg.

In the following we will assume that all the control systems are controllable, that is, for any two points \(q_0\) and \(q_f\) in the configuration space Q, there exists an admissible control u(t) defined on some interval [0, T] such that the system with initial condition \(q_0\) reaches the point \(q_f\) at time T (see Bloch 2003; Bullo and Lewis 2005 for example).

5.1 Optimal Control of Fully Actuated Systems

Let \(L:TQ\rightarrow {\mathbb {R}}\) be a regular Lagrangian and take local coordinates \((q^{A})\) on Q where \(1\le A\le n\). For this Lagrangian the controlled Euler–Lagrange equations are

$$\begin{aligned} \frac{\mathrm{d}}{\mathrm{{d}}t}\frac{\partial L}{\partial \dot{q}^{A}}-\frac{\partial L}{\partial q^{A}}=u_{A}, \end{aligned}$$
(18)

where \(u=(u_{A})\in U\subset {\mathbb {R}}^{n}\) is an open subset of \({\mathbb {R}}^n\), the set of control parameters.

The optimal control problem consists in finding a trajectory of the state variables and control inputs \((q^{(A)}(t),u^{A}(t))\) satisfying (18) given initial and final conditions \((q^{A}(t_0),\dot{q}^{A}(t_0))\), \((q^{A}(t_f),\dot{q}^{A}(t_f))\) respectively, minimizing the cost function

$$\begin{aligned} {\mathcal {A}}=\int _{t_0}^{t_f} C(q^{A},\dot{q}^{A},u_{A})\mathrm{{d}}t, \end{aligned}$$

where \(C:TQ\times U\rightarrow {\mathbb {R}}\).

From (18) we can rewrite the cost function as a second-order Lagrangian \(\widetilde{L}:T^{(2)}Q\rightarrow {\mathbb {R}}\) given by

$$\begin{aligned} \widetilde{L}(q^{A},\dot{q}^{A},\ddot{q}^{A})=C\left( q^{A},\dot{q}^{A},\frac{\mathrm{d}}{\mathrm{{d}}t}\frac{\partial L}{\partial \dot{q}^{A}}-\frac{\partial L}{\partial q^{A}}\right) \end{aligned}$$

replacing the controls by the Euler–Lagrange equations in the cost function (see Bloch 2003 for example).

Suppose that \(Q={\mathbb {R}}^n\). Then we can define a discretization of the Lagrangian \(\widetilde{L}:T^{(2)}Q\rightarrow {\mathbb {R}}\) by a discrete Lagrangian \(\widetilde{L}_d:TQ\times TQ\rightarrow {\mathbb {R}}\),

$$\begin{aligned} \widetilde{L}_d(q_{k},v_k,q_{k+1},v_{k+1})&=\frac{h}{2}\widetilde{L}\left( \frac{q_k+q_{k+1}}{2},\frac{v_k+v_{k+1}}{2},\frac{2}{h^2}(q_{k+1}-q_{k}-hv_{k})\right) \\&\quad \,\,+\,\frac{h}{2}\widetilde{L}\left( \frac{q_k+q_{k+1}}{2},\frac{v_k+v_{k+1}}{2},\frac{2}{h^2}(q_{k}-q_{k+1}+hv_{k+1})\right) . \end{aligned}$$

In the first term, we have computed an approximate value of the acceleration \(a_k\) by using the Taylor expansion \(q_{k+1}\approx q_k+hv_k+\frac{h^2}{2}a_k\). For the second term, we have approximated \(a_{k+1}\) using \(q_{k}\approx q_{k+1}-hv_{k+1}+\frac{h^2}{2} a_{k+1}\), as in Example 3.2.

Other natural possibilities for \(\widetilde{L}_d\) are, for instance,

$$\begin{aligned} \widetilde{L}_d(q_k,v_k,q_{k+1},v_{k+1})&=hL\left( \frac{q_k+q_{k+1}}{2},\frac{q_{k+1}-q_k}{h},\frac{v_{k+1}-v_k}{h}\right) \end{aligned}$$

or

$$\begin{aligned} \widetilde{L}_d(q_k,v_k,q_{k+1},v_{k+1})&=\frac{1}{2}L\left( q_k,v_k,\frac{v_{k+1}-v_{k}}{h}\right) +\frac{1}{2}L\left( q_{k+1},v_{k+1},\frac{v_{k+1}-v_{k}}{h}\right) . \end{aligned}$$

Applying the results given in Sect. 3, we know that the minimizers of the cost function are obtained by solving the discrete second-order Euler–Lagrange equations

$$\begin{aligned} D_1\widetilde{L}_d(q_k,v_k,q_{k+1},v_{k+1})+D_3\widetilde{L}_d(q_{k-1},v_{k-1},q_k,v_k)&=0,\\ D_2\widetilde{L}_d(q_k,v_k,q_{k+1},v_{k+1})+D_4\widetilde{L}_d(q_{k-1},v_{k-1},q_{k},v_k)&=0. \end{aligned}$$

If the matrix

$$\begin{aligned} \left( \begin{array}{cc} D_{13}\widetilde{L}_d &{} D_{14}\widetilde{L}_d \\ D_{23}\widetilde{L}_d &{} D_{24}\widetilde{L}_d \\ \end{array} \right) \end{aligned}$$

is regular, then one can define the discrete Lagrangian map to solve the optimal control problem.

Example 5.1

(Two-link manipulator) We consider the optimal control of a two-link manipulator which is a classical example studied in robotics (see, e.g., Murray et al. 1994 and Ober-Blöbaum et al. 2011). The two-link manipulator consists of two coupled (planar) rigid bodies with mass \(m_i\), length \(l_i\) and moments of inertia with respect to the joints \(J_i\), with \(i = 1, 2\), respectively.

Fig. 5
figure 5

Two-link manipulator

Let \(\theta _1\) and \(\theta _2\) be the configuration angles measured as in Fig. 5. If we assume one end of the first link to be fixed in an inertial reference frame, the configuration of the system is locally specified by the coordinates \((\theta _1, \theta _2)\in {\mathbb {S}}^{1}\times {\mathbb {S}}^{1}\). The Lagrangian is given by the kinetic energy of the system minus the potential energy, that is,

$$\begin{aligned} L(q,\dot{q})= & {} \frac{1}{8}(m_{1}+4m_{2})l_{1}^{2}\dot{\theta }_{1}^{2}+\frac{1}{8}m_{2}l_{2}^{2}(\dot{\theta }_{1}+\dot{\theta }_{2})^{2}\\&+\,\frac{1}{2}m_{2}l_{1}l_{2}\cos (\theta _2)\dot{\theta }_{1}(\dot{\theta }_{1}+\dot{\theta }_2)+\frac{1}{2}J_{1}\dot{\theta }_{1}^{2}\\&+\,\frac{1}{2}J_{2}(\dot{\theta }_{1}+\dot{\theta }_{2})^{2}+g\left( \frac{1}{2}m_{1}l_{1}\sin \theta _{1}+m_{2}l_{1}\sin \theta _{1}+\frac{1}{2}m_{2}l_{2}(\theta _1+\theta _{2})\right) , \end{aligned}$$

where g is the constant gravitational acceleration.

Control torques \(u_{1}\) and \(u_{2}\) are applied at the base of the first link and at the joint between the two links. The equations of motion of the controlled system are

$$\begin{aligned} u_{1}&=-\sin \theta _2l_1l_2m_2\dot{\theta }_{2}\dot{\theta }_{1}-\frac{1}{2}\sin \theta _{2}\dot{\theta }_{2}^{2}l_1l_2m_{2}+\frac{1}{2}m_2l_2\cos (\theta _1+\theta _2)g\\&\quad +\left( m_2g\cos \theta _1+\frac{1}{2}g\cos \theta _{1}m_1\right) l_{1}+\left( \frac{1}{4}m_2l_2^{2}+J_2+\frac{1}{2}\cos \theta _2l_1l_2m_2\right) \ddot{\theta }_{2}\\&\quad +\left( \cos \theta _2l_1l_2m_2+\left( \frac{m_1}{4}+m_2\right) l_{1}^{2}+\frac{m_{2}l_{2}^{2}}{4}+J_{1}+J_2\right) \ddot{\theta }_{1},\\ u_2&=\frac{1}{2}\sin \theta _2l_1l_2m_2\dot{\theta }_{1}^{2}+\left( \frac{1}{4}m_2l_2^{2}+J_2+\frac{1}{2}\cos \theta _{2}l_{1}l_{2}m_{2}\right) \ddot{\theta }_{1}\\&\quad +\frac{1}{2}m_{2}l_{2}\cos (\theta _1+\theta _2)g+\left( \frac{1}{4}m_{2}l_{2}^{2}+J_{2}\right) \ddot{\theta }_{2}. \end{aligned}$$

We look for trajectories \((\theta _{1}(t),\theta _{2}(t), u(t))\) of the state variables and control inputs for given initial and final conditions, that is, for given values of \((\theta _{1}(0),\theta _{2}(0), \dot{\theta }_1(0), \dot{\theta }_2(0))\) and \((\theta _1(T), \theta _2(T), \dot{\theta }_1(T), \dot{\theta }_2(T))\), and minimizing the cost functional

$$\begin{aligned} {\mathcal {A}} = \frac{1}{2}\int _{0}^{T} \left( u_{1}^{2}+u_{2}^{2}\right) \mathrm{{d}}t. \end{aligned}$$

We construct the discrete Lagrangian \(\widetilde{L}_d:T({\mathbb {S}}^{1}\times {\mathbb {S}}^{1})\times T({\mathbb {S}}^{1}\times {\mathbb {S}}^{1})\rightarrow {\mathbb {R}}\), discretizing the Lagrangian \(\displaystyle {\widetilde{L}:T^{(2)}({\mathbb {S}}^{1}\times {\mathbb {S}}^{1})\rightarrow {\mathbb {R}}}\) given by

$$\begin{aligned} \widetilde{L}(\theta _1,\theta _2,\dot{\theta }_1,\dot{\theta }_2,\ddot{\theta }_1,\ddot{\theta }_2)&=\frac{1}{2}\left[ \frac{1}{2}\sin \theta _2l_1l_2m_2\dot{\theta }_{1}^{2}+\left( \frac{1}{4}m_2l_2^{2}+J_2+\frac{1}{2}\cos \theta _{2}l_{1}l_{2}m_{2}\right) \ddot{\theta }_{1}\right. \\&\quad +\left. \frac{1}{2}m_{2}l_{2}\cos (\theta _1+\theta _2)g+\left( \frac{1}{4}m_{2}l_{2}^{2}+J_{2}\right) \ddot{\theta }_{2}\right] ^{2}\\&\quad +\frac{1}{2}\left[ \frac{1}{2}\sin \theta _2l_1l_2m_2\dot{\theta }_{1}^{2}+\left( \frac{1}{4}m_2l_2^{2}+J_2+\frac{1}{2}\cos \theta _{2}l_{1}l_{2}m_{2}\right) \ddot{\theta }_{1} \right. \\&\quad +\left. \frac{1}{2}m_{2}l_{2}\cos (\theta _1+\theta _2)g+\left( \frac{1}{4}m_{2}l_{2}^{2}+J_{2}\right) \ddot{\theta }_{2} \right] ^{2} \end{aligned}$$

taking the same discretization as in equation (12) to approximate the acceleration and taking midpoint averages to approximate the position and velocity.

Fig. 6
figure 6

Angles \(\theta _1\) and \(\theta _2\) for the optimal control of the two-link manipulator. Initially, the two links point downwards; at \(T=10\) they point upwards

Fig. 7
figure 7

Evolution of the actual position of the two-link manipulator (detail for \(t\in [3,6]\)). Sections of this surface with the vertical plane \(t=t_0\) show the two links as they are positioned at time \(t_0\)

Figures 6 and 7 show the results from a numerical simulation of the method, taking the system from the stable mechanical equilibrium \((\theta _{1}(0),\theta _{2}(0), \dot{\theta }_1(0), \dot{\theta }_2(0))=(-\pi /2,0,0,0)\) to the unstable equilibrium \((\theta _{1}(T),\theta _{2}(T), \dot{\theta }_1(T), \dot{\theta }_2(T))=(\pi /2,0,0,0)\). We have used \(T=10\), \(N=1000\), \(m_1=0.375\), \(m_2=0.25\), \(l_1=1.5\), \(l_2=1\), \(J_1=\frac{m_1l_1^2}{3}\), \(J_2=\frac{m_2l_2^2}{3}\), and \(g=9.8\). In addition, the reader can find a video of the simulation in www.youtube.com/watch?v=ZUUH0596a30. The algorithm generates a sequence of velocities as well as positions, but we represent only the positions in the figures.

We have also considered a different setting where the angle \(\theta _2\) is restricted to move between 0 and 170 degrees, inspired by an elbow joint. This range of motion is enforced by adding a continuous, piecewise linear function \(V(\theta _2)\) to the cost function, with slope \(-1000\) for \(\theta _2<0^\circ \), 0 for \(0^\circ<\theta _2<170^\circ \), and 1000 for \(\theta _2>170^\circ \). We simulated the optimal trajectory with the same endpoint conditions and physical parameters as above, with \(N=200\). A video of the resulting motion can be found in www.youtube.com/watch?v=OxOFHdT7emQ.

6 Conclusions and Future Research

In this paper we design variational integrators for higher-order variational systems and their application to optimal control problems. The general idea for those variational integrators is to directly discretize Hamilton’s principle rather than the equations of motion in a way that preserves the original system invariants, notably the symplectic form and, via a discrete version of Noether’s theorem, the momentum map.

We show that a regular higher-order Lagrangian system has a unique solution for given nearby endpoint conditions using a direct variational proof of existence and uniqueness for the local boundary value problem using a regularization procedure assuming only \(C^k\) differentiability (instead of \(C^{2k}\) as in standard ODE theory).

We have seen that taking a discrete Lagrangian function \(L_d:T^{(k-1)}Q\times T^{(k-1)}Q \rightarrow {{\mathbb {R}}}\) we obtain the appropriate approximation of the action \( \int ^h_0 L(q, \dot{q}, \ldots , q^{(k)})\, \mathrm{{d}}t\). Moreover, we derive a particular choice of discrete Lagrangian which gives an exact correspondence between discrete and continuous systems, the exact discrete Lagrangian. We show that if the original Lagrangian is regular then it is also the exact discrete Lagrangian and how is the relation between the discrete Legendre transformations with the continuous one.

As future research, we are interested in the construction of an exact discrete Lagrangian function for higher-order mechanical systems subject to higher-order constraints. The main point will be to show the existence and uniqueness of solutions for the boundary value problem for higher-order systems subject to higher-order constraints. After it, one could define the exact discrete Lagrangian for constrained systems in a similar fashion that the ones shown in this work. Since optimal control problems for the class of underactuated mechanical systems can be seen as constrained higher-order variational problems, the extension of the constructions given in this work can be useful to new developments in the field of geometric integration for optimal control problems. The case of optimal control of nonholonomic systems will be developed.