Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

This paper introduces the discrete Hamel formalism along with some of its applications. Besides being of a pure theoretical interest, this development is motivated by restoring the concept of ideal constraints in the discrete setting and by an attempt to better understand structural stability of variational and nonholonomic integrators. A loss of structural stability has been recently observed in [25, 26, 34].

Hamel’s formalism is a version of Lagrangian mechanics in which the velocity components are measured relative to a set of independent vector fields on the configuration space. These vector fields are not associated with configuration coordinates and therefore do not commute, leading to the so-called ‘bracket terms’ in the equations of motion.

One of the reasons for using Hamel’s formalism is that the Euler–Lagrange equations written in generalized coordinates, while universal, are not always the best tool for analyzing the dynamics of mechanical systems. For example, it is difficult to study the motion of the Euler top if the Euler–Lagrange equations (either intrinsically or in generalized coordinates) are used to represent the dynamics. On the other hand, the use of the angular velocity components relative to a body frame pioneered by Euler [13] results in a much simpler representation of dynamics. Euler’s approach led to the development of the Euler–Poincaré equations by Lagrange [24] for reasonably general Lagrangians on the rotation group and by Poincaré [35] for arbitrary Lie groups (see [27] for details and history). An extension of this formalism from Lie groups to arbitrary configuration manifolds was carried out by Hamel [16]. Hamel’s formalism is especially useful in nonholonomic mechanics. See e.g. [5, 30, 33] for the history and contemporary exposition of Hamel’s formalism.

Discrete Lagrangian mechanics is obtained by discretizing Hamilton’s variational principle. This approach leads to symplectic- and, for systems with symmetry, momentum-preserving integrators. By discretizing the Lagrange–d’Alembert principle, nonconservative forces (see Kane et al. [20] and Marsden and West [28]) and nonholonomic constraints (see Cortés and Martínez [12]) can be incorporated as well. Recall that, in the continuous-time setting, the dynamics of a Lagrangian system with nonholonomic constraints may be reformulated as the dynamics of an unconstrained system by adding the constraint reaction force. See Suslov [37] and Chetaev [11] for details and precise statements. However, as pointed out in Cortés and Martínez [12], the discretizations of these two representations, as a rule, are not the same, which makes the versions of the discrete Lagrange–d’Alembert principle of [20, 28] and [12] incompatible. In other words, the notion of an ideal constraint of continuous-time mechanics is not retained by the discretization of Cortés and Martínez.

Following the variational discretization approach, we develop discrete Hamel’s formalism by discretizing Hamilton’s principle for Hamel’s equations. The principal difficulty in extending this program to Hamel’s setting is caused by the bracket terms, as a discrete analogue of the Jacobi–Lie bracket is known only for left- or right-invariant vector fields on Lie groups (Moser and Veselov [32], Marsden, Pekarsky, and Shkoller [29], Bobenko and Suris [6, 7]). In this paper we resolve the bracket term discretization issue for systems on vector spaces.

When a continuous-time system is discretized, we first select the vector fields that are used to measure the velocity components, and then set up the discrete variational principle. In general, the outcome is a somewhat different discrete dynamical system than the outcome of the usual variational discretization procedure. Remarkably, a modification of our formalism for systems with nonholonomic constraints resolves, at least for Chaplygin systems, the ideal constraint issue of Cortés and Martínez. That is, the discrete Lagrange–d’Alembert principle for Hamel’s equations in the presence of nonholonomic constraints is identical to the discrete Lagrange–d’Alembert principle of Kane et al. [20] and Marsden and West [28] written after replacing the constraints with their reactions.

Our formalism also contributes to the study of structural stability of nonholonomic integrators. Recently, Lynch and Zenkov [25, 26] discovered that the nonholonomic integrator of Cortés and Martínez, in general, is not structure-preserving, as it is capable of changing the dimension and stability of manifolds of relative equilibria of continuous-time systems. A similar effect was observed in the holonomic setting in [34]. This lack of structural stability is a serious issue as it alters the α- and ω-limit sets, thus making the asymptotic dynamics of the integrator different from the asymptotic dynamics of the underlying continuous-time system. Such an integrator, in principle, is not suitable for long-term numerical simulations of continuous-time nonholonomic systems. Discrete Hamel’s equations are certain to preserve the manifolds of relative equilibria and their stability, and thus are a better candidate for good quality long-term integrators.

The paper is organized as follows: Continuous-time Lagrangian mechanics and Hamel’s formalism, Hamilton’s variational principle, and discrete mechanics are reviewed in Sections 24. Discrete Hamel’s formalism is introduced in Section 5. Applications of discrete Hamel’s formalism to nonholonomic mechanics and to global energy-momentum numerical integration of the spherical pendulum are exposed in Sections 6 and 7.

2 Lagrangian Mechanics

Lagrangian mechanics provides a systematic approach to deriving the equations of motion as well as establishes the equivalence of force balance and variational principles.

2.1 The Euler–Lagrange Equations

A Lagrangian mechanical system is specified by a smooth manifold Q called the configuration space and a function \(L: TQ \rightarrow \mathbb{R}\) called the Lagrangian. In many cases, the Lagrangian is the kinetic minus potential energy of the system, with the kinetic energy defined by a Riemannian metric and the potential energy being a smooth function on the configuration space Q. If necessary, non-conservative forces can be introduced (e.g., gyroscopic forces that are represented by terms in L that are linear in the velocity), but this is not discussed in detail in this paper.

In local coordinates \(q = (q^{1},\ldots,q^{n})\) on the configuration space Q we write \(L = L(q,\dot{q})\). The dynamics is given by the Euler–Lagrange equations

$$\displaystyle{ \frac{d} {\mathit{dt}} \frac{\partial L} {\partial \dot{q}^{i}} = \frac{\partial L} {\partial q^{i}},\quad i = 1,\ldots,n. }$$
(1)

These equations were originally derived by Lagrange [24] in 1788 by requiring that simple force balance be covariant, i.e. expressible in arbitrary generalized coordinates. A variational derivation of the Euler–Lagrange equations, namely Hamilton’s principle (see Theorem 1 below), came later in the work of Hamilton [17, 18] in 1834/35.

Let q(t), a ≤ t ≤ b, be a smooth curve in Q. A variation of the curve q(t) is a smooth map \(\beta: [a,b] \times [-\varepsilon,\varepsilon ] \rightarrow Q\) that satisfies the condition β(t, 0) = q(t). This variation gives rise to the vector field

$$\displaystyle{ \delta q(t) = \left.\frac{\partial \beta (t,s)} {\partial s} \right \vert _{s=0} }$$
(2)

along the curve q(t).

Theorem 1.

The following statements are equivalent:

  1. (i)

    The curve q(t), where a ≤ t ≤ b, is a critical point of the action functional

    $$\displaystyle{ \int _{a}^{b}L(q,\dot{q})\,\mathit{dt} }$$

    on the space of curves in Q connecting q a to q b on the interval a ≤ t ≤ b, where we choose variations of the curve q(t) that satisfy the condition \(\delta q(a) =\delta q(b) = 0\) .

  2. (ii)

    The curve q(t) satisfies the Euler–Lagrange equations (1).

We point out here that this principle assumes that a variation of the curve q(t) induces the variation \(\delta \dot{q}(t)\) of its velocity according to the formula

$$\displaystyle{\delta \dot{q}(t):= \frac{d} {\mathit{dt}}\delta q(t).}$$

For more details and a proof, see e.g. [2, 27], and Theorem 2 below.

3 Lagrangian Mechanics in Non-coordinate Frames

In this section we discuss the continuous-time Hamel formalism and a relevant variational principle, following the exposition of [5].

3.1 The Hamel Equations

In many cases the Lagrangian and the equations of motion have a simpler structure when the velocity components are measured against a frame that is not necessarily induced by system’s local configuration coordinates. An example of such a system is the rigid body.

Let \(q = (q^{1},\ldots,q^{n})\) be local coordinates on the configuration space Q and u i  ∈ TQ, \(i = 1,\ldots,n\), be smooth independent local vector fields on Q defined in the same coordinate neighborhood hereafter denoted U. In certain cases, some or all of u i can be chosen to be global vector fields on Q. The components of u i relative to the coordinate-induced basis ∂ q j are written as \(\psi _{i}^{j}\); that is,

$$\displaystyle{u_{i}(q) =\psi _{ i}^{j}(q) \frac{\partial } {\partial q^{j}},}$$

where \(i,j = 1,\ldots,n\).

Let \(\xi = (\xi ^{1},\ldots,\xi ^{n}) \in \mathbb{R}^{n}\) be the components of the velocity vector \(\dot{q} \in TQ\) relative to the frame \(u(q) = (u_{1}(q),\ldots,u_{n}(q))\), i.e.,

$$\displaystyle{ \dot{q} = u(q)\, \cdot \,\xi, }$$
(3)

where, by definition,

$$\displaystyle{ u(q)\, \cdot \,\xi:=\xi ^{i}u_{ i}(q). }$$
(4)

When convenient, we reverse the order of factors in (4), i.e., we assume that

$$\displaystyle{u(q)\, \cdot \,\xi =\xi \, \cdot \,u(q).}$$

The Lagrangian of the system written in the local coordinates (q, ξ) on the velocity phase space TQ reads

$$\displaystyle{ l(q,\xi ):= L(q,u(q)\, \cdot \,\xi ). }$$
(5)

The coordinates (q, ξ) are a Lagrangian analogue of non-canonical variables in Hamiltonian dynamics.

Given two elements \(\xi,\zeta \in \mathbb{R}^{n}\), define the antisymmetric bracket operation \([\, \cdot \,,\, \cdot \,]_{q}: \mathbb{R}^{n} \times \mathbb{R}^{n} \rightarrow \mathbb{R}^{n}\) by

$$\displaystyle{u(q)\, \cdot \,[\xi,\zeta ]_{q} =\big [u(q)\, \cdot \,\xi,u(q)\, \cdot \,\zeta \big],}$$

where [ ⋅ , ⋅ ] is the Jacobi–Lie bracket of vector fields on Q. That is, [ξ, ζ] q consists of the components of \([u_{i}\xi ^{i},u_{j}\zeta ^{j}](q)\) relative to the frame \(u_{1},\ldots,u_{n}\).

Therefore, each tangent space T q U is isomorphic to the Lie algebra \(W_{q}:= (\mathbb{R}^{n},[\, \cdot \,,\, \cdot \,]_{q})\), and the tangent bundle TU is diffeomorphic to a Lie algebra bundle over U.

The dual of [ ⋅ , ⋅ ] q is, by definition, the operation \([\, \cdot \,,\, \cdot \,]_{q}^{{\ast}}: W_{q} \times W_{q}^{{\ast}}\rightarrow W_{q}^{{\ast}}\) given by

$$\displaystyle{\langle [\xi,\alpha ]_{q}^{{\ast}},\zeta \rangle:=\langle \alpha,[\xi,\zeta ]_{ q}\rangle.}$$

Define the structure functions \(c_{ij}^{a}(q)\) by the equations

$$\displaystyle{ [u_{i}(q),u_{j}(q)] = c_{ij}^{a}(q)u_{ a}(q), }$$

\(i,j,a = 1,\ldots,n\). These quantities vanish if and only if the vector fields u i (q), \(i = 1,\ldots,n\), commute.

Viewing u i as vector fields on TQ whose fiber components equal 0, one defines the directional derivatives u i [l] for a function \(l: TQ \rightarrow \mathbb{R}\) in a usual way. It is straightforward to show that

$$\displaystyle{u_{i}[l] =\psi _{ i}^{j} \frac{\partial l} {\partial q^{j}}.}$$

For a frame \(u = (u_{1},\ldots,u_{n})\), define u[l] by the formula

$$\displaystyle{u[l] = (u_{1}[l],\ldots,u_{n}[l]).}$$

The evolution of the variables (q, ξ) is governed by the Hamel equations

$$\displaystyle{ \frac{d} {\mathit{dt}} \frac{\partial l} {\partial \xi ^{j}} = c_{ij}^{a}\xi ^{i} \frac{\partial l} {\partial \xi ^{a}} + u_{j}[l], }$$
(6)

coupled with equation (3). If \(u_{i} = \partial /\partial q^{i}\), equations (6) become the Euler–Lagrange equations (1). Equations (6) were introduced in [16] (see also [33] and [5] for details and some history).

3.2 Hamilton’s Principle for Hamel’s Equations

The variational derivation of Hamel’s equations in this section mostly follows [5]. We refer the readers to [27] for the related history of the development of variational principles for the Euler–Lagrange, Euler–Poincaré, and Hamel equations, and to [1] for the Hamilton–Pontryagin principle for the Hamel equations.

Theorem 2 (Zenkov, Bloch, and Marsden [5]).

Let \(L: TQ \rightarrow \mathbb{R}\) be a Lagrangian and l be its representation in local coordinates (q,ξ). Then, the following statements are equivalent:

  1. (i)

    The curve q(t), where a ≤ t ≤ b, is a critical point of the action functional

    $$\displaystyle{ \int _{a}^{b}L(q,\dot{q})\,\mathit{dt} }$$
    (7)

    on the space of curves in Q connecting q a to q b on the interval [a,b], where we choose variations of the curve q(t) that satisfy \(\delta q(a) =\delta q(b) = 0\) .

  2. (ii)

    The curve q(t) satisfies the Euler–Lagrange equations

    $$\displaystyle{ \frac{d} {\mathit{dt}} \frac{\partial L} {\partial \dot{q}} = \frac{\partial L} {\partial q}.}$$
  3. (iii)

    The curve (q(t),ξ(t)) is a critical point of the functional

    $$\displaystyle{ \int _{a}^{b}l(q,\xi )\,\mathit{dt} }$$
    (8)

    with respect to variations δξ, induced by the variations

    $$\displaystyle{ \delta q = u(q)\, \cdot \,\zeta \equiv u_{i}(q)\zeta ^{i}, }$$
    (9)

    and given by

    $$\displaystyle{ \delta \xi =\dot{\zeta } +[\xi,\zeta ]_{q}. }$$
    (10)

    Footnote 1

  4. (iv)

    The curve \((q(t),\xi (t))\) satisfies the Hamel equations

    $$\displaystyle{ \frac{d} {\mathit{dt}} \frac{\partial l} {\partial \xi } = \left [\xi, \frac{\partial l} {\partial \xi } \right ]_{q}^{{\ast}} + u[l]}$$

    coupled with the equations \(\dot{q} = u(q)\, \cdot \,\xi \equiv \xi ^{i}u_{i}(q).\)

For the early development of these equations see [35] and [16].

Proof.

The equivalence of (i) and (ii) is proved by computing the variation of the action functional (7):

$$\displaystyle{\delta \int _{a}^{b}L(q,\dot{q})\,\mathit{dt} =\int _{ a}^{b}\left (\frac{\partial L} {\partial q} \delta q + \frac{\partial L} {\partial \dot{q}} \delta \dot{q}\right )\mathit{dt} =\int _{ a}^{b}\left (\frac{\partial L} {\partial q} - \frac{d} {\mathit{dt}} \frac{\partial L} {\partial \dot{q}} \right )\delta q\,\mathit{dt}.}$$

Recall that we denote the components of δ q(t) relative to the frame \(u(q(t)) = (u_{1}(q(t)),\ldots,u_{n}(q(t)))\) by \(\zeta (t) = (\zeta ^{1}(t),\ldots,\zeta ^{n}(t))\); that is,

$$\displaystyle{\delta q(t) = u(q(t))\, \cdot \,\zeta (t) \equiv u_{i}(q(t))\zeta ^{i}(t).}$$

To prove the equivalence of (i) and (iii), we first compute the quantities \(\delta \dot{q}\) and d(δ q)∕dt. Using the definition (2) of the field \(\delta q\), one concludes that

$$\displaystyle{ \delta u_{a}(q(t)) = \frac{\partial u_{a}(\beta (t,s))} {\partial s} \bigg\vert _{s=0} =\zeta \, \cdot \,u[u_{a}] =\delta q[u_{a}] \equiv \zeta ^{b}u_{ b}[u_{a}]. }$$
(11)

Similarly,

$$\displaystyle{ \frac{d} {\mathit{dt}}u_{b}(q(t)) =\dot{ q}[u_{b}] =\xi \, \cdot \,u[u_{b}] \equiv \xi ^{a}u_{ a}[u_{b}].}$$

Next,

$$\displaystyle\begin{array}{rcl} \delta \dot{q}& =& \delta u(q(t))\, \cdot \,\xi (t) + u(q(t))\, \cdot \,\delta \xi (t), {}\\ \frac{d(\delta q)} {\mathit{dt}} & =& \frac{\mathit{du}(q(t))} {\mathit{dt}} \, \cdot \,\zeta (t) + u(q(t))\, \cdot \,\dot{\zeta }(t). {}\\ \end{array}$$

Equivalently, in coordinates,

$$\displaystyle\begin{array}{rcl} \delta \dot{q}& =& \delta \left (\xi ^{i}(t)u_{ i}(q(t))\right ) =\delta \xi ^{i}(t)u_{ i}(q(t)) +\xi ^{i}(t)\frac{\partial u_{i}} {\partial q^{j}}\delta q^{j}, {}\\ \frac{d(\delta q)} {\mathit{dt}} & =& \frac{d} {\mathit{dt}}\left (\zeta ^{i}(t)u_{ i}(q(t))\right ) =\dot{\zeta } ^{i}(t)u_{ i}(q(t)) +\zeta ^{i}(t)\frac{\partial u_{i}} {\partial q^{j}}\dot{q}^{j}. {}\\ \end{array}$$

Since \(\delta \dot{q} = d(\delta q)/\mathit{dt}\), we obtain

$$\displaystyle\begin{array}{rcl} & & u(q(t))\, \cdot \,\big(\delta \xi (t) -\dot{\zeta } (t)\big) =\xi ^{i}(t)\zeta ^{j}(t)\big(u_{ i}(q(t))[u_{j}(q(t))] - u_{j}(q(t))[u_{i}(q(t))]\big) {}\\ & & \quad =\xi ^{i}(t)\zeta ^{j}(t)[u_{ i}(q(t)),u_{j}(q(t))] \equiv \big [u(q(t))\, \cdot \,\xi (t),u(q(t))\, \cdot \,\zeta (t)\big], {}\\ \end{array}$$

which implies formula (10).

To prove the equivalence of (iii) and (iv), we use the above formula and compute the variation the functional (8):

$$\displaystyle\begin{array}{rcl} \delta \int _{a}^{b}l(q,\xi )\,\mathit{dt}& =& \int _{ a}^{b}\left ( \frac{\partial l} {\partial q}\delta q + \frac{\partial l} {\partial \xi } \delta \xi \right )\mathit{dt} {}\\ & =& \int _{a}^{b}\left (\zeta \, \cdot \,u[l] + \frac{\partial l} {\partial \xi } \left (\dot{\zeta }+[\xi,\zeta ]_{q(t)}\right )\right )\mathit{dt} {}\\ & =& \int _{a}^{b}\bigg(u[l] + \left [\xi, \frac{\partial l} {\partial \xi } \right ]_{q(t)}^{{\ast}}- \frac{d} {\mathit{dt}} \frac{\partial l} {\partial \xi } \bigg)\zeta \,\mathit{dt}. {}\\ \end{array}$$

The latter vanishes if and only if the Hamel equations are satisfied. □ 

3.3 Remarks on the Frame Selection

As discussed in [2, 3], and [5], constraints and symmetry naturally define subbundles of the velocity phase space TQ. For underactuated mechanical systems, the controlled directions define a subbundle of the momentum phase space T Q. It may be beneficial to select a frame in such a way that suitable subframes of the frame and its dual span the mentioned subbundles. Such frames lead to a simpler representation of dynamics and clarify the structure of the mechanical system under consideration (subsystems, interconnections, etc.).

4 Discrete Mechanics

A discrete analogue of Lagrangian mechanics can be obtained by discretizing Hamilton’s principle; this approach underlies the construction of variational integrators. See Marsden and West [28], and references therein, for a more detailed discussion of discrete mechanics.

A key notion is that of the discrete Lagrangian, which is a map \(L^{d}: Q \times Q \rightarrow \mathbb{R}\) that approximates the action integral along an exact solution of the Euler–Lagrange equations joining the configurations \(q_{k},q_{k+1} \in Q\),

$$\displaystyle{ L^{d}(q_{ k},q_{k+1}) \approx \mathop{\mathrm{ext}}\nolimits _{q\in \mathcal{C}([0,h],Q)}\int _{0}^{h}L(q,\dot{q})\,\mathit{dt}, }$$

where \(\mathcal{C}([0,h],Q)\) is the space of curves q: [0, h] → Q with q(0) = q k , \(q(h) = q_{k+1}\), and \(\mathop{\mathrm{ext}}\nolimits\) denotes extremum.

In the discrete setting, the action integral of Lagrangian mechanics is replaced by an action sum

$$\displaystyle{ S^{d}(q_{ 0},q_{1},\ldots,q_{N}) =\sum _{ k=0}^{N-1}L^{d}(q_{ k},q_{k+1}), }$$

where q k  ∈ Q, \(k = 0,1,\ldots,N\), is a finite sequence in the configuration space. The equations are obtained by the discrete Hamilton principle, which extremizes the discrete action given fixed endpoints q 0 and q N . Taking the extremum over \(q_{1},\ldots,q_{N-1}\) gives the discrete Euler–Lagrange equations

$$\displaystyle{ D_{1}L^{d}(q_{ k},q_{k+1}) + D_{2}L^{d}(q_{ k-1},q_{k}) = 0 }$$
(12)

for \(k = 1,\ldots,N - 1\). Here and below, D i F denotes the partial derivative of the function F with respect to its ith input. Equations (12) implicitly define the update map \(\varPhi: Q \times Q \rightarrow Q \times Q\), where \(\varPhi (q_{k-1},q_{k}) = (q_{k},q_{k+1})\) and Q × Q replaces the velocity phase space TQ of continuous-time Lagrangian mechanics.

In the case that Q is a vector space, it may be convenient to use \((q_{k+1/2},v_{k,k+1})\), where \(q_{k+1/2} = \tfrac{1} {2}(q_{k} + q_{k+1})\) and \(v_{k,k+1} = \tfrac{1} {h}(q_{k+1} - q_{k})\), as a state of a discrete mechanical system. In such a representation, the discrete Lagrangian becomes a function of \((q_{k+1/2},v_{k,k+1})\), and the discrete Euler–Lagrange equations read

$$\displaystyle\begin{array}{rcl} & & \tfrac{1} {2}\big(D_{1}L^{d}(q_{ k-1/2},v_{k-1,k}) + D_{1}L^{d}(q_{ k+1/2},v_{k,k+1})\big) {}\\ & & \quad + \tfrac{1} {h}\big(D_{2}L^{d}(q_{ k-1/2},v_{k-1,k}) - D_{2}L^{d}(q_{ k+1/2},v_{k,k+1})\big) = 0. {}\\ \end{array}$$

These equations are equivalent to the variational principle

$$\displaystyle\begin{array}{rcl} \delta S^{d} =\sum _{ k=0}^{N-1}\big(D_{ 1}L^{d}(q_{ k+1/2},v_{k,k+1})\,\delta q_{k+1/2} + D_{2}L^{d}(q_{ k+1/2},v_{k,k+1})\,\delta v_{k,k+1}\big) = 0,\quad & &{}\end{array}$$
(13)

where the variations \(\delta q_{k+1/2}\) and δ v k, k+1 are induced by the variations δ q k and are given by the formulae

$$\displaystyle{\delta q_{k+1/2} = \tfrac{1} {2}\big(\delta q_{k+1} +\delta q_{k}\big),\quad \delta v_{k,k+1} = \tfrac{1} {h}\big(\delta q_{k+1} -\delta q_{k}\big).}$$

The discrete Hamel formalism introduced below may be interpreted as a generalization of the representation (13) of discrete mechanics.

5 Discrete Hamel’s Equations

In the rest of the paper we assume that Q is a vector space. Start with a sequence of configurations {q k } k = 0 N. Given a parameter τ ∈ [0, 1], define the points \(q_{k+\tau }:= (1-\tau )q_{k} +\tau q_{k+1}\) for each 0 ≤ k ≤ N − 1. The velocity components relative to the frame u(q) at q k+τ are denoted \(\xi _{k,k+1} = (\xi _{k,k+1}^{1},\ldots,\xi _{k,k+1}^{n})\). Similar to [8, 22], the phase space for the suggested discretization of Hamel’s equation is the tangent bundle TQ. In local coordinates (q, ξ) on TQ, the discrete Lagrangian \(l^{d}: TQ \rightarrow \mathbb{R}\) reads \(l^{d} = l^{d}(q_{k+\tau },\xi _{k,k+1})\). To discretize a continuous-time system, we suggest the following procedure:

  1. (i)

    Select a frame u(q) and identify the continuous-time Lagrangian l(q, ξ), as in (5).

  2. (ii)

    Construct the discrete Lagrangian using the formula

    $$\displaystyle{l^{d}(q_{ k+\tau },\xi _{k,k+1}) = hl(q_{k+\tau },\xi _{k,k+1}).}$$

The action sum then is

$$\displaystyle{ s^{d} =\sum _{ k=0}^{N-1}l^{d}(q_{ k+\tau },\xi _{k,k+1}), }$$
(14)

which is an approximation of the action integral (8) of the continuous-time system.

Given τ ∈ [0, 1], define ζ k+τ by the formula

$$\displaystyle{ \zeta _{k+\tau } = (1-\tau )\zeta _{k} +\tau \zeta _{k+1}. }$$
(15)

The quantities ζ k , ζ k+1, and ζ k+τ will be used below to establish the discrete analogues of the variation formulae (9) and (10).

Define the discrete conjugate momentum by

$$\displaystyle{ \mu _{k,k+1}:= D_{2}l^{d}(q_{ k+\tau },\xi _{k,k+1}). }$$
(16)

Below, we use the notations

$$\displaystyle{u_{k+\tau }:= u(q_{k+\tau }),\ \ l_{k+\tau }^{d}:= l^{d}(q_{ k+\tau },\xi _{k,k+1}),\ \ u\big[l^{d}\big]_{ k+\tau }:= u\big[l^{d}\big](q_{ k+\tau },\xi _{k,k+1}),}$$

etc.

Theorem 3.

The sequence \(\big(q_{k+\tau },\xi _{k,k+1}\big) \in TQ\) satisfies the discrete Hamel equations

$$\displaystyle\begin{array}{rcl} & & \tfrac{1} {h}\big(\mu _{k-1,k} -\mu _{k,k+1}\big) +\tau u\big[l^{d}\big]_{ k-1+\tau } + (1-\tau )u\big[l^{d}\big]_{ k+\tau } \\ & & \quad +\tau \big [\xi _{k-1,k},\mu _{k-1,k}\big]_{q_{k-1+\tau }}^{{\ast}} + (1-\tau )\big[\xi _{ k,k+1},\mu _{k,k+1}\big]_{q_{k+\tau }}^{{\ast}} = 0{}\end{array}$$
(17)

if and only if

$$\displaystyle{\delta s^{d} =\delta \sum _{ k=0}^{N-1}l^{d}(q_{ k+\tau },\xi _{k,k+1}) = 0,}$$

where

$$\displaystyle\begin{array}{rcl} \delta q_{k+\tau }& =& u(q_{k+\tau })\, \cdot \,\zeta _{k+\tau },{}\end{array}$$
(18)
$$\displaystyle\begin{array}{rcl} \delta \xi _{k,k+1} = \tfrac{1} {h}\big(\zeta _{k+1} -\zeta _{k}\big) +\big [\xi _{k,k+1},\zeta _{k+\tau }\big]_{q_{k+\tau }}.& &{}\end{array}$$
(19)

Here \(\zeta _{0} =\zeta _{N} = 0,\) and ζ k+τ is defined in  (15) , \(k = 0,\ldots,N - 1\) .

In order to obtain a complete system of equations, one supplements (17) with a discrete analogue of the kinematic equation \(\dot{q} = u(q)\, \cdot \,\xi\). There is a certain freedom in doing that. For now, we assume this discrete analogue to be

$$\displaystyle{\frac{\varDelta q_{k}} {h} = u_{k+\tau }\, \cdot \,\xi _{k,k+1}.}$$

We will use a different discretization of the kinematic equation to construct an integrator for the spherical pendulum in Section 7.

In the coordinate form, the discrete Hamel equations and the formulae for variations read

$$\displaystyle\begin{array}{rcl} & & \tfrac{1} {h}\big(\mu _{k-1,k;j} -\mu _{k,k+1;j}\big) +\tau u_{j}\big[l^{d}\big]_{ k-1+\tau } + (1-\tau )u_{j}\big[l^{d}\big]_{ k+\tau } {}\\ & & \quad +\tau c_{ij}^{a}(q_{ k-1+\tau })\xi _{k-1,k}^{i}\mu _{ k-1,k;a} + (1-\tau )c_{ij}^{a}(q_{ k+\tau })\xi _{k,k+1}^{i}\mu _{ k,k+1;a} = 0, {}\\ \end{array}$$

and

$$\displaystyle\begin{array}{rcl} \delta q_{k+\tau }^{i}& =& \psi _{ b}^{i}(q_{ k+\tau })\zeta _{k+\tau }^{b}, {}\\ \delta \xi _{k,k+1}^{b}& =& \tfrac{1} {h}\big(\zeta _{k+1}^{b} -\zeta _{ k}^{b}\big) + c_{ ij}^{b}(q_{ k+\tau })\,\xi _{k,k+1}^{i}\zeta _{ k+\tau }^{j}, {}\\ \end{array}$$

respectively.

Remark.

Unlike the continuous-time case, the formulae for variations (18) and (19) cannot be derived in a manner presented in the proof of Theorem 2. The situation here is somewhat similar to the issue encountered and resolved by Chetaev in his work [10] on the equivalence of the Lagrange–d’Alembert and Gauss principles for systems with nonlinear nonholonomic constraints. Recall that Chetaev’s approach was to define variations in such a way that the two principles become equivalent.

Proof.

Using formulae (18) and (19) and computing the variation of the action sum (14), one obtains

$$\displaystyle\begin{array}{rcl} \delta s^{d}& =& \sum _{ k=0}^{N-1}D_{ 1}l^{d}(q_{ k+\tau },\xi _{k,k+1})\,\delta q_{k+\tau } + D_{2}l^{d}(q_{ k+\tau },\xi _{k,k+1})\,\delta \xi _{k,k+1} {}\\ & =& \sum _{k=0}^{N-1}\Big\langle D_{ 1}l_{k+\tau }^{d},u_{ k+\tau }\, \cdot \,\zeta _{k+\tau }\Big\rangle {}\\ & & \qquad +\Big\langle D_{2}l_{k+\tau }^{d},(\zeta _{ k+1} -\zeta _{k})/h +\big [\xi _{k,k+1},\zeta _{k+\tau }\big]_{q_{k+\tau }}\Big\rangle {}\\ & =& \sum _{k=1}^{N-1}\Big\langle \tfrac{1} {h}(\mu _{k-1,k} -\mu _{k,k+1}),\zeta _{k}\Big\rangle {}\\ & & \qquad +\Big\langle u\big[l^{d}\big]_{ k+\tau } +\big [\mu _{k,k+1},\xi _{k,k+1}\big]_{q_{k+\tau }}^{{\ast}},(1-\tau )\zeta _{ k} +\tau \zeta _{k+1}\Big\rangle {}\\ & =& \sum _{k=1}^{N-1}\Big\langle \tfrac{1} {h}(\mu _{k-1,k} -\mu _{k,k+1}),\zeta _{k}\Big\rangle +\Big\langle \tau u\big[l^{d}\big]_{ k-1+\tau } + (1-\tau )u\big[l^{d}\big]_{ k+\tau },\zeta _{k}\Big\rangle {}\\ & & \qquad +\Big\langle \tau \big [\mu _{k-1,k},\xi _{k-1,k}\big]_{q_{k-1+\tau }}^{{\ast}} + (1-\tau )\big[\mu _{ k,k+1},\xi _{k,k+1}\big]_{q_{k+\tau }}^{{\ast}},\zeta _{ k}\Big\rangle. {}\\ \end{array}$$

Thus, vanishing of δ s d for arbitrary \(\zeta _{k},\ k = 1,\ldots,N - 1\), is equivalent to discrete Hamel’s equations (17). □ 

The formulae for variations (18) and (19) in the discrete setting are motivated by the following observations. First, recall that in the continuous-time setting the formula (10) for δ ξ follows from the formula

$$\displaystyle{ \delta (u\, \cdot \,\xi ) - \frac{d} {\mathit{dt}}(u\, \cdot \,\zeta ) = 0. }$$
(20)

A discrete analogue of δ(u ⋅ ξ) is relatively straightforward to obtain. Indeed, using the formula

$$\displaystyle{\delta q_{k+\tau } = u_{k+\tau }\, \cdot \,\zeta _{k+\tau } \equiv u_{k+\tau }\, \cdot \,\big((1-\tau )\zeta _{k} +\tau \zeta _{k+1})\big)}$$

and the interpretation of the operator δ as a directional derivative, just like in formula (11), one obtains

$$\displaystyle{\delta u_{k+\tau } =\big (\zeta _{k+\tau }\, \cdot \,u[u]\big)_{k+\tau },}$$

and therefore

$$\displaystyle\begin{array}{rcl} \delta (u_{k+\tau }\, \cdot \,\xi _{k+1})& =& \delta u_{k+\tau }\, \cdot \,\xi _{k,k+1} + u_{k+\tau }\, \cdot \,\delta \xi _{k,k+1} {}\\ & =& u_{k+\tau }\, \cdot \,\delta \xi _{k,k+1} +\big (\zeta _{k+\tau }\, \cdot \,u\big[\xi _{k,k+1}\, \cdot \,u\big]\big)_{k+\tau }. {}\\ \end{array}$$

However, a discrete analogue of the formula \(\frac{d} {\mathit{dt}}(u\, \cdot \,\zeta )\) is not immediately available, as the operation of time differentiation is not intrinsically present in the discrete setting. A workaround that we suggest is to view the transition from q k to q k+1 as a motion along a straight line segment at a uniform rate:

$$\displaystyle{ q_{k+\tau } = (1-\tau )q_{k} +\tau q_{k+1},\quad 0 \leq \tau \leq 1, }$$
(21)

so that \(q_{k+\tau } = q_{k}\) when τ = 0 and \(q_{k+\tau } = q_{k+1}\) when τ = 1. Since the time step is h, the analogue of continuous-time velocity is Δ q k h. From (21),

$$\displaystyle{\frac{\varDelta q_{k}} {h} = \frac{1} {h} \frac{dq_{k+\tau }} {d\!\tau },}$$

leading to an interpretation of the operator

$$\displaystyle{\frac{1} {h} \frac{d} {d\!\tau }}$$

as a discrete analogue of time differentiation of continuous-time mechanics.

The discrete analogue of the term \(\frac{d} {\mathit{dt}}(u\, \cdot \,\zeta )\) thus is

$$\displaystyle\begin{array}{rcl} \frac{1} {h} \frac{d} {d\!\tau }\big(u_{k+\tau }\, \cdot \,\zeta _{k+\tau }\big)& =& \frac{1} {h} \frac{\mathit{du}_{k+\tau }} {d\!\tau } \, \cdot \,\zeta _{k+\tau } + u_{k+\tau }\, \cdot \,\frac{1} {h} \frac{d\zeta _{k+\tau }} {d\!\tau } {}\\ & =& u_{k+\tau }\, \cdot \,\frac{1} {h} \frac{d\zeta _{k+\tau }} {d\!\tau } +\big (\xi _{k,k+1}\, \cdot \,u\big[\zeta _{k+\tau }\, \cdot \,u\big]\big)_{k+\tau } {}\\ & =& u_{k+\tau }\, \cdot \,\frac{\zeta _{k+1} -\zeta _{k}} {h} +\big (\xi _{k,k+1}\, \cdot \,u\big[\zeta _{k+\tau }\, \cdot \,u\big]\big)_{k+\tau }. {}\\ \end{array}$$

Summarizing, the discrete analogue of (20) reads

$$\displaystyle{u_{k+\tau }\, \cdot \,\delta \xi _{k,k+1} = u_{k+\tau }\, \cdot \,\frac{\zeta _{k+1} -\zeta _{k}} {h} +\big [u\, \cdot \,\xi _{k,k+1},u\, \cdot \,\zeta _{k+\tau }\big]_{q_{k+\tau }},}$$

which implies formula (19) for variation δ ξ.

6 Hamel’s Formalism and Nonholonomic Integrators

In this section we study some of the structure-preserving properties of discrete Hamel’s formalism in the presence of velocity constraints.

6.1 The Lagrange–d’Alembert Principle

Assume now that there are velocity constraints imposed on the system. We confine our attention to constraints that are homogeneous in the velocity. Accordingly, we consider a configuration space Q and a distribution \(\mathcal{D}\) on Q that describes these constraints. Recall that a distribution \(\mathcal{D}\) is a collection of linear subspaces of the tangent spaces of Q; we denote these spaces by \(\mathcal{D}_{q} \subset T_{q}Q\), one for each q ∈ Q. A curve q(t) ∈ Q is said to satisfy the constraints if \(\dot{q}(t) \in \mathcal{D}_{q(t)}\) for all t. This distribution is, in general, nonintegrable; i.e., the constraints are, in general, nonholonomic.Footnote 2

Consider a Lagrangian \(L: TQ \rightarrow \mathbb{R}\). The equations of motion are given by the following Lagrange–d’Alembert principle.

Definition 1.

The Lagrange–d’Alembert equations of motion for the system are those determined by

$$\displaystyle{\delta \int _{a}^{b}L(q,\dot{q})\,\mathit{dt} = 0,}$$

where we choose variations δ q(t) of the curve q(t) that satisfy \(\delta q(a) =\delta q(b) = 0\) and \(\delta q(t) \in \mathcal{D}_{q(t)}\) for each t ∈ [a, b].

This principle is supplemented by the condition that the curve q(t) itself satisfies the constraints. Note that we take the variation before imposing the constraints; that is, we do not impose the constraints on the family of curves defining the variation. This is well known to be important to obtain the correct mechanical equations (see [23] and [3] for discussions and references).

6.2 Ideal Constraints

As discussed in e.g. Suslov [37] and Chetaev [11], it is assumed in classical mechanics that the constraints imposed on the system can be replaced with the reaction forces. This means that after the forces are imposed on the unconstrained system, the constraint distribution becomes a conditional invariant manifold of the forced unconstrained Lagrangian system whose dynamics on this invariant manifold is identical to that of the constrained system.

Definition 2.

Constraints (either holonomic or nonholonomic) are called ideal if their reaction forces at each q ∈ Q belong to the null space \(\mathcal{D}_{q}^{\circ }\subset T_{q}^{{\ast}}Q\) of \(\mathcal{D}_{q}\).

As shown in Suslov [37] and Chetaev [11], the reaction forces of ideal constraints are defined uniquely at each state \((q,\dot{q}) \in TQ\).

In summary, for a system subject to ideal constraints, the forced dynamics is equivalent to the Lagrange–d’Alembert principle. We refer the reader to books [37] and [11] for a more detailed exposition and history of the concept of ideal constraints.

6.3 The Constrained Hamel Equations

Given a system with velocity constraints, that is, a Lagrangian \(L: TQ \rightarrow \mathbb{R}\) and constraint distribution \(\mathcal{D}\), select the independent local vector fields

$$\displaystyle{u_{i}: Q \rightarrow TQ,\quad i = 1,\ldots,n,}$$

such that \(\mathcal{D}_{q} =\mathop{ \mathrm{span}}\nolimits \{u_{1}(q),\ldots,u_{m}(q)\}\), m < n. Each \(\dot{q} \in TQ\) can be uniquely written as

$$\displaystyle{ \dot{q} = u(q)\, \cdot \,\xi ^{\mathcal{D}} + u(q)\, \cdot \,\xi ^{\mathcal{U}}, }$$
(22)

where \(u(q)\, \cdot \,\xi ^{\mathcal{D}}\) is the component of \(\dot{q}\) along \(\mathcal{D}_{q}\) and \(u(q)\, \cdot \,\xi ^{\mathcal{U}}\) is the complementary component. Similarly, each a ∈ T Q can be uniquely decomposed as

$$\displaystyle{a = a_{\mathcal{D}}\, \cdot \,u^{{\ast}}(q) + a_{ \mathcal{U}}\, \cdot \,u^{{\ast}}(q),}$$

where \(a_{\mathcal{D}}\, \cdot \,u^{{\ast}}(q)\) is the component of a along the dual of \(\mathcal{D}_{q}\), where \(a_{\mathcal{U}}\, \cdot \,u^{{\ast}}(q)\) is the complementary component, and where \(u^{{\ast}}(q) \in T^{{\ast}}Q \times \ldots \times T^{{\ast}}Q\) denotes the dual frame of u(q). Using the decomposition (22), the constraints read

$$\displaystyle{ \xi =\xi ^{\mathcal{D}}\quad \text{or}\quad \xi ^{\mathcal{U}} = 0. }$$
(23)

Similar to (22), we write

$$\displaystyle{\delta q = u(q)\, \cdot \,\zeta = u(q)\, \cdot \,\zeta ^{\mathcal{D}} + u(q)\, \cdot \,\zeta ^{\mathcal{U}}.}$$

Recall that \(\delta q(t) \in \mathcal{D}_{q(t)}\), which is equivalent to

$$\displaystyle{ \zeta =\zeta ^{\mathcal{D}}\quad \text{or}\quad \zeta ^{\mathcal{U}} = 0. }$$
(24)

The Lagrange–d’Alembert principle in combination with (24) proves the following theorem:

Theorem 4.

The dynamics of a system with velocity constraints is represented by the constrained Hamel equations

$$\displaystyle{ \bigg( \frac{d} {\mathit{dt}} \frac{\partial l} {\partial \xi } -\bigg [\xi ^{\mathcal{D}}, \frac{\partial l} {\partial \xi } \bigg]_{q}^{{\ast}}- u[l]\bigg)_{ \mathcal{D}} = 0,\quad \xi ^{\mathcal{U}} = 0, }$$

coupled with the kinematic equation

$$\displaystyle{\quad \dot{q} = u(q)\, \cdot \,\xi ^{\mathcal{D}}.}$$

The constrained Lagrangian is the restriction of the Lagrangian to the constraint distribution. Thus, using Hamel’s formalism, the constrained Lagrangian reads

$$\displaystyle{l_{c}\left (q,\xi ^{\mathcal{D}}\right ) = l\left (q,\xi ^{\mathcal{D}},0\right ) \equiv l(q,\xi )\vert _{\xi ^{ \mathcal{U}}=0}.}$$

It is straightforward to check that an alternative form of the constrained Hamel equations is

$$\displaystyle{ \frac{d} {\mathit{dt}} \frac{\partial l_{c}} {\partial \xi ^{\mathcal{D}}} -\bigg (\bigg[\xi ^{\mathcal{D}}, \frac{\partial l} {\partial \xi } \bigg]_{q}^{{\ast}}\bigg)_{ \mathcal{D}}- u_{\mathcal{D}}[l_{c}] = 0,\quad \xi ^{\mathcal{U}} = 0. }$$
(25)

6.4 Continuous-Time Chaplygin Systems

As an important special case, consider commutative Chaplygin systems, which are nonholonomic systems with a commutative symmetry group H, \(\dim H = n - m\), and subject to the condition that at each q ∈ Q the tangent space T q Q is the direct sum of the fiber of the constraint distribution and the tangent space to the orbit \(\mathop{\mathrm{Orb}}\nolimits _{H}(q)\) of H through q:

$$\displaystyle{ T_{q}Q = \mathcal{D}_{q} \oplus T_{q}\!\mathop{ \mathrm{Orb}}\nolimits _{H}(q). }$$
(26)

To avoid technical difficulties, assume that the group H acts freely and properly on the configuration space Q, so that \(\pi: Q \rightarrow Q/H\) is a principal fiber bundle, where π is the projection. Elements of QH and H are denoted x and s, respectively.

Following [3], define an Ehresmann connection by requiring that \(\mathcal{D}_{q}\) and \(T_{q}\!\mathop{ \mathrm{Orb}}\nolimits _{H}(q)\) are the horizontal and vertical spaces at q ∈ Q, respectively. These spaces are denoted H q and V q .

In other words, the nonholonomic kinematic constraints provide an Ehresmann connection on the principal bundle \(\pi: Q \rightarrow Q/H\). Under the assumptions made above, the equations of motion drop to the reduced space \(\mathcal{D}/H\), which in this special case is the same as T(QH).

Recall that an Ehresmann connection A on a bundle Q is a vertical-valued one-form that is a projection; i.e., \(A_{q}: T_{q}Q \rightarrow V _{q}\) is a linear map for each q ∈ Q and A(v) = v for all v ∈ V q . In the bundle coordinates (x, s) introduced above, the form A reads

$$\displaystyle{ A =\omega ^{a} \frac{\partial } {\partial s^{a}},\quad \text{where}\quad \omega ^{a}(q) = A_{\alpha }^{a}(x)\,\mathit{dx}^{\alpha } + \mathit{ds}^{a}, }$$
(27)

where α = 1, , m and \(a = m + 1,\ldots,n\). Recall also that the horizontal space \(H_{q} =\ker A_{q}\), so that \(T_{q}Q = H_{q} \oplus V _{q}\), in full agreement with (26).

The curvature of A is the vertical-valued two-form defined by

$$\displaystyle{B(X,Y ) = -A([\mathop{\mathrm{hor}}\nolimits X,\mathop{\mathrm{hor}}\nolimits Y ]),}$$

where \(\mathop{\mathrm{hor}}\nolimits X\) and \(\mathop{\mathrm{hor}}\nolimits Y\) are the horizontal parts of the vectors X, Y ∈ T q Q. In the bundle coordinates (x, s),

$$\displaystyle{B(X,Y ) = B_{\alpha \beta }^{a}X^{\alpha }Y ^{\beta } \frac{\partial } {\partial s^{a}},}$$

where

$$\displaystyle{ B_{\alpha \beta }^{a} = \frac{\partial A_{\alpha }^{a}} {\partial r^{\beta }} -\frac{\partial A_{\beta }^{a}} {\partial r^{\alpha }}. }$$

Recall that the constrained Lagrangian is the restriction of the Lagrangian onto the constraint distribution: \(L_{c} = L\vert _{\mathcal{D}}\). For Chaplygin systems, L and L c naturally reduce to the functions on TQH and \(\mathcal{D}/H\), respectively. In the bundle coordinates (x, s), this simply means that L is independent of s,Footnote 3 i.e., \(L = L(x,\dot{x},\dot{s})\), and the constrained Lagrangian reads

$$\displaystyle{ L_{c}(x,\dot{x}) = L(x,\dot{x},-A(x)\,\dot{x}). }$$

The equations of motion for Chaplygin systems,

$$\displaystyle{ \frac{d} {\mathit{dt}} \frac{\partial L_{c}} {\partial \dot{x}} -\frac{\partial L_{c}} {\partial x} =\bigg\langle \frac{\partial L} {\partial \dot{s}},\mathop{\mathrm{\mathbf{i}}}\nolimits _{\dot{x}}B\bigg\rangle, }$$
(28)

or, in coordinates,

$$\displaystyle{ \frac{d} {\mathit{dt}} \frac{\partial L_{c}} {\partial \dot{x}^{\alpha }} -\frac{\partial L_{c}} {\partial x^{\alpha }} = -\frac{\partial L} {\partial \dot{s}^{a}}B_{\alpha \beta }^{a}\dot{x}^{\beta },}$$

\(\alpha,\beta = 1,\ldots,m\), \(a = m + 1,\ldots,n\), were first derived, through a coordinate calculation, by Chaplygin in [9]. They are called the Chaplygin equations.

Following [30], we now obtain equations (28) using Hamel’s formalism. Recall that connection (27) is defined by the constraint distribution. Equivalently, the constraints read

$$\displaystyle{\dot{s} + A(x)\,\dot{x} = 0.}$$

Associated with the constraint distribution are the vector fields

$$\displaystyle{ u_{\alpha } =\mathop{ \mathrm{hor}}\nolimits \partial _{x^{\alpha }} = \partial _{x^{\alpha }} - A_{\alpha }^{a}\partial _{ s^{a}}\quad \text{and}\quad u_{a} = \partial _{s^{a}}. }$$
(29)

Using this frame,

$$\displaystyle{\dot{q} =\dot{ x}^{\alpha }u_{\alpha } + (\dot{s}^{a} + A_{\alpha }^{a}\dot{x}^{\alpha })u_{ a},}$$

\(\alpha = 1,\ldots,m\), \(a = m + 1,\ldots,n\), or, equivalently,

$$\displaystyle{\xi ^{\mathcal{D}} =\dot{ x},\quad \xi ^{\mathcal{U}} =\dot{ s} + A(x)\,\dot{x},\quad \dot{q} = u_{ \mathcal{D}}\, \cdot \,\xi ^{\mathcal{D}} + u_{ \mathcal{U}}\, \cdot \,\xi ^{\mathcal{U}},}$$

and

$$\displaystyle{ l(x,\xi ) = L\left (x,\xi ^{\mathcal{D}},\xi ^{\mathcal{U}}- A(x)\xi ^{\mathcal{D}}\right ),\quad l_{ c}\left (x,\xi ^{\mathcal{D}}\right ) = L\left (x,\xi ^{\mathcal{D}},-A(x)\xi ^{\mathcal{D}}\right ). }$$
(30)

Evaluating the Jacobi–Lie brackets of the fields (29), one obtains

$$\displaystyle{[u_{\alpha },u_{\beta }] =\Bigg (\frac{\partial A_{\alpha }^{a}} {\partial x^{\beta }} -\frac{\partial A_{\beta }^{a}} {\partial x^{\alpha }} \Bigg) \frac{\partial } {\partial s^{a}} \equiv B_{\alpha \beta }^{a} \frac{\partial } {\partial s^{a}},\quad [u_{\alpha },u_{a}] = [u_{a},u_{b}] = 0,}$$

which implies

$$\displaystyle{\bigg(\bigg[\xi ^{\mathcal{D}}, \frac{\partial l} {\partial \xi } \bigg]_{q}^{{\ast}}\bigg)_{ \mathcal{D}} =\bigg\langle \frac{\partial L} {\partial \dot{s}},\mathop{\mathrm{\mathbf{i}}}\nolimits _{\dot{x}}B\bigg\rangle,}$$

and thus (28) are just the constrained Hamel equations (25). Recall that B is the curvature of the form A.

An important remark is that, from Chaplygin’s prospective, equations (28) are the Euler–Lagrange equations on the configuration space QH subject to a nonconservative force

$$\displaystyle{\bigg\langle \frac{\partial L} {\partial \dot{s}},\mathop{\mathrm{\mathbf{i}}}\nolimits _{\dot{x}}B\bigg\rangle.}$$

This force may be interpreted as a shape component of the constraint reaction.

Another important remark is that \(\dot{x}^{\alpha }\) in the classical literature are viewed as the reduced configuration velocities, whereas from the point of view of Hamel’s formalism \(\dot{x}^{\alpha }\) represent the velocity components along the non-commuting fields u α , \(\alpha = 1,\ldots,m\).

6.5 Discrete Nonholonomic Systems

Discrete nonholonomic systems (nonholonomic integrators) were introduced by Cortés and Martínez in [12].

Let Q be a configuration space. According to Cortés and Martínez, a discrete nonholonomic mechanical system on Q is characterized by:

  1. (i)

    A discrete Lagrangian \(L^{d}: Q \times Q \rightarrow \mathbb{R}\);

  2. (ii)

    A constraint distribution \(\mathcal{D}\) on Q;

  3. (iii)

    A discrete constraint manifold \(\mathcal{D}^{d} \subset Q \times Q\) which has the same dimension as \(\mathcal{D}\) and satisfies the condition \((q,q) \in \mathcal{D}^{d}\) for all q ∈ Q.

The dynamics is given by the following discrete Lagrange–d’Alembert principle (see [12]):

$$\displaystyle{\sum _{k=1}^{N-1}\Big(D_{ 1}L^{d}(q_{ k},q_{k+1}) + D_{2}L^{d}(q_{ k-1},q_{k})\Big)\delta q_{k} = 0,\ \delta q_{k} \in \mathcal{D}_{q_{k}},\ (q_{k},q_{k+1}) \in \mathcal{D}^{d}.}$$

As pointed out in [14, 15], the discrete constraint manifold should be carefully selected when a continuous-time nonholonomic system is discretized. For the details on the properties of discrete nonholonomic systems we refer the reader to papers [12, 14, 15, 31]. In a recent paper [22], a somewhat different approach to discretizing nonholonomic systems has been suggested.

Cortés and Martínez also study the dynamics of discrete Chaplygin systems. In particular, given a continuous-time Chaplygin system, they discretize the Euler–Lagrange equations with constraint reactions, and conclude that, in general, the resulting discrete system is inconsistent with the outcome of their discrete Lagrange–d’Alembert principle. In other words, the concept of ideal constraints is not acknowledged by their discretization procedure.

Lynch and Zenkov [25, 26] proved that the discrete dynamics defined by the Lagrange–d’Alembert principle of Cortés and Martínez may lack structural stability. For example, it is possible for the discretization of a continuous-time Chaplygin system to change the dimension and/or stability of manifolds of relative equilibria of the said continuous-time system.

Below, we shall show that a different definition of the discrete Lagrange–d’Alembert principle exists that is free of the aforementioned issues. In particular, the dimension and stability of manifolds of relative equilibria are kept intact if this new version of the Lagrange–d’Alembert principle is utilized.

6.6 Hamel’s Formalism for Discrete Nonholonomic Systems

Recall that the Lagrange–d’Alembert principle for continuous-time nonholonomic systems assumes that the variation of action is carried out before imposing the constraints. The outcome is the constrained Hamel equations, as discussed in Section 6.3. In a similar manner, we accept that the dynamics of a discrete nonholonomic system is determined by the discrete Lagrange–d’Alembert principle, obtained by first taking the variation of the discrete action (14) using variations (18) and (19) subject to the discrete analogue of (24), and then imposing the discrete constraints. We emphasize that the definition of the discrete Lagrange–d’Alembert principle given here is not the same as the definition of Cortés and Martínez reproduced in Section 6.5.

In the continuous-time setting, the constraints are represented by formula (23). We thus suggest that, under the same assumptions on the frame selection as in Section 6.3, the discrete constraints are

$$\displaystyle{ \xi _{k,k+1} =\xi _{ k,k+1}^{\mathcal{D}}\quad \text{or}\quad \xi _{ k,k+1}^{\mathcal{U}} = 0. }$$

The discrete analogue of (24) is

$$\displaystyle{ \zeta _{k} =\zeta _{ k}^{\mathcal{D}}\quad \text{or}\quad \zeta _{ k}^{\mathcal{U}} = 0. }$$

Arguing like in Section 6.3, one proves the discrete analogue of Theorem 4:

Theorem 5.

The dynamics of a discrete system with velocity constraints is given by the constrained discrete Hamel equations

$$\displaystyle\begin{array}{rcl} & & \tfrac{1} {h}\big(\mu _{k-1,k} -\mu _{k,k+1}\big)_{\mathcal{D}} +\big (\tau u\big[l^{d}\big]_{ k-1+\tau } + (1-\tau )u\big[l^{d}\big]_{ k+\tau }\big)_{\mathcal{D}} \\ & &\qquad +\big (\tau \big[\xi _{k-1,k}^{\mathcal{D}},\mu _{ k-1,k}\big]_{q_{k-1+\tau }}^{{\ast}} + (1-\tau )\big[\xi _{ k,k+1}^{\mathcal{D}},\mu _{ k,k+1}\big]_{q_{k+\tau }}^{{\ast}}\big)_{ \mathcal{D}} = 0,{}\end{array}$$
(31)

where μ k,k+1 is given by formula (16).

Of a special interest is the value \(\tau = 1/2\), in which case one verifies that the order of approximation of (31) is 2.

6.7 Discrete Chaplygin Systems

Given a continuous-time Chaplygin system, we construct its discretization by utilizing the discrete Hamel formalism. Using the frame (29) and the continuous-time Lagrangians (30) introduced in Section 6.4, the discrete Lagrangian and the discrete constrained Lagrangian read

$$\displaystyle\begin{array}{rcl} l^{d}(x_{ k+\tau },\xi _{k,k+1})& =& hl(x_{k+\tau },\xi _{k,k+1}), {}\\ l_{c}^{d}\left (x_{ k+\tau },\xi _{k,k+1}^{\mathcal{D}}\right )& =& l^{d}\left (x_{ k+\tau },\xi _{k,k+1}^{\mathcal{D}}\right ) \equiv hl_{ c}\left (x_{k+\tau },\xi _{k,k+1}^{\mathcal{D}}\right ). {}\\ \end{array}$$

The dynamics is then given by equation (31), with

$$\displaystyle{(\mu _{k,k+1})_{\mathcal{D}} = D_{2}l_{c}^{d}\left (x_{ k+\tau },\xi _{k,k+1}^{\mathcal{D}}\right ) \equiv D_{ 2}l^{d}(x_{ k+\tau },\xi _{k,k+1})\vert _{\xi _{k,k+1}^{\mathcal{U}}=0}}$$

and μ k, k+1 defined as in (16).

We now convert the dynamics into a discrete analogue of the Chaplygin equations (28). Following the general discretization procedure, we obtain the formulae

$$\displaystyle{\xi _{k,k+1}^{\mathcal{D}} =\varDelta x_{ k}/h,\quad \xi _{k,k+1}^{\mathcal{U}} =\varDelta s_{ k}/h + A(x_{k+\tau })\varDelta x_{k}/h.}$$

Then, invoking (30), it is straightforward to see that

$$\displaystyle\begin{array}{rcl} l^{d}(x_{ k+\tau },\xi _{k,k+1})& =& hl(x_{k+\tau },\xi _{k,k+1}) \\ & =& hL\left (x_{k+\tau },\xi _{k,k+1}^{\mathcal{D}},\xi _{ k,k+1}^{\mathcal{U}}- A(x_{ k+\tau })\xi _{k,k+1}^{\mathcal{D}}\right ){}\end{array}$$
(32)

and

$$\displaystyle\begin{array}{rcl} l_{c}^{d}\left (x_{ k+\tau },\xi _{k,k+1}^{\mathcal{D}}\right )& =& l^{d}\left (x_{ k+\tau },\xi _{k,k+1}^{\mathcal{D}}\right ) = hl_{ c}\left (x_{k+\tau },\xi _{k,k+1}^{\mathcal{D}}\right ) \\ & =& hL_{c}\left (x_{k+\tau },\xi _{k,k+1}^{\mathcal{D}}\right ) = hL_{ c}(x_{k+\tau },\varDelta x_{k}/h) \\ & & \qquad = hL(x_{k+\tau },\varDelta x_{k}/h,-A(x_{k+\tau })\varDelta x_{k}/h),{}\end{array}$$
(33)

where \(L(x,\dot{x},\dot{s})\) is the Lagrangian of the continuous-time Chaplygin system. From formulae (32), (33), and (29), one obtains

$$\displaystyle\begin{array}{rcl} \mu _{k,k+1}& =& D_{2}l^{d}(x_{ k+\tau },\xi _{k,k+1}), {}\\ (\mu _{k,k+1})_{\mathcal{D}}& =& D_{2}l_{c}^{d}\left (x_{ k+\tau },\xi _{k,k+1}^{\mathcal{D}}\right ) {}\\ & =& hD_{2}L_{c}\left (x_{k+\tau },\xi _{k,k+1}^{\mathcal{D}}\right ) = hD_{ 2}L_{c}(x_{k+\tau },\varDelta x_{k}/h), {}\\ (\mu _{k,k+1})_{\mathcal{U}}& =& D_{3}l^{d}\left (x_{ k+\tau },\xi _{k,k+1}^{\mathcal{D}},\xi _{ k,k+1}^{\mathcal{U}}\right ) {}\\ & =& hD_{3}L\left (x_{k+\tau },\xi _{k,k+1}^{\mathcal{D}},\xi _{ k,k+1}^{\mathcal{U}}- A(x_{ k+\tau })\xi _{k,k+1}^{\mathcal{D}}\right ) {}\\ & =& hD_{3}L(x_{k+\tau },\varDelta x_{k}/h,\varDelta s_{k}/h). {}\\ \end{array}$$

Next, since we utilize the frame (29) just like in the continuous-time setting, the formula

$$\displaystyle\begin{array}{rcl} & & \Big(\big[\xi _{k,k+1}^{\mathcal{D}},\mu _{ k,k+1}\big]_{q_{k+\tau }}^{{\ast}}\Big)_{ \mathcal{D}} =\Big\langle \mu _{k,k+1},\mathop{\mathrm{\mathbf{i}}}\nolimits _{\xi _{k,k+1}^{\mathcal{D}}}B_{q_{k+\tau }}\Big\rangle {}\\ & & \qquad \qquad \qquad =\Big\langle (\mu _{k,k+1})_{\mathcal{U}},\mathop{\mathrm{\mathbf{i}}}\nolimits _{\xi _{k,k+1}^{\mathcal{D}}}B_{q_{k+\tau }}\Big\rangle =\big\langle (\mu _{k,k+1})_{\mathcal{U}},\mathop{\mathrm{\mathbf{i}}}\nolimits _{\varDelta x_{k}/h}B_{q_{k+\tau }}\big\rangle {}\\ & & \qquad \qquad \qquad \qquad \qquad \qquad =\big\langle hD_{3}L(x_{k+\tau },\varDelta x_{k}/h,-A(x_{k+\tau })\varDelta x_{k}/h),\mathop{\mathrm{\mathbf{i}}}\nolimits _{\varDelta x_{k}/h}B_{q_{k+\tau }}\big\rangle {}\\ \end{array}$$

is established with an aid of the arguments of Section 6.4. To keep the formulae shorter, we write the latter expression as

$$\displaystyle{\big\langle hD_{3}L,\mathop{\mathrm{\mathbf{i}}}\nolimits _{\varDelta x_{k}/h}B\big\rangle _{k+\tau }.}$$

Finally,

$$\displaystyle\begin{array}{rcl} \big(u[l^{d}](q_{ k+\tau },\xi _{k,k+1})\big)_{\mathcal{D}}& =& D_{1}l^{d}\left (x_{ k+\tau },\xi _{k,k+1}^{\mathcal{D}}\right ) {}\\ & =& D_{1}l_{c}^{d}\left (x_{ k+\tau },\xi _{k,k+1}^{\mathcal{D}}\right ) = hD_{ 1}L_{c}(x_{k+\tau },\varDelta x_{k}/h). {}\\ \end{array}$$

Summarizing, the dynamics of the discrete Chaplygin system reads

$$\displaystyle\begin{array}{rcl} & & \tfrac{1} {h}\big((D_{2}L_{c})_{k+\tau } - (D_{2}L_{c})_{k-1+\tau }\big) =\tau (D_{1}L_{c})_{k-1+\tau } + (1-\tau )(D_{1}L_{c})_{k+\tau }\quad \\ & & \qquad +\tau \big\langle D_{3}L,\mathop{\mathrm{\mathbf{i}}}\nolimits _{\varDelta x_{k-1}/h}B\big\rangle _{k-1+\tau } + (1-\tau )\big\langle D_{3}L,\mathop{\mathrm{\mathbf{i}}}\nolimits _{\varDelta x_{k}/h}B\big\rangle _{k+\tau }, {}\end{array}$$
(34)

where \((D_{i}L_{c})_{k+\tau }:= D_{i}L_{c}(x_{k+\tau },\varDelta x_{k}/h).\) Remarkably, the discrete Chaplygin equations (34) are identical to the discretization of continuous-time Chaplygin equations (28) viewed as forced Euler–Lagrange dynamics. For more details on this latter discretization of the Chaplygin equations see [12] and [26].

6.8 Stability

In this section we link up stability of relative equilibria of Chaplygin systems with structural stability of nonholonomic integrators.

Consider a commutative Chaplygin system characterized by the Lagrangian L and constraint distribution \(\mathcal{D}\), as discussed in Section 6.4. Assume that the dynamics of the Chaplygin system (28) is invariant with respect to the action of a commutative group G on QH.Footnote 4 Often such a situation is the result of the original system being invariant with respect to the semidirect product \(G\mathop{\circledS }H\) of the groups G and H. The elements of the group G are denoted g, and we assume that the action of G on QH is free and proper, so that QH has the structure of a principal fiber bundle with the structure group G. Thus, locally, there exist the bundle coordinates x = (r, g) on QH.

Under certain assumptions (see e.g. [21] and [39]), the dynamics has a manifold (whose dimension equals dimG) of relative equilibria. These relative equilibria are the solutions of (28) that in the bundle coordinates (r, g) read

$$\displaystyle{ r = r_{e},\quad \dot{g} =\eta _{e}. }$$

As established in Karapetyan [21], some of these relative equilibria may be partially asymptotically stable. Karapetyan justifies stability using the center manifold stability analysis, which, for nonholonomic systems under consideration, reduces to verifying that the nonzero spectrum of linearization of (28) at the relative equilibrium of interest belongs to the left half-plane.Footnote 5

Partially asymptotically stable relative equilibria are a part of the ω-limit set of dynamics (28). Similarly, relative equilibria that become partially asymptotically stable after the time reversal are a part of the α-limit set of dynamics (28).

It is important for a long-term numerical integrator to preserve the manifold of relative equilibria and their stability types. Indeed, if the limit sets of an integrator are different from the limit sets of the continuous-time dynamics, this integrator will not adequately simulate the continuous-time dynamics over long time intervals.

As shown in [25, 26], the discrete Lagrange–d’Alembert principle of Cortés and Martínez may produce discretizations that fail to preserve the manifold of relative equilibria. For instance, it may change the dimension of this manifold, thus changing the structure of the limit sets. Informally, the origin of this effect can be explained as follows: The discrete Lagrange–d’Alembert principle of Cortés and Martínez is capable of introducing reactions that correspond to non-ideal constraints. A typical example would be a reaction force with a dissipative component, whose discrete counterpart causes the aforementioned changes of relative equilibria.

A relative equilibrium of a discrete Chaplygin system (34) with commutative symmetry is a solution

$$\displaystyle{r_{k} = \text{const},\quad \varDelta g_{k} = \text{const}.}$$

Assume now that \(\tau = 1/2\) in equations (34). Let h > 0 be the time step.

Theorem 6 (Lynch and Zenkov [25, 26]).

Discretization  (34) Footnote 6 preserves the manifold of relative equilibria of the continuous-time Chaplygin system; that is, \(r_{k} = r_{e}\) , \(\varDelta g_{k} = h\eta _{e}\) is a relative equilibrium of the discretization  (34) if and only if r = r e , \(\dot{g} =\eta _{e}\) is a relative equilibrium of the continuous-time system. The conditions for partial asymptotic stability of the equilibria of the continuous-time system and of its discretization are the same.

Summarizing, the discrete Lagrange–d’Alembert principle proposed in this paper ensures the necessary conditions for structural stability of the associated nonholonomic integrator.

7 The Spherical Pendulum

Here we outline the results of Zenkov, Leok, and Bloch [40] on the applications of the discrete Hamel formalism to the energy-momentum-preserving integrator for the spherical pendulum.

7.1 The Spherical Pendulum as a Degenerate Rigid Body

Consider a spherical pendulum whose length is r and mass is m. We view the pendulum as a point mass moving on the sphere of radius r centered at the origin of \(\mathbb{R}^{3}\). The development here is based on the representation

$$\displaystyle\begin{array}{rcl} \dot{\boldsymbol{\mu }} =\boldsymbol{\mu } \times \boldsymbol{\xi } + mg\,\boldsymbol{\gamma } \times \boldsymbol{ a},& &{}\end{array}$$
(35)
$$\displaystyle\begin{array}{rcl} \dot{\boldsymbol{\gamma }} =\boldsymbol{\gamma } \times \boldsymbol{\xi }& &{}\end{array}$$
(36)

of pendulum’s dynamics; that is, the pendulum is viewed as a rigid body rotating about a fixed point. This rigid body is of course degenerate, with the inertia tensor \(\mathcal{I} =\mathop{ \mathrm{diag}}\nolimits \{mr^{2},mr^{2},0\}\). Here \(\boldsymbol{\xi }\) is the angular velocity of the pendulum, \(\boldsymbol{\mu }\) is its angular momentum, \(\boldsymbol{\gamma }\) is the unit vertical vector (and thus the constraint \(\|\boldsymbol{\gamma }\|= 1\) is imposed), and \(\boldsymbol{a}\) is the vector from the origin to the center of mass, which for the pendulum is its bob, all written relative to the body frame. Throughout the rest of the paper, the boldface characters represent three-dimensional vectors. The kinetic and potential energies of the pendulum are

$$\displaystyle{K = \tfrac{1} {2}\langle \boldsymbol{\mu },\boldsymbol{\xi }\rangle \equiv \tfrac{1} {2}\langle \mathop{ \mathrm{\mathcal{I}}}\nolimits \boldsymbol{\xi },\boldsymbol{\xi }\rangle,\quad U = mg\langle \boldsymbol{\gamma },\boldsymbol{a}\rangle \equiv mgr\gamma ^{3},}$$

and the Lagrangian reads

$$\displaystyle{ l(\boldsymbol{\xi },\boldsymbol{\gamma }) = \tfrac{1} {2}\langle \mathop{ \mathrm{\mathcal{I}}}\nolimits \boldsymbol{\xi },\boldsymbol{\xi }\rangle -mg\langle \boldsymbol{\gamma },\boldsymbol{a}\rangle. }$$
(37)

This Lagrangian is invariant with respect to rotations about \(\boldsymbol{\gamma }\), and therefore the vertical component of the spatial angular momentum is conserved.

There are two independent components in the vector equation (35). We emphasize that the representation (35) and (36) of the dynamics of the pendulum, though redundant, eliminates the use of local coordinates on the sphere, such as spherical coordinates. Spherical coordinates, while being a nice theoretical tool, introduce artificial singularities at the north and south poles. That is, the equations of motion written in spherical coordinates have denominators vanishing at the poles, but this has nothing to do with the physics of the problem and is solely caused by the geometry of the spherical coordinates. Thus, the use of spherical coordinates in calculations is not advisable.

Another important remark is that the length of the vector \(\boldsymbol{\gamma }\) is a conservation law of equations (35) and (36), and thus adding the constraint \(\|\boldsymbol{\gamma }\|= 1\) does not result in a system of differential-algebraic equations. The latter are known to be a nontrivial object for numerical integration.

Equations (35) and (36) may be interpreted in a number of ways. In the above, we viewed them as the dynamics of a degenerate rigid body. Since the moment of inertia relative to the direction of the vector \(\boldsymbol{a}\) is zero, the third component of the body angular momentum vanishes,

$$\displaystyle{\mu _{3} = \frac{\partial l} {\partial \xi ^{3}} = 0,}$$

and thus there are only two nontrivial equations in (35). Thus, one needs five equations to capture the pendulum dynamics. This reflects the fact that rotations about the direction of the pendulum have no influence on the pendulum’s motion.

The dynamics then can be simplified by setting

$$\displaystyle{ \xi ^{3} = 0, }$$
(38)

which leads to an interpretation of equations (35) and (36) as the dynamics of the heavy Suslov top Footnote 7 with a rotationally-invariant inertia tensor and constraint (38).

Summarizing, the dynamics becomes

$$\displaystyle{ \dot{\boldsymbol{\mu }}= mg\boldsymbol{\gamma } \times \boldsymbol{ a},\quad \dot{\boldsymbol{\gamma }} =\boldsymbol{\gamma } \times \boldsymbol{\xi },\quad \langle \boldsymbol{\xi },\boldsymbol{a}\rangle = 0. }$$
(39)

These equations are in fact the constrained Hamel equations, the reconstruction equation, and the constraint, written in the redundant configuration coordinates \(\boldsymbol{\gamma }= (\gamma ^{1},\gamma ^{2},\gamma ^{3})\); see [40] for details. Recall that the length of \(\boldsymbol{\gamma }\) is the conservation law, so that the constraint \(\|\boldsymbol{\gamma }\|= 1\) does not need to be imposed, but the appropriate level set of the conservation law has to be selected.

Our discretization is based on this point of view, i.e., the discrete dynamics will be written in the form of discrete Hamel’s equations. The discrete dynamics will posses the discrete version of the conservation law \(\|\boldsymbol{\gamma }\|= \text{const}\), so that the algorithm should be capable, in theory, of preserving the length of \(\boldsymbol{\gamma }\) up to machine precision.

7.2 Variational Discretization for the Spherical Pendulum

The integrator for the spherical pendulum is constructed by discretizing equations (39).

Let the positive real constant h be the time step. Applying the mid-point rule to (37), the discrete Lagrangian is computed to be

$$\displaystyle{ l^{d}(\boldsymbol{\xi }_{ k,k+1},\boldsymbol{\gamma }_{k+1/2}) = \tfrac{h} {2} \langle \mathop{ \mathrm{\mathcal{I}}}\nolimits \boldsymbol{\xi }_{k,k+1},\boldsymbol{\xi }_{k,k+1}\rangle - hU(\boldsymbol{\gamma }_{k+1/2}). }$$

Here \(\boldsymbol{\xi }_{k,k+1} = (\xi _{k,k+1}^{1},\xi _{k,k+1}^{2},0)\) is the discrete analogue of the angular velocity \(\boldsymbol{\xi }= (\xi ^{1},\xi ^{2},0)\) and \(\boldsymbol{\gamma }_{k+1/2} = \frac{1} {2}(\boldsymbol{\gamma }_{k+1} +\boldsymbol{\gamma } _{k})\). The discrete dynamics then reads

$$\displaystyle\begin{array}{rcl} \tfrac{1} {h}\mathop{ \mathrm{\mathcal{I}}}\nolimits \big(\boldsymbol{\xi }_{k,k+1} -\boldsymbol{\xi }_{k-1,k}\big) = mg\big(\boldsymbol{\gamma }_{k+1/2} +\boldsymbol{\gamma } _{k-1/2}\big) \times \boldsymbol{ a},& &{}\end{array}$$
(40)
$$\displaystyle\begin{array}{rcl} \tfrac{1} {h}\big(\boldsymbol{\gamma }_{k+1/2} -\boldsymbol{\gamma }_{k-1/2}\big) = \tfrac{1} {2}\big(\boldsymbol{\gamma }_{k+1/2} +\boldsymbol{\gamma } _{k-1/2}\big) \times \tfrac{1} {2}\big(\boldsymbol{\xi }_{k,k+1} +\boldsymbol{\xi } _{k-1,k}\big).& &{}\end{array}$$
(41)

We reiterate that there is a certain flexibility in setting up the discrete analogue (41) of the continuous-time reconstruction equation (36). Our choice may be justified in a number of ways, one of them being energy conservation by the discrete dynamics.

The structure-preserving properties of the proposed integrator for the spherical pendulum are summarized in the following theorem.

Theorem 7 (Zenkov, Leok, and Bloch [40]).

The discrete spherical pendulum dynamics  (40) and  (41) preserves the energy, the vertical component of the spatial angular momentum, and the length of  \(\boldsymbol{\gamma }\) .

We refer the readers to [40] for the proof and details.

7.3 Simulations

Here we present simulations of the dynamics of the spherical pendulum using the integrator constructed in Section 7.2. For simulations, we select the parameters of the system and the time step to be

$$\displaystyle{m =\mathrm{ 1\;\mathrm{\mathrm{k}\mathrm{g}}},\quad r =\mathrm{ 9.8\;\mathrm{m}},\quad h =\mathrm{.2\;\mathrm{s}}.}$$

The trajectory of the bob of the pendulum with the initial conditions

$$\displaystyle\begin{array}{rcl} \xi _{0}^{1} =\mathrm{.6\;\mathrm{\mathrm{rad}\mathrm{/}\mathrm{s}}},\qquad \qquad \xi _{ 0}^{2} =\mathrm{ 0\;\mathrm{\mathrm{rad}\mathrm{/}\mathrm{s}}},& &{}\end{array}$$
(42)
$$\displaystyle\begin{array}{rcl} \gamma _{0}^{1} =\mathrm{.3\;\mathrm{m}},\qquad \qquad \gamma _{ 0}^{2} =\mathrm{.2\;\mathrm{m}},\qquad \qquad \gamma _{ 0}^{3} =\mathrm{ -\sqrt{1 -\big (\mbox{ $\gamma $} _{ 0}^{1}\big)^{2} -\big (\mbox{ $\gamma $}_{0}^{2}\big)^{2}}\;\mathrm{m}}& &{}\end{array}$$
(43)

is shown in Figure 1a. As expected, it reveals the quasiperiodic nature of pendulum’s dynamics.

Figure 1b shows pendulum’s trajectory that crosses the equator. This simulation demonstrates the global nature of the algorithm, and also seems to do a good job of hinting at the geometric conservation properties of the method.

Fig. 1
figure 1

Trajectories of the pendulum calculated with the Hamel integrator. (a) Pendulum’s trajectory on S 2 for initial conditions (42) and (43). (b) Pendulum’s trajectory on S 2 that crosses the equator

Theoretically, if one solves the nonlinear equations exactly, and in the absence of numerical roundoff error, the Hamel variational integrator should exactly preserve the length constraint and the energy. In practice, Figure 2a demonstrates that \(\|\boldsymbol{\gamma }\|\) stays to within unit length to about 10−14 after 10,000 iterations. Figure 2b demonstrates numerical energy conservation, and the energy error is to about 5 ⋅ 10−15 after 10,000 iterations. Indeed, one notices that the energy error tracks the length error of the simulation, which is presumably due to the relationship between the length of the pendulum and the potential energy of the pendulum. The drift in both appear to be due to accumulation of numerical roundoff error, and could possibly be reduced through the use of compensated summation techniques.

Fig. 2
figure 2

Numerical properties of the Hamel integrator for the pendulum. (a) Preservation of the length of \(\boldsymbol{\gamma }\). (b) Conservation of energy

For the comparison of the Hamel integrator with simulations using the generalized Störmer–Verlet method and the RATTLE method see [40]. We point out here that the energy error for the Hamel integrator is smaller than those of the Störmer–Verlet and RATTLE methods.

8 Conclusions

This paper introduced the discrete Hamel formalism and demonstrated its utility in nonholonomic mechanics. Future work will include further study of the properties of this formalism, as well as the development of discrete Hamel’s formalism on manifolds in general, and on Lie groups and homogeneous spaces as important special cases. It would be also interesting to relate the discrete Hamel formalism to the results of Iglesias et al. [19].