1 Introduction

In numerical ordinary differential equations (ODEs), it is known that all B-series methods (including Runge–Kutta methods) preserve linear invariants, while only certain ones preserve quadratic invariants. Linear invariants arising in physical systems include mass, charge, and linear momentum; quadratic invariants include angular momentum and other momentum maps, as well as the canonical symplectic form for Hamiltonian systems. See Hairer, Lubich, and Wanner [15] and references therein.

However, for partial differential equations (PDEs) describing time evolution, it is desirable for a numerical integrator to preserve not only global invariants but also local conservation laws. For instance, the evolution may preserve total mass (a global invariant), but the mass in a particular region may change by flowing through the boundary of the region (a local conservation law). Another example is the canonical multisymplectic conservation law for Hamiltonian PDEs, which is a quadratic local conservation law for the variational equation. Focusing only on global invariants overlooks this more granular, local form of conservativity.

This paper develops a new framework for the preservation of such properties by numerical integrators. We do so by answering a much more general question: When does a numerical integrator preserve the evolution of certain classes of observables (e.g., linear, quadratic), even when those observables are not invariants? This includes not only global invariants, as previously studied, but also local conservation laws and other balance laws encountered in both conservative and dissipative dynamical systems.

The main idea of our approach is summarized as follows. Suppose \( y = y (t) \) evolves in a (finite- or infinite-dimensional) Banach space Y according to \( {\dot{y}} = f (y) \). Given a functional \( F \in C ^1 (Y) \), the chain rule implies that \( z = F (y) \) evolves according to \( {\dot{z}} = F ^\prime (y) f (y) \). Now, if \(\Phi \) is a numerical integrator, let \( \Phi _f :y _0 \mapsto y _1 \) denote its application to the original system \( {\dot{y}} = f (y) \), and let \( \Phi _g :( y _0 , z _0 ) \mapsto ( y _1, z _1 ) \) denote its application to the augmented system

$$\begin{aligned} {\dot{y}} = f (y) , \qquad {\dot{z}} = F ^\prime (y) f (y) , \end{aligned}$$
(1)

corresponding to the vector field \( g ( y , z ) = \bigl ( f (y) , F ^\prime (y) f (y) \bigr ) \). We say that \(\Phi \) is F-functionally equivariant if \( \Phi _g \) preserves the relation \( z = F (y) \), i.e., \( \Phi _g :\bigl ( y _0, F ( y _0 ) \bigr ) \mapsto \bigl ( y _1 , F ( y _1 ) \bigr ) \), for all vector fields f on Y. In other words, the following diagram commutes:

figure a

This is weaker than equivariance in the usual sense, since the diagram need only commute for (1), not arbitrary \((\mathrm {id}, F)\)-related vector fields. Preserving invariants becomes the special case where the augmented equation reads \( {\dot{z}} = 0 \) and the integrator leaves z constant.

We develop a theory of functional equivariance and show that it provides a useful tool kit for understanding the behavior of (especially affine and quadratic) observables, including local conservation laws and multisymplecticity. The paper is organized as follows:

  • Section 2 characterizes the functional equivariance of a large class of numerical integrators, including B-series methods, and explores some consequences for both conservative and non-conservative dynamical systems. The main result, Theorem 2.9, shows that a method is functionally equivariant for a class of observables if and only if it preserves invariants in that class. In particular, all B-series methods are affine functionally equivariant, and those preserving quadratic invariants are quadratic functionally equivariant.

  • Section 3 applies this framework to local conservation laws for PDEs and spatially semidiscretized PDEs. In particular, affine/quadratic functionally equivariant numerical integrators are seen to preserve discrete-time local conservation laws for affine/quadratic observables.

  • Section 4 applies this framework to the multisymplectic conservation law for canonical Hamiltonian PDEs and spatially semidiscretized PDEs. We show that multisymplectic semidiscretization in space, followed by a symplectic integrator in time, yields a multisymplectic method in spacetime. We also show that hybrid finite elements may be used for multisymplectic semidiscretization, generalizing the results of McLachlan and Stern [22] to time-evolution problems.

  • Finally, Sect. 5 extends the results from the class of methods considered in Sect. 2 to additive and partitioned methods, including additive/partitioned Runge–Kutta methods and splitting/composition methods.

We remark that many of the results, particularly in Sects. 2 and 5, are obtained using only the equivariance properties of methods with respect to affine maps, rather than representing them in terms of trees or Runge–Kutta tableaux. In particular, Theorem 2.12 gives a new, tree-free proof that B-series methods are closed under differentiation, while Theorem 5.20 generalizes this to additive and partitioned methods.

2 Functional Equivariance

2.1 Basic Definitions and Results

Let \( \Phi \) be a one-step numerical integrator, whose application to a vector field \( f \in {\mathfrak {X}} (Y) \) with time-step size \(\Delta t\) gives a map \( \Phi _{\Delta t,f} :Y \rightarrow Y \), \( y _0 \mapsto y _1 \). All the methods we will consider have \( \Phi _{ \Delta t, f } = \Phi _{ 1, \Delta t f } \), so it suffices to consider integrator maps \( \Phi _f {:}{=}\Phi _{ 1, f } \) with unit time step. When we refer to a numerical integrator, we mean the entire collection of maps \( \Phi = \bigl \{ \Phi _f : f \in {\mathfrak {X}} (Y) ,\ Y \text { a Banach space} \bigr \} \).Footnote 1

Remark 2.1

While this definition covers a large class of numerical integrators, including B-series methods, other classes of methods require additional data besides f in order to define an integrator map, e.g., an additive decomposition of f or a partitioning of Y. In Sect. 5, we will discuss how the results of this section generalize to such methods, including additive/partitioned Runge–Kutta methods and splitting/composition methods.

The main class of numerical integrators of this type that we will consider are B-series methods, which McLachlan, Modin, Munthe-Kaas, and Verdier [18] proved are completely characterized by the following property of affine equivariance.

Definition 2.2

Given an affine map \( A :Y \rightarrow U \), a pair of vector fields \( f \in {\mathfrak {X}} (Y) \) and \( g \in {\mathfrak {X}} (U) \) is A-related if \( A ^\prime \circ f = g \circ A \). A numerical integrator \(\Phi \) is affine equivariant if \( A \circ \Phi _f = \Phi _g \circ A \) for all A-related f and g, all affine maps A, and all Banach spaces Y and U.

Remark 2.3

This is consistent with the definition of affine equivariance in [18]. We distinguish it from the weaker definition in Munthe-Kaas and Verdier [23], where the condition above is required only for affine isomorphisms rather than all affine maps.

Definition 2.4

Given a Gâteaux differentiable map \( F :Y \rightarrow Z \) and \( f \in {\mathfrak {X}} (Y) \), define \( g \in {\mathfrak {X}} ( Y \times Z ) \) by \( g ( y, z ) = \bigl ( f (y) , F ^\prime (y) f (y) \bigr ) \). We say that a numerical integrator \(\Phi \) is F-functionally equivariant if \( (\mathrm {id}, F ) \circ \Phi _f = \Phi _g \circ ( \mathrm {id}, F ) \) for all \( f \in {\mathfrak {X}} (Y) \). That is, if \( \Phi _f :y _0 \mapsto y _1 \), then \( \Phi _g :\bigl ( y _0, F ( y _0 ) \bigr ) \mapsto \bigl ( y _1 , F ( y _1 ) \bigr ) \). Given a class of maps \({\mathcal {F}}\), we say that \(\Phi \) is \( {\mathcal {F}} \)-functionally equivariant if this holds for all \( F \in {\mathcal {F}} ( Y, Z ) \) and all Banach spaces Y and Z.

This is a slight generalization of the situation considered in the introduction: Z may now be any Banach space rather than \( {\mathbb {R}} \), and F is only required to be Gâteaux differentiable rather than \( C ^1 \). Note that \(g \in {\mathfrak {X}} ( Y \times Z ) \) is precisely the vector field corresponding to the augmented system (1).

Example 2.5

(Runge–Kutta methods) An s-stage Runge–Kutta method has the form

$$\begin{aligned} Y _i&= y _0 + \Delta t \sum _{ j = 1 } ^s a _{ i j } f ( Y _j ) , \qquad i = 1, \ldots , s ,\\ y _1&= y _0 + \Delta t \sum _{ i = 1 } ^s b _i f ( Y _i ) , \end{aligned}$$

where \( a _{ i j } \) and \( b _i \) are given coefficients defining the method. When this method is applied to the augmented system (1), we augment the method by

$$\begin{aligned} z _1 = z _0 + \Delta t \sum _{ i = 1 } ^s b _i F ^\prime ( Y _i ) f ( Y _i ) . \end{aligned}$$

Note that the internal stages \( Z _1, \ldots , Z _s \) are not needed, since the augmented vector field depends only on y. Hence, for a Runge–Kutta method, F-functional equivariance says that

$$\begin{aligned} F ( y _1 ) = F ( y _0 ) + \Delta t \sum _{ i = 1 } ^s b _i F ^\prime ( Y _i ) f ( Y _i ) . \end{aligned}$$

In particular, if F is an invariant of f, so that \( F ^\prime (y) f (y) = 0 \) for all \(y \in Y \), then the terms of this sum vanish, and we get \( F ( y _1 ) = F ( y _0 ) \).

Proposition 2.6

Every affine equivariant method is affine functionally equivariant.

Proof

If \( F :Y \rightarrow Z \) is an affine map, then so is \( (\mathrm {id}, F ) :Y \rightarrow Y \times Z \). Since the vector fields f and g in Definition 2.4 are \( ( \mathrm {id}, F ) \)-related, the conclusion follows by affine equivariance. \(\square \)

Remark 2.7

The converse is generally not true. For instance, the map \( y _0 \mapsto y _0 + ( f {\text {div}} f ) ( y _0 ) \), which is defined for finite-dimensional Y, is seen to be affine functionally equivariant but is not affine equivariant except in the weaker sense mentioned in Remark 2.3. This is an example of an “aromatic” series that is not a B-series, cf. Munthe-Kaas and Verdier [23].

Since we are concerned for the time being with affine equivariant numerical integrators, it is natural to make the following assumptions on \({\mathcal {F}}\). This includes cases where \({\mathcal {F}}\) contains not only affine maps but also quadratic or higher-degree polynomial maps.

Assumption 2.8

The class of maps \({\mathcal {F}}\) satisfies the following:

  • \({\mathcal {F}} (Y, Y) \) contains the identity map for all Y;

  • \( {\mathcal {F}} ( Y, Z ) \) is a vector space for all Y and Z;

  • \( {\mathcal {F}} \) is invariant under composition with affine maps, in the following sense: If \( A :Y \rightarrow U \) and \( B :V \rightarrow Z \) are affine and \( F \in {\mathcal {F}} ( U, V ) \), then \( B \circ F \circ A \in {\mathcal {F}} ( Y, Z ) \).

As noted in the introduction, preservation of invariants may be seen as a special case of functional equivariance, so one might expect the latter property to be strictly stronger than the former. Perhaps surprisingly, our first main result shows that they are equivalent.

Theorem 2.9

Let \({\mathcal {F}}\) satisfy Assumption 2.8. A numerical integrator \(\Phi \) preserves \({\mathcal {F}}\)-invariants if and only if it is \({\mathcal {F}}\)-functionally equivariant.

Proof

\(( \Rightarrow ) \) Suppose \(\Phi \) preserves \({\mathcal {F}}\)-invariants. Given \( F \in {\mathcal {F}} ( Y, Z ) \), it follows from Assumption 2.8 that \( G ( y, z ) = F (y) - z \) is in \( {\mathcal {F}} ( Y \times Z , Z ) \). This is an invariant of the augmented vector field \(g(y, z ) = \bigl ( f (y) , F ^\prime (y) f (y) \bigr ) \), since \( G ^\prime ( y, z ) g ( y, z ) = F ^\prime (y) f (y) - F ^\prime (y) f (y) = 0 \). Hence, preservation of \({\mathcal {F}}\)-invariants implies \( \Phi _g :( y _0, z _0 ) \mapsto ( y _1 , z _1 ) \) satisfies \( G ( y _1, z _1 ) = G ( y _0 , z _0 ) \), i.e., \( F ( y _1 ) - z _1 = F ( y _0 ) - z _0 \). In particular, \( z _0 = F ( y _0 ) \) implies \( z _1 = F ( y _1 ) \).

\( ( \Leftarrow ) \) Conversely, suppose \(\Phi \) is \({\mathcal {F}}\)-functionally equivariant. If \(F \in {\mathcal {F}} ( Y, Z ) \) is an invariant of \(f \in {\mathfrak {X}} ( Y ) \), then the augmented vector field is \( g = ( f, 0 ) \), and \({\mathcal {F}}\)-functional equivariance implies \( \Phi _g :\bigl ( y _0, F ( y _0 ) \bigr ) \mapsto \bigl ( y _1 , F ( y _1 ) \bigr ) \). However, any constant functional \( y \mapsto c \in Z \) is also in \( {\mathcal {F}} ( Y , Z ) \) and has the same augmented vector field g, so \( \Phi _g :( y _0, c ) \mapsto ( y _1 , c ) \) for all \( c \in Z \). Hence, \( F ( y _1 ) = F ( y _0 ) \). \(\square \)

Corollary 2.10

For B-series methods, the following statements hold:

  1. (a)

    Every B-series method is affine functionally equivariant.

  2. (b)

    B-series methods preserving quadratic invariants (e.g., Gauss–Legendre collocation methods) are quadratic functionally equivariant.

  3. (c)

    No B-series method is cubic functionally equivariant.

Proof

This follows since B-series methods are affine equivariant and none preserves arbitrary cubic invariants (Chartier and Murua [9], Iserles, Quispel, and Tse [16]). \(\square \)

2.2 Strong Equivariance vs. Functional Equivariance

There is a stronger notion of F-equivariance, based on a straightforward generalization of Definition 2.2 to nonlinear maps \( F :Y \rightarrow U \). Two vector fields \( f \in {\mathfrak {X}} (Y) \) and \( g \in {\mathfrak {X}} (U) \) are F-related if \( F ^\prime (y) f (y) = ( g \circ F ) (y) \) for all \( y \in Y \), and \(\Phi \) is F-equivariant if \( F \circ \Phi _f = \Phi _g \circ F \) whenever this is the case.

To illustrate the distinction with functional equivariance, we now show that the implicit midpoint method is not quadratic equivariant in this stronger sense, although Corollary 2.10(b) tells us that it is quadratic functionally equivariant. Let \( F :{\mathbb {R}} \rightarrow {\mathbb {R}} , y \mapsto y ^2 \), and observe that the vector fields

$$\begin{aligned} f (y) = -y , \qquad g (u) = -2 u , \end{aligned}$$

are F-related. Applying the implicit midpoint method with time step size \(\Delta t = 1 \) gives

$$\begin{aligned} y _1 = \frac{ 1 }{ 3 } y _0 , \qquad u _1 = 0 . \end{aligned}$$

Since \( u _1 \ne ( y _1 ) ^2 \) for \( y _0 \ne 0 \), the method is not F-equivariant. On the other hand, applying the method to the augmented equation \( {\dot{z}} = F ^\prime (y) f (y) = - 2 y ^2 \) with \( z _0 = ( y _0 ) ^2 \) gives

$$\begin{aligned} z _1 = ( y _0 ) ^2 - 2 \biggl ( \frac{ y _0 + y _1 }{ 2 } \biggr ) ^2 = ( y _0 ) ^2 - 2 \biggl ( \frac{ y _0 + \frac{ 1 }{ 3 } y _0 }{ 2 } \biggr ) ^2 = \frac{ 1 }{ 9 } (y _0) ^2 = ( y _1 ) ^2 , \end{aligned}$$

which illustrates that the method is F-functionally equivariant.

Essentially, functional equivariance requires only that \(\Phi \) commute with particular pairs of related vector fields, while strong equivariance requires that it commute with all such pairs.

2.3 Affine Equivariance and Closure under Differentiation

In addition to invariants and observables that depend on y itself, we are often interested in those that depend on variations of y. We say that \(\eta \) is a variation of y if \( ( y, \eta ) \in Y \times Y \) satisfy

$$\begin{aligned} {\dot{y}} = f (y) , \qquad {\dot{\eta }} = f ^\prime (y) \eta , \end{aligned}$$
(2)

whose flow is the derivative of the flow of \( f \in {\mathfrak {X}} (Y) \). An especially important example is the canonical symplectic form for Hamiltonian ODEs, which can be understood as a quadratic invariant depending on two variations of y.

Definition 2.11

A numerical method \(\Phi \) is said to be closed under differentiation if the method applied to (2) is the derivative of \( \Phi _f \), i.e., \( ( y _1 , \eta _1 ) = \bigl ( \Phi _f ( y _0 ), \Phi ^\prime _f ( y _0 ) \eta _0 \bigr ) \).

Bochev and Scovel [5] showed that Runge–Kutta methods are closed under differentiation, from which it follows that those preserving quadratic invariants are symplectic integrators. The same argument can be applied to B-series methods, where closure under differentiation can be established by showing that it holds for all trees [15, Theorem VI.7.1]. Here, we present a new, tree-free proof that uses only affine equivariance, and which will readily generalize to additive and partitioned methods in Sect. 5.

Theorem 2.12

Affine equivariant numerical integrators are closed under differentiation.

Proof

Given \( f \in {\mathfrak {X}} (Y) \), consider the system

$$\begin{aligned} {\dot{x}} = f (x) , \qquad {\dot{y}} = f (y) , \end{aligned}$$

corresponding to \( f \times f \in {\mathfrak {X}} ( Y \times Y ) \). Since \( f \times f \) is A-related to f, where A is either of the projections \( ( x, y ) \mapsto x \) or \( ( x, y ) \mapsto y \), it follows that \( \Phi _{ f \times f } = \Phi _f \times \Phi _f \). Now, let \( \epsilon > 0 \) and take \( z = F(x,y) = ( x - y ) / \epsilon \), giving the augmented system

$$\begin{aligned} {\dot{x}} = f (x) , \qquad {\dot{y}} = f (y) , \qquad {\dot{z}} = \frac{ f (x) - f (y) }{ \epsilon } . \end{aligned}$$

By Proposition 2.6, applying \(\Phi \) to this system yields

$$\begin{aligned} x _1 = \Phi _f ( x _0 ) , \qquad y _1 = \Phi _f ( y _0 ) , \qquad z _1 = \frac{ \Phi _f ( x _0 ) - \Phi _f ( y _0 ) }{ \epsilon } . \end{aligned}$$

Finally, let \( x _0 = y _0 + \epsilon \eta _0 \) and take the limit as \( \epsilon \rightarrow 0 \). \(\square \)

Corollary 2.13

Let \(\Phi \) be an affine equivariant numerical integrator preserving \({\mathcal {F}}\)-invariants. Given \( F \in {\mathcal {F}} ( Y \times Y , Z ) \), define \( g \in {\mathfrak {X}} ( Y \times Y \times Z ) \) by

$$\begin{aligned} g ( y, \eta , z ) = \Bigl ( f (y) , f ^\prime (y) \eta , F ^\prime ( y, \eta ) \bigl ( f (y) , f ^\prime (y) \eta \bigr ) \Bigr ). \end{aligned}$$

Then \( \Phi _g \bigl ( y _0 , \eta _0, F ( y _0 , \eta _0 ) \bigr ) = \bigl ( y _1 , \eta _1 , F ( y _1, \eta _1 ) \bigr ) \), where \( y _1 = \Phi _f ( y _0 ) \) and \( \eta _1 = \Phi _f ^\prime ( y _0 ) \eta _0 \).

Proof

Apply Theorems 2.9 and 2.12. \(\square \)

Remark 2.14

It is trivial to extend Corollary 2.13 to the case where F depends on two or more variations of y, e.g., \( F = F ( y, \xi , \eta ) \) where \( {\dot{\xi }} = f ^\prime (y) \xi \) and \( {\dot{\eta }} = f ^\prime (y) \eta \).

2.4 Examples

Before discussing applications to conservation laws for PDEs, which will be the subject of Sect. 3, we first illustrate some examples of functional equivariance for numerical ODEs.

2.4.1 Hamiltonian Systems

Suppose Y is equipped with a Poisson bracket \( \{ \cdot , \cdot \} \). Given \( H :Y \rightarrow {\mathbb {R}} \), the corresponding Hamiltonian vector field \( f \in {\mathfrak {X}} (Y) \) is determined by the condition \( {\dot{F}} = \{ F , H \} \) for \( F :Y \rightarrow {\mathbb {R}} \). That is, the augmented system (1) can be written

$$\begin{aligned} {\dot{y}} = f (y) , \qquad {\dot{z}} = \{ F, H \} (y) . \end{aligned}$$

Hence, if \(\Phi \) is \({\mathcal {F}}\)-functionally equivariant, then applying \(\Phi \) to this system gives a “discrete version” of \( {\dot{F}} = \{ F, H \} \) for \( F \in {\mathcal {F}} ( Y, {\mathbb {R}} ) \). For a Runge–Kutta method, this has the form

$$\begin{aligned} F ( y _1 ) = F ( y _0 ) + \Delta t \sum _{ i = 1 } ^s b _i \{ F, H \} ( Y _i ) . \end{aligned}$$

This holds for any Poisson bracket, not just the canonical bracket or those for which the Poisson tensor is constant. Preservation of \({\mathcal {F}}\)-invariants is the special case where \( \{ F, H \} = 0 \).

2.4.2 Canonical Hamiltonian Systems with Damping/Forcing

Let \( Y = {\mathbb {R}} ^{ 2 n } \) with canonical coordinates \( y = ( q, p ) \). Consider the system

$$\begin{aligned} {\dot{q}} = \nabla _p H (q,p) , \qquad {\dot{p}} = - \nabla _q H (q,p) - c p , \end{aligned}$$
(3)

where H is a Hamiltonian and \( c \ge 0 \) a constant parameter. If H has the special form \( H ( q, p ) = \frac{1}{2} p ^T M ^{-1} p + V (q) \), where M is a positive definite mass matrix and V is a potential energy function, then energy is dissipated according to \( \frac{\mathrm {d}}{\mathrm {d}t} H ( q, p ) = - c p ^T M ^{-1} p \), and the parameter c dictates the rate of dissipation.

If V is also quadratic, then so is H, and hence any quadratic functionally equivariant method \(\Phi \) yields a discrete version of this dissipation law. If \(\Phi \) is a Runge–Kutta method preserving quadratic invariants, then this has the form

$$\begin{aligned} H ( q _1 , p _1 ) = H ( q _0 , p _0 ) - \Delta t \sum _{ i = 1 } ^s b _i c P _i ^T M ^{-1} P _i . \end{aligned}$$

There is no reason to restrict to linear damping: if we replace the damping term \( - c p \) in (3) by an arbitrary forcing term \( \phi ( q, p ) \), then we obtain

$$\begin{aligned} H(q _1, p _1 ) = H(q _0, p _0) + \Delta t \sum _{ i = 1 } ^s b _i P _i ^T M ^{-1} \phi ( Q _i , P _i ) , \end{aligned}$$
(4)

where the sum on the right-hand side approximates the work done by \(\phi \). When V is not quadratic, the identities above generally do not hold. However, since the kinetic energy functional \(\frac{1}{2} p^T M^{-1} p\) is quadratic, we still get the weaker identity

$$\begin{aligned} \frac{1}{2} p _1 ^T M ^{-1} p _1 = \frac{1}{2} p _0 ^T M ^{-1} p _0 + \Delta t \sum _{ i = 1 } ^s b _i P _i ^T M ^{-1} \bigl [ - \nabla V ( Q _i ) + \phi ( Q _i , P _i ) \bigr ] , \end{aligned}$$

where the sum approximates work done by both conservative and non-conservative forces.

2.4.3 Monotone Observables

Suppose \( F \in {\mathcal {F}} (Y, {\mathbb {R}} ) \) is such that \( F ^\prime (y) f (y) \le 0 \), so F(y) is monotone decreasing. If \(\Phi \) is an \({\mathcal {F}}\)-functionally equivariant Runge–Kutta method with \( b _i \ge 0 \), then

$$\begin{aligned} F ( y _1 ) = F ( y _0 ) + \Delta t \sum _{ i = 1 } ^s b _i F ^\prime ( Y _i ) f ( Y _i ) \le F ( y _0 ) , \end{aligned}$$

so F is also monotone decreasing along the numerical solution. Conversely, any method with this monotonicity property also preserves \({\mathcal {F}}\)-invariants, and is thus \({\mathcal {F}}\)-functionally equivariant, since F is an invariant when \( \pm F \) are both monotone decreasing.

Remark 2.15

For Runge–Kutta methods, the additional condition \( b _i \ge 0 \) is needed to get monotonicity. Functional equivariance alone is not sufficient. We are not aware of a more general version of this condition for arbitrary B-series methods.

An immediate consequence is the known B-stability of Runge–Kutta methods preserving quadratic invariants with \( b _i \ge 0 \). If Y is a Hilbert space with inner product \( \langle \cdot , \cdot \rangle \), consider

$$\begin{aligned} {\dot{x}} = f (x) , \qquad {\dot{y}} = f (y) , \end{aligned}$$

on \( Y \times Y \), and let \( F ( x, y ) = \frac{1}{2} \Vert x - y \Vert ^2 \). Then \( F ^\prime ( x, y ) \bigl ( f (x) , f (y) \bigr ) = \bigl \langle x - y , f (x) - f (y) \bigr \rangle \le 0 \) implies \( F (x _1 , y _1 ) \le F ( x _0 , y _0 ) \), i.e., \( \Vert x _1 - y _1 \Vert \le \Vert x _0 - y _0 \Vert \). This is precisely the condition for B-stability, cf. Butcher [8] , Burrage and Butcher [7].

Another immediate application is to the dissipative systems in Sect. 2.4.2, when H is quadratic. If \( \phi \) is a dissipative force, in the sense that \( p ^T M ^{-1} \phi ( q, p ) \le 0 \) for all \( ( q, p ) \in {\mathbb {R}} ^{ 2 n } \), then (4) implies \( H ( q _1 , p _1 ) \le H ( q _0 , p _0 ) \), i.e., the quadratic energy is monotone decreasing along the numerical solution.

2.4.4 Symplectic and Conformal Symplectic Systems

Suppose that \(\omega \) is a continuous bilinear form on Y. Let \(\xi \) and \(\eta \) each be variations of y, so that \( ( y, \xi , \eta ) \in Y \times Y \times Y \) satisfy

$$\begin{aligned} {\dot{y}} = f (y) , \qquad {\dot{\xi }} = f ^\prime (y) \xi , \qquad {\dot{\eta }} = f ^\prime (y) \eta . \end{aligned}$$

Then \( \omega ( \xi , \eta ) \) is a quadratic functional of this augmented system, evolving according to

$$\begin{aligned} \frac{\mathrm {d}}{\mathrm {d}t} \omega ( \xi , \eta ) = \omega \bigl ( f ^\prime (y) \xi , \eta \bigr ) + \omega \bigl ( \xi , f ^\prime (y) \eta \bigr ) = (L _f \omega ) _y ( \xi , \eta ) , \end{aligned}$$

where \( (L _f \omega ) _y \) is the Lie derivative along f of \(\omega \) at y [1, Theorem 6.4.1].

If \(\Phi \) preserves quadratic invariants, then we may apply quadratic functional equivariance to describe the numerical evolution of \(\omega \). Taking \( g \in {\mathfrak {X}} ( Y \times Y \times Y \times {\mathbb {R}} ) \) to be

$$\begin{aligned} g ( y, \xi , \eta , z ) = \bigl ( f (y) , f ^\prime (y) \xi , f ^\prime (y) \eta , ( L _f \omega ) _y ( \xi , \eta ) \bigr ) , \end{aligned}$$

it follows from Corollary 2.13 and Remark 2.14 that

$$\begin{aligned} \Phi _g ( y _0 , \xi _0 , \eta _0, \omega ( \xi _0, \eta _0 ) \bigr ) = \bigl ( y _1, \xi _1 , \eta _1 , \omega ( \xi _1, \eta _1 ) \bigr ) , \end{aligned}$$
(5)

where \( y _1 = \Phi _f ( y _0 ) \), \( \xi _1 = \Phi _f ^\prime ( y _0 ) \xi _0 \), and \( \eta _1 = \Phi _f ^\prime ( y _0 ) \eta _0 \). Furthermore, this implies

$$\begin{aligned} \omega ( \xi _1 , \eta _1 ) = \omega \bigl ( \Phi _f ^\prime ( y _0 ) \xi _0, \Phi _f ^\prime ( y _0 ) \eta _0 \bigr ) = ( \Phi _f ^*\omega ) _{y _0} ( \xi _0, \eta _0 ) , \end{aligned}$$

where \( ( \Phi _f ^*\omega ) _{ y _0 } \) is the pullback of \(\omega \) by \( \Phi _f \) at \( y _0 \). For a Runge–Kutta method preserving quadratic invariants, (5) takes the form

$$\begin{aligned} \omega (\xi _1 , \eta _1 ) = \omega ( \xi _0 , \eta _0 ) + \Delta t \sum _{ i = 1 } ^s b _i ( L _f \omega ) _{ Y _i } ( \Xi _i , \mathrm {H} _i ) , \end{aligned}$$
(6)

and the sum on the right-hand side expresses the difference between \( \Phi _{\Delta t f} ^*\omega \) and \( \omega \) at \( y _0 \).

In particular, suppose that \(\omega \) is antisymmetric and nondegenerate, so that \( ( Y , \omega ) \) is a symplectic vector space. If f is a symplectic vector field, satisfying \( L _f \omega = 0 \), then we recover the result of Bochev and Scovel [5] that if \(\Phi \) preserves quadratic invariants, then \( \Phi _f ^*\omega = \omega \), i.e., \(\Phi \) is a symplectic integrator. An interesting generalization is the case of conformal symplectic vector fields, satisfying \( L _f \omega = - c \omega \) for some constant c, of which (3) is a canonical example; see McLachlan and Perlmutter [17]. In this case, (6) becomes

$$\begin{aligned} \omega (\xi _1 , \eta _1 ) = \omega ( \xi _0 , \eta _0 ) - \Delta t \sum _{ i = 1 } ^s b _i c \omega ( \Xi _i , \mathrm {H} _i ) , \end{aligned}$$

which can be seen as an approximate conformal symplecticity relation. However, \( \Phi _{ \Delta t f } ^*\omega \) generally does not equal \( e ^{ - c \Delta t } \omega \) exactly unless \( c = 0 \); see McLachlan and Quispel [19, Example 7] for a counterexample when \(\Phi \) is the implicit midpoint method.

Remark 2.16

The arguments above apply without modification if \(\omega \) is a vector-valued bilinear form, i.e., a continuous bilinear map \( Y \times Y \rightarrow Z \) for some Banach space Z.

3 Application to Conservation Laws in PDEs

In this section, we apply the general results of Sect. 2 to local conservation laws in time-evolution PDEs. We also consider discrete conservation laws in numerical PDEs, when semidiscretization in space is combined with numerical integration in time.

3.1 General Approach and Examples

Let \( {\dot{y}} = f (y) \) correspond to a time-dependent system of PDEs on a domain \(\Omega \), where the Banach space Y is a function space (or product of function spaces) on \(\Omega \). Suppose that solutions satisfy a local conservation law, in the form

$$\begin{aligned} {\dot{\rho }} = - {\text {div}} J , \end{aligned}$$
(7)

where \(\rho \) and J depend on y. The notation is deliberately suggestive of Maxwell’s equations, where \(\rho \) is charge density, J is current density, and (7) is local conservation of charge.

From Theorem 2.9 and Corollary 2.10, we immediately obtain a powerful general statement about preservation of local conservation laws under numerical integration. If \( \rho = F (y) \), where \( F \in {\mathcal {F}} ( Y , Z ) \) and Z is an appropriate space of densities, then \( F ^\prime (y) f (y) = - {\text {div}} J (y) \), and thus an \({\mathcal {F}}\)-functionally equivariant integrator satisfies a discrete-time version of (7). For instance, a Runge–Kutta method preserving \({\mathcal {F}}\)-invariants satisfies

$$\begin{aligned} \rho _1 = \rho _0 - \Delta t \sum _{ i = 1 } ^s b _i {\text {div}} J ( Y _i ) . \end{aligned}$$

We note that, while \(\rho \) is required to be related to y by a functional in \({\mathcal {F}}\) (e.g., \(\rho \) is affine or quadratic in y), no such restriction is placed on J. In particular, all B-series methods preserve affine local conservation laws, while those preserving quadratic invariants also preserve quadratic local conservation laws. In the case of symplectic Runge–Kutta methods, Frasca-Caccia and Hydon [12, Section 3.1] recently proved this by a direct computation, whereas here it is seen as a particular instance of quadratic functional equivariance.

In addition to the differential form of the conservation law (7), we may also integrate over a compact subdomain \( K \subset \Omega \) and apply the divergence theorem to get

$$\begin{aligned} \frac{\mathrm {d}}{\mathrm {d}t} \int _K \rho = - \int _{ \partial K } J \cdot {\widehat{n}} , \end{aligned}$$
(8)

where \( {\widehat{n}} \) denotes the outer unit normal to K. This may be seen as an integral form of the conservation law (7). In this case, if \( \int _K \rho = F (y) \) with \( F \in {\mathcal {F}} ( Y, {\mathbb {R}} ) \), then an \({\mathcal {F}}\)-functionally equivariant method satisfies a discrete-time version of (8). In the case of a Runge–Kutta method preserving \( {\mathcal {F}} \)-invariants, this has the form

$$\begin{aligned} \int _K \rho _1 = \int _K \rho _0 - \Delta t \sum _{ i = 1 } ^s b _i \int _{ \partial K } J ( Y _i ) \cdot {\widehat{n}} . \end{aligned}$$

Example 3.1

Maxwell’s equations in \({\mathbb {R}}^3\) consist of the vector evolution equations

$$\begin{aligned} {\dot{B}} = - {\text {curl}} E , \qquad \dot{ {D} } = {\text {curl}} H - J , \end{aligned}$$

along with the scalar constraint equations

$$\begin{aligned} {\text {div}} B = 0 , \qquad {\text {div}} {D} = \rho , \end{aligned}$$

and the constitutive relations \( {D} = \epsilon E \) and \( B = \mu H \). Here, E and H are the electric and magnetic fields, D and B are the electric and magnetic flux densities, \(\epsilon \) and \(\mu \) are the electric permittivity and magnetic permeability tensors, and \(\rho \) and J are charge and current density.

Taking the divergence of the first evolution equation, we see that \( {\text {div}} B \) is a local invariant, so the constraint \( {\text {div}} B = 0 \) is preserved by the evolution. Next, interpreting the constraint \( {\text {div}} {D} = \rho \) to define \(\rho \) as a function of D, we see that taking the divergence of the second evolution equation gives the local conservation law \( {\dot{\rho }} = - {\text {div}} J \). Since \( {\text {div}} B \) and \( {\text {div}} {D} \) are both linear in \( y = ( B, {D} ) \), any B-series method will preserve the constraint \( {\text {div}} B = 0 \), together with a discrete version of the conservation law relating \(\rho \) and J.

Example 3.2

Consider the nonlinear Schrödinger (NLS) equation,

$$\begin{aligned} i {\dot{u}} + \Delta u = \phi \bigl ( |u |^2 \bigr ) u . \end{aligned}$$

A direct computation shows that solutions satisfy

$$\begin{aligned} \frac{ \partial }{ \partial t } \frac{1}{2} |u |^2 = \Im ( {\bar{u}} \,i {\dot{u}} ) = \Im ( - {\bar{u}} \Delta u ) = \Im \bigl ( - {\text {div}} ( {\bar{u}} {\text {grad}} u ) \bigr ) = - {\text {div}} \Im ( {\bar{u}} {\text {grad}} u ) , \end{aligned}$$

which is a local conservation law for the quadratic functional \( F (u) = \frac{1}{2} |u |^2 \). Since the implicit midpoint method is quadratic functionally equivariant, it follows that this conservation law is preserved, in the sense that

$$\begin{aligned} \frac{1}{2} |u _1 |^2 = \frac{1}{2} |u _0 |^2 - \Delta t {\text {div}} \Im \biggl ( \frac{ {\bar{u}} _0 + {\bar{u}} _1 }{ 2 } {\text {grad}} \frac{ u _0 + u _1 }{2} \biggr ). \end{aligned}$$

More generally, for any quadratic functionally equivariant Runge–Kutta method,

$$\begin{aligned} \frac{1}{2} |u _1 |^2 = \frac{1}{2} |u _0 |^2 - \Delta t \sum _{ i = 1 } ^s b _i {\text {div}} \Im ( {\bar{U}} _i {\text {grad}} U _i ) . \end{aligned}$$

Example 3.3

Consider the wave equation \( \ddot{ u } = \Delta u \), written as the first-order system

$$\begin{aligned} {\dot{u}} = p , \qquad {\dot{p}} = \Delta u . \end{aligned}$$

If \( y = ( u, p ) \) is a solution, then

$$\begin{aligned} \frac{ \partial }{ \partial t } \frac{1}{2} \bigl ( p ^2 + |{\text {grad}} u |^2 \bigr ) = p {\dot{p}} + {\text {grad}} u \cdot {\text {grad}} {\dot{u}} = p \Delta u + {\text {grad}} u \cdot {\text {grad}} p = {\text {div}} ( p {\text {grad}} u ) , \end{aligned}$$

which is a local conservation law for the quadratic functional \( F (u, p) = \frac{1}{2} \bigl ( p ^2 + |{\text {grad}} u |^2 \bigr ) \). Hence, applying any quadratic functionally equivariant Runge–Kutta method gives

$$\begin{aligned} \frac{1}{2} \bigl ( (p _1) ^2 + |{\text {grad}} u _1 |^2 \bigr ) = \frac{1}{2} \bigl ( (p _0) ^2 + |{\text {grad}} u _0 |^2 \bigr ) + \Delta t \sum _{ i = 1 } ^s b _i {\text {div}} ( P _i {\text {grad}} U _i ) . \end{aligned}$$

3.2 Discrete Conservation Laws in Numerical PDEs

In practice, of course, we will not be applying numerical integrators to infinite-dimensional function spaces. Rather, we typically first semidiscretize in space (e.g., using a finite difference, finite volume, or finite-element scheme), yielding a system \( {\dot{y}} _h = f _h ( y _h ) \) on a finite-dimensional vector space \( Y _h \) to which we can apply a numerical integrator.

Suppose the spatial discretization scheme is such that it preserves a semidiscrete conservation law \( {\dot{\rho }} _h = - {\text {div}} _h J _h \), where \( {\text {div}} _h \) is some “discrete divergence” operator. Then it follows from the previous section that, if \( \rho _h = F _h ( y _h ) \) for some \( F _h = {\mathcal {F}} ( Y _h , Z _h ) \), then applying an \({\mathcal {F}}\)-functionally equivariant numerical integrator yields a fully discrete conservation law corresponding to (7). We illustrate this with a few examples, which are semidiscretized versions of those considered in the previous section.

Example 3.4

Nédélec [24] introduced a finite-element semidiscretization of Maxwell’s equations, in which E and B are approximated by piecewise-polynomial vector fields \( E _h \in H ( {\text {curl}} ; \Omega ) \) and \( B _h \in H ( {\text {div}}; \Omega ) \). This method may be written as

$$\begin{aligned} {\dot{B}} _h = - {\text {curl}} E _h , \qquad \int _\Omega \dot{ {D}} _h \cdot v _h = \int _\Omega ( H _h \cdot {\text {curl}} v _h - J _h \cdot v _h ) , \end{aligned}$$

where \( {D} _h = \epsilon E _h \), \( H _h = \mu ^{-1} B _h \), and \( v _h \) is any vector field from the same space as \( E _h \).

Taking the divergence of the first equation gives \( {\text {div}} {\dot{B}} _h = 0 \), so the constraint \( {\text {div}} B _h = 0 \) is preserved by the evolution. For the second, when \( v _h = -{\text {grad}} \phi _h \) for a piecewise-polynomial scalar field \( \phi _h \), we get

$$\begin{aligned} -\int _\Omega \dot{ {D} } _h \cdot {\text {grad}} \phi _h = \int _\Omega J _h \cdot {\text {grad}} \phi _h , \end{aligned}$$

which we may write as \( {\text {div}} _h \dot{ {D} } _h = - {\text {div}} _h J _h \). Thus, taking \( \rho _h = {\text {div}} _h {D} _h \) implies the semidiscrete charge conservation law \( {\dot{\rho }} _h = - {\text {div}} _h J _h \). (See Berchenko-Kogan and Stern [4] for a hybridization of Nédélec’s method that preserves a stronger form of this conservation law, using \( {\text {div}} \) rather than \( {\text {div}} _h \).) Since \( {\text {div}} B _h \) and \( {\text {div}} _h {D} _h \) are linear in \( y _h = ( B _h , {D} _h ) \), any B-series method will preserve \( {\text {div}} B _h = 0 \) exactly and give a discrete-time version of the charge conservation law relating \( \rho _h \) and \( J _h \).

Example 3.5

For the one-dimensional NLS equation, the finite-difference semidiscretization

$$\begin{aligned} i {\dot{u}} _k + \frac{ u _{ k + 1 } - 2 u _k + u _{ k -1 } }{ h ^2 } = \phi \bigl ( |u _k |^2 \bigr ) u _k \end{aligned}$$

satisfies the semidiscrete local conservation law

$$\begin{aligned} \frac{\mathrm {d}}{\mathrm {d}t} \frac{1}{2} |u _k |^2 = - \frac{ 1 }{ h } \Biggl [ \Im \Biggl ( \biggl (\frac{ {\bar{u}} _k + {\bar{u}} _{ k + 1 } }{ 2 } \biggr ) \biggl ( \frac{ u _{ k + 1 } - u _k }{ h } \biggr ) \Biggr ) - \Im \Biggl ( \biggl ( \frac{ {\bar{u}} _{k-1} + {\bar{u}} _k }{ 2 } \biggr ) \biggl ( \frac{ u _k - u _{k-1} }{ h } \biggr ) \Biggr ) \Biggr ], \end{aligned}$$

where the right-hand side is a difference of midpoint approximations to \( \Im ( {\bar{u}} \,\partial _x u ) \). Hence, a discrete-time version of this conservation law is preserved by any B-series method that preserves quadratic invariants.

Example 3.6

For the one-dimensional wave equation, consider the finite-difference semidiscretization

$$\begin{aligned} {\dot{u}} _k = p _k , \qquad {\dot{p}} _k = \frac{ u _{ k + 1 } - 2 u _k + u _{ k -1 } }{ h ^2 } . \end{aligned}$$

If we define

$$\begin{aligned} \rho _k = \frac{1}{2} p _k ^2 + \frac{ 1 }{ 4 } \biggl ( \frac{ u _{ k + 1 } - u _k }{ h } \biggr ) ^2 + \frac{ 1 }{ 4 } \biggl ( \frac{ u _k - u _{k-1} }{ h } \biggr ) ^2 , \end{aligned}$$

which is a finite-difference approximation to \( \frac{1}{2} \bigl ( p ^2 + ( \partial _x u ) ^2 \bigr ) \), then a short calculation gives the semidiscrete conservation law

$$\begin{aligned} {\dot{\rho }} _k = \frac{ 1 }{ h } \Biggl [ \biggl ( \frac{ p _k + p _{ k + 1 } }{ 2 } \biggr ) \biggl ( \frac{ u _{ k + 1 } - u _k }{ h } \biggr ) - \biggl ( \frac{ p _{ k -1 } + p _k }{ 2 } \biggr ) \biggl ( \frac{ u _k - u _{ k -1 } }{ h } \biggr ) \Biggr ], \end{aligned}$$

where the right-hand side is a difference of midpoint approximations to \( p \, \partial _x u \). As in the previous example, a discrete-time version of this conservation law is therefore preserved by any B-series method that preserves quadratic invariants.

3.3 Remarks on Quadratic Conservation Laws Arising from Point Symmetries

Conservation laws with quadratic densities are common in partial differential and differential-difference equations because of their association with linear symmetries of Hamiltonian PDEs. (See, e.g., Olver [25].) However, not all such symmetries are easily preserved under semidiscretization. We focus here on affine point symmetries, those arising from actions on the field variables.

For example, the one-dimensional NLS equation may be written in the form

$$\begin{aligned} i {\dot{u}} = \frac{ \delta }{ \delta {\bar{u}} } \int _\Omega \Bigl ( |\partial _x u |^2 + V \bigl ( |u |^2 \bigr ) \Bigr ), \end{aligned}$$

where \( V ^\prime = \phi \) and \( \delta / \delta {\bar{u}} \) is the variational derivative with respect to \( {\bar{u}} \). The integrand \( {\mathcal {H}} = |\partial _x u |^2 + V \bigl ( |u |^2 \bigr ) \) is called the Hamiltonian density. Observe that \({\mathcal {H}}\) is invariant under the diagonal U(1) action \( (u, \partial _x u ) \mapsto ( e ^{ i \alpha } u , e ^{ i \alpha } \partial _x u ) \), where \( \alpha \in {\mathbb {R}} \cong {\mathfrak {u}} (1) \). This point symmetry leads to the local conservation law for \( \rho = \frac{1}{2} |u |^2 \) in Example 3.2. More generally, any Hamiltonian density of the form \( {\mathcal {H}} = {\mathcal {H}} \bigl ( |u |^2 , |\partial _x u |^2 , {\bar{u}} \,\partial _x u \bigr ) \) has the same point symmetry, and hence \( i {\dot{u}} = \frac{ \delta }{ \delta {\bar{u}} } \int _\Omega {\mathcal {H}} \) has a local conservation law for \( \rho = \frac{1}{2} |u |^2 \).

Similarly, the one-dimensional semidiscretized NLS equation in Example 3.6 can be written

$$\begin{aligned} i {\dot{u}} _k = \frac{ \partial }{ \partial {\bar{u}} _k } \sum _\ell \biggl ( \Bigl |\frac{ u _{ \ell + 1 } - u _\ell }{ h } \Bigr |^2 + \frac{ V \bigl ( |u _\ell |^2 \bigr ) + V \bigl ( |u _{\ell +1} |^2 \bigr ) }{ 2 } \biggr ), \end{aligned}$$

where the summand can be viewed as a discrete Hamiltonian density \( {\mathcal {H}} _h \). The invariance of \( {\mathcal {H}} _h \) under the point symmetry \( u _\ell \mapsto e ^{ i \alpha } u _\ell \), \( u _{ \ell + 1 } \mapsto e ^{ i \alpha } u _{ \ell + 1 } \), yields the semidiscrete local conservation law for \( \rho _k = \frac{1}{2} |u _k |^2 \) obtained in Example 3.6. More generally, we get such a local conservation law whenever the discrete Hamiltonian density has the form \( {\mathcal {H}} _h = {\mathcal {H}} _h \bigl ( |u _\ell |^2 , |u _{ \ell + 1 } |^2 , {\bar{u}} _\ell u _{ \ell + 1 } \bigr ) \).

A related example involves orthogonal (rather than unitary) point symmetry. Suppose u(tx) and its conjugate momentum p(tx) both take values in \({\mathbb {R}}^3\), and let \( A \in O (3) \) act by \( z = ( u , p, \partial _x u , \partial _x p ) \mapsto ( A u, A p, A \partial _x u , A \partial _x p ) \). Then any Hamiltonian density that depends only on the 10 invariants \( z _i \cdot z _j \), \( 1 \le i \le j \le 4 \), is O(3) invariant and thus has a local conservation law for \( \rho = u \times p \). Like the U(1) point symmetry discussed above, this O(3) point symmetry is preserved under a wide class of lattice semidiscretizations, which have corresponding semidiscrete quadratic conservation laws.

By contrast with point symmetries, symmetries that involve spatial translations are typically broken by semidiscretization. However, special semidiscretizations can be constructed that preserve versions of the associated conservation laws, although these are generally not symplectic. An example is provided by the Korteweg–de Vries equation

$$\begin{aligned} \partial _t u = \partial _x ( \alpha u ^2 ) + \nu \partial _x ^3 u , \end{aligned}$$

which has a local conservation law with \( \rho = u ^2 \). The semidiscretization

$$\begin{aligned} {\dot{u}} _k = \frac{ \alpha }{ 2 h } \bigl [ \theta ( u _{ k + 1 } ^2 - u _{ k -1 } ^2 ) + 2 ( 1 - \theta ) u _k ( u _{ k + 1 } - u _{ k -1 } ) \bigr ] + \frac{ \nu }{ 2 h ^3 } ( u _{ k + 2 } - 2 u _{ k + 1 } + 2 u _{ k -1 } - u _{ k - 2 } ) \end{aligned}$$

has a semidiscrete conservation law with density \( \rho _k = u _k ^2 \) only for the parameter \( \theta = 2/3 \) (Ascher and McLachlan [3]).

Frasca-Caccia and Hydon [12] give general techniques for constructing finite-difference semidiscretizations preserving several local conservation laws—linear, quadratic, or otherwise—with many examples. When such methods are used in conjunction with B-series methods for time integration, it follows from Theorem 2.9 and Corollary 2.10 that affine local conservation laws are always preserved in a discrete sense, while quadratic local conservation laws are preserved by any B-series method that preserves quadratic invariants.

4 Multisymplectic Integrators

In this section, we apply the foregoing theory to the multisymplectic conservation law for canonical Hamiltonian PDEs and its preservation by numerical integrators. Since this is a quadratic local conservation law depending on variations of solutions, it follows that B-series methods preserving quadratic invariants also preserve a discrete-time version of the multisymplectic conservation law. Furthermore, we discuss techniques for spatial semidiscretization that preserve a semidiscrete multisymplectic conservation law, reviewing some known results for finite-difference semidiscretization and introducing new results for finite-element semidiscretization. Consequently, when such methods are used in conjunction with B-series methods preserving quadratic invariants, the resulting method will satisfy a fully discrete multisymplectic conservation law.

4.1 Canonical Hamiltonian PDEs

Before discussing the canonical Hamiltonian formalism for time-evolution PDEs, we first quickly recall the stationary (time-independent) case, following the treatment in McLachlan and Stern [22].

Given a spatial domain \( \Omega \subset {\mathbb {R}}^m \) with coordinates \( x = ( x ^1 , \ldots , x ^m ) \), let \( u :\Omega \rightarrow {\mathbb {R}}^n \) and \( \sigma :\Omega \rightarrow {\mathbb {R}} ^{ m n } \) be unknown fields. The de Donder–Weyl equations [11, 30] for a Hamiltonian \( H :\Omega \times {\mathbb {R}}^n \times {\mathbb {R}} ^{ m n } \rightarrow {\mathbb {R}} \), \( H = H ( x, u, \sigma ) \), are

$$\begin{aligned} \partial _\mu u ^i = \frac{ \partial H }{ \partial \sigma _i ^\mu } , \qquad - \partial _\mu \sigma _i ^\mu = \frac{ \partial H }{ \partial u ^i } , \end{aligned}$$
(9)

where \( \mu = 1 , \ldots , m \) and \( i = 1 , \ldots , n \). Here and henceforth, we use the Einstein index convention of summing over repeated indices; for instance, \( \partial _\mu \sigma _i ^\mu \) has an implied sum over \(\mu \) and therefore corresponds to the divergence of the vector field \( \sigma _i \).

Now, for time-dependent problems, we let \(u = u ^i ( t, x ) \) and \( \sigma = \sigma _i ^\mu ( t, x ) \) depend on \(t \in ( t _0, t _1 ) \), and we introduce an additional unknown field \( p = p _i ( t, x ) \). The de Donder–Weyl equations for \( H :( t _0, t _1 ) \times \Omega \times {\mathbb {R}}^n \times {\mathbb {R}}^n \times {\mathbb {R}} ^{ m n } \), \( H = H ( t, x, u, p, \sigma ) \), are then given by

$$\begin{aligned} {\dot{u}} ^i = \frac{ \partial H }{ \partial p _i } , \qquad \partial _\mu u ^i = \frac{ \partial H }{ \partial \sigma _i ^\mu } , \qquad - ( {\dot{p}} _i + \partial _\mu \sigma _i ^\mu ) = \frac{ \partial H }{ \partial u ^i } . \end{aligned}$$
(10)

Note that (10) is simply (9) in \( (m + 1) \)-dimensional spacetime, where we have adopted the special notation \( t = x ^0 \) and \( p _i = \sigma _i ^0 \). Moreover, the special case \( m = 0 \) recovers canonical Hamiltonian mechanics on \( {\mathbb {R}} ^{ 2 n } \).

For \( m > 0 \), the de Donder–Weyl equations are not in the form \( {\dot{y}} = f (y) \), since we have expressions for \( {\dot{u}} \) and \( {\dot{p}} \) but not \( {\dot{\sigma }} \). To deal with this, we assume that the second equation of (10) defines \(\sigma \) as an implicit function of t, x, u, p, and \( {\text {grad}} u \). By the implicit function theorem, this is true (at least locally) if the \( m n \times m n \) matrix \( \partial ^2 H / ( \partial \sigma _i ^\mu \partial \sigma _j ^\nu ) \) is nondegenerate. Therefore, we may eliminate the second equation and substitute this expression for \(\sigma \) into the other two equations. Assuming the Hamiltonian does not depend on t, this gives a system of the form \( {\dot{y}} = f (y) \) with \( y = ( u, p ) \).

Example 4.1

Let \( n = 1 \), so that u and p are scalar fields and \(\sigma \) is a vector field on \(\Omega \), and take \( H = \frac{1}{2} \bigl ( p ^2 - |\sigma |^2 ) \). Then the de Donder–Weyl equations are

$$\begin{aligned} {\dot{u}} = p , \qquad {\text {grad}} u = - \sigma , \qquad - ({\dot{p}} + {\text {div}} \sigma ) = 0 . \end{aligned}$$

Eliminating the second equation and substituting \( \sigma = - {\text {grad}} u \) into the third, we obtain the first-order form of the wave equation with \( y = ( u, p ) \), as in Example 3.3.

4.2 The Multisymplectic Conservation Law

For Hamiltonian ODEs, the symplectic conservation law is a statement about variations of solutions to Hamilton’s equations. Similarly, for Hamiltonian PDEs, the multisymplectic conservation law is a statement about variations of solutions to the de Donder–Weyl equations.

Definition 4.2

Let \( ( u, p, \sigma ) \) be a solution to (10). A (first) variation of \( ( u, p , \sigma ) \) is a solution \( ( v, r, \tau ) \) to the linearized problem

$$\begin{aligned} \begin{aligned} {\dot{v}} ^i&{}={}&\frac{ \partial ^2 H }{ \partial p _i \partial u ^j } v ^j&{}+{}&\frac{ \partial ^2 H }{ \partial p _i \partial p _j } r _j&{}+{}&\frac{ \partial ^2 H }{ \partial p _i \partial \sigma _j ^\nu } \tau _j ^\nu ,\\ \partial _\mu v ^i&{}={}&\frac{ \partial ^2 H }{ \partial \sigma _i ^\mu \partial u ^j } v ^j&{}+{}&\frac{ \partial ^2 H }{ \partial \sigma _i ^\mu \partial p _j } r _j&{}+{}&\frac{ \partial ^2 H }{ \partial \sigma _i ^\mu \partial \sigma _j ^\nu } \tau _j ^\nu ,\\ - ( {\dot{r}} _i + \partial _\mu \tau _i ^\mu )&{}={}&\frac{ \partial ^2 H }{ \partial u ^i \partial u ^j } v ^j&{}+{}&\frac{ \partial ^2 H }{ \partial u ^i \partial p _j } r _j&{}+{}&\frac{ \partial ^2 H }{ \partial u ^i \partial \sigma _j ^\nu } \tau _j ^\nu , \end{aligned} \end{aligned}$$

where the Hessians on the right-hand side are evaluated at \( ( t, x, u, p, \sigma ) \).

On the space \( {\mathbb {R}} ^n \times {\mathbb {R}}^n \times {\mathbb {R}} ^{ m n } \ni ( u, p, \sigma ) \), we now define the canonical 2-forms \( \omega ^0 = \mathrm {d} u ^i \wedge \mathrm {d} p _i \) and \( \omega ^\mu = \mathrm {d} u ^i \wedge \mathrm {d} \sigma _i ^\mu \) for \( \mu = 1 , \ldots , m \). The multisymplectic conservation law states that, for any pair of variations \( ( v, r, \tau ) \) and \( ( v ^\prime , r ^\prime , \tau ^\prime ) \), we have

$$\begin{aligned} \partial _t \Bigl ( \omega ^0 \bigl ( ( v, r, \tau ) , (v ^\prime , r ^\prime , \tau ^\prime ) \bigr ) \Bigr ) = - \partial _\mu \Bigl ( \omega ^\mu \bigl ( ( v, r, \tau ) , (v ^\prime , r ^\prime , \tau ^\prime ) \bigr ) \Bigr ) , \end{aligned}$$

that is,

$$\begin{aligned} \partial _t ( v ^i r _i ^\prime - v ^{ \prime i } r _i ) = - \partial _\mu ( v ^i \tau _i ^{ \prime \mu } - v ^{ \prime i } \tau _i ^\mu ) . \end{aligned}$$

The proof is simply a calculation, using the symmetry of the Hessian. We abbreviate the multisymplectic conservation law as

$$\begin{aligned} {\dot{\omega }} ^0 = - \partial _\mu \omega ^\mu , \end{aligned}$$
(11)

with the understanding that both sides are evaluated on variations of solutions to (10). In the special case \( m = 0 \), we recover the usual symplectic conservation law for Hamiltonian ODEs. As with the conservation laws in Sect. 3.1, we may also integrate (11) over a compact subdomain \( K \subset \Omega \) and apply the divergence theorem to get

$$\begin{aligned} \int _K {\dot{\omega }} ^0 \,\mathrm {d}^m x = - \int _{ \partial K } \omega ^\mu \,\mathrm {d}^{m-1} x_\mu , \end{aligned}$$
(12)

which is an integral form of the multisymplectic conservation law. Here, \( \mathrm {d} ^m x {:}{=}\mathrm {d} x ^1 \wedge \cdots \wedge \mathrm {d} x ^m \) is the standard Euclidean volume form on \( {\mathbb {R}}^m \) and \( \,\mathrm {d}^{m-1} x_\mu {:}{=}\iota _{ e _\mu } \,\mathrm {d}^m x \) is its interior product with the \(\mu \)th standard basis vector. Again, we interpret (12) to mean that the equality holds when both sides are evaluated on arbitrary variations of solutions.

Remark 4.3

If \(\Omega \) is compact, and boundary conditions are chosen so that \( \omega ^\mu \,\mathrm {d}^{m-1} x_\mu = 0 \) on \( \partial \Omega \), then taking \( K = \Omega \) in (12) gives \( \int _\Omega {\dot{\omega }} ^0 \,\mathrm {d}^m x = 0 \). This may be interpreted as invariance of the symplectic form \( \omega = \int _\Omega \omega ^0 \,\mathrm {d}^m x \) on the infinite-dimensional phase space Y.

4.3 Discrete-Time Multisymplectic Conservation Laws for Numerical Integrators

As before, assume that the Hamiltonian does not depend on t and that \(\sigma \) is an implicit function of the remaining variables, so that (10) can be reduced to \( {\dot{y}} = f (y) \) with \( y = ( u, p ) \). It follows that variations evolve according to \( {\dot{\eta }} = f ^\prime (y) \eta \) with \( \eta = ( v, r ) \), where \(\tau = \sigma ^\prime (y) \eta \). Hence, (11) may be seen as a quadratic conservation law involving variations of y, and the results of Sect. 2 immediately apply to give the following.

Theorem 4.4

Suppose that (10) can be written as \( {\dot{y}} = f (y) \) with \( y = ( u , p ) \), where \(\sigma \) is an implicit function of the other variables. Let \( \xi = ( v, r ) \) and \( \eta = ( v ^\prime , r ^\prime ) \), with \( \tau = \sigma ^\prime (y) \xi \) and \( \tau ^\prime = \sigma ^\prime (y) \eta \), and define the augmented vector field

$$\begin{aligned} g ( y, \xi , \eta , z ) = \biggl ( f (y) , f ^\prime (y) \xi , f ^\prime (y) \eta , - \partial _\mu \Bigl ( \omega ^\mu \bigl ( ( v, r, \tau ), ( v ^\prime , r ^\prime , \tau ^\prime ) \bigr ) \Bigr ) \biggr ) . \end{aligned}$$

If \(\Phi \) is an affine equivariant method preserving quadratic invariants, then

$$\begin{aligned} \Phi _g \Bigl ( y _0 , \xi _0, \eta _0, \omega ^0 \bigl ( ( v _0, r _0, \tau _0 ) , ( v _0 ^\prime , r _0 ^\prime , \tau _0 ^\prime ) \bigr ) \Bigr ) = \Bigl ( y _1 , \xi _1, \eta _1 , \omega ^0 \bigl ( ( v _1 , r _1 , \tau _1 ) , ( v _1 ^\prime , r _1 ^\prime , \tau _1 ^\prime ) \bigr ) \Bigr ), \end{aligned}$$

where \( y _1 = \Phi _f ( y _0 ) \), \( \xi _1 = \Phi _f ^\prime ( y _0 ) \xi _0 \), and \( \eta _1 = \Phi _f ^\prime ( y _0 ) \eta _0 \).

Proof

The key observation is that \( \omega ^0 \bigl ( ( v , r , \tau ) , ( v ^\prime , r ^\prime , \tau ^\prime ) \bigr ) = v ^i r _i ^\prime - v ^{ \prime i } r _i \) is quadratic in \(\xi \) and \(\eta \) alone, so it is not affected by the (possibly nonlinear) dependence of \(\sigma \) and its variations on the other variables. Hence, the result follows from (5) and Remark 2.16. \(\square \)

Corollary 4.5

For a Runge–Kutta method preserving quadratic invariants, we have

$$\begin{aligned}&\omega ^0 \bigl ( ( v _1 , r _1, \tau _1 ) , ( v _1 ^\prime , r _1 ^\prime , \tau _1 ^\prime ) \bigr ) \\&\quad = \omega ^0 \bigl ( ( v _0, r _0, \tau _0 ) , ( v _0 ^\prime , r _0 ^\prime , \tau _0 ^\prime ) \bigr ) - \Delta t \sum _{ i = 1 } ^s b _i \partial _\mu \Bigl ( \omega ^\mu \bigl ( ( V _i , R _i, T _i ), (V _i ^\prime , R _i ^\prime , T _i ^\prime ) \bigr ) \Bigr ). \end{aligned}$$

This may be written equivalently as

$$\begin{aligned} (\mathrm {d} u _1 ) ^j \wedge ( \mathrm {d} p _1 ) _j = ( \mathrm {d} u _0 ) ^j \wedge ( \mathrm {d} p _0 ) _j - \Delta t \sum _{ i = 1 } ^s b _i \partial _\mu \bigl ( ( \mathrm {d} U _i ) ^j \wedge ( \mathrm {d} \Sigma _i ) _j ^\mu \bigr ) . \end{aligned}$$

4.4 Multisymplectic Semidiscretization on Rectangular Grids

If \(\Omega \) is a Cartesian product of intervals, equipped with a rectangular finite-difference grid, there is a substantial literature on spatial semidiscretization such that a semidiscrete multisymplectic conservation law holds. We refer the reader in particular to the following (non-exhaustive) list of references: Reich [26], Bridges and Reich [6], Ryland and McLachlan [27], McLachlan, Ryland, and Sun [21]. These semidiscretization schemes generally apply a symplectic Runge–Kutta or partitioned Runge–Kutta method in each of the spatial directions. In light of Sect. 2, the semidiscrete multisymplectic conservation law may be seen as resulting from m applications of (6) or its generalization to partitioned methods in Sect. 5.

In one dimension of space, Sun and Xing [29] have recently investigated multisymplectic semidiscretization using discontinuous Galerkin finite-element methods.

4.5 Multisymplectic Semidiscretization with Hybrid Finite Element Methods

In McLachlan and Stern [22], we developed a framework for multisymplectic discretization of time-independent Hamiltonian PDEs by hybrid finite element methods, including hybridizable discontinuous Galerkin methods (cf. Cockburn, Gopalakrishnan, and Lazarov [10]). In this section, we show that those same methods may be used for semidiscretization of time-dependent Hamiltonian PDEs, and that a semidiscrete multisymplectic conservation law holds. Consequently, when combined with a symplectic numerical integrator for time discretization, the resulting method satisfies a fully discrete multisymplectic conservation law in spacetime. Unlike the methods discussed in the previous section, these methods may be applied to unstructured meshes on non-rectangular domains.

Suppose that \(\Omega \subset {\mathbb {R}}^m \) is polyhedral, and let \( {\mathcal {T}} _h \) be a simplicial triangulation of \(\Omega \) by m-simplices \( K \in {\mathcal {T}} _h \), where \( {\mathcal {E}} _h = \bigcup _{ K \in {\mathcal {T}} _h } \partial K \) denotes the set of \( ( m -1 ) \)-dimensional facets. We specify finite-element spaces

$$\begin{aligned} V (K)&\subset \bigl [ H ^2 (K) \bigr ] ^n , \qquad V {:}{=}\prod _{ K \in {\mathcal {T}} _h } V (K) ,\\ \Sigma (K)&\subset \bigl [ H ^1 (K) \bigr ] ^{ m n } , \qquad \Sigma {:}{=}\prod _{ K \in {\mathcal {T}} _h } \Sigma (K) , \end{aligned}$$

along with spaces of approximate boundary traces on \( {\mathcal {E}} _h \),

$$\begin{aligned} {\widehat{V}} \subset \bigl [ L ^2 ( {\mathcal {E}} _h ) \bigr ] ^n , \qquad {\widehat{V}} _0 {:}{=}\bigl \{ {\widehat{v}} \in {\widehat{V}} : {\widehat{v}} |_{ \partial \Omega } = 0 \bigr \} . \end{aligned}$$

The de Donder–Weyl equations (10) are then approximated by the weak problem: Find \( \bigl ( u (t) , \sigma (t) , p (t) , {\widehat{u}} (t) \bigr ) \in V \times \Sigma \times V ^*\times {\widehat{V}} \) satisfying

$$\begin{aligned} \int _K {\dot{u}} ^i r _i \,\mathrm {d}^m x= & {} \int _K \frac{ \partial H }{ \partial p _i } r _i \,\mathrm {d}^m x , \quad \forall r \in V ^*(K) , \end{aligned}$$
(13a)
$$\begin{aligned} 0= & {} \int _K \biggl ( u ^i \partial _\mu \tau _i ^\mu + \frac{ \partial H }{ \partial \sigma _i ^\mu } \tau _i ^\mu \biggr ) \,\mathrm {d}^m x - \int _{ \partial K } {\widehat{u}} ^i \tau _i ^\mu \,\mathrm {d}^{m-1} x_\mu , \quad \forall \tau \in \Sigma (K) , \nonumber \\ \end{aligned}$$
(13b)
$$\begin{aligned} \int _K {\dot{p}} _i v ^i \,\mathrm {d}^m x= & {} \int _K \biggl ( \sigma _i ^\mu \partial _\mu v ^i - \frac{ \partial H }{ \partial u ^i } v ^i \biggr ) \,\mathrm {d}^m x - \int _{ \partial K } {\widehat{\sigma }} _i ^\mu v ^i \,\mathrm {d}^{m-1} x_\mu , \quad \forall v \in V (K) ,\nonumber \\ \end{aligned}$$
(13c)

for all \( K \in {\mathcal {T}} _h \), together with the conservativity condition

$$\begin{aligned} 0= & {} \sum _{ K \in {\mathcal {T}} _h } \int _{ \partial K } {\widehat{\sigma }} _i ^\mu {\widehat{v}} ^i \,\mathrm {d}^{m-1} x_\mu , \quad \forall {\widehat{v}} \in {\widehat{V}} _0 . \end{aligned}$$
(13d)

Here, \( {\widehat{\sigma }} \) is determined by \( u , \sigma , {\widehat{u}} \) through a specified numerical flux function; see Cockburn, Gopalakrishnan, and Lazarov [10], McLachlan and Stern [22] for further details. The Eqs. (13a)–(13c) are derived by multiplying (10) by test functions, integrating by parts over K, and replacing the boundary traces of u and \(\sigma \) by the approximate traces \( {\widehat{u}} \) and \( {\widehat{\sigma }} \). Under appropriate nondegeneracy assumptions, the Eqs. (13a) and (13c) define the dynamics of \( y _h = ( u , p ) \) on \( Y _h {:}{=}V \times V ^*\), where \( \sigma \), \( {\widehat{u}} \), and \( {\widehat{\sigma }} \) are implicit functions of \( y _h \).

We may then consider variations of solutions to (13), along with a corresponding semidiscrete multisymplectic conservation law in the integral form (12). The following is a straightforward generalization of Lemma 2 in [22].

Theorem 4.6

If (13a)–(13c) hold on \( K \in {\mathcal {T}} _h \), then

$$\begin{aligned} \int _K \partial _t ( \mathrm {d} u ^i \wedge \mathrm {d} p _i ) \,\mathrm {d}^m x= & {} - \int _{ \partial K } ( \mathrm {d} {\widehat{u}} ^i \wedge \mathrm {d} {\widehat{\sigma }} _i ^\mu ) \,\mathrm {d}^{m-1} x_\mu \\&+ \int _{ \partial K } \bigl ( \mathrm {d} ( {\widehat{u}} ^i - u ^i ) \wedge \mathrm {d} ( {\widehat{\sigma }} _i ^\mu - \sigma _i ^\mu ) \bigr ) \,\mathrm {d}^{m-1} x_\mu . \end{aligned}$$

Consequently, the semidiscrete multisymplectic conservation law

$$\begin{aligned} \int _K \partial _t ( \mathrm {d} u ^i \wedge \mathrm {d} p _i ) \,\mathrm {d}^m x = - \int _{ \partial K } ( \mathrm {d} {\widehat{u}} ^i \wedge \mathrm {d} {\widehat{\sigma }} _i ^\mu ) \,\mathrm {d}^{m-1} x_\mu \end{aligned}$$
(14)

holds on \( K \in {\mathcal {T}} _h \) if and only if \( \int _{ \partial K } \bigl ( \mathrm {d} ( {\widehat{u}} ^i - u ^i ) \wedge \mathrm {d} ( {\widehat{\sigma }} _i ^\mu - \sigma _i ^\mu ) \bigr ) \,\mathrm {d}^{m-1} x_\mu = 0 \).

Proof

We may rewrite (13a)–(13c) as

$$\begin{aligned} \int _K {\dot{u}} ^i \,\mathrm {d} p _i \,\mathrm {d}^m x&= \int _K \frac{ \partial H }{ \partial p _i } \,\mathrm {d} p _i \,\mathrm {d}^m x ,\\ 0&= \int _K \biggl ( u ^i \,\mathrm {d} ( \partial _\mu \sigma _i ^\mu ) + \frac{ \partial H }{ \partial \sigma _i ^\mu } \,\mathrm {d} \sigma _i ^\mu \biggr ) \,\mathrm {d}^m x - \int _{ \partial K } {\widehat{u}} ^i \,\mathrm {d} \sigma _i ^\mu \,\mathrm {d}^{m-1} x_\mu , \\ \int _K {\dot{p}} _i \,\mathrm {d} u ^i \,\mathrm {d}^m x&= \int _K \biggl ( \sigma _i ^\mu \mathrm {d} (\partial _\mu u ^i) - \frac{ \partial H }{ \partial u ^i } \,\mathrm {d} u ^i \biggr ) \,\mathrm {d}^m x - \int _{ \partial K } {\widehat{\sigma }} _i ^\mu \,\mathrm {d} u ^i \,\mathrm {d}^{m-1} x_\mu . \end{aligned}$$

Adding the first two equations, subtracting the third, and taking the exterior derivative on both sides, we get

$$\begin{aligned}&\int _K \partial _t ( \mathrm {d} u ^i \wedge \mathrm {d} p _i ) \,\mathrm {d}^m x \\&\quad = \int _K \bigl ( \partial _\mu ( \mathrm {d} u ^i \wedge \mathrm {d} \sigma _i ^\mu ) + \mathrm {d} \mathrm {d} H \bigr ) \,\mathrm {d}^m x - \int _{ \partial K } ( \mathrm {d} {\widehat{u}} ^i \wedge \mathrm {d} \sigma _i ^\mu + \mathrm {d} u ^i \wedge \mathrm {d} {\widehat{\sigma }} _i ^\mu ) \,\mathrm {d}^{m-1} x_\mu \\&\quad = \int _{ \partial K } ( \mathrm {d} u ^i \wedge \mathrm {d} \sigma _i ^\mu - \mathrm {d} {\widehat{u}} ^i \wedge \mathrm {d} \sigma _i ^\mu - \mathrm {d} u ^i \wedge \mathrm {d} {\widehat{\sigma }} _i ^\mu ) \,\mathrm {d}^{m-1} x_\mu \\&\quad = - \int _{ \partial K } ( \mathrm {d} {\widehat{u}} ^i \wedge \mathrm {d} {\widehat{\sigma }} _i ^\mu ) \,\mathrm {d}^{m-1} x_\mu + \int _{ \partial K } \bigl ( \mathrm {d} ( {\widehat{u}} ^i - u ^i ) \wedge \mathrm {d} ( {\widehat{\sigma }} _i ^\mu - \sigma _i ^\mu ) \bigr ) \,\mathrm {d}^{m-1} x_\mu , \end{aligned}$$

where the second equality uses \( \mathrm {d} \mathrm {d} H = 0 \) and the divergence theorem. \(\square \)

In Section 4 of [22], it is proved that several families of hybrid finite-element methods, including hybridized mixed methods (RT-H and BDM-H), nonconforming methods (NC-H), discontinuous Galerkin methods (LDG-H and IP-H), and continuous Galerkin methods (CG-H) satisfy the condition \( \int _{ \partial K } \bigl ( \mathrm {d} ( {\widehat{u}} ^i - u ^i ) \wedge \mathrm {d} ( {\widehat{\sigma }} _i ^\mu - \sigma _i ^\mu ) \bigr ) \,\mathrm {d}^{m-1} x_\mu = 0 \) of Theorem 4.6. Therefore, when these methods are applied to (13), they satisfy the semidiscrete multisymplectic conservation law (14) on each \( K \in {\mathcal {T}} _h \).

If the numerical flux satisfies the so-called strong conservativity condition \( \llbracket {\widehat{\sigma }} \rrbracket = 0 \), which is stronger than (13d), then the multisymplectic conservation law (14) may also be strengthened so that it holds for arbitrary unions of simplices. This holds for all of the methods mentioned in the previous paragraph except CG-H. The following is a straightforward generalization of Theorem 3 in [22].

Theorem 4.7

If a strongly conservative method satisfies (14), then for all \( {\mathcal {K}} \subset {\mathcal {T}} _h \),

$$\begin{aligned} \int _{ \bigcup {\mathcal {K}} } \partial _t ( \mathrm {d} u ^i \wedge \mathrm {d} p _i ) \,\mathrm {d}^m x = - \int _{ \partial ( \overline{ \bigcup K }) } ( \mathrm {d} {\widehat{u}} ^i \wedge \mathrm {d} {\widehat{\sigma }} _i ^\mu ) \,\mathrm {d}^{m-1} x_\mu . \end{aligned}$$

Proof

Sum (14) over \( K \in {\mathcal {K}} \), using \( \llbracket {\widehat{\sigma }} \rrbracket = 0 \) to cancel the contributions of internal facets. \(\square \)

Remark 4.8

In the situation considered in Remark 4.3, taking \( {\mathcal {K}} = {\mathcal {T}} _h \) implies conservation of the symplectic form \( \int _\Omega \mathrm {d} u ^i \wedge \mathrm {d} p _i \,\mathrm {d}^m x \) on \( Y _h \). This generalizes a result of Sánchez et al. [28], which states that semidiscretization of the acoustic wave equation by LDG-H is symplectic.

5 Generalization to Additive and Partitioned Methods

In the preceding sections, we have developed a theory of functional equivariance for a class of numerical integrators, including B-series methods, and applied it to local conservation laws for PDEs. This section extends the functional equivariance theory from Sect. 2 to two larger classes of numerical integrators: additive methods and partitioned methods. It follows that, when these methods are applied to PDEs satisfying local conservation laws, the results of Sects. 3 and 4 may also be extended to these classes of methods.

5.1 Additive Methods

We now consider integrators applied to a vector field \( f \in {\mathfrak {X}} (Y) \) after it has been additively decomposed as \( f = f ^{ [1] } + \cdots + f ^{ [N] } \). Specifically, we have in mind additive Runge–Kutta and NB-series methods (cf. Aráujo, Murua, and Sanz-Serna [2]), as well as splitting and composition methods (cf. McLachlan and Quispel [20]).

Denote the application of a method \(\Phi \) to a decomposed vector field \( f = f ^{ [1] } + \cdots + f ^{ [N] } \) by \( \Phi _{ f ^{ [1] } , \ldots , f ^{ [N] } } \). By an additive numerical integrator, we mean the entire collection of maps \( \Phi = \bigl \{ \Phi _{ f ^{ [1] } , \ldots , f ^{ [N] } } : f ^{ [1] } , \ldots , f ^{ [N] } \in {\mathfrak {X}} (Y) ,\ Y \text { a Banach space} \bigr \} \). We begin by extending the definitions of affine equivariance and functional equivariance to such methods.

Definition 5.1

An additive numerical integrator \(\Phi \) is N-affine equivariant if \( A \circ \Phi _{ f ^{ [1] } , \ldots , f ^{ [N] } } = \Phi _{ g ^{ [1] } , \ldots , g ^{ [N ]} } \circ A \) whenever \( f ^{ [\nu ] } \in {\mathfrak {X}} (Y) \) and \( g ^{ [\nu ] } \in {\mathfrak {X}} (U) \) are A-related for all \( \nu = 1 , \ldots , N \), all affine maps \(A :Y \rightarrow U \), and all Banach spaces Y and U.

Definition 5.2

Given a Gâteaux differentiable map \( F :Y \rightarrow Z \) and \( f ^{ [1] } , \ldots , f ^{ [N] } \in {\mathfrak {X}} (Y) \), define \( g ^{ [1] }, \ldots , g ^{ [N] } \in {\mathfrak {X}} ( Y \times Z ) \) by \( g ^{ [\nu ] } ( y, z ) = \bigl ( f ^{[\nu ]} (y) , F ^\prime (y) f ^{[\nu ]} (y) \bigr ) \) for \( \nu = 1 , \ldots , N \). We say that an additive numerical integrator \(\Phi \) is F-functionally equivariant if \( (\mathrm {id}, F ) \circ \Phi _{ f ^{ [1] } , \ldots , f ^{ [N] } } = \Phi _{ g ^{ [1] } , \ldots , g ^{ [N] } } \circ ( \mathrm {id}, F ) \) for all \( f ^{[1]} , \ldots , f^{[N]} \in {\mathfrak {X}} (Y) \) and \( {\mathcal {F}} \)-functionally equivariant if this holds for all \( F \in {\mathcal {F}} ( Y, Z ) \) and all Banach spaces Y and Z.

Proposition 5.3

Every N-affine equivariant method is affine functionally equivariant.

Proof

The proof is essentially identical to that for Proposition 2.6. If F is affine, then so is \( ( \mathrm {id}, F ) \), and the vector fields \( f ^{ [\nu ] } \) and \( g ^{ [\nu ] } \) are \( ( \mathrm {id}, F ) \)-related for all \( \nu = 1 , \ldots , N \). \(\square \)

Example 5.4

(additive Runge–Kutta methods) An s-stage additive Runge–Kutta (ARK) method has the form

$$\begin{aligned} Y _i&= y _0 + \Delta t \sum _{ \nu = 1 } ^N \sum _{ j = 1 } ^s a _{ i j } ^{[\nu ] }f ^{[\nu ]} ( Y _j ) , \qquad i = 1, \ldots , s ,\\ y _1&= y _0 + \Delta t \sum _{ \nu = 1 } ^N \sum _{ i = 1 } ^s b _i ^{[\nu ]} f ^{[\nu ]} ( Y _i ) , \end{aligned}$$

and F-functional equivariance is the condition

$$\begin{aligned} F ( y _1 ) = F ( y _0 ) + \Delta t \sum _{ \nu = 1 } ^N \sum _{ i = 1 } ^s b _i ^{[\nu ]} F ^\prime ( Y _i ) f ^{[\nu ]} ( Y _i ) . \end{aligned}$$

If F is an invariant, then we have \( F ^\prime ( Y _i ) f ( Y _i ) = 0 \) but generally \( F ^\prime ( Y _i ) f ^{ {[\nu ]} } ( Y _i ) \ne 0 \) for \( N > 1 \), so the sum on the right-hand side need not vanish. However, if \( b _i ^{[\nu ]} = b _i \) is independent of \(\nu \), then it does vanish, and we obtain \( F ( y _1 ) = F ( y _0 ) \) as in Example 2.5. This illustrates that an ARK method may be functionally equivariant but not invariant preserving (even for affine maps) unless some additional condition is satisfied.

Proposition 5.5

Additive Runge–Kutta methods are N-affine equivariant. Furthermore, an ARK method preserves affine invariants if \( b _i ^{[\nu ]} = b _i \) is independent of \(\nu \).

Proof

Suppose \( f ^{ [\nu ] } \) and \( g ^{ [\nu ] } \) are A-related for \( \nu = 1 , \ldots , N \). Then

$$\begin{aligned} A ( Y _i )&= A ( y _0 ) + A ^\prime ( Y _i - y _0 ) \\&= A ( y _0 ) + \Delta t \sum _{ \nu = 1 } ^N \sum _{ j = 1 } ^s a _{ i j } ^{[\nu ] } ( A ^\prime \circ f ^{[\nu ]} ) ( Y _j ) \\&= A ( y _0 ) + \Delta t \sum _{ \nu = 1 } ^N \sum _{ j = 1 } ^s a _{ i j } ^{[\nu ] } g ^{[\nu ]} \bigl ( A ( Y _j ) \bigr ) , \end{aligned}$$

for \( i = 1 , \ldots , s \), and similarly,

$$\begin{aligned} A ( y _1 ) = A ( y _0 ) + \Delta t \sum _{ \nu = 1 } ^N \sum _{ i = 1 } ^s b _i ^{[\nu ]} g ^{[\nu ]} \bigl ( A ( Y _i ) \bigr ) . \end{aligned}$$

This shows that \( A ( y _1 ) = ( A \circ \Phi _f ) ( y _0 ) = ( \Phi _g \circ A ) ( y _0 ) \), so \( \Phi \) is N-affine equivariant. Finally, if \( b _i ^{[\nu ]} = b _i \) is independent of \(\nu \), then Proposition 5.3 and Example 5.4 show that \(\Phi \) preserves affine invariants. \(\square \)

Remark 5.6

It is straightforward to show that, in fact, all NB-series methods are N-affine equivariant. (This includes, e.g., generalized additive Runge–Kutta methods, whose symplecticity conditions were recently investigated by Günther, Sandu, and Zanna [13].) The proof is, essentially, to repeatedly differentiate the A-relatedness condition \( A ^\prime \circ f ^{ [\nu ] } = g ^{ [\nu ] } \circ A \), obtaining a relation between the elementary differentials.

Theorem 5.7

Let \({\mathcal {F}}\) satisfy Assumption 2.8. An additive numerical integrator \(\Phi \) preserves \({\mathcal {F}}\)-invariants if and only if it is \({\mathcal {F}}\)-functionally equivariant and preserves affine invariants.

Proof

\( ( \Rightarrow ) \) Suppose \(\Phi \) preserves \({\mathcal {F}}\)-invariants. The proof of \({\mathcal {F}}\)-functional equivariance is essentially identical to that in Theorem 2.9, and preservation of affine invariants follows from the fact that \({\mathcal {F}}\) contains affine maps by Assumption 2.8.

\( ( \Leftarrow ) \) Conversely, suppose that \(\Phi \) is \({\mathcal {F}}\)-functionally equivariant and preserves affine invariants. If \( F \in {\mathcal {F}} ( Y , Z ) \) is an invariant of \( f \in {\mathfrak {X}} ( Y ) \), then \( g ^{ [\nu ] } (y, z ) = \bigl ( f ^{[\nu ]} (y) , F ^\prime (y) f ^{ [\nu ] } (y) \bigr ) \) is the corresponding decomposition of \( g = ( f, 0 ) \). By \({\mathcal {F}}\)-functional equivariance, we have \( \Phi _{ g ^{ [1] } , \ldots , g ^{ [N] } } :\bigl ( y _0, F ( y _0 ) \bigr ) \mapsto \bigl ( y _1, F ( y _1 ) \bigr ) \). Finally, since \( G ( y, z ) = z \) is an affine invariant of g, it is preserved by \( \Phi _{ g ^{ [1] } , \ldots , g ^{ [N] } } \), and thus \( F ( y _0 ) = F ( y _1 ) \). \(\square \)

Example 5.8

Let \({\mathcal {F}}\) be the class of quadratic maps. It follows that an additive numerical integrator preserves quadratic invariants if and only if it is quadratic functionally equivariant and preserves affine invariants. For ARK methods, a sufficient condition is that \( b _i ^{ [\nu ] } = b _i \) be independent of \(\nu \) and \( b _i ^{ [\nu ] } a _{ i j } ^{[\mu ]} + b _j ^{[\mu ]} a _{ j i } ^{[\nu ]} = b _i ^{ [\nu ] } b _j ^{ [\mu ] } \) for all i, j, \(\mu \), \(\nu \). The proof is identical to that for symplecticity of ARK methods, cf. Aráujo, Murua, and Sanz-Serna [2, Theorem 7].

Splitting methods take \( \Phi _{ f ^{ [1] }, \cdots , f ^{ [N] } } \) to be a composition of exact flows \( \varphi _{ \tau f ^{ [\nu ] } } \), i.e.,

$$\begin{aligned} \Phi _{ f ^{ [1] } , \ldots , f ^{ [N] } } = \varphi _{ \tau _s f ^{ [\nu _s] } } \circ \cdots \circ \varphi _{ \tau _1 f ^{ [\nu _1] } } , \end{aligned}$$

where consistency requires \( \sum _{ \nu _i = \nu } \tau _i = 1 \) for all \( \nu = 1 , \ldots , N \). For \( N = 2 \), the two most elementary splitting methods are the Lie–Trotter splitting \( \varphi _{ f ^{ [1] } } \circ \varphi _{ f ^{ [2] } } \) and the Strang splitting \( \varphi _{ \frac{1}{2} f ^{ [2] } } \circ \varphi _{ f ^{ [1] } } \circ \varphi _{ \frac{1}{2} f ^{ [2] } } \), where \(\varphi \) denotes the exact time-1 flow. Since the exact flow is equivariant (and hence functionally equivariant) with respect to all maps F, the chain rule implies that this is also true of splitting methods. As a consequence of Theorem 5.7, we get the following negative result for splitting methods.

Corollary 5.9

Any splitting method that preserves affine invariants equals the exact flow.

Proof

Since splitting methods are equivariant with respect to all maps, Theorem 5.7 implies that any splitting method preserving affine invariants preserves all invariants. To see that this must be the exact flow, consider the vector field \( ( f, 1 ) \in {\mathfrak {X}} ( Y \times {\mathbb {R}} ) \), which augments \( {\dot{y}} = f (y) \) by the equation \( {\dot{t}} = 1 \). The exact solution is \( y (t) = \varphi _{ t f } (y _0) \), so \( F ( y, t ) = y - \varphi _{ t f } (y _0) \) is an invariant of (f, 1) . Therefore, \( F ( y _1 , 1 ) = F ( y _0 , 0 ) = 0 \), which says that \( y _1 = \varphi _f ( y _0 ) \). \(\square \)

5.2 Partitioned Methods

We finally consider partitioned methods, which are based on a partitioning \( Y = Y ^{ [1] } \oplus \cdots \oplus Y ^{ [N] } \). In particular, we have in mind partitioned Runge–Kutta and P-series methods (cf. Hairer [14]). These are closely related to the methods in the previous section, except the vector field decomposition \( f = f ^{ [1] } + \cdots + f ^{ [N] } \) is uniquely specified by the partitioning of Y, i.e., \( f ^{ [\nu ] } (y) \in Y ^{ [\nu ] } \) for all \( y \in Y \) and \( \nu = 1 , \ldots , N \). For this reason, we write the flow of such a method as \( \Phi _f \) rather than \( \Phi _{ f ^{ [1] } , \ldots , f ^{ [N] } } \). By a partitioned numerical integrator, we mean the entire collection of maps \( \Phi = \bigl \{ \Phi _f : f \in {\mathfrak {X}} ( Y ),\ Y = \bigoplus _{ \nu = 1 } ^N Y ^{[\nu ]} \text { a partitioned Banach space} \bigr \} \).

Definition 5.10

Given partitioned spaces \( Y = \bigoplus _{ \nu = 1 } ^N Y ^{[\nu ]} \) and \( U = \bigoplus _{ \nu = 1 } ^N U ^{ [\nu ] } \), we say that \( A :Y \rightarrow U \) is a P-affine map if it decomposes as \( A = \bigoplus _{ \nu = 1 } ^N A ^{ [\nu ] } \), where each \( A ^{ [\nu ] } :Y ^{ [\nu ] } \rightarrow U ^{ [\nu ] } \) is affine. A partitioned numerical integrator \(\Phi \) is P-affine equivariant if \( A \circ \Phi _f = \Phi _g \circ A \) whenever \( f ^{ [\nu ] } \) and \( g ^{ [\nu ] } \) are A-related for all \( \nu = 1 , \ldots , N \), all P-affine maps A, all partitionings, and all Banach spaces Y and U.

Example 5.11

If we partition \( U = {\mathbb {R}} \) into \( U ^{ [\mu ] } = {\mathbb {R}} \) and \( U ^{ [\nu ] } = \{ 0 \} \) for \( \nu \ne \mu \), then the P-affine functionals are those depending only on \( Y ^{ [\mu ] } \). Affine functionals depending on more than one component \( Y ^{ [\nu ] } \) cannot be P-affine for any partitioning of \({\mathbb {R}}\). In particular, if we take \( Y = {\mathbb {R}}^2 = \bigl ( {\mathbb {R}} \times \{ 0 \} \bigr ) \oplus \bigl ( \{ 0 \} \times {\mathbb {R}} \bigr ) \), then:

  • \( ( q, p ) \mapsto q \) is P-affine for the partitioning \( U = {\mathbb {R}} \oplus \{ 0 \} \);

  • \( ( q, p ) \mapsto p \) is P-affine for the partitioning \( U = \{0 \} \oplus {\mathbb {R}} \);

  • \( ( q, p ) \mapsto q + p \) is never P-affine.

Proposition 5.12

If an additive numerical integrator \( \Psi \) is N-affine equivariant, then the partitioned numerical integrator \(\Phi \) defined by \(\Phi _f = \Psi _{ f ^{ [1] } , \ldots , f ^{ [N] } } \) is P-affine equivariant.

Proof

This follows immediately from the definitions, since P-affine maps are affine. \(\square \)

Example 5.13

(partitioned Runge–Kutta methods) An s-stage partitioned Runge–Kutta method (PRK) is just the application of an ARK method to a partitioned space, as in Proposition 5.12, where \(\Phi \) is the PRK method and \( \Psi \) is the ARK method. As an immediate corollary of this proposition, all PRK methods are P-affine equivariant.

The definition of F- and \({\mathcal {F}}\)-functional equivariance is the same as in Definition 2.4, where given \( Y = \bigoplus _{ \nu = 1 } ^N Y ^{[\nu ]} \) and \( Z = \bigoplus _{ \nu = 1 } ^N Z ^{[\nu ]} \), we partition \( Y \times Z = \bigoplus _{ \nu = 1 } ^N ( Y ^{[\nu ]} \times Z ^{ [\nu ] } ) \). However, the methods being considered are not necessarily equivariant with respect to all affine maps, so Assumption 2.8 is too restrictive on \({\mathcal {F}}\). We therefore replace it with the following, which just replaces “affine” by “P-affine” for specified partitions.

Assumption 5.14

Assume that:

  • \({\mathcal {F}} (Y,Y) \) contains the identity map for all \(Y = \bigoplus _{ \nu = 1 } ^N Y ^{[\nu ]}\);

  • \( {\mathcal {F}} ( Y, Z ) \) is a vector space for all \(Y = \bigoplus _{ \nu = 1 } ^N Y ^{[\nu ]}\) and \(Z=\bigoplus _{ \nu = 1 } ^N Z ^{[\nu ]}\);

  • \( {\mathcal {F}} \) is invariant under composition with P-affine maps, in the following sense: If \( A :Y \rightarrow U \) and \( B :V \rightarrow Z \) are P-affine and \( F \in {\mathcal {F}} ( U, V ) \), then \( B \circ F \circ A \in {\mathcal {F}} ( Y, Z ) \), for all \(Y = \bigoplus _{ \nu = 1 } ^N Y ^{[\nu ]}\), \(Z = \bigoplus _{ \nu = 1 } ^N Z ^{[\nu ]}\), \(U = \bigoplus _{ \nu = 1 } ^N U ^{[\nu ]}\), and \(V = \bigoplus _{ \nu = 1 } ^N V ^{[\nu ]}\).

Theorem 5.15

Let \({\mathcal {F}}\) satisfy Assumption 5.14. A partitioned numerical integrator \(\Phi \) preserves \({\mathcal {F}}\)-invariants if and only if it is \({\mathcal {F}}\)-functionally equivariant.

Proof

The proof is formally identical to that for Theorem 2.9. \(\square \)

Example 5.16

Let \({\mathcal {F}}\) be the class of P-affine maps. It follows that all P-affine equivariant methods preserve P-affine invariants. In particular, by Example 5.11, affine invariants \( F :Y \rightarrow {\mathbb {R}} \) depending only on a single component \( Y ^{ [\mu ] } \) are preserved.

Example 5.17

Let \({\mathcal {F}}\) be the class of all affine maps, irrespective of partitioning. It follows that P-affine equivariant methods preserve affine invariants if and only if they are affine functionally equivariant. For PRK methods, as for ARK methods, this holds if \( b _i ^{ [\nu ] } = b _i \) is independent of \(\nu \). (See Example 5.4 and Proposition 5.5.)

Example 5.18

Let \({\mathcal {F}}\) be the class of quadratic maps that are at most bilinear with respect to the partition. i.e., terms may be bilinear in \( y ^{ [\mu ] } \) and \( y ^{ [\nu ] } \) for \( \mu \ne \nu \). For PRK methods, a sufficient condition for \({\mathcal {F}}\)-invariant preservation, and thus for \({\mathcal {F}}\)-functional equivariance, is that \( b _i ^{ [\nu ] } = b _i \) be independent of \(\nu \) and \( b _i ^{ [\nu ] } a _{ i j } ^{[\mu ]} + b _j ^{[\mu ]} a _{ j i } ^{[\nu ]} = b _i ^{ [\nu ] } b _j ^{ [\mu ] } \) for all i, j, and \( \mu \ne \nu \). This is a straightforward generalization of the \( N = 2 \) case, cf. Hairer, Lubich, and Wanner [15, Theorem IV.2.4].

Example 5.19

Let \({\mathcal {F}}\) be the class of all quadratic maps, irrespective of partitioning. For PRK methods, as for ARK methods, a sufficient condition for quadratic invariant preservation, and thus for quadratic functional equivariance, is that \( b _i ^{ [\nu ] } = b _i \) be independent of \(\nu \) and \( b _i ^{ [\nu ] } a _{ i j } ^{[\mu ]} + b _j ^{[\mu ]} a _{ j i } ^{[\nu ]} = b _i ^{ [\nu ] } b _j ^{ [\mu ] } \) for all i, j, \(\mu \), \(\nu \). (See Example 5.8.)

5.3 Closure under Differentiation and (Multi)Symplecticity

Finally, we generalize Theorem 2.12, which allows the functional equivariance results for N-affine and P-affine equivariant methods to be applied to observables depending on variations.

Theorem 5.20

N-affine and P-affine equivariant methods are closed under differentiation.

Proof

The proof is basically the same as Theorem 2.12, although we need to specify how \(\Phi \) is applied to the augmented system

$$\begin{aligned} {\dot{x}} = f (x) , \qquad {\dot{y}} = f (y) , \qquad {\dot{z}} = \frac{ f (x) - f (y) }{ \epsilon } . \end{aligned}$$

We simply use the same decomposition or partition for each of the three parts. Specifically, if \(\Phi \) is N-affine equivariant, then we decompose

$$\begin{aligned} f (x)= & {} \sum _{ \nu = 1 } ^N f ^{ [\nu ] } (x) ,\;\\ f (y)= & {} \sum _{ \nu = 1 } ^N f ^{ [\nu ] } (y) ,\\ \frac{ f (x) - f (y) }{ \epsilon }= & {} \sum _{ \nu = 1 } ^N \frac{ f ^{ [\nu ] } (x) - f ^{ [\nu ] } (y) }{ \epsilon } , \end{aligned}$$

while if \(\Phi \) is P-affine equivariant, we partition \( Y \times Y \times Y = \bigoplus _{ \nu = 1 } ^N ( Y ^{ [\nu ] } \times Y ^{ [\nu ] } \times Y ^{ [\nu ] } ) \). The proof then proceeds as in Theorem 2.12. \(\square \)

Therefore, the results on symplecticity and multisymplecticity of affine equivariant methods preserving quadratic invariants also hold for N-affine and P-affine equivariant methods preserving quadratic invariants. Moreover, since the canonical symplectic form \( \omega = \mathrm {d} q ^i \wedge \mathrm {d} p _i \) and multisymplectic form \( \omega ^0 = \mathrm {d} u ^i \wedge \mathrm {d} p _i \) are bilinear on \( Y = V \times V ^*\), it suffices for an \( N = 2 \) partitioned method to preserve only bilinear invariants, as in Example 5.18. This includes widely used symplectic PRK methods such as Störmer/Verlet and the Lobatto IIIA–IIIB pair (Hairer, Lubich, and Wanner [15, Sections IV.2 and VI.4]), as well as compositions of these methods.

6 Concluding Remarks

We conclude by posing a natural question for future investigation: Which numerical integrators are affine functionally equivariant? Here is a summary of some related results that have been mentioned throughout this paper:

  • B-series methods are precisely the affine equivariant methods [18], so by Proposition 2.6, they are included among the affine functionally equivariant methods.

  • Aromatic B-series methods are precisely the affine isomorphism equivariant methods [23]. Since only isomorphsims are considered, the series coefficients may vary depending on \( \dim Y \). If the series coefficients are constant across dimensions, then the method is affine functionally equivariant, as in Remark 2.7. Conversely, variable-coefficient methods cannot be affine functionally equivariant, since y would then evolve differently between the original and augmented systems.

  • As shown in Example 5.17, partitioned methods may also be affine functionally equivariant, e.g., a PRK method with \( b _i ^{[\nu ]} = b _i \) independent of \(\nu \). However, such methods are generally not affine isomorphism equivariant, e.g., if \( a _{ i j } ^{[\nu ]} \) varies with \(\nu \), so affine functional equivariance need not imply affine isomorphism equivariance.

Figure 1 depicts these relationships among the different classes of “equivariant” methods.

Fig. 1
figure 1

The landscape of equivariant methods