1 Introduction

Many systems of interest in science and engineering occur in a domain with disparate length scales [10, e.g.]: often a fine structure is modulated on a much larger scale [22, e.g.]. Such disparate scales usually are a major challenge in computational simulations [32, p.14, e.g.]. Two classic examples are Taylor–Couette flow [16, e.g.] and Benard convection [33, e.g.]. Often the fine-scale detail is crucial to the accurate modelling of a multiscale system, but with multiple length scales a simulation resolving the physical fine scale is not only prohibitively inefficient and severely constrained by memory limitations, it is also an arduous task to analyse simulation data generated at a scale much smaller than the scale of interest. This article further develops a unified mathematical approach and theory to reduce the full set of nonlinear governing equations to a simplified evolution equation, with quantified error, enabling more efficient simulations and analysis. This theoretical methodology should be able to better justify and illuminate many extant long-wave and homogenisation theories [9, 25, e.g.].

We focus on multiscale systems whose physical domain is ‘large’ in multiple dimensions, but have a relatively ‘thin’ cross-section in the other dimensions. As a specific example, Sect. 1.1 takes as prescribed a variant of the integrated boundary layer pdes for a thin liquid film of Newtonian fluid spreading over a planar rotating surface [7], and rigorously derives a simpler lubrication model of the nonlinear flow of the film [25, 36, §II.K, e.g.]. Appendix A lists computer algebra code for deriving this lubrication model, with the code written to be readily adaptable to a wide range of systems. Thin liquid flows are important in biology, physics, and engineering, as well as in the environment. They may be of common liquids such as water or oil, or of more rheologically complex fluids, and display interesting nonlinear wave patterns [19, 25, 36]. Other examples of systems amenable to our methodology include flood and tsunami modelling [20, 24], pattern formation [9, 35], wave interactions  [13], elastic shells [18, 21], and microstructured materials [31].

Section 2 defines the generic nonlinear pde system to which the methodology applies, and defines the ‘large’ but ‘thin’ multiscale domain on which the system evolves. In pattern evolution problems, and in homogenization problems, a ‘thin’ domain variable is the phase of the underlying small scale pattern [28, §3.3, e.g.]. A first step in the reduction of such pdes was taken by Mielke [22], but the analysis required solutions to exist for all \({\mathbb {R}}\)-time and for all \({\mathbb {R}}^m\)-space which excludes initial/boundary value problems on finite domains. Here we analyse the dynamics of a general pde in a thin cross-section of the large, but finite, domain by constructing a multivariate Taylor expansion for the local spatial structures and analysing the evolution of the coefficients. Being the union of local-space-time modelling means the approach is valid everywhere outside of boundary layers [27, e.g.] and initial transients [26, e.g.]. Section 3 details how to capture the emergent behaviour of the system at every chosen cross-section via constructing a set of generalised eigenvectors which span the centre subspace. Based upon these eigenvectors, the system’s centre manifold, on which the slow system evolves, is parametrised and a nonlinear pde derived for the emergent slow evolution. The order of the multivariate Taylor expansion determines the order of accuracy of the derived slow evolution, and Lagrange’s Remainder Theorem provides a novel exact expression for the error of this pde for the emergent slow evolution.

This article significantly extends the approach previously developed to derive the emergent evolution of nonlinear pde systems on a large one-dimensional physical domain [28] to domains with multiple large dimensions. Based upon the methodology established for linear systems [30], here we extend the approach to nonlinear pdes with multiple large dimensions. Our approach is distinctly different to many other methods which derive large-scale models in that here no asymptotic limit is required for the scale separation between the large domain and the thin cross-section [17, 23, e.g.]. Specifically, we have no requirement that a scale separation parameter (often named \(\epsilon \)) must be asymptotically small; we only require that such a small-large scale separation exists so that we can establish centre and stable subspaces (Assumption 3), and then our approach is valid at finite scale separation.

1.1 Example of a rotating shallow fluid flow

As an example application of some of the results, consider the flow of a shallow layer of fluid on a solid flat rotating substrate, such as in spin coating [25, 36, §II.K, e.g.] or large-scale shallow water waves [11, 15, e.g.]. Let \(\vec {x}=(x_1,x_2)\) parametrise location on the rotating substrate, and let the fluid layer have thickness \(h(\vec {x}, t)\) and move with depth-averaged horizontal velocity \(\vec {v}(\vec {x}, t)=(v_1, v_2)\). We take as given (with its simplified physics) that the (non-dimensional) governing set of pdes is the nonlinear system

$$\begin{aligned} \frac{\partial h}{\partial t}&=-{\nabla }\cdot (h\vec {v}), \end{aligned}$$
(1a)
$$\begin{aligned} \frac{\partial \vec {v}}{\partial t}&=\begin{bmatrix} -b &{} f\\ -f &{}-b\end{bmatrix}\vec {v}\\&\quad -(\vec {v}\cdot \nabla )\vec {v}-g{\nabla }h+\nu \nabla ^2\vec {v}\,, \end{aligned}$$
(1b)

where b represents viscous bed drag, f is the Coriolis coefficient, \(g\) is the acceleration due to gravity, \(\nu \) is the kinematic viscosity, and we neglect surface tension. The pdes (1) are similar to those used by Dellar and Salmon [11, Eq. (79)], but with only one component of the Coriolis force, and the addition of viscous drag and viscosity, and also similar to that used by Hereman [15, Eqs. (22)–(24)], but here with a flat substrate.

For such a shallow fluid flow, the horizontal gradient \(\nabla \) of quantities are relatively small [10, e.g.]. Then the flow driven by variations of film thickness, \(\nabla h\), is approximately balanced in (1b) by the rotation and the substrate drag, leading to the leading approximate velocity field

$$\begin{aligned} \vec {v}\approx \frac{-g}{b^2+f^2}\begin{bmatrix} b&{}f\\ -f&{}b \end{bmatrix}\nabla h\,. \end{aligned}$$

Substituting this balance in the conservation of mass equation (1a) derives the single component, ‘lubrication’, model

$$\begin{aligned} \frac{\partial h}{\partial t}\approx \frac{gb}{b^2+f^2}\nabla \cdot (h\nabla h). \end{aligned}$$
(2)

Having just one component, we can use the pde model (2) as a simpler description of the shallow fluid dynamics. But pde (2) is an approximation to the ‘original’ pde (1), and so three outstanding questions are: can we find a rigorous error? can the analysis be extended to higher order? and can such an approach apply generally? Our answer to all three questions is yes.

Returning to the original system of pdes (1) and defining the system field \(u(\vec {x},t)=(h, v_1, v_2)\), we rewrite system (1) as one nonlinear equation, while also grouping terms of like order:

$$\begin{aligned} \frac{\partial u}{\partial t}&=\begin{bmatrix}0 &{} 0 &{} 0\\ 0 &{} -b &{} f\\ 0 &{} -f &{} -b\end{bmatrix}u +\begin{bmatrix}0 &{} 0 &{} 0\\ -g &{} 0 &{} 0\\ 0 &{} 0 &{} 0\end{bmatrix}\partial _{x_1} u \nonumber \\&\quad +\begin{bmatrix}0 &{} 0 &{} 0\\ 0 &{} 0 &{} 0\\ -g &{} 0 &{} 0\end{bmatrix}\partial _{x_2} u \end{aligned}$$
(3a)
$$\begin{aligned}&\quad +\begin{bmatrix}0 &{} 0 &{} 0\\ 0 &{} \nu &{} 0\\ 0 &{} 0 &{} \nu \end{bmatrix}\left[ \partial _{x_1}^2u+\partial _{x_2}^2u\right] \end{aligned}$$
(3b)
$$\begin{aligned}&\quad +\begin{bmatrix} (\partial _{x_1} u)^T{\mathfrak {M}}_{10,0}u\\ (\partial _{x_1} u)^T{\mathfrak {M}}_{10,1}u\\ (\partial _{x_1} u)^T{\mathfrak {M}}_{10,2}u \end{bmatrix} +\begin{bmatrix} (\partial _{x_2} u)^T{\mathfrak {M}}_{01,0}u\\ (\partial _{x_2} u)^T{\mathfrak {M}}_{01,1}u\\ (\partial _{x_2} u)^T{\mathfrak {M}}_{01,2}u \end{bmatrix}, \end{aligned}$$
(3c)

where matrices

$$\begin{aligned} {\mathfrak {M}}_{10,0}&=\begin{bmatrix}0 &{} -1 &{} 0\\ -1 &{} 0 &{} 0\\ 0 &{} 0 &{} 0\end{bmatrix},\quad {\mathfrak {M}}_{10,1}=\begin{bmatrix}0 &{} 0 &{} 0\\ 0 &{} -1 &{} 0\\ 0 &{} 0 &{} 0\end{bmatrix},\quad \nonumber \\&\quad {\mathfrak {M}}_{10,2}=\begin{bmatrix}0 &{} 0 &{} 0\\ 0 &{} 0 &{} 0\\ 0 &{} -1 &{} 0\end{bmatrix}, \nonumber \\ {\mathfrak {M}}_{01,0}&=\begin{bmatrix}0 &{} 0 &{} -1\\ 0 &{} 0 &{} 0\\ -1 &{} 0 &{} 0\end{bmatrix},\quad {\mathfrak {M}}_{01,1}=\begin{bmatrix}0 &{} 0 &{} 0\\ 0 &{} 0 &{} -1\\ 0 &{} 0 &{} 0\end{bmatrix},\quad \nonumber \\&\quad {\mathfrak {M}}_{01,2}=\begin{bmatrix}0 &{} 0 &{} 0\\ 0 &{} 0 &{} 0\\ 0 &{} 0 &{} -1\end{bmatrix}. \end{aligned}$$
(3d)

The system’s first two lines (3a) and (3b) are the linear part, whereas the third line (3c) contains the nonlinear quadratic terms which encode inertial acceleration. When the nonlinear terms (3c) are negligible, such as for very viscous fluids with low Reynolds numbers, the method described by Roberts and Bunder [30] derives a slow linear pde approximation of (3), but the analysis and results of that article do not account for nonlinear effects, such as (3c). Herein we further develop the methodology and theory to account for nonlinear effects.

For the example shallow fluid flow described by (3), the ‘large’ domain is some ‘physical’ subset of the \(x_1x_2\)-plane, and the ‘thin’ cross-section is the three-components of \(u=(h,v_1,v_2)\) . The aim is to capture the long, slow behaviour of the original \(u(\vec {x},t)\) field in a one-component slow field \(U(\vec {x},t)\) (instead of the three components which describe the thin cross-section), and to construct a pde for \(U(\vec {x},t)\) which is correct to some order \(N\) in spatial derivatives, with known error. In general, higher orders \(N\) are potentially able to capture more extreme spatial fluctuations but may not be structurally stable, and so we address up to some low–moderate order \(N\). This restricts \(U(\vec {x},t)\) to describing long, relatively gradual, spatial variations of the original field \(u(\vec {x},t)\). Section 3.2 defines the slow field \(U(\vec {x},t)\) in the general case and proves that it describes the behaviour of the original microscale field on the slowly evolving centre manifold (or slow manifold, as is the case in this shallow fluid example).

To determine the nature of the slow field \(U\) we first seek to understand the ‘slow’ and ‘fast’ evolution of the lowest-order linearisation of the system field \(u\) (Assumption 3 elaborates the general case). The linear dynamics of \(u\) are dominantly characterised by the lowest order linear term in (3), the term \({\mathfrak {L}}_{{\vec {0}}}u\) where here

$$\begin{aligned} {\mathfrak {L}}_{{\vec {0}}}=\begin{bmatrix}0 &{} 0 &{} 0\\ 0 &{} -b &{} f\\ 0 &{} -f &{} -b\end{bmatrix}. \end{aligned}$$

The eigenvalues of \({\mathfrak {L}}_{{\vec {0}}}\) indicate that \(u\) evolves on a one-dimensional slow subspace (one zero eigenvalue) and a two-dimensional stable subspace (two eigenvalues, \(-b\pm {\text {i}}f\), with negative real part). The stable part of \(u\) decays relatively quickly, roughly like \(e^{-bt}\), whereas the slow component of \(u\), namely \(h\), evolves on the one-dimensional slow subspace. In the notation introduced by Assumption 3 and Sect. 3, there exists a slow subspace of dimension \(m=1\) with right and left eigenvectors \(V^{{\vec {0}}}= Z^{{\vec {0}}} =(1,0,0)\), and eigenvalue \(A_{{\vec {0}}} =0\) .

Once the lowest-order linear dynamics of the system field \(u\) are known from matrix \({\mathfrak {L}}_{{\vec {0}}}\), we construct generalised eigenvectors \({{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}\), for \(|\vec {n}|\leqslant N\) , which span the spatially-local slow subspace of the linear system (3a) and (3b) to the specified order of accuracy \(N\). This order of accuracy N is that of a local multivariate Taylor expansion of the field \(u(\vec {x},t)\). The advantages of such a Taylor expansion are that not only does it provide a straightforward way to increase the order of accuracy \(N\), but also Lagrange’s Remainder Theorem provides a rigorous error term. Section 3.1 discusses the general construction of the eigenvectors \({{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}\), which in essence detail spatially-local out-of-equilibrium structures, and then Sect. 3.2 models the full nonlinear system to derive the slow manifold pde.

Appendix A lists computer algebra ReduceFootnote 1 code which applies the general theory of Sects. 2 and 3 to determine the slow pde for any order \(N\) of the Taylor expansion, thus constructing pdes which describe the slow \(U=h\) field evolution of the shallow fluid dynamics to various orders of accuracy. See Sects. 2 and 3 for details that justify (2) for order \(N=2\), and that for order \(N=3\) justify that the slow pde is the 2D advection-diffusion pde

$$\begin{aligned} \frac{\partial h}{\partial t}\approx \sum _{|\vec {\ell }_1|=0}^3\sum _{|\vec {\ell }_{2}|=0}^{|\vec {\ell }_1|}a_{\vec {\ell }_1\vec {\ell }_{2}}(\partial _{\vec {x}}^{\vec {\ell }_1}h) (\partial _{\vec {x}}^{\vec {\ell }_{2}}h)\,, \end{aligned}$$
(4a)

with multi-indices \(\vec {\ell }_{1,2}\in {\mathbb {N}}^2_0\) , symmetry \(a_{\vec {\ell }_1\vec {\ell }_2}=a_{\vec {\ell }_2\vec {\ell }_1}\) when \(|\vec {\ell }_1|=|\vec {\ell }_2|\) , and nonzero constant coefficients Footnote 2

$$\begin{aligned} a_{(01)(01)}&=a_{(10)(10)}=a_{(02)(00)}=a_{(20)(00)}=\frac{bg}{b^2+f^2}\,,\nonumber \\ a_{(03)(01)}&=a_{(30)(10)}=a_{(12)(10)}=a_{(21)(01)}=\frac{\nu g(b^2-f^2)}{(b^2+f^2)^2}\,,\nonumber \\ a_{(03)(10)}&=a_{(21)(10)}= -a_{(30)(01)}=-a_{(12)(01)}\nonumber \\&= 2\nu f\frac{bg}{(b^2+f^2)^2}\,. \end{aligned}$$
(4b)

The four \(a_{\vec {\ell }_1\vec {\ell }_{2}}\) coefficients equal to \(bg/(b^2+f^2)\) correspond to the \(N=2\) approximation (2), but the rigorous derivation from Appendix A on the slow subspace empowers a far richer description of the slow dynamics in the \(x_1x_2\)-plane. Further, Sect. 3.2 provides an exact expression (44) for the error of slow pdes such as (4a); Appendix A.3 details how components of this error are constructed for this shallow fluid flow.

Executing the computer algebra code in Appendix A to obtain slow pdes of higher orders is straightforward. Although the computational time increases rapidly with N, in principle the code is applicable to any order N. For example, the slow pde for order \(N=4\) is

$$\begin{aligned} \frac{\partial h}{\partial t}\approx {}&{}\sum _{|\vec {\ell }_1|=0}^4\sum _{|\vec {\ell }_{2}|=0}^{|\vec {\ell }_1|} a_{\vec {\ell }_1\vec {\ell }_{2}}(\partial _{\vec {x}}^{\vec {\ell }_1}h) (\partial _{\vec {x}}^{\vec {\ell }_{2}}h)\,, \end{aligned}$$
(5a)

with the same nonzero constant coefficients \(a_{\vec {\ell }_1\vec {\ell }_{2}}\) given in (4b), as well as

$$\begin{aligned} a_{(40)(00)}=a_{(04)(00)}=\tfrac{1}{2} a_{(22)(00)} =\frac{\nu g(b^2-f^2)}{(b^2+f^2)^2}\,. \end{aligned}$$
(5b)

For small enough damping, \(b<f\), the fourth order hyperdiffusion in (5a) makes the model structurally unstable. As is often necessary, for \(b<f\) one would then regularise the model as in the Benjamin, Bona, and Mahony [4] regularised long wave equation. Notwithstanding such practical regularisation, our derived error expression (44) applies and is useful for as long as the spatial gradients in the solutions to (5a) remain small enough.

Figure 1 plots finite difference simulations of the height h for: the original model (1) on a \(75\times 75\) grid; the order two slow pde (2) (lubrication model) on a \(15\times 15\) grid; and the order four slow pde (5) on a \(15\times 15\) grid. This coarser resolution of the slow models is appropriate as these pdes are only accurate for slower variations in space. The example square spatial domain is of width \(2\pi \) with periodic boundary conditions, and the dimensionless parameters are drag \(b=1\) , Coriolis coefficient \(f=5\) , gravity \(g=1\) and viscosity \(\nu =0.5\) (all parameters are non-dimensional). We choose a large Coriolis coefficient \(f\) as this produces relatively large fourth order coefficients (5b) [although smaller than the second order coefficients shown in the first line of (4b)] and enhances the differences between the order two and order four simulations. For the original model simulation, the initial condition of the height h is a Gaussian peak, wider in the \(x_1\) direction than the \(x_2\) direction, plus a uniformly distributed random component chosen from the interval [0, 0.5] (Fig. 1, top left). The mean of this initial condition in \(5\times 5\) blocks provides the initial conditions for both the order two and order four simulations (Fig. 1, top middle and top right). Figure 1 shows that the fluid gradually diffuses across the domain and the two coarse simulations provide reasonable approximations of full model diffusion, although the diffusion of the second order simulation is less accurate.

Fig. 1
figure 1

Simulations of the fluid height h over a square domain of width \(2\pi \) with periodic boundary conditions for: (left column) original pde (1) on a \(75\times 75\) grid; (middle column) order two slow pde (2) on a \(15\times 15\) grid; and (right column) order four slow pde (5) on a \(15\times 15\) grid. Plots in the same row are simulated at the same time, and all plots are coloured according to the same colour bar (shown below the row of plots). At any one time, all three plots have the same contour levels, but plots at different times have different contour levels

Figure 2 plots the velocity components of the original model (1) simulation at different times and reveals the dynamics of the flow. All initial velocity components v1 and v2 are normally distributed random numbers with mean zero and standard deviation 0.001 (Fig. 2, first row). The system quickly evolves into the expected rotational motion induced by the Coriolis force, with v1 dominating v2 due to the asymmetry in the initial fluid height. Although Fig. 2 shows that the velocity plays an important role in explaining the dynamics of this fluid flow model (1), Fig. 1 shows that approximation models such as the fourth order slow pde (5) accurately predict the slow behaviour of the height h without requiring any simulation of the velocity.

Fig. 2
figure 2

Simulations of the fluid velocity components (left) v1 and (right) v2 over a square domain of width \(2\pi \) on a \(75\times 75\) grid and with periodic boundary conditions. These velocity simulations correspond to the h simulation in the left column of Fig. 1. Plots in the same row are simulated at the same time, and all plots are coloured according to the same colour bar (shown below the row of plots)

To further explore the accuracy and efficiency of the diffusion in the two coarse approximation simulations of Fig. 1, Fig. 3 plots the fluid height h of the original model (1) averaged over the \(5\times 5\) grid in the centre of the domain, and the fluid heights of both the order two slow pde (2) and the order four slow pde (5) in the centre of the domain. In general, the fourth order simulation provides a good approximation for the mean height of the full simulation, particularly for times \(t\gtrsim 1\) , and at \(t=6\) the absolute error of the order four simulation is \(3\times 10^{-3}\). In contrast, the second order simulation decays too slowly and produces a mean peak height which is generally larger than the mean peak height of the original pde, with an absolute error of \(5\times 10^{-2}\) at \(t=6\) . Figure 3 was created using Matlab’s ode solver ode45, and for \(t\in [0,6]\) the original model simulation required 4217 time steps, whereas the order two simulation only required 121 and the fourth order required 405, demonstrating that our approximate pdes provide a significant improvement in efficiency when restricted to using explicit methods.

Fig. 3
figure 3

Mean peak heights of Fig. 1 simulations. For times \(t\gtrsim 1\) the order four simulation (Fig. 1, right) provides a good estimation of the expected mean peak height (Fig. 1, left), but the order two simulation (Fig. 1, middle) decays too slowly

The following sections develop theoretical support for the derivation of nonlinear slow pdes such as (2), (4) and (5), and also derive the novel general exact algebraic expression (44) for the error in such approximate pdes.

2 Local expansion of general nonlinear dynamics

The local expansion developed here builds on that of Roberts and Bunder [30, §3] by extending it to nonlinear systems. Similarly, Roberts and Bunder [30, §3] generalised the procedure of Roberts [28, §2] from a system with one large dimension and any number of significantly thin dimensions, to a system with some finite number of large dimensions and any number of thin dimensions. For completeness, we here present a full and detailed derivation which simplifies to the derivations of Roberts and Bunder [30, §3] for linear systems and of Roberts [28, §2] for linear systems with a one-dimensional ‘large’ domain. Thus, necessarily, some of the detailed expressions here reproduce that of earlier work as the basis for the new nonlinear theory herein is the previously established linear theory.

Consider some multiscale spatial domain \({\mathbb {X}}\times {\mathbb {Y}}\) where \({\mathbb {X}}\) is some open domain of large macroscale extent in space, and \({\mathbb {Y}}\) is a ‘relatively small’ microscale domain (in some Hilbert space). We analyse the dynamics of some field u within the multiscale spatial domain \({\mathbb {X}}\times {\mathbb {Y}}\) and determine the emergent behaviour of this field on the macroscale; that is, we aim to derive a description, over some time interval \({\mathbb {T}}\), of the long, slow u field dynamics on the macroscale domain \({\mathbb {X}}\) while accounting for the fine details in the microscale domain \({\mathbb {Y}}\) in an ‘averaged’, ‘homogenised’ or ‘slaved’ sense [28, 30]. As the domain \({\mathbb {Y}}\) is a small cross-section of the full domain of the system, a description of the large-scale behaviour should not involve fluctuations across \({\mathbb {Y}}\) as dynamic variables.

We consider the field u\((\vec {x},y,t)\) in a given Hilbert space \({\mathbb {U}}\) (finite or infinite dimensional), where \(u:{\mathbb {X}}\times {\mathbb {Y}}\times {\mathbb {T}}\rightarrow {\mathbb {U}}\) is a function of \(M\)-dimensional position \(\vec {x}\in {\mathbb {X}}\subseteq {\mathbb {R}}^M\), cross-sectional point \(y\in {\mathbb {Y}}\), and time \(t\in {\mathbb {T}}\subseteq {\mathbb {R}}\). The derivatives in the large space dimensions \(\vec {x}\) are crucial to organising the analysis so we introduce the multivariate (mixed) derivative

$$\begin{aligned} \partial _{\vec {x}}^{\vec {k}}:=\frac{\partial ^{|\vec {k}|}}{\partial x_1^{k_1}\partial x_2^{k_2}\cdots \partial x_M^{k_M}} \end{aligned}$$

for multi-indices \(\vec {k}=(k_1,\ldots ,k_M)\in {\mathbb {N}}_0^M\) (as usual, the set of natural numbers \({\mathbb {N}}_0:=\{0,1,2,\ldots \}\)). This multivariate derivative is of order \(|\vec {k}|=k_1+k_2+\cdots +k_M\) . We consider the class of problems where the field \(u(\vec {x},y,t)\) satisfies some specified nonlinear pde in the form

$$\begin{aligned} \frac{\partial u}{\partial t}={\mathfrak {L}}[u]+f[u] =\sum _{|\vec {k}|=0}^{\infty } {\mathfrak {L}}_{\vec {k}} \partial _{\vec {x}}^{\vec {k}} u +f[u], \end{aligned}$$
(6)

where \(f[u]:{\mathbb {U}}\rightarrow {\mathbb {U}}\) is a ‘strictly’ nonlinear function of field u and its derivatives, the \({\mathfrak {L}}_{\vec {k}}\) are linear operators (in \(y\)), and where the apparently infinite sum over all possible multi-indices \(\vec {k}\) in the pde (6) is finite in applications because we assume that only a finite number of operators \({\mathfrak {L}}_{\vec {k}}\) are non-zero.

In application to fluid or heat convection the nonlinear term is the quadratic \(f[u]=\vec {u}\cdot \vec {\nabla }\vec {u}\), whereas for the Ginzburg–Landau equation \(f[u]=u^3\). Consequently we consider nonlinearities that are sums of products of cognate factors.

Assumption 1

The nonlinear function \(f[u]\) may be written as, or usefully approximated as, a sum of products of \(u\) and its derivatives:

$$\begin{aligned} f[u]=\sum _j f^j[u], \quad \text {where}\quad f^j[u]=c_j(y)\prod _{i=1}^{P_j} \partial _{\vec {x}}^{\vec {p}^j_i} u(\vec {x},y,t), \end{aligned}$$
(7)

for \(P_j\) the order of each nonlinear term, and for some M-dimensional index \(\vec {p}_i^j\). (Sometimes we detail the case when \(f[u]\) has only one term in its sum.)

Roberts [28] considered systems such as pde (6) on the multiscale domain \({\mathbb {X}}\times {\mathbb {Y}}\) for one dimensional \({\mathbb {X}}\subset {\mathbb {R}}\) , and by analysing the dynamics of the system at each cross-sectional station \(X\in {\mathbb {X}}\) , constructed a reduced pde for the slowly varying dynamics. The construction relied on a Taylor expansion of the field u to order N, with the expansion made exact by including the Lagrange’s Remainder term in the derivation. Analysis of the Taylor coefficients then reveals the slow behaviour of the system near \(X\in {\mathbb {X}}\) within a centre manifold. Then a projection of the u field pde onto this centre manifold, generalised to the union over all stations \(X\in {\mathbb {X}}\) , defines the slow pde within domain \({\mathbb {X}}\), and a projection of Lagrange’s Remainder determines the error of the pde. Roberts and Bunder [30] analogously considered linear systems on the multiscale domain \({\mathbb {X}}\times {\mathbb {Y}}\) , but generalised to M-dimensional \({\mathbb {X}}\subset {\mathbb {R}}^M\). Here we further generalise these earlier developments to the class of nonlinear pdes (6) that have multiple macroscale dimensions.

2.1 Large-scales modulates local Taylor coefficients

This section adapts the approach of Roberts and Bunder [30] to the additional complication of nonlinear effects \(f[u]\). Both the field u and the nonlinearity \(f[u]\) are written as Taylor expansions with Lagrange Remainder terms [34, (1.27), e.g.]. These remainder terms ensure the analysis of the dynamics of the system is exact.

First, one chooses a fixed order, denoted by \(N\), for the Taylor series. The analysis begins as in earlier work [30, §3.1], but with subtle differences due to restrictions caused by the nonlinearity that require careful restatement here. For example, the analysis requires some assumptions about the smoothness of u. For \(k_{\max }\) denoting the largest magnitude derivative in the linear term of pde (6), for \(p_i^j\) representing all magnitude derivatives in the nonlinear term (7), and for Taylor expansion to order \(N\), the field \(u\) must be in differentiability class \(C^{N+\max (p_i^j,k_{\max })}\).

At every cross-section station \(\vec {X}\in {\mathbb {X}}\subset {\mathbb {R}}^M\), every point in the cross-section \(y\in {\mathbb {Y}}\), and every time \(t\in {\mathbb {T}}\), we expand the field \(u\) as an \(N\)th order Taylor multinomial about \(\vec {x}=\vec {X}\) :

$$\begin{aligned} u(\vec {x},y,t)= & {} \sum _{|\vec {n}|=0}^{N-1}u^{(\vec {n})}(\vec {X},y,t)\frac{(\vec {x}-\vec {X})^{\vec {n}}}{\vec {n}\text {!}}\nonumber \\&+\sum _{|\vec {n}|=N}u^{(\vec {n})}(\vec {X},\vec {x},y,t)\frac{(\vec {x}-\vec {X})^{\vec {n}}}{\vec {n}\text {!}}\,, \end{aligned}$$
(8a)

where we define the multi-index factorial \(\vec {n}!:=n_1!n_2!\cdots n_M!\) , the multi-index magnitude \(|\vec {n}|:=n_1+n_2+\cdots +n_M\) , the multi-index power \(\vec {x}^{\vec {n}}:=x_1^{n_1}x_2^{n_2}\cdots x_M^{n_M}\), and where

  • in the first sum, for \(|\vec {n}|<N\) , we define the coefficients \(u^{(\vec {n})}:{\mathbb {X}}\times {\mathbb {Y}}\times {\mathbb {T}}\rightarrow {\mathbb {U}}\) to be

    $$\begin{aligned} u^{(\vec {n})}(\vec {X},y,t):=\partial _{\vec {x}}^{\vec {n}}u\big |_{\vec {x}=\vec {X}}\,; \end{aligned}$$
    (8b)
  • and in the second sum, for \(|\vec {n}|=N\) , by Lagrange’s Remainder Theorem for multivariate Taylor series, we define the coefficients \(u^{(\vec {n})}:{\mathbb {X}}\times {\mathbb {X}}\times {\mathbb {Y}}\times {\mathbb {T}}\rightarrow {\mathbb {U}}\) to be

    $$\begin{aligned} u^{(\vec {n})}(\vec {X},\vec {x},y,t):=N\int _0^1(1-s)^{N-1}\partial _{\vec {x}}^{\vec {n}}u\big |_{\vec {X}+s(\vec {x}-\vec {X})}\,ds\,. \end{aligned}$$
    (8c)

The Taylor expansion of the nonlinear term \(f[u]\) in pde (6) is expressed in the same way as for the field \(u\); that is, we do not expand \(f[u]\) in a series in \(u\), but instead expand \(f[u(\vec {x},y,t)]\) in a series in \((\vec {x}-\vec {X})\) to order \({N}\). As Assumption 1 specifies that \(f\)[u] is a sum of a product of linear functions, (7), the smoothness requirements for constructing the Taylor expansion of \(f[u]\) are satisfied by \(u\) being of class \(C^{N+\max (p_i^j,k_{\max })}\). The \(N\)th order Taylor multinomial of \(f\)[\(u\)] in \(\vec {x}\) about \(\vec {x}=\vec {X}\) is

$$\begin{aligned} f[u(\vec {x},y,t)]&=\sum _{|\vec {n}|=0}^{N-1}f^{(\vec {n})}(\vec {X},y,t)\frac{(\vec {x}-\vec {X})^{\vec {n}}}{\vec {n}\text {!}}\nonumber \\&\quad +\sum _{|\vec {n}|=N}f^{(\vec {n})}(\vec {X},\vec {x},y,t)\frac{(\vec {x}-\vec {X})^{\vec {n}}}{\vec {n}\text {!}}\,, \end{aligned}$$
(9a)

where we define the coefficients

  • for \(|\vec {n}|<N\) ,

    $$\begin{aligned} f^{(\vec {n})}(\vec {X},y,t):=\left. \partial _{\vec {x}}^{\vec {n}}f[u(\vec {x},y,t)]\right| _{\vec {x}=\vec {X}}\,; \end{aligned}$$
    (9b)
  • and, for \(|\vec {n}|=N\) ,

    $$\begin{aligned} f^{(\vec {n})}(\vec {X},\vec {x},y,t) :=N\int _0^1(1-s)^{N-1z} \left. \partial _{\vec {x}}^{\vec {n}}f[u(\vec {x},y,t)]\right| _{\vec {X}+s(\vec {x}-\vec {X})}\,ds\,. \end{aligned}$$
    (9c)

The Taylor coefficients \(f^{(\vec {n})}:{\mathbb {U}}\rightarrow {\mathbb {U}}\) of the nonlinearity are, in principle, functions of the u field Taylor coefficients \(u^{(\vec {k})}\) with \(|\vec {k}|\leqslant N\) . They may be obtained by substituting the Taylor expansion (8a) of the field u into equations (9b) and (9c). For example, in the case of nonlinearity \(f[u]\) being only one term, a direct substitution of (8a) into the nonlinear form (7) gives

$$\begin{aligned} f[u]&=c(y)\prod _{i=1}^{P}\partial _{\vec {x}}^{\vec {p}_i} \left[ \sum _{|\vec {n}|=0}^{N-1}u^{(\vec {n})}(\vec {X},y,t)\frac{(\vec {x}-\vec {X})^{\vec {n}}}{\vec {n}\text {!}}\right. \nonumber \\&\quad \left. +\sum _{|\vec {n}|=N}u^{(\vec {n})}(\vec {X},\vec {x},y,t)\frac{(\vec {x}-\vec {X})^{\vec {n}}}{\vec {n}\text {!}}\right] \nonumber \\&=c(y)\prod _{i=1}^{P}\left[ \sum _{|\vec {n}|=0}^{N-1-|\vec {p}_i|}u^{(\vec {n}+\vec {p}_i)}(\vec {X},y,t)\frac{(\vec {x}-\vec {X})^{(\vec {n}-\vec {p}_i)}}{(\vec {n}-\vec {p}_i)\text {!}}\nonumber \right. \\&\quad +\sum _{|\vec {n}|=N-|\vec {p}_i|}\sum _{\vec {m}={\vec {0}}}^{\vec {p}_i}\nonumber \\&\quad \left. \left( {\begin{array}{c}\vec {p}_i\\ \vec {m}\end{array}}\right) \partial _{\vec {x}}^{\vec {m}}u^{(\vec {n}+\vec {p}_i)}(\vec {X},\vec {x},y,t)\frac{(\vec {x}-\vec {X})^{(\vec {n}+\vec {m})}}{(\vec {n}+\vec {m})\text {!}}\right] , \end{aligned}$$
(10)

where, for sums over multi-indices such as \(\sum _{\vec {m}=\vec {k}}^{\vec {\ell }}\) we require that \(k_i\leqslant m_i\leqslant \ell _i\) for each component \(i=1,2,\ldots , M\) .

As in the linear case [30, Eq. (19)], the multivariate Taylor multinomial (8a) of a field u gives, after some rearrangement, that the \(\vec {\ell }\)th spatial derivative

$$\begin{aligned} \partial _{\vec {x}}^{\vec {\ell }} u={}&{}\sum _{|\vec {n}|=0}^{N-|\vec {\ell }|-1}u^{(\vec {n}+\vec {\ell })}\frac{(\vec {x}-\vec {X})^{\vec {n}}}{\vec {n}\text {!}} \nonumber \\ {}&{} +\sum _{|\vec {n}|=N}\sum _{\vec {m}=(\vec {n}-\vec {\ell })^{\oplus }}^{\vec {n}}\left( {\begin{array}{c}\vec {\ell }\\ \vec {n}-\vec {m}\end{array}}\right) \partial _{\vec {x}}^{\vec {m}+\vec {\ell }-\vec {n}}\nonumber \\&\quad u^{(\vec {n})}\frac{(\vec {x}-\vec {X})^{\vec {m}}}{\vec {m}\text {!}}\,, \end{aligned}$$
(11)

where, appearing here and elsewhere in the limits of some sums, \((\vec {k})^{\oplus }\) denotes the multi-index vector with \(i\)th component \(\max (k_i,0)\), thus ensuring all multi-index components are non-negative in the sums. Using some details given for the linear case [30], substitute (11) into the nonlinear pde (6), and after rearrangement we derive that this nonlinear pde becomes

$$\begin{aligned}&\sum _{|\vec {n}|=0}^{N-|\vec {\ell }|-1}\frac{\partial u^{(\vec {n}+\vec {\ell })}}{\partial t}\frac{(\vec {x}-\vec {X})^{\vec {n}}}{\vec {n}\text {!}}\nonumber \\&\qquad +\sum _{|\vec {n}|=N}\sum _{\vec {m}=(\vec {n}-\vec {\ell })^{\oplus }}^{\vec {n}}\left( {\begin{array}{c}\vec {\ell }\\ \vec {n}-\vec {m}\end{array}}\right) \partial _{\vec {x}}^{\vec {m}+\vec {\ell }-\vec {n}}\frac{\partial u^{(\vec {n})}}{\partial t}\frac{(\vec {x}-\vec {X})^{\vec {m}}}{\vec {m}\text {!}} \nonumber \\&\quad = \sum _{|\vec {k}|=0}^{\infty } {\mathfrak {L}}_{\vec {k}} \sum _{|\vec {n}|=0}^{N-|\vec {\ell }+\vec {k}|-1}u^{(\vec {n}+\vec {\ell }+\vec {k})}\frac{(\vec {x}-\vec {X})^{\vec {n}}}{\vec {n}\text {!}} \nonumber \\&\qquad +\sum _{|\vec {k}|=0}^{\infty } {\mathfrak {L}}_{\vec {k}} \sum _{|\vec {n}|=N}\sum _{\vec {m}=(\vec {n}-\vec {\ell }-\vec {k})^{\oplus }}^{\vec {n}}\left( {\begin{array}{c}\vec {\ell }+\vec {k}\\ \vec {n}-\vec {m}\end{array}}\right) \partial _{\vec {x}}^{\vec {m}+\vec {\ell }+\vec {k}-\vec {n}}\nonumber \\&\quad u^{(\vec {n})}\frac{(\vec {x}-\vec {X})^{\vec {m}}}{\vec {m}\text {!}}\nonumber \\&\qquad +\partial _{\vec {x}}^{\vec {\ell }}f[u]\,. \end{aligned}$$
(12)

As the multivariate Taylor multinomial (8a) is exact, for all stations \(\vec {X}\in {\mathbb {X}}\) and \(\vec {x}\in {\mathbb {X}}\) , equation (12) is exact for every \(\vec {x}\in \chi (\vec {X})\), where \(\chi (\vec {X})\) is an open subset of \({\mathbb {X}}\) such that for all points \(\vec {x}\in \chi (\vec {X})\) the convex combination \(\vec {X}+s(\vec {x}-\vec {X})\in \chi (\vec {X})\) for every \(0\leqslant s\leqslant 1\) ; this condition ensures that when we take the limit \(\vec {x}\rightarrow \vec {X}\) , \(\vec {x}\) will always remain inside \(\chi (\vec {X})\subset {\mathbb {X}}\) and \((\vec {x}-\vec {X})\rightarrow {\vec {0}}\) .

Now take the limit \(\vec {x}\rightarrow \vec {X}\) in equation (12) so that all terms with factors of \((\vec {x}-\vec {X})\) vanish. For simplicity, and unless otherwise specified, hereafter \(u^{(\vec {n})}\) denotes \(u^{(\vec {n})}(\vec {X},y,t)\) when \(|\vec {n}|<N\) and denotes \(u^{(\vec {n})}(\vec {X},\vec {X},y,t)\) when \(|\vec {n}|=N\) . Similarly for the nonlinearity: \(f^{(\vec {n})}\) denotes \(f^{(\vec {n})}(\vec {X},y,t)\) when \(|\vec {n}|<N\) and denotes \({f^{(\vec {n})}(\vec {X},\vec {X},y,t)}\) when \(|\vec {n}|=N\) . Further, interchange the \({\vec {n}}\) and \({\vec {\ell }}\) multi-indices in (12). Then the nonlinear pde (6) implies that for every station \({\vec {X}\in {\mathbb {X}}}\), every point in the cross-section \(y\in {\mathbb {Y}}\), and every time \(t\in {\mathbb {T}}\),

$$\begin{aligned}&\frac{\partial u^{(\vec {n})}}{\partial t}=\sum _{|\vec {k}|=0}^{N-|\vec {n}|} {\mathfrak {L}}_{\vec {k}} u^{(\vec {n}+\vec {k})}+f^{(\vec {n})}+r_{\vec {n}}\,,\nonumber \\&\quad \quad \text {for every }|\vec {n}|\leqslant N\,, \end{aligned}$$
(13a)

where the remainder

$$\begin{aligned} r_{\vec {n}}=\sum _{|\vec {k}|\geqslant 1}\sum _{\begin{array}{c} |\vec {\ell }|=N\\ \vec {\ell }\,\lneqq \,\vec {n}+\vec {k} \end{array}}{\mathfrak {L}}_{\vec {k}}\left( {\begin{array}{c}\vec {k}+\vec {n}\\ \vec {\ell }\end{array}}\right) \left[ \partial _{\vec {x}}^{\vec {k}+\vec {n}-\vec {\ell }}u^{(\vec {\ell })}(\vec {X},\vec {x},y,t)\right] _{\vec {x}=\vec {X}}\,, \end{aligned}$$
(13b)

where in comparing indices we define that \(\lneqq \) means \(\leqslant \) for each component, but excluding exact equality of the two multi-indices. The second term on the right-hand side of (12) (when \({|\vec {k}|\geqslant 1}\)) determines the remainder (13b). Since multi-index \({\vec {n}\in {\mathbb {N}}_0^M}\) and \({|\vec {n}|\leqslant N}\), there are thus \({{\mathcal {N}}}:=\left( {\begin{array}{c}N+M\\ M\end{array}}\right) \) coupled odes (13a).

For every multi-index \(\vec {n}\), \({|\vec {n}|\leqslant N}\), the \({u^{(\vec {n})}}\) terms in equation (13a) are evaluated at station \({\vec {X}}\), but the spatial derivatives of \({u^{(\vec {n})}(\vec {X},\vec {x},y,t)}\) with \({|\vec {n}|=N}\) that appear in the remainder term \({r_{\vec {n}}}\) (13b) couple the dynamics at station \({\vec {X}}\) to dynamics of the system along the line joining fixed station \(\vec {X}\) to variable position \({\vec {x}}\), that is, the dynamics at \({\vec {X}}\) are coupled to the dynamics at points in the neighbourhood \({\chi (\vec {X})}\). This dependence of derivatives of \({u^{(\vec {n})}(\vec {X},\vec {x},y,t)}\) on the dynamics at points in \({\chi (\vec {X})}\) is directly seen from an application of the integral mean value theorem on equation (8c). By this theorem, there exists some \({{{\hat{s}}}}\in (0,1)\) such that

$$\begin{aligned} u^{(\vec {n})}(\vec {X},\vec {x},y,t)&=N\partial _{\vec {x}}^{\vec {n}}u\big |_{\vec {X}+{{{\hat{s}}}}(\vec {x}-\vec {X})}\int _0^1(1-s)^{N-1}\,ds\\&=\partial _{\vec {x}}^{\vec {n}}u\big |_{\vec {X}+{{{\hat{s}}}}(\vec {x}-\vec {X})}\,, \end{aligned}$$

and \({\vec {X}+{{{\hat{s}}}}(\vec {x}-\vec {X})\in \chi (\vec {X})}\). Spatial derivatives of \({u^{(\vec {n})}(\vec {X},\vec {x},y,t)}\) retain dependence on \({{{\hat{s}}}}\), and thus on the dynamics about \({\vec {X}}\), even when evaluated at \({\vec {x}=\vec {X}}\). In contrast, \({u^{(\vec {n})}(\vec {X},\vec {X},y,t)=\partial _{\vec {x}}^{\vec {n}}u\big |_{\vec {X}}}\) is independent of \({{{\hat{s}}}}\). Whereas an \({{{\hat{s}}}}\in (0,1)\) must exist for each \({u^{(\vec {n})}(\vec {X},\vec {x},y,t)}\), these \({{{\hat{s}}}}\) are generally not determined by users, and so we view gradients of \({u^{(\vec {n})}(\vec {X},\vec {x},y,t)}\) as ‘uncertain’. We therefore classify the remainders \({r_{\vec {n}}}\) as uncertain forcing which couple the local dynamics at \({\vec {X}}\) to the dynamics in its neighbourhood, and thereby to the global dynamics over \({\mathbb {X}}\).

The nonlinear \({f^{(\vec {n})}}\) may also contain ‘uncertain’ gradients of \({u^{(\vec {n})}(\vec {X},\vec {x},y,t)}\), depending on the particular nonlinearity. For example, for the case of a single-term nonlinearity we obtain the last line of equation (10) which contains spatial derivatives up to order \({\vec {p}_i}\) of \({u^{(\vec {n}+\vec {p}_i)}(\vec {X},\vec {x},y,t)}\) where \({|\vec {n}+\vec {p}_i|=N}\) . So, if at least one \({\vec {p}_i>{\vec {0}}}\) , the nonlinear term contains uncertain gradients which couple the dynamics at \({\vec {X}}\) to the dynamics in \({\chi (\vec {X})}\). Section 2.2 explicitly identifies these uncertain gradients in the nonlinearity \({f^{(\vec {n})}}\).

2.2 Generating multinomial and PDE

We now pack all the multivariate Taylor coefficients \({u^{(\vec {n})}}\) together into a generating function (multinomial) in order to handle all the pdes together, and also connect with established heuristic methodologies. The details here extend those for linear systems [30, §3.2]. For every station \({\vec {X}\in {\mathbb {X}}}\) and time \(t\in {\mathbb {T}}\) consider the field \(u\) in terms of a local Taylor multinomial (8a) about the cross-section \({\vec {x}=\vec {X}}\) . In terms of the indeterminate \({\vec {\xi }\in {\mathbb {R}}^M}\), define the generating multinomial

$$\begin{aligned} {{\tilde{u}}}(\vec {X},t):=\sum _{|\vec {n}|=0}^{N-1}\frac{\vec {\xi }^{\vec {n}}}{\vec {n}\text {!}} u^{(\vec {n})}(\vec {X},y,t) +\sum _{|\vec {n}|=N}\frac{\vec {\xi }^{\vec {n}}}{\vec {n}\text {!}}u^{(\vec {n})}(\vec {X},\vec {X},y,t)\,, \end{aligned}$$
(14)

where this generating multinomial \({{\tilde{u}}}\), through its range denoted by \({\mathbb {U}}_N\), is implicitly a function of the indeterminate \(\vec {\xi }\) and the cross-sectional variable \(y\). This generating multinomial \({{\tilde{u}}}:{\mathbb {X}}\times {\mathbb {T}}\rightarrow {\mathbb {U}}_N\) for the vector space \({\mathbb {U}}_N:={\mathbb {U}}\otimes _t{\mathbb {G}}_N\) where \({\mathbb {G}}_N\) denotes the space of multinomials in \(\vec {\xi }\) of degree\({}\leqslant N\), and where \(\otimes _t\) represents the vector space tensor product. The generating operator

$$\begin{aligned} {\mathcal {G}}:=\left[ \sum _{|\vec {n}|=0}^{N}\frac{\vec {\xi }^{\vec {n}}}{\vec {n}\text {!}}\partial _{\vec {x}}^{\vec {n}}\right] _{\vec {x}=\vec {X}}\,, \end{aligned}$$
(15)

acts to convert the original field \({u(\vec {x},y,t)}\) into the generating multinomial \({{{\tilde{u}}}(\vec {X},t)={\mathcal {G}}u(\vec {x},y,t)}\) . The generating operator (15) similarly converts the nonlinear term of pde (6) into a multinomial in \({\vec {\xi }}\),

$$\begin{aligned} {{\tilde{f}}}[{{\tilde{u}}}]&:={\mathcal {G}}f[u(\vec {x},y,t)] =\sum _{|\vec {n}|=0}^{N-1}\frac{\vec {\xi }^{\vec {n}}}{\vec {n}\text {!}} f^{(\vec {n})}(\vec {X},y,t) \nonumber \\&\quad +\sum _{|\vec {n}|=N}\frac{\vec {\xi }^{\vec {n}}}{\vec {n}\text {!}}f^{(\vec {n})}(\vec {X},\vec {X},y,t), \end{aligned}$$
(16)

with \({{\tilde{f}}}[{{\tilde{u}}}]:{\mathbb {U}}_N\rightarrow {\mathbb {U}}_N\) appearing as the nonlinear term in (17) of the next Proposition 2.

We introduce the generating multinomial \({{\tilde{u}}}\) because it is more convenient to deal with one multinomial and one pde than the \({{\mathcal {N}}}\) Taylor coefficients \({u^{(\vec {n})}}\) and the \({{\mathcal {N}}}\) differential equations (13) derived in the previous section. Roberts and Bunder [30, §3.2] constructed a similar multinomial \({{\tilde{u}}}\) and pde via the generating operator \({\mathcal {G}}\); however, here we make new special provisions for the nonlinear term \(f[u]\). Although the compact form of multinomial \({{\tilde{u}}}\) is useful, a more important property is that the dynamics of \({{\tilde{u}}}\) and the original field \({u(\vec {x},y,t)}\) are equivalent up to a known difference, as described by Proposition 2.

Proposition 2

(cf. Roberts and Bunder [30, Prop. 2]) Let \({u(\vec {x},y,t)}\) be governed by the specified nonlinear pde (6). Then the dynamics at every locale \({\vec {X}\in {\mathbb {X}}\subset {\mathbb {R}}^M}\), every \(y\in {\mathbb {Y}}\), and every \(t\in {\mathbb {T}}\), is equivalently governed by the nonlinear pde

$$\begin{aligned} \frac{\partial {{\tilde{u}}}}{\partial t}={{{\tilde{{\mathcal {L}}}}}}{{\tilde{u}}}+{{\tilde{f}}}[{{\tilde{u}}}]+{{\tilde{r}}}[u]\,, \end{aligned}$$
(17)

for the generating function multinomial \({{{\tilde{u}}}(\vec {X},y,t)}\) defined in (14), the ‘uncertain’ forcing \({{\tilde{r}}}[u]\) given by (21), the nonlinear \({{\tilde{f}}}[{{\tilde{u}}}]\) defined by (16), and the operator

$$\begin{aligned} {{{\tilde{{\mathcal {L}}}}}}:=\sum _{|\vec {k}|=0}^{N}{\mathfrak {L}}_{\vec {k}}\partial _{\vec {\xi }}^{\vec {k}}\,. \end{aligned}$$
(18)

To establish Proposition 2, we first show that the multinomial \({{{\tilde{u}}}(\vec {X},y,t)}\) (14) satisfies pde (17). To construct a pde for \({{\tilde{u}}}\), take the time derivative of (14) and replace \({\frac{\partial u^{(\vec {n})}}{\partial t}}\) using (13a):

$$\begin{aligned} \frac{\partial {{\tilde{u}}}}{\partial t} ={}&{}\sum _{|\vec {n}|=0}^{N}\frac{\vec {\xi }^{\vec {n}}}{\vec {n}\text {!}} \left[ \sum _{|\vec {k}|=0}^{N-|\vec {n}|}{\mathfrak {L}}_{\vec {k}}u^{(\vec {n}+\vec {k})}\right] \nonumber \\&+\sum _{|\vec {n}|=0}^N\frac{\vec {\xi }^{\vec {n}}}{\vec {n}\text {!}}f^{(\vec {n})} +\sum _{|\vec {n}|=0}^N\frac{\vec {\xi }^{\vec {n}}}{\vec {n}\text {!}}r_{\vec {n}} \nonumber \\ ={}&{}\sum _{|\vec {k}|=0}^{N}{\mathfrak {L}}_{\vec {k}}\partial _{\vec {\xi }}^{\vec {k}}{{\tilde{u}}}+{{\tilde{f}}}[{{\tilde{u}}}]+\sum _{|\vec {n}|=0}^N\frac{\vec {\xi }^{\vec {n}}}{\vec {n}\text {!}}r_{\vec {n}}\,, \end{aligned}$$
(19)

where in the first term the \({\vec {n}}\) and \({\vec {k}}\) sums are exchanged, and we then simplify this term using the useful identity

$$\begin{aligned} \partial _{\vec {\xi }}^{\vec {k}}{{\tilde{u}}}=\sum _{|\vec {n}|=0}^{N-|\vec {k}|}\frac{\vec {\xi }^{\vec {n}}}{\vec {n}\text {!}}u^{(\vec {n}+\vec {k})}, \end{aligned}$$
(20)

obtained from derivatives of the generating multinomial (14) with respect to \({\vec {\xi }}\). The above pde (19) is precisely pde (17) of Proposition 2 with forcing ‘remainder’

$$\begin{aligned} {{\tilde{r}}}[u]&=\sum _{|\vec {n}|=0}^N\frac{\vec {\xi }^{\vec {n}}}{\vec {n}\text {!}}r_{\vec {n}} \nonumber \\&=\sum _{|\vec {k}|\geqslant 1}{\mathfrak {L}}_{\vec {k}}\sum _{|\vec {n}|=0}^N\frac{\vec {\xi }^{\vec {n}}}{\vec {n}\text {!}}\sum _{\begin{array}{c} |\vec {\ell }|=N\\ \vec {\ell }\,\lneqq \,\vec {n}+\vec {k} \end{array}} \left( {\begin{array}{c}\vec {k}+\vec {n}\\ \vec {\ell }\end{array}}\right) \nonumber \\&\quad \left[ \partial _{\vec {x}}^{\vec {k}+\vec {n}-\vec {\ell }}u^{(\vec {\ell })}(\vec {X},\vec {x},y,t) \right] _{\vec {x}=\vec {X}}\,, \end{aligned}$$
(21)

upon using expression (13b) for \({r_{\vec {n}}}\).

The second task for establishing Proposition 2 is to show that the generating pde (17) and the original pde (6) describe the same dynamics at every locale \({\vec {X}\in {\mathbb {X}}\subset {\mathbb {R}}^M}\). We do this by providing a more physical interpretation of the generating operator \({\mathcal {G}}\) and the generating multinomial \({{{\tilde{u}}}(\vec {X},t)}\), beyond just a convenient way to pack the Taylor coefficients of \({u(\vec {x},y,t)}\).

Consider the Taylor expansion of some general function \({g(\vec {x})\in C^{N+1}}\) at \({\vec {x}=\vec {X}+\vec {\xi }}\) about \({\vec {x}=\vec {X}}\) :

$$\begin{aligned}&[g(\vec {x})]_{\vec {x}=\vec {X}+\vec {\xi }}=g(\vec {X}+\vec {\xi })=\sum _{|\vec {n}|=0}^N\frac{\vec {\xi }^{\vec {n}}}{\vec {n}\text {!}}\partial _{\vec {x}}^{\vec {n}}g(\vec {X})+R_N(g)\nonumber \\&\quad ={\mathcal {G}}g(\vec {x})+{\mathcal {O}} \mathchoice{\big (|\vec {\xi }|^{N+1}\big )}{\big (|\vec {\xi }|^{N+1}\big )}{(|\vec {\xi }|^{N+1})}{(|\vec {\xi }|^{N+1})}, \end{aligned}$$
(22)

where \(R_N(g)\) is the order \(N\) Lagrange remainder term of \({g(\vec {X}+\vec {\xi })}\) [30]. So, \({{\mathcal {G}}g(\vec {x})}\) evaluates \({g(\vec {X}+\vec {\xi })}\) correct to \({{\mathcal {O}} \mathchoice{\big (|\vec {\xi }|^{N+1}\big )}{\big (|\vec {\xi }|^{N+1}\big )}{(|\vec {\xi }|^{N+1})}{(|\vec {\xi }|^{N+1})}}\). Similarly, \({{{\tilde{u}}}(\vec {X},t)={\mathcal {G}}u(\vec {x},y,t)}\) evaluates \({u(\vec {X}+\vec {\xi },y,t)}\) correct to \({{\mathcal {O}} \mathchoice{\big (|\vec {\xi }|^{N+1}\big )}{\big (|\vec {\xi }|^{N+1}\big )}{(|\vec {\xi }|^{N+1})}{(|\vec {\xi }|^{N+1})}}\). We interpret \({{{\tilde{u}}}(\vec {X},t)}\) as the projection of \({u(\vec {x},y,t)}\) at \({\vec {x}=\vec {X}+\vec {\xi }}\) onto the space \({\mathbb {U}}_N={\mathbb {U}}\otimes _t{\mathbb {G}}_N\) , with \({{\mathcal {O}} \mathchoice{\big (|\vec {\xi }|^{N+1}\big )}{\big (|\vec {\xi }|^{N+1}\big )}{(|\vec {\xi }|^{N+1})}{(|\vec {\xi }|^{N+1})}}\) interpreted not as an error but as the difference between \({u(\vec {X}+\vec {\xi },y,t)}\) and its projection onto \({\mathbb {U}}_N\) [30]. As \({\mathcal {G}}\) commutes with the temporal derivative \({{\mathcal {G}}\frac{\partial u(\vec {x},y,t)}{\partial t}=\frac{\partial {{\tilde{u}}}(\vec {X},t)}{\partial t}}\), and \({\frac{\partial {{\tilde{u}}}(\vec {X},t)}{\partial t}}\) is equivalent to the Taylor expansion of \({\frac{\partial u(\vec {x},y,t)}{\partial t}}\) at \({\vec {x}=\vec {X}+\vec {\xi }}\) correct to \({{\mathcal {O}} \mathchoice{\big (|\vec {\xi }|^{N+1}\big )}{\big (|\vec {\xi }|^{N+1}\big )}{(|\vec {\xi }|^{N+1})}{(|\vec {\xi }|^{N+1})}}\). Therefore the generating pde (17) for multinomial \({{{\tilde{u}}}(\vec {X},t)}\) is equivalent to the pde (6) for \({u(\vec {x},y,t)}\) evaluated at \({\vec {x}=\vec {X}+\vec {\xi }}\) correct to \({{\mathcal {O}} \mathchoice{\big (|\vec {\xi }|^{N+1}\big )}{\big (|\vec {\xi }|^{N+1}\big )}{(|\vec {\xi }|^{N+1})}{(|\vec {\xi }|^{N+1})}}\). Thus the dynamics of pde (17) are identical to the dynamics of pde (6) at every \({\vec {x}=\vec {X}\in {\mathbb {X}}}\) . This completes the proof of Proposition 2.

The dynamics of the original nonlinear pde (6) for field u are equivalent to the dynamics of the nonlinear pde (17) for the N dimensional multinomial \({{\tilde{u}}}\) (14); furthermore, the two pdes are symbolically the same with \(u\leftrightarrow {{\tilde{u}}}\) and \({\vec {x}\leftrightarrow \vec {\xi }}\) , except for a forcing term. But the advantage of the multinomial form is that the derivatives \({\partial _{\vec {\xi }}}\) operate only on \({\mathbb {G}}_N\), that is, multinomials of at most degree N in \({\vec {\xi }\in {\mathbb {R}}^M}\) , and are thus bounded in \({\mathbb {G}}_N\). In contrast, the derivatives \({\partial _{\vec {x}}}\) in the original pde are potentially unbounded (e.g., for u rapidly oscillating or containing irrational functions). The slowly varying modelling of Sect. 3.2 takes advantage of the near symbolic equivalence between pde (6) and pde (17) with \(u\leftrightarrow {{\tilde{u}}}\) and \({\vec {x}\leftrightarrow \vec {\xi }}\).

We now expand the nonlinear term (16) of pde (17) explicitly in terms of generating multinomial \({{\tilde{u}}}\) and nonlinear ‘uncertain’ terms involving gradients of \({u^{(\vec {n})}}\) with \({|\vec {n}|>N}\) . Section 3.2 makes use of this expansion to simplify the remainder term \(\rho \) of the slow pde, and Appendix A applies the expansion in the construction of the slow pde for the fluid flow example discussed in Sect. 1.1.

The nonlinear term (16) in the generating pde (17), expanded according to (7) in Assumption 1 is

$$\begin{aligned} {{\tilde{f}}}[{{\tilde{u}}}]=&\sum _j\sum _{|\vec {n}|=0}^{N}\frac{\vec {\xi }^{\vec {n}}}{\vec {n}\text {!}}\left[ \partial _{\vec {x}}^{\vec {n}} c_j(y)\prod _{i=1}^{P_j} \partial _{\vec {x}}^{\vec {p}_i^j} u(\vec {x},y,t)\right] _{\vec {x}=\vec {X}} \nonumber \\ ={}&{}\sum _j\sum _{|\vec {n}|=0}^{N}\sum _{\sum _{i=1}^{P_j}\vec {m}_i=\vec {n}} c_j(y)\prod _{i=1}^{P_j}\frac{\vec {\xi }^{\vec {m}_i}}{\vec {m}_i\text {!}}\left[ \partial _{\vec {x}}^{\vec {m}_i}\partial _{\vec {x}}^{\vec {p}_i^j} u(\vec {x},y,t)\right] _{\vec {x}=\vec {X}} \nonumber \\ ={}&{}\sum _j\sum _{|\vec {n}|=0}^{N}\sum _{\begin{array}{c} \sum _{i=1}^{P_j}\vec {m}_i=\vec {n}\\ |\vec {m}_i+\vec {p}_i^j|\leqslant N \end{array}} c_j(y)\prod _{i=1}^{P_j}\frac{\vec {\xi }^{\vec {m}_i}}{\vec {m}_i\text {!}} \left[ \partial _{\vec {x}}^{\vec {m}_i+\vec {p}_i^j} u(\vec {x},y,t)\right] _{\vec {x}=\vec {X}} \nonumber \\&\quad +\sum _j\sum _{|\vec {n}|=0}^{N}\sum _{\begin{array}{c} \sum _{i=1}^{P_j}\vec {m}_i=\vec {n}\\ \exists |\vec {m}_i+\vec {p}_i^j|> N \end{array}} c_j(y)\prod _{i=1}^{P_j}\frac{\vec {\xi }^{\vec {m}_i}}{\vec {m}_i\text {!}}\left[ \partial _{\vec {x}}^{\vec {m}_i+\vec {p}_i^j} u(\vec {x},y,t)\right] _{\vec {x}=\vec {X}}\,. \end{aligned}$$
(23)

The components with \({|\vec {m}_i+\vec {p}_i^j|>N}\) in the last term on the right hand side are ‘uncertain’, similar to the uncertain forcing \({{\tilde{r}}}[u]\) (21), although in the special case where all \({\vec {p}_i^j={\vec {0}}}\) , no such uncertain nonlinear terms exist. For the uncertain gradients, consider expansion (11) with \({|\vec {\ell }|>N}\) evaluated at \({\vec {x}=\vec {X}}\) ,

$$\begin{aligned} \partial _{\vec {x}}^{\vec {\ell }} [u(\vec {x},y,t)]_{\vec {x}=\vec {X}}= \sum _{|\vec {n}|=N,\, \vec {n}\,\lneqq \,\vec {\ell }}\left( {\begin{array}{c}\vec {\ell }\\ \vec {n}\end{array}}\right) \left[ \partial _{\vec {x}}^{\vec {\ell }-\vec {n}}u^{(\vec {n})}(\vec {X},\vec {x},y,t)\right] _{\vec {x}=\vec {X}}\,. \end{aligned}$$

Using this expansion for the uncertain terms, as well as (8b) and (20) and Assumption 1, we rewrite the nonlinear term (23) as

$$\begin{aligned} {{\tilde{f}}}[{{\tilde{u}}}] ={}&{}\sum _j\sum _{|\vec {n}|=0}^{N}\sum _{\begin{array}{c} \sum _{i=1}^{P}\vec {m}_i=\vec {n}\\ |\vec {m}_i+\vec {p}_i^j|\leqslant N \end{array}} c_j(y)\prod _{i=1}^{P_j}\frac{\vec {\xi }^{\vec {m}_i}}{\vec {m}_i\text {!}}\left[ \partial _{\vec {\xi }}^{\vec {m}_i+\vec {p}_i^j}{{\tilde{u}}}\right] _{\vec {\xi }={\vec {0}}} \nonumber \\{}&{} +\sum _j\sum _{|\vec {n}|=0}^{N}\sum _{\begin{array}{c} \sum _{i=1}^{P_j}\vec {m}_i=\vec {n}\\ \exists |\vec {m}_i+\vec {p}_i^j|> N \end{array}} c_j(y)\prod _{i=1}^{P_j}\frac{\vec {\xi }^{\vec {m}_i}}{\vec {m}_i\text {!}}f_i^j [u,{{\tilde{u}}}]\,, \end{aligned}$$
(24)

where in the second term \(f_i[u,{{\tilde{u}}}]\) is either a function of the generating multinomial \({{\tilde{u}}}\) or of the uncertain gradients of original field \(u\),

$$\begin{aligned} f_i^j[u,{{\tilde{u}}}]:={\left\{ \begin{array}{ll} \left[ \partial _{\vec {\xi }}^{\vec {m}_i+\vec {p}_i^j}{{\tilde{u}}}\right] _{\vec {\xi }={\vec {0}}} &{} \text {for }|\vec {m}_i+\vec {p}_i^j|\leqslant N\,,\\ \sum \limits _{\begin{array}{c} |\vec {k}|=N\\ \vec {k}\,\lneqq \, \vec {m}_i+\vec {p}_i^j \end{array}} \left( {\begin{array}{c}\vec {m}_i+\vec {p}_i^j\\ \vec {k}\end{array}}\right) \left[ \partial _{\vec {x}}^{\vec {m}_i+\vec {p}_i^j-\vec {k}} u^{(\vec {k})}(\vec {X},\vec {x},y,t) \right] _{\vec {x}=\vec {X}}&\text {for }|\vec {m}_i+\vec {p}_i^j|> N\,. \end{array}\right. } \end{aligned}$$

Thus in the Taylor expansion (16),

$$\begin{aligned} f^{(\vec {n})}={}&{}\vec {n}!\sum _j\sum _{\begin{array}{c} \sum _{i=1}^{P_j}\vec {m}_i=\vec {n}\\ |\vec {m}_i+\vec {p}_i^j|\leqslant N \end{array}} c_j(y)\prod _{i=1}^{P_j}\frac{1}{\vec {m}_i!} \left[ \partial _{\vec {\xi }}^{\vec {m}_i+\vec {p}_i^j}{{\tilde{u}}}\right] _{\vec {\xi }={\vec {0}}} \nonumber \\ {}&{}+\vec {n}!\sum _j\sum _{\begin{array}{c} \sum _{i=1}^{P_j}\vec {m}_i=\vec {n}\\ \exists |\vec {m}_i+\vec {p}_i^j|> N \end{array}} c_j(y)\prod _{i=1}^{P_j}\frac{1}{\vec {m}_i!}f_i^j [u,{{\tilde{u}}}]\,, \end{aligned}$$
(25)

where the second term contains all uncertain gradients.

3 A slow nonlinear model emerges

Section 3.1 constructs the eigenspace which describes the emergent slow dynamics of the generating pde (17) by analysing a linearisation of the pde (17). We then show how this eigenspace and associated eigenvalues capture the slow dynamics of original nonlinear pde (6). It is possible to determine the slow dynamics of pde (6) from the eigenspace of the linearised Taylor coefficient pdes (13a), without introducing the generating multinomial and generating function, but then one must explicitly deal with \({{\mathcal {N}}}\) Taylor coefficients and their coupled \({{\mathcal {N}}}\) pdes, as seen in the linear example of Roberts and Bunder [30, §2.2]. Employing the linear eigenspace as a foundation to describe the dynamics of a nonlinear system is justified by centre manifold theory [3, 6, 14, e.g.] which assures us that generically the stability properties of a linear system with centre-stable dynamics persist under nonlinear perturbations and time-dependent forcing. Section 3.2 extends the analysis to the nonlinear pde (17) to describe the slow dynamics of the nonlinear system, including coupling between the centre and stable subspaces via uncertain terms in both the forcing and the nonlinear term.

The slow dynamics are characterised by a set of generalised eigenvectors \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}}\) which span the centre subspace on which the slow dynamics of \({{\tilde{u}}}\) evolve. As these generalised eigenvectors are determined from the linearised pde, they are the same as those we determined [30, §3.3] for linear pdes. However, there we constructed the generalised eigenvectors \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}}\) via a two-step process, firstly constructing generalised eigenvectors for the linearised version of the original pde (6) (i.e., for \(f[u])=0\)), and then mapping these eigenvectors into \({\mathbb {U}}_N\) [30, Eq. (37) and Lemma 6]. Here we show how to construct the generalised eigenvectors in \({\mathbb {U}}_N\) directly from the generating pde (17).

To analyse the eigenspace and determine the slow dynamics of the linearised generating pde (17), we apply Assumption 3 which describes the eigenspace of \({{\mathfrak {L}}_{{\vec {0}}}}\), the lowest order operator in \({{{\tilde{{\mathcal {L}}}}}}\) (as was also assumed in the linear case by Roberts and Bunder [30], and are needed here to clearly define quantities). However, Assumption 3 does not provide necessary assumptions for the extraction of a slow model; for example, here we derive the slow model after assuming the Hilbert space \({\mathbb {U}}\) is a centre-stable subspace, but an analogous derivation is possible when \({\mathbb {U}}\) is a slow-stable subspace (the shallow fluid example of Sect. 1.1 is on a slow-stable subspace and the code in Appendices A.2 and A.3 permit either slow-stable or centre-stable dynamics).

Assumption 3

We assume the following for the primary case of purely centre-stable dynamics.

  1. (1)

    The Hilbert space \({\mathbb {U}}\) is the direct sum of two closed \({{\mathfrak {L}}_{{\vec {0}}}}\)-invariant subspaces, \({\mathbb {E}}_c^0\) and \({\mathbb {E}}_s^0\), and the corresponding restrictions of \({{\mathfrak {L}}_{{\vec {0}}}}\) generate strongly continuous semigroups [2, 12].

  2. (2)

    The operator \({{\mathfrak {L}}_{{\vec {0}}}}\) has a discrete spectrum of eigenvalues \(\lambda _1,\lambda _2,\ldots \) (repeated according to multiplicity) with corresponding linearly independent (possibly generalised) eigenvectors \({v_1^{{\vec {0}}},v_2^{{\vec {0}}},\ldots }\) that are complete (\({{\mathbb {U}}={\text {span}}\{v_1^{{\vec {0}}},v_2^{{\vec {0}}},\ldots \}}\)).

  3. (3)

    The first \(m\) eigenvalues \(\lambda _1,\ldots ,\lambda _m\) of \({{\mathfrak {L}}_{{\vec {0}}}}\) all have real part satisfying \(|\mathfrak {R}\lambda _j|\leqslant \alpha \) and hence the \(m\)-dimensional centre subspace \({{\mathbb {E}}_c^0={\text {span}}\{v_1^{{\vec {0}}},\ldots ,v_m^{{\vec {0}}}\}}\) [8, Chap. 4, e.g.].

  4. (4)

    All other eigenvalues \(\lambda _{m+1},\lambda _{m+2},\ldots \) have real part negative and well separated from the centre eigenvalues, namely \(\mathfrak {R}\lambda _j\leqslant -\beta <-N\alpha \) for \(j=m+1,m+2,\ldots \) , and so the stable subspace \({{\mathbb {E}}_s^0={\text {span}}\{v_{m+1}^{{\vec {0}}},v_{m+2}^{{\vec {0}}},\ldots \}}\). For clarity, say the number of stable eigenvalues is \(m'\), so that the stable subspace \({\mathbb {E}}^0_s\) is \(m'\)-dimensional, although the number of stable eigenvalues may be infinite, \(m'\rightarrow \infty \) .

For convenience, Definition 4 packs the m eigenvectors which span the centre subspace \({\mathbb {E}}^0_c\) of \({{\mathfrak {L}}_{{\vec {0}}}}\) into one matrix \({V^{{\vec {0}}}}\), and similarly packs the eigenvalues into the matrix \({A_{{\vec {0}}}}\) (being linear concepts, these are the same as for the linear case).

Definition 4

Assumption 3 identifies a subset of m eigenvectors of \({{\mathfrak {L}}_{{\vec {0}}}}\) which span the centre subspace \({\mathbb {E}}^0_c\subset {\mathbb {U}}\) .

  • With these eigenvectors define

    $$\begin{aligned} V^{\vec {0}}:=\begin{bmatrix} v_1^{\vec {0}}&v_2^{\vec {0}}&\cdots&v_m^{\vec {0}}\end{bmatrix} \in {\mathbb {U}}^{1\times m}. \end{aligned}$$
  • Since the centre subspace is an invariant space of \({{\mathfrak {L}}_{{\vec {0}}}}\), define complex matrix \({A_{{\vec {0}}}\in {\mathbb {C}}^{m\times m}}\) to be such that \({{\mathfrak {L}}_{{\vec {0}}} V^{{\vec {0}}} =V^{{\vec {0}}} A_{{\vec {0}}}}\) (often \({A_{{\vec {0}}}}\) will be in Jordan form, but it is not necessarily so).

  • Use \(\langle \cdot ,\cdot \rangle \) to also denote the inner product on the Hilbert space \({\mathbb {U}}\), \(\langle \cdot ,\cdot \rangle :{\mathbb {U}}\times {\mathbb {U}}\rightarrow {\mathbb {C}}\) , the field of complex numbers.

    Interpret this inner product when acting on two matrices/vectors with elements in \({\mathbb {U}}\) as the matrix/vector of the corresponding elementwise inner products. For example, for \({Z^{{\vec {0}}}, V^{{\vec {0}}}\in {\mathbb {U}}^{1\times m}}\) , \({\langle Z^{{\vec {0}}},V^{{\vec {0}}}\rangle \in {\mathbb {C}}^{m\times m}}\).

  • Define \({Z^{{\vec {0}}}\in {\mathbb {U}}^{1\times m}}\) to have \(m\) linearly independent columns which are the \(m\) left eigenvectors of \({{\mathfrak {L}}_{{\vec {0}}}}\), ordered such that the \(j\)th columns of \({V^{{\vec {0}}}}\) and \({Z^{{\vec {0}}}}\) have the same eigenvalue and normalised such that \({\langle Z^{\vec {0}},V^{\vec {0}}\rangle =I_m}\) .

Section 3.1 uses the centre subspace eigenvectors \({V^{{\vec {0}}}}\) of \({{\mathfrak {L}}_{{\vec {0}}}}\) to generate a set of generalised eigenvectors of \({{{\tilde{{\mathcal {L}}}}}}\) which describe the slow dynamics of linear pde \(\partial _t{{\tilde{u}}}={{{\tilde{{\mathcal {L}}}}}}{{\tilde{u}}}\) confined to the centre subspace.

3.1 Generalised eigenvectors span the centre subspace

We invoke Assumption 3 to construct a set of eigenvectors (possibly generalised) which span the centre subspace \({\mathbb {E}}_c^N\subset {\mathbb {U}}_N\) of the linear operator \({{{\tilde{{\mathcal {L}}}}}}\) (18). These eigenvectors capture the slow behaviour of the linear pde

$$\begin{aligned} \frac{\partial {{\tilde{u}}}}{\partial t}={{{\tilde{{\mathcal {L}}}}}}{{\tilde{u}}}\,, \end{aligned}$$
(26)

which is the linearisation of the generating pde (17), with neglected forcing.

For \({0<|\vec {n}|\leqslant N}\) , we construct the generalised eigenvector \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}\in {\mathbb {U}}^{1\times m}\otimes _t{\mathbb {G}}_N={\mathbb {U}}_N^{1\times m}}\) from the following recurrence relations, beginning with \({{{{\tilde{{\mathcal {V}}}}}}^{{\vec {0}}}=V^{{\vec {0}}}}\) ,

$$\begin{aligned} A_{\vec {n}}:= \sum _{0<|\vec {k}|, \vec {k}\leqslant \vec {n}}\big \langle Z^{{\vec {0}}},{\mathfrak {L}}_{\vec {k}} {{{\tilde{{\mathcal {V}}}}}}^{\vec {n}-\vec {k}}\big \rangle _{\vec {\xi }={\vec {0}}}\,, \end{aligned}$$
(27a)
$$\begin{aligned} {\mathfrak {L}}_{{\vec {0}}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}-{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}} A_{{\vec {0}}}= -\sum _{0<|\vec {k}|, \vec {k}\leqslant \vec {n}}{\mathfrak {L}}_{\vec {k}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}-\vec {k}} + \sum _{0<|\vec {k}|, \vec {k}\leqslant \vec {n}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}-\vec {k}} A_{\vec {k}}\,, \end{aligned}$$
(27b)
$$\begin{aligned} \big \langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}\big \rangle =\frac{\vec {\xi }^{\vec {n}}}{\vec {n}\text {!}} I_m\,. \end{aligned}$$
(27c)

The m rows of all \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}}\) with \({|\vec {n}|\leqslant N}\) form a subset of \({\mathbb {U}}_N\) with \(m{{\mathcal {N}}}\) elements. Here we show that these \(m{{\mathcal {N}}}\) elements are generalised eigenvectors of \({{{\tilde{{\mathcal {L}}}}}}\) which span the centre subspace \({\mathbb {E}}_c^N\). To do this we show that the \(m{{\mathcal {N}}}\) elements are linearly independent and that the generalised eigenvector equation \({{{{\tilde{{\mathcal {L}}}}}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}-{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}A_{{\vec {0}}}}\) only produces linear combinations of \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {k}}}\) with \({{\vec {0}}\leqslant \vec {k}<\vec {n}}\) .

The recurrence relations (27) appear at first sight to be those presented by Roberts and Bunder [30, Eq. (37)], but there is an important difference. The recurrence relations (27) are expressed in terms of the generalised eigenvectors \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}\in {\mathbb {U}}^{1\times m}_N}\), whereas the recurrence relations of Roberts and Bunder [30] are in terms of distinctly different generalised eigenvectors \({V^{\vec {n}}\in {\mathbb {U}}^{1\times m}}\), which are later converted into generalised eigenvectors \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}}\) [30, Lemma 6]. Here, by working only with the generalised eigenvectors \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}}\), our notation, and hence our derivation, is significantly simpler than that of Roberts and Bunder [30].

The inner product (27c) ensures that \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}=\frac{\vec {\xi }^{\vec {n}}}{\vec {n}\text {!}}V^{{\vec {0}}}+{{\tilde{V}}}^{\vec {n}}}\) for some \({{{\tilde{V}}}^{\vec {n}}\in {\mathbb {U}}_N^{1\times m}}\) such that \({\big \langle Z^{{\vec {0}}},{{\tilde{V}}}^{\vec {n}}\big \rangle =0_m}\) for all \({|\vec {n}|>0}\) . Further, since the \({\frac{\vec {\xi }^{\vec {n}}}{\vec {n}\text {!}}V^{{\vec {0}}}}\) part of \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}}\) gives zero in the left hand side of (27b), the objective of (27b) is to determine the \({{{\tilde{V}}}^{\vec {n}}}\) part of \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}}\). As the right hand side of (27b) is a function of \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {k}}}\) with \({{\vec {0}}\leqslant \vec {k}<\vec {n}}\) , we conclude that \({{{\tilde{V}}}^{\vec {n}}}\) has order of \({\vec {\xi }}\) no larger than the order of these \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {k}}}\) . Since we know that \({V^{{\vec {0}}}}\) is independent of \({\vec {\xi }}\), for \({|\vec {n}|=1}\) equation (27b) ensures that \({{{\tilde{V}}}^{\vec {n}}}\) is independent of \({\vec {\xi }}\) and the highest order of \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}}\) when \({|\vec {n}|=1}\) must be \({\vec {\xi }^{\vec {n}}}\). By induction we conclude that for any \({\vec {n}}\), \({{{\tilde{V}}}^{\vec {n}}}\) is of order \({\vec {k}}\) in \({\vec {\xi }}\), where \({{\vec {0}}\leqslant \vec {k}<\vec {n}}\) , and \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}}\) is of order \({\vec {n}}\) in \({\vec {\xi }}\). Thus \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}}\) is an \({\vec {n}}\)th order multinomial in \({\mathbb {U}}_N^{1\times m}\) and for all \({|\vec {n}|\leqslant N}\) we have \({{\mathcal {N}}}\) linearly independent \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}}\).

Now consider the rows of each \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}}\). Since

$$\begin{aligned} {{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}=\frac{\vec {\xi }^{\vec {n}}}{\vec {n}\text {!}}V^{{\vec {0}}}+{{\tilde{V}}}^{\vec {n}}=\frac{\vec {\xi }^{\vec {n}}}{\vec {n}\text {!}}\begin{bmatrix} v_1^{\vec {0}}&v_2^{\vec {0}}&\cdots&v_m^{\vec {0}}\end{bmatrix}+{{\tilde{V}}}^{\vec {n}}\,, \end{aligned}$$

with linearly independent eigenvectors \({v_j^{{\vec {0}}}}\) for \(j=1,\ldots , m\) , and since \({\langle Z^{{\vec {0}}},{{\tilde{V}}}^{\vec {n}}\rangle =0_m}\) , each of the m elements of \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}}\) are linearly independent. Therefore, the \(m\) elements of all \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}}\) with \({|\vec {n}|\leqslant N}\), form a set of \(m{{\mathcal {N}}}\) linearly independent elements of \({\mathbb {U}}_N\).

To show that the rows of \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}}\) are generalised eigenvectors of \({{{\tilde{{\mathcal {L}}}}}}\) in the centre subspace \({\mathbb {E}}_c^N\), consider

$$\begin{aligned} {{{\tilde{{\mathcal {L}}}}}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}-{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}A_{{\vec {0}}}&=\sum _{|\vec {k}|=0}^N{\mathfrak {L}}_{\vec {k}}\partial _{\vec {\xi }}^{\vec {k}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}-{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}A_{{\vec {0}}}\nonumber \\&=\sum _{0\leqslant \vec {k}\leqslant \vec {n}}{\mathfrak {L}}_{\vec {k}}\partial _{\vec {\xi }}^{\vec {k}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}-{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}A_{{\vec {0}}}\quad \text {since }{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}\text { is order}~\vec {n}\text { in}~\vec {\xi }\nonumber \\&=\sum _{0\leqslant \vec {k}\leqslant \vec {n}}{\mathfrak {L}}_{\vec {k}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}-\vec {k}}-{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}A_{{\vec {0}}} \quad \text {from Lemma}\,5\nonumber \\&=\sum _{0< \vec {k}\leqslant \vec {n}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}-\vec {k}}A_{\vec {k}} \quad \text {from rearranging}\,(27\mathrm{b}).\end{aligned}$$
(28)

The left-hand side only produces \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {k}}}\) with \({{\vec {0}}\leqslant \vec {k}<\vec {n}}\), and thus the rows of \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}}\) are generalised eigenvectors of rank \({\vec {n}}\) with eigenvalues in matrix \({A_{{\vec {0}}}}\). Since the rows of all \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}}\) with \({|\vec {n}|\leqslant N}\) provide \(m{{\mathcal {N}}}\) linearly independent generalised eigenvectors of \({{{\tilde{{\mathcal {L}}}}}}\) with eigenvalues contained in \({A_{{\vec {0}}}}\), these \(m{{\mathcal {N}}}\) eigenvectors must span the centre subspace \({\mathbb {E}}_c^N\).

Lemma 5

For generalised eigenvector \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}}\) constructed from recurrence relations (27), derivatives of \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}}\) with respect to \({\vec {\xi }}\) satisfy \({\partial _{\vec {\xi }}^{\vec {m}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}={{{\tilde{{\mathcal {V}}}}}}^{\vec {n}-\vec {m}}}\) for \({{\vec {0}}<\vec {m}\leqslant \vec {n}}\) .

Since \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}=\frac{\vec {\xi }^{\vec {n}}}{\vec {n}\text {!}}V^{{\vec {0}}}+{{\tilde{V}}}^{\vec {n}}}\) with \({{{\tilde{V}}}^{\vec {n}}}\) of order less than \({\vec {n}}\) in \({\vec {\xi }}\), then \({\partial _{\vec {\xi }}^{\vec {n}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}={{{\tilde{{\mathcal {V}}}}}}^{{\vec {0}}}}\) , in agreement with Lemma 5 when \({\vec {m}=\vec {n}}\) . However, to prove Lemma 5 we need only prove the \({|\vec {m}|=1}\) case for general \({\vec {n}}\), as the additive property of derivative powersFootnote 3 then ensure the lemma is true for every \({{\vec {0}}<\vec {m}\leqslant \vec {n}}\) .

We prove Lemma 5 by induction. For \({|\vec {n}|=1}\) , since \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}=\frac{\vec {\xi }^{\vec {n}}}{\vec {n}\text {!}}V^{{\vec {0}}}+{{\tilde{V}}}^{\vec {n}}}\) with \({{{\tilde{V}}}^{\vec {n}}}\) independent of \({\vec {\xi }}\), we know that \({\partial _{\vec {\xi }}^{\vec {n}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}=V^{{\vec {0}}}={{{\tilde{{\mathcal {V}}}}}}^{{\vec {0}}}}\) , and thus Lemma 5 is true for every \({|\vec {n}|=1}\) . Now assume that \({\partial _{\vec {\xi }}^{\vec {m}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {k}}={{{\tilde{{\mathcal {V}}}}}}^{\vec {k}-\vec {m}}}\) with \({|\vec {m}|=1}\) is true for all \({{\vec {0}}<\vec {k}\leqslant \vec {n}}\) . Then, for \({|\vec {m}|=1}\) , replace \({\vec {n}}\) with \({\vec {n}+\vec {m}}\) in (27b) and take the \({\vec {m}}\)th derivative with respect to \({\vec {\xi }}\),

$$\begin{aligned}&{\mathfrak {L}}_{{\vec {0}}}(\partial _{\vec {\xi }}^{\vec {m}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}+\vec {m}})-(\partial _{\vec {\xi }}^{\vec {m}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}+\vec {m}}) A_{{\vec {0}}}\\&\quad = -\sum _{0<|\vec {k}|, \vec {k}\leqslant \vec {n}+\vec {m}}{\mathfrak {L}}_{\vec {k}}(\partial _{\vec {\xi }}^{\vec {m}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}+\vec {m}-\vec {k}})\\&\qquad + \sum _{0<|\vec {k}|, \vec {k}\leqslant \vec {n}+\vec {m}}(\partial _{\vec {\xi }}^{\vec {m}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}+\vec {m}-\vec {k}}) A_{\vec {k}}\\&\quad = -\sum _{0<|\vec {k}|, \vec {k}\leqslant \vec {n}}{\mathfrak {L}}_{\vec {k}}(\partial _{\vec {\xi }}^{\vec {m}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}+\vec {m}-\vec {k}})\\&\qquad + \sum _{0<|\vec {k}|, \vec {k}\leqslant \vec {n}}(\partial _{\vec {\xi }}^{\vec {m}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}+\vec {m}-\vec {k}}) A_{\vec {k}}\\&\quad =-\sum _{0<|\vec {k}|, \vec {k}\leqslant \vec {n}}{\mathfrak {L}}_{\vec {k}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}-\vec {k}} + \sum _{0<|\vec {k}|, \vec {k}\leqslant \vec {n}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}-\vec {k}} A_{\vec {k}}\\&\quad ={\mathfrak {L}}_{{\vec {0}}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}-{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}} A_{{\vec {0}}}\,, \end{aligned}$$

where in the fourth and fifth lines we recall that the highest \({\vec {\xi }}\) order of \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}+\vec {m}-\vec {k}}}\) is \({\vec {n}+\vec {m}-\vec {k}}\) , so to take the \({\vec {m}}\)th derivative we must have \({\vec {k}\leqslant \vec {n}}\) ; in the sixth line we apply the assumption that \({\partial _{\vec {\xi }}^{\vec {m}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {k}}={{{\tilde{{\mathcal {V}}}}}}^{\vec {k}-\vec {m}}}\) for all \({{\vec {0}}<\vec {k}\leqslant \vec {n}}\); and the seventh line comes from equation (27b) . On comparing the first and last lines we see that \({\partial _{\vec {\xi }}^{\vec {m}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}+\vec {m}}-{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}\propto V^{{\vec {0}}}}\) , but since

$$\begin{aligned} \partial _{\vec {\xi }}^{\vec {m}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}+\vec {m}}-{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}&=\partial _{\vec {\xi }}^{\vec {m}}\left( \frac{\vec {\xi }^{(\vec {n}+\vec {m})}}{(\vec {n}+\vec {m})\text {!}}V^{{\vec {0}}}+{{\tilde{V}}}^{\vec {n}+\vec {m}}\right) -\frac{\vec {\xi }^{\vec {n}}}{\vec {n}\text {!}}V^{{\vec {0}}}-{{\tilde{V}}}^{\vec {n}}\nonumber \\&=\partial _{\vec {\xi }}^{\vec {m}}{{\tilde{V}}}^{\vec {n}+\vec {m}}-{{\tilde{V}}}^{\vec {n}}\,, \end{aligned}$$

and we know \({\langle Z^{{\vec {0}}},{{\tilde{V}}}^{\vec {n}}\rangle =0}\) for every \({|\vec {n}|>0}\) , then \({\partial _{\vec {\xi }}^{\vec {m}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}+\vec {m}}={{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}}\) . So, if \({\partial _{\vec {\xi }}^{\vec {m}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {k}}={{{\tilde{{\mathcal {V}}}}}}^{\vec {k}-\vec {m}}}\) with \({|\vec {m}|=1}\) is true for every \({{\vec {0}}<\vec {k}\leqslant \vec {n}}\) and \({1\leqslant |\vec {n}|<N}\) , then \({\partial _{\vec {\xi }}^{\vec {m}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}+\vec {m}}={{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}}\) is also true. Since \({\partial _{\vec {\xi }}^{\vec {n}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}={{{\tilde{{\mathcal {V}}}}}}^{{\vec {0}}}}\) is true when \({|\vec {n}|=1}\) , \({\partial _{\vec {\xi }}^{\vec {m}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}+\vec {m}}={{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}}\) must be true for every \({|\vec {n}|>0}\) when \({|\vec {m}|=1}\) . Finally, because derivative orders are additive, Lemma 5 must be true for every \({{\vec {0}}<\vec {m}\leqslant \vec {n}}\) .

3.2 Slow field and PDE

In this section we complete our primary aim, which is to model the slow dynamics of the original field \({u(\vec {x},y,t)}\). To do this, we project \({u(\vec {x},y,t)}\) onto the centre subspace \({\mathbb {E}}^0_c\) and define this projection as the slow field \({U(\vec {x},t)=\langle Z^{{\vec {0}}},u(\vec {x},y,t)\rangle \in {\mathbb {C}}^m}\). The aim of this section is to construct a pde for \({U(\vec {x},t)}\) with an exact error term. For the shallow fluid flow example of Sect. 1.1, pdes of different order are (4a) and (5a), but the error term is new.

The slow field \({U(\vec {x},t)}\) evaluated at station \({\vec {x}=\vec {X}}\) is equivalent to \({\langle Z^{{\vec {0}}},u(\vec {X}+\vec {\xi },y,t)\rangle }\) evaluated at \({\vec {\xi }={\vec {0}}}\) , and since \({{{\tilde{u}}}(\vec {X},t)=u(\vec {X}+\vec {\xi },y,t)+{\mathcal {O}} \mathchoice{\big (|\vec {\xi }|^{N+1}\big )}{\big (|\vec {\xi }|^{N+1}\big )}{(|\vec {\xi }|^{N+1})}{(|\vec {\xi }|^{N+1})}}\) (Sect. 2.2 and equation (22)) the slow field is also equivalent to \({\langle Z^{{\vec {0}}},{{\tilde{u}}}(\vec {X},t)\rangle }\) evaluated at \({\vec {\xi }={\vec {0}}}\) . We expand \({{{\tilde{u}}}(\vec {X},t)}\) in terms of the centre modes \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}}\) and the analogous stable modes, and then project this parameterisation onto the the centre subspace \({\mathbb {E}}^0_c\) to obtain the slow field \({U(\vec {X},t)}\).

Since we project \({{{\tilde{u}}}(\vec {X},t)}\) onto \({\mathbb {E}}^0_c\), the stable modes may at first seem superfluous in the expansion of \({{{\tilde{u}}}(\vec {X},t)}\). However, while the stable modes decay exponentially rapidly, resulting in the emergence of the evolution of \({u(\vec {x},y,t)}\) on the centre subspace, through the nonlinearity these stable modes are not generally negligible and their influence must be accounted for in \({U(\vec {X},t)}\).

We define \({{{{\tilde{{\mathcal {W}}}}}}^{\vec {n}}\in {\mathbb {U}}^{1\times m'}\otimes _t {\mathbb {G}}_N={\mathbb {U}}_N^{1\times m'}}\) as the generalised eigenvectors which span the stable subspace \({\mathbb {E}}_s^N\subset {\mathbb {U}}_N\) of \({{{\tilde{{\mathcal {L}}}}}}\). The full set of generalised eigenvectors, \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}}\) and \({{{{\tilde{{\mathcal {W}}}}}}^{\vec {n}}}\), span \({\mathbb {U}}_N\) of \({{{\tilde{{\mathcal {L}}}}}}\), fully parameterising the field \({{\tilde{u}}}\). Many of the properties of the \({{{{\tilde{{\mathcal {W}}}}}}^{\vec {n}}}\) are analogous to those of the \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}}\) and can be established by proofs similar to those presented in Sect. 3.1. Therefore, here we only briefly comment on those properties of \({{{{\tilde{{\mathcal {W}}}}}}^{\vec {n}}}\) which are required for the analysis of generating multinomial \({{\tilde{u}}}\), and ultimately the slow field \({U(\vec {x},t)}\).

For the lowest order case \({\vec {n}={\vec {0}}}\) , the centre subspace eigenvector \({{{{\tilde{{\mathcal {V}}}}}}^{{\vec {0}}}=V^{{\vec {0}}}}\) satisfies \({{\mathfrak {L}}_{{\vec {0}}}{{{\tilde{{\mathcal {V}}}}}}^{{\vec {0}}}={{{\tilde{{\mathcal {V}}}}}}^{{\vec {0}}}A_{{\vec {0}}}}\) (Definition 4) where matrix \({A_{{\vec {0}}}\in {\mathbb {C}}^{m\times m}}\) has centre eigenvalues \(\lambda _1,\ldots ,\lambda _m\) (Assumption 3), and similarly there is a stable subspace eigenvector \({{{{\tilde{{\mathcal {W}}}}}}^{{\vec {0}}}=\begin{bmatrix} v_{m+1}^{\vec {0}}&v_{m+2}^{\vec {0}}&\cdots \end{bmatrix}\in {\mathbb {U}}^{1\times m'}}\) which satisfies \({{\mathfrak {L}}_{{\vec {0}}}{{{\tilde{{\mathcal {W}}}}}}^{{\vec {0}}}={{{\tilde{{\mathcal {W}}}}}}^{{\vec {0}}}B_{{\vec {0}}}}\) for some matrix \({B_{{\vec {0}}}\in {\mathbb {C}}^{m'\times m'}}\) with stable eigenvalues \(\lambda _{m+1},\lambda _{m+2},\ldots ,\lambda _{m'}\) (Recall from Assumption 3 that \(m'\) may be infinite). Together, \({{{{\tilde{{\mathcal {V}}}}}}^{{\vec {0}}}}\) and \({{{{\tilde{{\mathcal {W}}}}}}^{{\vec {0}}}}\) span the Hilbert space \({\mathbb {U}}\) of \({{\mathfrak {L}}_{{\vec {0}}}}\).

For convenience, define matrix \({{{{\tilde{{\mathcal {V}}}}}}=[{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}]}\) where the columns of \({{{\tilde{{\mathcal {V}}}}}}\) are the centre subspace eigenvectors \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}}\). The ordering of the \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}}\) in \({{{\tilde{{\mathcal {V}}}}}}\) is according to the magnitude \({|\vec {n}|}\), so that the first column is \({{{{\tilde{{\mathcal {V}}}}}}^{{\vec {0}}}}\). From (28), \({{{{\tilde{{\mathcal {L}}}}}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}=\sum _{{\vec {0}}\leqslant \vec {k}\leqslant \vec {n}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}-\vec {k}}A_{\vec {k}}}\) and so we define block upper triangular matrix \({\mathcal {A}}\) such that \({{{\tilde{{\mathcal {L}}}}}}{{{\tilde{{\mathcal {V}}}}}}={{{\tilde{{\mathcal {V}}}}}}{\mathcal {A}}\) . The upper block triangular matrix \({\mathcal {A}}\) consists of \({{\mathcal {N}}}\times {{\mathcal {N}}}\) blocks with \(0_{m}\) below the diagonal, \({A_{{\vec {0}}}\in {\mathbb {C}}^{m\times m}}\) along the main diagonal, and the \({(\vec {k},\vec {n})}\) block above the diagonal (that is, for \({\vec {n}>\vec {k}}\)) is \({A_{\vec {n}-\vec {k}}\in {\mathbb {C}}^{m\times m}}\) . The centre eigenvalues of \({{{\tilde{{\mathcal {L}}}}}}\) are the eigenvalues of the blocks along the diagonal of \({\mathcal {A}}\), namely, the m eigenvalues of \({A_{{\vec {0}}}}\), \(\lambda _1,\ldots ,\lambda _m\), repeated \({{\mathcal {N}}}\) times.

Similarly to matrix \({\mathcal {A}}\) which satisfies \({{{\tilde{{\mathcal {L}}}}}}{{{\tilde{{\mathcal {V}}}}}}={{{\tilde{{\mathcal {V}}}}}}{\mathcal {A}}\), we define matrix \({\mathcal {B}}\) such that \({{{\tilde{{\mathcal {L}}}}}}{{{\tilde{{\mathcal {W}}}}}}={{{\tilde{{\mathcal {W}}}}}}{\mathcal {B}}\). Analogous to \({\mathcal {A}}\), \({\mathcal {B}}\) is upper block triangular with \({{\mathcal {N}}}\times {{\mathcal {N}}}\) blocks of size \({\mathbb {C}}^{\mathrm{m}'\times m'}\), with block \({B_{{\vec {0}}}}\) along the main diagonal, and \({B_{\vec {n}-\vec {k}}}\) the \({(\vec {k},\vec {n})}\) block above the diagonal. Thus the stable eigenvalues of \({{{\tilde{{\mathcal {L}}}}}}\) are the eigenvalues of \({\mathcal {B}}\) which must be \(\lambda _{m+1},\lambda _{m+2},\ldots \) repeated \({{\mathcal {N}}}\) times. Recall that these stable eigenvalues all have real part \(\leqslant -\beta <-N\alpha \) , whereas the magnitude of the real part of the centre eigenvalues are \(\leqslant \alpha \) (Assumption 3).

We capture the full centre-stable dynamics of the linear generating pde (26) on \({\mathbb {U}}_N\) with

$$\begin{aligned} {{\tilde{u}}}(\vec {X},t)=\sum _{|\vec {n}|=0}^N ({{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}U_{\vec {n}}+{{{\tilde{{\mathcal {W}}}}}}^{\vec {n}}S_{\vec {n}})={{{\tilde{{\mathcal {V}}}}}}{\mathcal {U}}+{{{\tilde{{\mathcal {W}}}}}}{\mathcal {S}}\,, \end{aligned}$$
(29)

for parameters \({{\mathcal {U}}=[U_{\vec {n}}]}\) and \({{\mathcal {S}}=[S_{\vec {n}}]}\) with \({U_{\vec {n}}\in {\mathbb {C}}^m}\) and \({S_{\vec {n}}\in {\mathbb {C}}^{m'}}\) . As \({{\tilde{r}}}[u],{{\tilde{f}}}[{{\tilde{u}}}]\in {\mathbb {U}}_N\) and the generalised eigenvectors, \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}}\) and \({{{{\tilde{{\mathcal {W}}}}}}^{\vec {n}}}\), span \({\mathbb {U}}_N\) the forcing and nonlinear terms are uniquely parameterised in terms of these generalised eigenvectors,

$$\begin{aligned} {{\tilde{r}}}[u]={{{\tilde{{\mathcal {V}}}}}}r_c(t)+{{{\tilde{{\mathcal {W}}}}}}r_s(t)\,,\quad {{\tilde{f}}}[{{\tilde{u}}}]={{{\tilde{{\mathcal {V}}}}}}f_c(t)+{{{\tilde{{\mathcal {W}}}}}}f_s(t)\,, \end{aligned}$$
(30)

where \({r_c=[r_c^{\vec {n}}]}\) and \({f_c=[f_c^{\vec {n}}]}\) , and similarly for \(r_s\) and \(f_s\), with \({r^{\vec {n}}_c,f^{\vec {n}}_c\in {\mathbb {C}}^m}\) and \({r^{\vec {n}}_s,f^{\vec {n}}_s\in {\mathbb {C}}^{m'}}\) . We substitute the expansion of \({{\tilde{u}}}\) (29) into the nonlinear generating pde (17) and separate the forcing and nonlinear terms into centre and slow components (30). From Sect. 3.1 we know that the centre subspace eigenvectors \({{{{\tilde{{\mathcal {V}}}}}}^{\vec {n}}}\) are linearly independent, and similarly the stable subspace eigenvectors \({{{{\tilde{{\mathcal {W}}}}}}^{\vec {n}}}\) are linearly independent. Therefore, we separate the centre and stable components of the pde to obtain

$$\begin{aligned}&\frac{\partial {\mathcal {U}}}{\partial t}={\mathcal {A}}{\mathcal {U}}+f_c(t)+r_c(t), \end{aligned}$$
(31a)
$$\begin{aligned}&\frac{\partial {\mathcal {S}}}{\partial t}={\mathcal {B}}{\mathcal {S}}+f_s(t)+r_s(t). \end{aligned}$$
(31b)

A general solution of the pde for the stable parameter \({\mathcal {S}}\) (31b) is

$$\begin{aligned} {\mathcal {S}}(t)&=e^{{\mathcal {B}}t}{\mathcal {S}}(0)+\int _0^t e^{{\mathcal {B}}(t-\tau )}[f_s(\tau )+r_s(\tau )]d\tau \nonumber \\&=e^{{\mathcal {B}}t}{\mathcal {S}}(0)+e^{{\mathcal {B}}t}\star [f_s(t)+r_s(t)]\,, \end{aligned}$$
(32)

with convolution \(h(t)\star g(t):=\int _0^t h(t-\tau )g(\tau )\,d\tau \) . As all eigenvalues of \({\mathcal {B}}\) have real part \(\leqslant -\beta <-N\alpha \), for some decay rate \(\gamma \in (\alpha ,\beta )\) ,

$$\begin{aligned} {\mathcal {S}}(t)=e^{{\mathcal {B}}t}\star [f_s(t)+r_s(t)]+{\mathcal {O}} \mathchoice{\big (e^{-\gamma t}\big )}{\big (e^{-\gamma t}\big )}{(e^{-\gamma t})}{(e^{-\gamma t})}\,. \end{aligned}$$
(33)

This solution for \({\mathcal {S}}(t)\) shows that, after a sufficiently long time, the forcing and nonlinear terms dominate \({\mathcal {S}}\) through the convolution, thus showing how the forcing and nonlinear terms couple the centre and stable solutions through \(f_{s/c}\) and \(r_{s/c}\) and why the influence of the stable modes are not negligible.

We now construct a pde for the slow field \({U(\vec {X},t)}\) by considering both its the temporal and spatial derivative in terms of the centre-stable dynamics. Since \({U(\vec {X},t)=\langle Z^{{\vec {0}}},{{\tilde{u}}}(\vec {X},t)\rangle _{\vec {\xi }={\vec {0}}}}\) at station \({\vec {x}=\vec {X}}\) ,

$$\begin{aligned}&\frac{\partial U(\vec {X},t)}{\partial t}\nonumber \\&\quad =\left\langle Z^{{\vec {0}}},\frac{\partial {{\tilde{u}}}(\vec {X},t)}{\partial t} \right\rangle _{\vec {\xi }={\vec {0}}}\nonumber \\&\quad =\left\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {V}}}}}}\frac{\partial {\mathcal {U}}}{\partial t} \right\rangle _{\vec {\xi }={\vec {0}}}+\left\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {W}}}}}}\frac{\partial {\mathcal {S}}}{\partial t} \right\rangle _{\vec {\xi }={\vec {0}}} \nonumber \\&\quad =\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {V}}}}}}{\mathcal {A}}{{\mathcal {U}}}\rangle _{\vec {\xi }={\vec {0}}}+\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {V}}}}}}f_c(t)\rangle _{\vec {\xi }={\vec {0}}}+\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {V}}}}}}r_c(t)\rangle _{\vec {\xi }={\vec {0}}}\nonumber \\&\qquad +\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {W}}}}}}{\mathcal {B}}{\mathcal {S}}\rangle _{\vec {\xi }={\vec {0}}}+\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {W}}}}}}f_s(t)\rangle _{\vec {\xi }={\vec {0}}}\nonumber \\&\qquad +\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {W}}}}}}r_s(t)\rangle _{\vec {\xi }={\vec {0}}}\quad \text {from }(31)\nonumber \\&\quad =\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {V}}}}}}\rangle _{\vec {\xi }={\vec {0}}}{\mathcal {A}}{{\mathcal {U}}}+\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {V}}}}}}\rangle _{\vec {\xi }={\vec {0}}}f_c(t)\nonumber \\&\qquad +\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {V}}}}}}\rangle _{\vec {\xi }={\vec {0}}}r_c(t)\nonumber \\&\quad +\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {W}}}}}}\rangle _{\vec {\xi }={\vec {0}}}f_s(t)+\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {W}}}}}}\rangle _{\vec {\xi }={\vec {0}}}r_s(t)\nonumber \\&\quad +\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {W}}}}}}\rangle _{\vec {\xi }={\vec {0}}}{\mathcal {B}}e^{{\mathcal {B}}t}\star [f_s(t)+r_s(t)] +{\mathcal {O}} \mathchoice{\big (e^{-\gamma t}\big )}{\big (e^{-\gamma t}\big )}{(e^{-\gamma t})}{(e^{-\gamma t})} \quad \text {from }(33). \end{aligned}$$
(34)

From (27c), the inner product \({\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {V}}}}}}\rangle _{\vec {\xi }=0}=\begin{bmatrix} I_m&0_m&\cdots&0_m\end{bmatrix}}\). Also, because of the upper block triangular structure of \({\mathcal {A}}\), the \({\vec {k}}\) element of \({\mathcal {A}}{\mathcal {U}}\) is

$$\begin{aligned}{}[{\mathcal {A}}{\mathcal {U}}]_{\vec {k}}=\sum _{\vec {n}\geqslant \vec {k},|\vec {n}|\leqslant N}{\mathcal {A}}_{\vec {n}-\vec {k}}U_{\vec {n}}=\sum _{|\vec {n}|=0}^{N-|\vec {k}|}{\mathcal {A}}_{\vec {n}}U_{\vec {n}+\vec {k}}\,. \end{aligned}$$

Therefore, in the first term on the right hand side of (34) we only retain the \({\vec {k}={\vec {0}}}\) element of \({\mathcal {A}}{\mathcal {U}}\), and in the second and third terms we only retain \({f_c^{{\vec {0}}}}\) and \({r_c^{{\vec {0}}}}\), respectively. So now,

$$\begin{aligned} \frac{\partial U(\vec {X},t)}{\partial t}={}&{}\sum _{|\vec {n}|=0}^NA_{\vec {n}}U_{\vec {n}}+f_c^{{\vec {0}}}(t)+r_c^{{\vec {0}}}(t)\nonumber \\ {}&{}+\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {W}}}}}}\rangle _{\vec {\xi }={\vec {0}}}f_s(t)+\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {W}}}}}}\rangle _{\vec {\xi }={\vec {0}}}r_s(t)\nonumber \\ {}&{}+\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {W}}}}}}\rangle _{\vec {\xi }={\vec {0}}}{\mathcal {B}}e^{{\mathcal {B}}t}\star [f_s(t)+r_s(t)]+{\mathcal {O}} \mathchoice{\big (e^{-\gamma t}\big )}{\big (e^{-\gamma t}\big )}{(e^{-\gamma t})}{(e^{-\gamma t})}\,. \end{aligned}$$
(35)

Now consider the order  \({\vec {n}}\) spatial derivative of the slow field,

$$\begin{aligned}{}[\partial _{\vec {x}}^{\vec {n}}U(\vec {x},t)]_{\vec {x}=\vec {X}}={}&{}\langle Z^{{\vec {0}}},u(\vec {x},y,t)\rangle _{\vec {x}=\vec {X}}\nonumber \\ ={}&{}\langle Z^{{\vec {0}}},u^{(\vec {n})}\rangle \nonumber \\ ={}&{}\langle Z^{{\vec {0}}},\partial _{\vec {\xi }}^{\vec {n}}{{\tilde{u}}}(\vec {X},t)\rangle _{\vec {\xi }={\vec {0}}}\quad \text {from } (20)\nonumber \\ ={}&{}\langle Z^{{\vec {0}}},\partial _{\vec {\xi }}^{\vec {n}}{{{\tilde{{\mathcal {V}}}}}}\rangle _{\vec {\xi }={\vec {0}}}{\mathcal {U}}+\langle Z^{{\vec {0}}},\partial _{\vec {\xi }}^{\vec {n}}{{{\tilde{{\mathcal {W}}}}}}\rangle _{\vec {\xi }={\vec {0}}}{\mathcal {S}}\,. \end{aligned}$$
(36)

From Lemma 5, \({\langle Z^{{\vec {0}}},\partial _{\vec {\xi }}^{\vec {n}}{{{\tilde{{\mathcal {V}}}}}}^{\vec {k}}\rangle _{\vec {\xi }={\vec {0}}}=\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {V}}}}}}^{\vec {k}-\vec {n}}\rangle _{\vec {\xi }={\vec {0}}}}\) which, from (27c), equals the identity \(I_m\) if \({\vec {k}=\vec {n}}\) , but zero otherwise. Thus, in the first inner product of (36), only the \({U_{\vec {n}}}\) element of \({\mathcal {U}}\) remains. Then, in the second inner product of (36), substitute the solution of the stable parameter \({\mathcal {S}}\) (33). The spatial derivative is now

$$\begin{aligned}{}[\partial _{\vec {x}}^{\vec {n}}U(\vec {x},t)]_{\vec {x}=\vec {X}}&=U_{\vec {n}}+\langle Z^{{\vec {0}}},\partial _{\vec {\xi }}^{\vec {n}}{{{\tilde{{\mathcal {W}}}}}}\rangle _{\vec {\xi }={\vec {0}}}e^{{\mathcal {B}}t}\star [f_s(t)+r_s(t)]\nonumber \\&\quad +{\mathcal {O}} \mathchoice{\big (e^{-\gamma t}\big )}{\big (e^{-\gamma t}\big )}{(e^{-\gamma t})}{(e^{-\gamma t})}\,. \end{aligned}$$
(37)

Combining (35) and (37),

$$\begin{aligned}&\frac{\partial U(\vec {X},t)}{\partial t}=\sum _{|\vec {n}|=0}^NA_{\vec {n}}\partial _{\vec {x}}^{\vec {n}}U(\vec {X},t) \nonumber \\&\quad +f_c^{{\vec {0}}}(t)+r_c^{{\vec {0}}}(t)+\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {W}}}}}}\rangle _{\vec {\xi }={\vec {0}}}f_s(t)+\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {W}}}}}}\rangle _{\vec {\xi }={\vec {0}}}r_s(t) \nonumber \\&\quad +\left[ \langle Z^{{\vec {0}}},{\mathcal {W}}\rangle {\mathcal {B}}-\sum _{|\vec {n}|=0}^NA_{\vec {n}}\langle Z^{{\vec {0}}},\partial _{\vec {\xi }}^{\vec {n}}{{{\tilde{{\mathcal {W}}}}}}\rangle \right] _{\vec {\xi }={\vec {0}}}e^{{\mathcal {B}}t}\star [f_s(t)+r_s(t)] \nonumber \\&\quad +{\mathcal {O}} \mathchoice{\big (e^{-\gamma t}\big )}{\big (e^{-\gamma t}\big )}{(e^{-\gamma t})}{(e^{-\gamma t})}\,. \end{aligned}$$
(38)

Whereas this equation symbolically resembles a pde, it is strictly a differential-integral equation which couples the dynamics at each station \({\vec {X}}\) via the ‘uncertain’ gradient terms and the stable parameter \({\mathcal {S}}(t)\), which is dependent on the history convolution integrals (32). To obtain a slow pde without this coupling to different stations, such as pde (4a) used in the shallow fluid flow example, we retain all terms which do not couple to different stations (i.e., no dependence on derivatives \({\partial _{\vec {x}}u^{(\vec {n})}}\) with \({|\vec {n}|=N}\) and no dependence on \({\mathcal {S}}(t)\)) and regulate all other terms to a remainder. The second last line of (38) contains a convolution, so is part of the remainder, and the two forcing terms \({r_c^{{\vec {0}}}(t)}\) and \({\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {W}}}}}}\rangle _{\vec {\xi }={\vec {0}}}r_s(t)}\) are dependent on uncertain gradients, so are also in the remainder. In contrast, the nonlinear terms \({f^{{\vec {0}}}_c(t)}\) and \(f_s(t)\) contain parts which we want to retain in the slow pde, as well as terms which should be in the remainder.

For specific cases, removing the remainder components from the nonlinear terms in (38) is achieved using (25), as shown in Appendix A.2. Here, for the general case, we show that the nonlinear terms which are retained in the slow pde must take a particular form. First, separate the nonlinear terms in the second line of (38) into two parts,

$$\begin{aligned} f_c^{{\vec {0}}}(t)=f_{c}^{{\vec {0}}}(\vec {X},t)+f_{c,r}^{{\vec {0}}}(t)\,,\quad \langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {W}}}}}}\rangle _{\vec {\xi }={\vec {0}}}f_s(t)=f_{s}(\vec {X},t)+f_{s,r}(t)\,, \end{aligned}$$
(39)

where \({f_{c}^{{\vec {0}}}(\vec {X},t)\in {\mathbb {C}}^m}\) and \({f_{s}^{{\vec {0}}}(\vec {X},t)\in {\mathbb {C}}^m}\) contain no uncertain terms and no dependence on \({\mathcal {S}}\) (so are retained in the slow pde), and where \({f_{c,r}^{{\vec {0}}}(t)}\) and \(f_{s,r}(t)\) contain all uncertain terms and \({\mathcal {S}}\) dependent terms. The \({f_{c,r}^{{\vec {0}}}(t)}\) and \(f_{s,r}(t)\), as well as the convolutions and forcing terms in (38), are not retained in the slow pde. As the nonlinear function \(f[u]\) in the original pde (6) is a sum of nonlinear terms \(f^j[u]\) (Assumption 1), the nonlinear \({f_{c}^{{\vec {0}}}(\vec {X},t)}\) and \({f_{s}(\vec {X},t)}\) are also a sum of nonlinear terms indexed by integer j,

$$\begin{aligned} f_c^{{\vec {0}}}(\vec {X},t)=\sum _jf_c^{j{\vec {0}}}(\vec {X},t)\,, \quad f_s^{{\vec {0}}}(\vec {X},t)=\sum _jf_s^{j{\vec {0}}}(\vec {X},t)\,, \end{aligned}$$
(40)

where \({f_c^{j{\vec {0}}}(\vec {X},t)\in {\mathbb {C}}^m}\) , and \({f_s^{j{\vec {0}}}(\vec {X},t)\in {\mathbb {C}}^m}\) . As \(f^j[u]\) is of order \(P_j\) in u and its derivatives (Assumption 1), \({f_c^{j{\vec {0}}}}\) and \({f_s^{j{\vec {0}}}}\) must be of order \(P_j\) in \({U_{\vec {n}}\in {\mathbb {C}}^m}\) for all \({|\vec {n}|\leqslant N}\) Footnote 4. So, in general, each \(k=1,\ldots ,m\) element of \({f_{c}^{j{\vec {0}}}+f_{s}^j}\) must have the form

$$\begin{aligned}&\sum _{\begin{array}{c} |\vec {\ell }_1|,\ldots ,|\vec {\ell }_{P_j}|=0\\ |\vec {\ell }_1|\geqslant |\vec {\ell }_2|\geqslant \cdots \geqslant |\vec {\ell }_{P_j}| \end{array}}^N \vec {a}^{jT}_{k\vec {\ell }_{1}\vec {\ell }_{2}\ldots \vec {\ell }_{P_j}}U_{\vec {\ell }_1}\nonumber \\&\quad \otimes U_{\vec {\ell }_2}\otimes \cdots \otimes U_{\vec {\ell }_{P_j}}\,, \end{aligned}$$
(41)

for some constant vector \({\vec {a}^j_{k\vec {\ell }_{1}\vec {\ell }_{2}\ldots \vec {\ell }_{P_j}}\in {\mathbb {C}}^{m^{P_j}}}\) and where \(\otimes \) represents the usual Kronecker product, for which \({U_{\vec {\ell }_p}\otimes U_{\vec {\ell }_q}\in {\mathbb {C}}^{m^2}}\) and \({U_{\vec {\ell }_1}\otimes U_{\vec {\ell }_2}\otimes \cdots \otimes U_{\vec {\ell }_{P_j}}\in {\mathbb {C}}^{m^{P_j}}}\)Footnote 5 On replacing all \({U_{\vec {\ell }}}\) with spatial derivatives of \({\partial _{\vec {x}}^{\vec {\ell }}U}\), as shown in (37), the kth coordinate of the m-dimensional nonlinear term retained in the slow pde must have the form

$$\begin{aligned}&[f_{c}^{{\vec {0}}}(\vec {X},t)+f_{s}(\vec {X},t)]_k \nonumber \\&\quad =\sum _j\sum _{\begin{array}{c} |\vec {\ell }_1|,\ldots ,|\vec {\ell }_{P_j}|=0\\ |\vec {\ell }_1|\geqslant |\vec {\ell }_2|\geqslant \cdots \geqslant |\vec {\ell }_{P_j}| \end{array}}^N \vec {a}^{jT}_{k\vec {\ell }_{1}\vec {\ell }_{2}\ldots \vec {\ell }_{P_j}}(\partial _{\vec {x}}^{\vec {\ell }_1}U) \otimes (\partial _{\vec {x}}^{\vec {\ell }_2}U) \otimes \cdots \otimes (\partial _{\vec {x}}^{\vec {\ell }_{P_j}}U)\,. \end{aligned}$$
(42)

Now, on replacing arbitrary station \({\vec {X}}\) with \({\vec {x}\in {\mathbb {X}}}\) , the slow pde determined from the differential-integral equation (38) is

$$\begin{aligned}&\frac{\partial U(\vec {x},t)}{\partial t}=\sum _{|\vec {n}|=0}^NA_{\vec {n}}\partial _{\vec {x}}^{\vec {n}}U(\vec {x},t)\nonumber \\&\quad +\sum _j\sum _{\begin{array}{c} |\vec {\ell }_1|,\ldots ,|\vec {\ell }_{P_j}|=0\\ |\vec {\ell }_1|\geqslant |\vec {\ell }_2|\geqslant \cdots \geqslant |\vec {\ell }_{P_j}| \end{array}}^N \vec {a}^{jT}_{\vec {\ell }_{1}\vec {\ell }_{2}\ldots \vec {\ell }_{P_j}}(\partial _{\vec {x}}^{\vec {\ell }_1}U) \otimes (\partial _{\vec {x}}^{\vec {\ell }_2}U) \otimes \cdots \otimes (\partial _{\vec {x}}^{\vec {\ell }_{P_j}}U)+\rho \,, \end{aligned}$$
(43)

with \({\vec {a}^{jT}_{\vec {\ell }_{1}\vec {\ell }_{2}\ldots \vec {\ell }_{P_j}}(\partial _{\vec {x}}^{\vec {\ell }_1}U)\otimes \cdots \otimes (\partial _{\vec {x}}^{\vec {\ell }_{P_j}}U)}\) the m-dimensional vector with elements \(k=1,2,\ldots ,m\) defined by (42), and with remainder

$$\begin{aligned} \rho =&f_{c,r}^{{\vec {0}}}(t)+r_c^{{\vec {0}}}(t)+f_{s,r}(t)+\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {W}}}}}}\rangle _{\vec {\xi }={\vec {0}}}r_s(t)\nonumber \\&+\left[ \langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {W}}}}}}\rangle {\mathcal {B}}-\sum _{|\vec {n}|=0}^NA_{\vec {n}}\langle Z^{{\vec {0}}},\partial _{\vec {\xi }}^{\vec {n}}{{{\tilde{{\mathcal {W}}}}}}\rangle \right] _{\vec {\xi }={\vec {0}}}e^{{\mathcal {B}}t}\star [f_s(t)+r_s(t)]\nonumber \\&+{\mathcal {O}} \mathchoice{\big (e^{-\gamma t}\big )}{\big (e^{-\gamma t}\big )}{(e^{-\gamma t})}{(e^{-\gamma t})}\,. \end{aligned}$$
(44)

Analogous slow pdes were derived by Robert [28] (equation (22)) and Roberts and Bunder [30] (equation (51)), but without the nonlinear terms.

Simplifications of the remainder \(\rho \) (44) are possible when the order N is chosen to be higher than the order of the spatial derivatives in the original pde (6). The original pde contains linear operators \({{\mathfrak {L}}_{\vec {k}}\partial _{\vec {x}}^{\vec {k}}}\) for \({\vec {k}}\) satisfying \({0\leqslant |\vec {k}|< \infty }\), but in practice there will be an upper limit on \({|\vec {k}|}\), say \(k_{\max }\) (often \(k_{\max }=1,2\)—the example of Sect. 1.1 has \(k_{\max }=1\)). Assume that \(N>k_{\max }\) and consider the uncertain linear terms in \(\rho \):

$$\begin{aligned} r^{{\vec {0}}}_c(t)+\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {W}}}}}}\rangle _{\vec {\xi }={\vec {0}}}r_s(t)&=\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {V}}}}}}r_c(t)+{{{\tilde{{\mathcal {W}}}}}}r_s(t)\rangle _{\vec {\xi }={\vec {0}}}\\&=\langle Z^{{\vec {0}}},{{\tilde{r}}}[u]\rangle _{\vec {\xi }={\vec {0}}}\,, \end{aligned}$$

Since \({Z^{{\vec {0}}}}\) is independent of \({\vec {\xi }}\), we need only consider \({\vec {\xi }={\vec {0}}}\) in \({{\tilde{r}}}[u]\) (21). When \({\vec {\xi }={\vec {0}}}\) the right hand side of (21) requires \({|\vec {n}|=0}\) , \({|\vec {\ell }|=N}\) and \({\vec {\ell }\lneqq \vec {k}}\) , but since \({N>k_{\max }\geqslant |\vec {k}|}\) we can never satisfy \({|\vec {\ell }|=N}\) and \({\vec {\ell }\lneqq \vec {k}}\). So, when \(N>k_{\max }\) we have \({{{\tilde{r}}}[u]_{\vec {\xi }=0}=0}\) and \({r_c^{{\vec {0}}}(t)+\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {W}}}}}}\rangle _{\vec {\xi }={\vec {0}}}r_s(t)=0}\) . Similarly, consider the projection of the nonlinear term

$$\begin{aligned} f^{{\vec {0}}}_c(t)+\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {W}}}}}}\rangle _{\vec {\xi }={\vec {0}}}f_s(t)&=\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {V}}}}}}f_c(t)+{{{\tilde{{\mathcal {W}}}}}}f_s(t)\rangle _{\vec {\xi }={\vec {0}}}\\&=\langle Z^{{\vec {0}}},{{\tilde{f}}}[{{\tilde{u}}}]\rangle _{\vec {\xi }={\vec {0}}}\,, \end{aligned}$$

and then expand \({{\tilde{f}}}[{{\tilde{u}}}]\) using (24). If \(N\) is chosen to be larger than any spatial derivative in the nonlinear term, that is \(N>p_i^j\) for every \(i=1,2,\ldots ,P_j\) and for every \(j\) the number of nonlinear terms, then

$$\begin{aligned} {{\tilde{f}}}[{{\tilde{u}}}]_{\vec {\xi }={\vec {0}}}=\sum _jc_j(y)\prod _{i=1}^{P_j} \left[ \partial _{\vec {\xi }}^{\vec {p}_i^j}{{\tilde{u}}}\right] _{\vec {\xi }={\vec {0}}} \end{aligned}$$

contains no uncertain terms. So, when separating the nonlinear terms according to (30) and (39) \({f^{{\vec {0}}}_{c,r}(t)}\) and \(f_{s,r}(t)\) contain all \({\mathcal {S}}\) dependence and any convolution terms, but no uncertain terms.

We have shown that \(N>\max (p_i^j,k_{\max })\) removes the uncertain terms from \({f^{{\vec {0}}}_{c,r}(t)}\) and \(f_{s,r}(t)\) and sets \({r^{{\vec {0}}}_c(t)+\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {W}}}}}}\rangle _{\vec {\xi }={\vec {0}}}r_s(t)=0}\) , but this does not remove all uncertain terms from the remainder \(\rho \) (44). Uncertain terms are still present in the remainder because of \(f_s(t)\) and \(r_s(t)\) which appear in the convolution in the second line of (44).

Section 1.1 presents the example of a shallow fluid flow on a rotating substrate and, with computer algebra code provided in Appendix A, constructs slow pde of the form given in (43), as shown in equations (4a) and (5a), for \(N=3\) and \(N=4\) , respectively. Appendix A.3 calculates parts of the remainder \(\rho \), such as \({\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {W}}}}}}\rangle {\mathcal {B}}}\) and \({A_{\vec {n}}\langle Z^{{\vec {0}}},\partial _{\vec {\xi }}^{\vec {n}}{{{\tilde{{\mathcal {W}}}}}}\rangle }\) , and shows that, since the order is sufficiently large (\(N>\max (p_i^j,k_{\max })=1\)) we have \({r^{{\vec {0}}}_c(t)+\langle Z^{{\vec {0}}},{{{\tilde{{\mathcal {W}}}}}}\rangle _{\vec {\xi }={\vec {0}}}r_s(t)=0}\) . Whereas the Appendix is written to support the example presented in Sect. 1.1, only Appendix A.1 is specific to this example, with the code in Appendices A.2 and A.3 written in a general format so as to be readily adaptable to a large number of systems.

4 Conclusion

This article further develops a general theory to support practical approximations of slow variations in space. This methodology was initially developed by Roberts [28] for linear systems in one dimensional space, and then extended by Roberts and Bunder [30] to linear systems in multi-dimensional space. We here extend theoretical support to nonlinear systems of pdes in spatial domains that are large in multiple dimensions. In addition, we substantially simplify the derivations of Roberts [28] and Roberts and Bunder [30] by directly constructing the generating pde (Sect. 3.1), rather than applying an intermediary step which later requires conversion to the generating pde. As is illustrated by our realistic example concerning the flow of a layer of fluid on rotating substrate, the significant advantages of the theoretical methodology are:

  • the approach is readily applicable to a wide range of systems, as illustrated by the general theory provided in Sects. 2 and 3;

  • higher order pdes are obtained in a straightforward manner by increasing the order N of the Taylor expansion; and

  • every resulting slow pde has a well-defined error, with a derived algebraic form which can be bounded in applications.

In the general theory, we make some assumptions about the structure of the nonlinear microscale system and its dynamics. Assumption 1 requires that the nonlinearity in the original microscale pde should be a sum of products of the unknown field and its derivatives, and Assumption 3 requires centre-stable dynamics. The key requirement for the presented methodology is the persistence of the centre manifold of the linear system (described in Sect. 3.1) when perturbed by nonlinearities and time-dependent forcing, thus justifying the importance of the eigenspace of the linearised system to the full nonlinear microscale system (Sect. 3.2). As other invariant manifolds are often similarly persistent, we expect that the methodology is not restricted to the centre-stable dynamics required by Assumption 3. Indeed, we show that other invariant manifolds are possible with the fluid flow example in Sect. 1.1, which has slow-stable dynamics. Furthermore, this fluid flow example has a more complex nonlinear structure than that required by Assumption 1. The Reduce Algebra code presented in Appendix A is designed for this fluid flow example, but is written so as to be adaptable to other systems, including those with different nonlinear structures.

Future research will aim to further generalise the methodology. Of particular interest is stochastic dynamics [1, 29], deriving boundary conditions for the slowly varying model from microscale boundary conditions [22, 27, 33], and non-local operators [i.e., beyond the local operators \({{\mathfrak {L}}_{\vec {k}}\partial ^{\vec {k}}_{\vec {x}}}\) in (6)] [5].