Keywords

1 Introduction

Differential algebra is concerned with structural properties of systems of ordinary and partial differential equations (ODEs and PDEs) and provides algorithms for their analysis [1, 31]. The properties unveiled by these algorithms correspond to intrinsic properties of the solutions of the system. At the same time these algorithms isolate equations of interest via elimination, transform systems into normal forms [8], describe singularities [24], allow to investigate control-theoretic properties [22, 23], or detect the size of solution sets [17, 18, 20].

Usually, PDEs come with additional information on the evaluation of functions. For example in inverse problems, parameters in differential equations are being estimated from data points. Or in theoretical and numerical methods for PDEs, boundary conditions, i.e., evaluations of functions on manifolds, ensure well-posedness. Data points and boundary conditions have rarely been addressed by algebraic means, with the exception of modeling of boundary conditions by integro-differential operators [35, 38].

Seemingly disconnected from these algebraic algorithms are Gaussian Processes (GPs) [34], a general regression technique, which arise as limit of large neural networks [29] and generalize linear (ridge) regression, Kriging, and many spline models. GPs describe probability distributions on function spaces. As such,

  1. (1)

    they can be conditioned on observations given as data points using Bayes’ rule in closed form, which avoids overfitting,

  2. (2)

    they admit an extensive dictionary between their mathematical properties and their covariance function, which allows to prescribe intended behavior,

  3. (3)

    form the maximum entropy prior distribution under the assumption of a finite mean and variance in the unknown behavior, and

  4. (4)

    the class of GPs is closed under various operations like conditioning, marginalization, and linear operators.

They are typically used in applications when data is rare or expensive to produce, e.g., in active learning [50], biology [11], anomaly detection [3] or engineering [45]. The mean function of the posterior is used for regression and the variance quantifies uncertainty. In that sense, they allow to deal with data, noise, and uncertainty in a way algebraic algorithms usually cannot.

The inclusion of algebraic methods for differential equations into covariance functions of GPs began by divergence-free and curl-free vector fields [25, 40], extended to electromagnetic fields [43, 47] and strain fields [14]. These approaches were formalized in [15], building on [39]. Then, [19] used Gröbner bases and worked out the necessity of systems being controllable. Boundary conditions were added to the setup in [21], restricted to simple polynomial boundaries.

In this paper, we develop algebraic algorithms suitable for this framework to deal with analytic boundary conditions. These algorithms might take

  1. (i)

    parametrizable linear systems of differential equations,

  2. (ii)

    assumptions on the solutions of the differential equations, e.g. smoothness,

  3. (iii)

    various forms of boundary conditions specified by analytic functions, and

  4. (iv)

    (noisy or noiseless) evaluations of functions at finitely many points

as inputs. They yield a probability distribution on the solution space of the differential equation given by a GP, which has the above properties (1)–(4).

Our approach is as follows. We construct a first parametrization of the solution set of the system of differential equations by finding a matrix whose row nullspace is generated by the equations of the given system. We take a second parametrization of the boundary condition. Then, we construct a parametrization of the intersection of the images of these two parametrizations. Algorithmically, this requires Gröbner bases over a Weyl algebra enlarged by various analytic functions, for which we develop the necessary theory and algorithms. After this symbolic approach, numeric algorithms incorporate measurement data into the GP.

In this setup, ODEs are trivial, both algebraically, as parametrizable linear systems of ODEs with constant or variable coefficients are isomorphic to free systems due to the Jacobson form [12], and also from the stochastic point of view, as boundary conditions in ODEs can be modelled by conditioning on data points [16]. Hence, we focus on PDEs.

From the point of view of machine learning, the results of this paper allow to incorporate information into the covariance structure of a GP prior. This prior is supported by solutions of the differential equation and the boundary conditions. In particular, rare measurement data can refine and improve this prior knowledge, instead of being necessary to learn this prior knowledge.

The contributions of this paper can be summarized as follows:

  1. (a)

    we develop Gröbner basis algorithms for Weyl algebras over certain rings of analytic functions (cf. Sects. 5 and 6),

  2. (b)

    we study boundary conditions parametrized by analytic functions, in particular how they constrain GPs (cf. Sect. 7), and

  3. (c)

    we construct GP priors for solution sets of PDEs including boundary conditions (cf. Sect. 8).

2 Gaussian Processes

A Gaussian Process (GP) \(g={{\,\mathrm{\mathcal{G}\mathcal{P}}\,}}(\mu ,k)\) defines a probability distribution on the evaluations of functions \(D\rightarrow {\mathbb {R}}^\ell \) where \(D\subseteq {\mathbb {R}}^d\equiv {\mathbb {R}}^{1\times d}\) such that function values \(g(x_1),\ldots ,g(x_n)\) at points \(x_1,\ldots ,x_n\in D\) are jointly (multivariate) Gaussian. A GP g is specified by a mean function \(\mu :D\rightarrow {\mathbb {R}}^\ell :x\mapsto E(g(x))\) and a positive semidefiniteFootnote 1 covariance function

$$\begin{aligned} k: D\times D\longrightarrow {\mathbb {R}}^{\ell \times \ell }_{\succeq 0}: (x,x') \longmapsto E\left( (g(x)-\mu (x))(g(x')-\mu (x'))^T\right) \text{. } \end{aligned}$$

Any finite set of evaluations of g follows the multivariate Gaussian distribution

$$\begin{aligned} \begin{bmatrix}g(x_1)\\ \vdots \\ g(x_n)\end{bmatrix} \sim \mathcal {N}\left( \begin{bmatrix}\mu (x_1)\\ \vdots \\ \mu (x_n)\end{bmatrix},\begin{bmatrix} k(x_1,x_1) &{} \ldots &{} k(x_1,x_n) \\ \vdots &{} \ddots &{} \vdots \\ k(x_n,x_1) &{} \ldots &{} k(x_n,x_n) \end{bmatrix}\right) . \end{aligned}$$

Now, one knows where a function value g(x) is supposed to be (mean \(\mu (x)\)), which ignorance we have about g(x) (variance k(xx)), and how two function values \(g(x_1)\) and \(g(x_2)\) are related (covariance \(k(x_1,x_2)\)). GPs are popular functional priors in Bayesian inference due to their maximum entropy property [13].

Assume the probabilistic regression model \(y=g(x)\) for a GP \(g={{\,\mathrm{\mathcal{G}\mathcal{P}}\,}}(0,k)\). Normalizing the data to mean zero justifies assuming a prior mean function zero. Conditioning the GP on training data points \((x_i,y_i)\in D\times {\mathbb {R}}^{1\times \ell }\) for \(i=1,\ldots ,n\) by Bayes’ theorem yields the posterior

$$\begin{aligned} p(\;g(x)=y\;|\;g(x_i)=y_i\;)=\frac{p(\;g(x_i)=y_i\;|\;g(x)=y\;)}{p(\;g(x_i)=y_i\;)}\cdot p(\;g(x)=y\;)\text{, } \end{aligned}$$

where i always runs from 1 to n. All of these distributions are multivariate Gaussian. Hence, the posterior \(p(\;g(x)=y\;|\;g(x_i)=y_i\;)\) is again a GP and can be computed in closed form via linear algebra:

$$\begin{aligned} {{\,\mathrm{\mathcal{G}\mathcal{P}}\,}}\Big ( \quad \quad \quad x&\mapsto yk(X,X)^{-1}k(X,x),\nonumber \\ (x,x')&\mapsto k(x,x')-k(x,X)k(X,X)^{-1}k(X,x')\quad \Big ){,} \end{aligned}$$
(1)

where \(y\in {\mathbb {R}}^{1\times \ell n}\) denotes the row vector obtained by concatenating the \(y_i\) and \(k(x,X)\in {\mathbb {R}}^{\ell \times \ell n}\) resp. \(k(X,x)\in {\mathbb {R}}^{\ell n\times \ell }\) resp. \(k(X,X)\in {\mathbb {R}}^{\ell n\times \ell n}_{\succeq 0}\) denote the (covariance) matrices obtained by concatenating the blocks \(k(x,x_j)\) resp. \(k(x_j,x)\) resp. \(k(x_i,x_j)\) to a matrix. In case of noisy data \((y_i)_j\), one adds the noise variance \(var((y_i)_j)\) to the \(((i-1)\ell +j)\)-th diagonal entry of k(XX). The Cholesky decomposition improves numerical stability regarding the inversion of the positive definite matrix k(XX) [34]. In the posterior (1), the mean function can be used as regression model and its variance as model uncertainty.

The class of GPs is closed under linear operators once mild assumptions hold, e.g. the derivative of a GP with differentiable realizations is again a GP.

Fig. 1.
figure 1

Left: a regression plot (mean and the \(2\sigma \) confidence bands) of a GP with mean zero and squared exponential covariance function conditioned on the points \((-2,-1)\) and (2, 1) with noise variance of \(0.1^2\). Right: the GP is additionally conditioned on derivative 1 with noise \(0.1^2\) at both data points.

figure a

Given a set of functions \(G\subseteq Y^D\) and \(b:Y\rightarrow Z\), then the pushforward is \( b_*G=\{b\circ f\mid f\in G\}\subseteq Z^D \). The pushforward of a stochastic process \(g:D\rightarrow Y\) by \(b:Y\rightarrow Z\) is defined as

$$\begin{aligned} b_*g:D\longrightarrow Z: d\longmapsto (b\circ g)(d). \end{aligned}$$

Lemma 1

([21, Lemma 2.2]). Let \(\mathcal {F}\) and \({\mathcal {G}}\) be spaces of functions defined on a set \(D\) with product \(\sigma \)-algebra of the function evaluations. Let \(g={{\,\mathrm{\mathcal{G}\mathcal{P}}\,}}(\mu (x),k(x,x'))\) with realizations in \(\mathcal {F}\) and \(B:\mathcal {F}\rightarrow {\mathcal {G}}\) a linear, measurable operator which commutes with expectation w.r.t. the measure induced by g on \(\mathcal {F}\) and by \(B_*g\) on \({\mathcal {G}}\). Then, the Gaussian Process (GP) \(B_*g\) of g under B is a GP with

$$\begin{aligned} B_*g = {{\,\mathrm{\mathcal{G}\mathcal{P}}\,}}(B\mu (x),Bk(x,x')(B')^T) \text{, } \end{aligned}$$

where \(B'\) denotes the operation of B on functions with argument \(x'\).

Example 1

Let \(g={{\,\mathrm{\mathcal{G}\mathcal{P}}\,}}(0,k(x,x'))\) be a GP with realizations (a.s.) in the set \(C^1({\mathbb {R}},{\mathbb {R}})\) of differentiable functions. The pushforward GP

$$\begin{aligned} \begin{bmatrix}\frac{\partial }{\partial x}\end{bmatrix}_*g := {{\,\mathrm{\mathcal{G}\mathcal{P}}\,}}\left( 0,\frac{\partial ^2}{\partial x\partial x'}k(x,x')\right) \end{aligned}$$

describes derivatives of the GP g [6, §5.2]. The one-argument derivative \(\frac{\partial }{\partial x}k(x,x')\) yields the cross-covariance between on the one hand a function evaluation \(g(x')\) of g at \(x'\in {\mathbb {R}}\) and on the other hand its derivative \((\begin{bmatrix}\frac{\partial }{\partial x}\end{bmatrix}_*g)(x)\) evaluated at \(x\in {\mathbb {R}}\). We use this to include data of derivatives into a model in Fig. 1.

3 Solution Sets of Operator Equations

This section discusses how GPs describe the real vector space \(\mathcal {F}=C^\infty (D,{\mathbb {R}})\), a candidate set of solutions for the linear differential equations, and how such GPs interplay with linear operators. Assume that \(D\subset {\mathbb {R}}^d\) is compact and \(\mathcal {F}\) is endowed with the usual Fréchet topology generated by the separating family

$$\begin{aligned} \Vert f\Vert _{a}:= \sup _{\begin{array}{c} i\in {\mathbb {Z}}_{\ge 0}^d\\ |i|\le a \end{array}} \sup _{x\in D}\ \left| \frac{\partial ^{|i|}}{\partial x^i}f(x)\right| \end{aligned}$$
(2)

of seminorms for all \(a\in {\mathbb {Z}}_{\ge 0}\), where \(i=(i_1,\ldots ,i_d)\in {\mathbb {Z}}_{\ge 0}^d\) is a multi-index with \(|i|=i_1+\ldots +i_d\). The squared exponential covariance function

$$\begin{aligned} k_\mathcal {F}:{\mathbb {R}}^d\times {\mathbb {R}}^d\longrightarrow {\mathbb {R}}:(x_i,x_j)\longmapsto \exp \left( -\frac{1}{2}\sum _{a=1}^d(x_{i,a}-x_{j,a})^2\right) \end{aligned}$$
(3)

induces an adapted GP prior in \(\mathcal {F}=C^\infty (D,{\mathbb {R}})\).

Proposition 1

The scalar GP \(g_\mathcal {F}={{\,\mathrm{\mathcal{G}\mathcal{P}}\,}}(0,k_\mathcal {F})\) has realizations dense (a.s.) in \(\mathcal {F}\) with respect to the Fréchet topology defined by Eq. (2).

Proof

We show that the realizations of \(g_\mathcal {F}\) are densely contained in \(\mathcal {F}\) in three steps: first, the realizations are contained in \(\mathcal {F}\), i.e., smooth; second, the elements of the reproducing kernel Hilbert space (RKHS)Footnote 2 \({\mathcal {H}}(g_\mathcal {F})\) of the GP \(g_\mathcal {F}\), are realizations; and third, the RKHS \({\mathcal {H}}(g_\mathcal {F})\) is dense in \(\mathcal {F}\).

First, show that the realizations of \(g_\mathcal {F}\) lie in \(\mathcal {F}\). They are continuously differentiable, as \(k_\mathcal {F}\) is twice continuously differentiable [6, (9.2.2)]. Continue inductively, as the covariance \(\frac{\partial ^2}{\partial x\partial x'}k_\mathcal {F}(x,x')\) of the derivative of \(g_\mathcal {F}\) is again smooth.

For the second step, we note that \(C^\infty (D,{\mathbb {R}})\) is Radon as \(D\) is compact, hence \(g_\mathcal {F}\) induces a Radon measure on \(\mathcal {F}\). For any Radon measure, \({\mathcal {H}}(g_\mathcal {F})\) is contained in the topological support of the measure induced by \(g_\mathcal {F}\) by [4, Thm. 3.6.1]. For this, \(\mathcal {F}=C^\infty (D,{\mathbb {R}})\) needs to be locally convex, which it is being Fréchet.

For the third step, by [41, Prop. 4], \({\mathcal {H}}(g_\mathcal {F})\) is continuously contained in \(\mathcal {F}\) and dense by [41, Thm. 12, Prop. 42] or [41, after proof of Cor. 38].   \(\square \)

The following three \({\mathbb {R}}\)-algebras R model linear operator equations by making \(\mathcal {F}\) a left R-module. Sections 5 and 6 introduce Gröbner bases for such rings.

Example 2

The polynomial ring \(R={\mathbb {R}}[\partial _{x_1},\ldots ,\partial _{x_d}]\) models linear PDEs with constant coefficients, where \(\partial _{x_i}\) acts on \(\mathcal {F}=C^\infty (D,{\mathbb {R}})\) via partial derivative with respect to \(x_i\).

Example 3

Let \(f_1,\ldots ,f_n\in \mathcal {F}\) be functions. The ring \(R={\mathbb {R}}[f_1,\ldots ,f_n]\) is commutative and models boundary conditions by multiplication, see Sect. 7.

Example 4

Let \(F\subseteq \mathcal {F}\) be an \({\mathbb {R}}\)-algebra closed under partial derivatives. To combine linear differential equations with boundary conditions, consider the Weyl algebra \(R={\mathbb {R}}[F]\langle \partial _{x_1},\ldots ,\partial _{x_d}\rangle \). The non-commutative relation \(\partial _{x_i}f=f\partial _{x_i}+\frac{\partial f}{\partial x_i}\) represents the product rule of differentiation for \(f\in F\) and \(1\le i\le d\).

Operators defined over these three rings satisfy the assumptions of Lemma 1: multiplication commutes with expectations and the dominated convergence theorem implies that expectation commutes with derivatives, as realizations of \(g_\mathcal {F}\) are continuously differentiable. Furthermore, these rings act continuously on \(\mathcal {F}\): the Fréchet topology makes derivation continuous by construction, and multiplication by elements in \(\mathcal {F}\) is bounded as D is compact, which implies continuity in the Fréchet space \(\mathcal {F}\). In particular, we have the following:

Corollary 1

Let \(\mathcal {F}=C^\infty (D,{\mathbb {R}})\) be the space of smooth functions defined on a compact set \(D\subset {\mathbb {R}}^d\). Let \(g={{\,\mathrm{\mathcal{G}\mathcal{P}}\,}}(\mu (x),k(x,x'))\) with realizations in \(\mathcal {F}^{\ell ''}\) and \(B:\mathcal {F}^{\ell ''}\rightarrow \mathcal {F}^\ell \) a linear operator over one of the operator rings in Examples 23, or 4. Then, the pushforward GP \(B_*g\) is again Gaussian with

$$\begin{aligned} B_*g = {{\,\mathrm{\mathcal{G}\mathcal{P}}\,}}(B\mu (x),Bk(x,x')(B')^T) \text{, } \end{aligned}$$

where \(B'\) denotes the operation of B on functions with argument \(x'\).

4 Parametrizations

We consider solution sets of linear differential equations, how to parametrize them by a suitable matrix B and thereby describe them by a GP \(B_*g\). Let \(R\) be one of the rings from the previous section, \(\mathcal {F}\) the left \(R\)-module \(C^{\infty }(D, {\mathbb {R}})\) and \(A\in R^{\ell '\times \ell }\). Define the solution set \({{\,\mathrm{sol}\,}}_\mathcal {F}(A):=\{f\in \mathcal {F}^{\ell \times 1}\mid Af=0\}\) of A. We say that a GP is in a function space, if its realizations are a.s. contained in said space. We first describe the interplay of GPs and solution sets of operators.

Lemma 2

([19, Lemma 2.2]). Let \(g={{\,\mathrm{\mathcal{G}\mathcal{P}}\,}}(\mu ,k)\) be a GP in \(\mathcal {F}^{\ell \times 1}\). Then g is a GP in the solution set \({{\,\mathrm{sol}\,}}_\mathcal {F}(A)\) of \(A\in R^{\ell '\times \ell }\) if and only if both \(\mu \) is contained in \({{\,\mathrm{sol}\,}}_\mathcal {F}(A)\) and \(A_*(g-\mu )\) is the constant zero process.

This lemma motivates how to construct GPs with realizations in \({{\,\mathrm{sol}\,}}_\mathcal {F}(A)\): find a \(B\in R^{\ell \times \ell ''}\) with \(AB=0\) [15]. Then, taking any GP \(g={{\,\mathrm{\mathcal{G}\mathcal{P}}\,}}(0,k)\) in \(\mathcal {F}^{\ell ''\times 1}\), the realizations of \(B_*g\) are (possibly strictly) contained in \({{\,\mathrm{sol}\,}}_\mathcal {F}(A)\), as \(A_*(B_*g)=(AB)_*g=0_*g=0\). One prefers to enlarge B to approximate all solutions in \({{\,\mathrm{sol}\,}}_\mathcal {F}(A)\) by \(B_*g\), i.e., the realizations of \(B_*g\) should be dense in \({{\,\mathrm{sol}\,}}_\mathcal {F}(A)\). Call \(B\in R^{\ell \times \ell ''}\) a parametrization of \({{\,\mathrm{sol}\,}}_\mathcal {F}(A)\) if \({{\,\mathrm{sol}\,}}_\mathcal {F}(A)=B\mathcal {F}^{\ell ''\times 1}\). Such a parametrization does not always exist, e.g., for the matrix \(A=\begin{bmatrix}\partial _{x_1}\end{bmatrix}\).

Proposition 2

([21, Proposition 3.5]). Let \(B\in R^{\ell \times \ell ''}\) be a parametrization of \({{\,\mathrm{sol}\,}}_\mathcal {F}(A)\). Let \(g_\mathcal {F}^{\ell ''\times 1}\) be the GP of \(\ell ''\) i.i.d. copies of \(g_\mathcal {F}\), the GP with squared exponential covariance \(k_\mathcal {F}\) (3). Then, \(B_*g_\mathcal {F}^{\ell ''\times 1}\) has realizations dense in \({{\,\mathrm{sol}\,}}_\mathcal {F}(A)\).

We summarize how to algorithmically decide whether a parametrization exists and how to compute it in the positive case. Computations directly over the space of functions \(\mathcal {F}\) are infeasible. Hence, we compute over R instead. Inferring results over \(\mathcal {F}\) is possible once \(\mathcal {F}\) is an injectiveFootnote 3 R-module, i.e., \({{\,\mathrm{Hom}\,}}_R(-,\mathcal {F})\) is exact. Luckily, for PDEs with constant coefficients we have the following:

Theorem 1

([7, 26] [31, §(54)]). Let \(R={\mathbb {R}}[\partial _{x_1},\ldots ,\partial _{x_d}]\) be as in Example 2 and \(D\subset {\mathbb {R}}^d\) convex. Then, \(\mathcal {F}=C^\infty (D,{\mathbb {R}})\) is an injective R-module.

With this in mind, we recall the construction of parametrizations.

Theorem 2

([49, Thm. 2] [31, §7.(24)] [5, 32, 33, 37]). Let \(R\) be a ring and \(\mathcal {F}\) an injective left \(R\)-module. Let \(A\in R^{\ell '\times \ell }\). Let B be the right nullspace of A and \(A'\) the left nullspace of B. Then \({{\,\mathrm{sol}\,}}_\mathcal {F}(A')\) is the largest subset of \({{\,\mathrm{sol}\,}}_\mathcal {F}(A)\) that is parametrizable, B parametrizes \({{\,\mathrm{sol}\,}}_\mathcal {F}(A')\), and \({{\,\mathrm{sol}\,}}_\mathcal {F}(A)\) is parametrizable if and only if the rows of A and \(A'\) generate the same row module, i.e., if all rows of \(A'\) are contained in the row module generated by A.

Gröbner bases turn Theorem 2 effective, as they allow to compute the right nullspace B of A, the left nullspace \(A'\) of B and decide whether the rows of \(A'\) are contained in the row space of A over R. We have the following criterion.

Theorem 3

([31, §7.(21)]). A system \({{\,\mathrm{sol}\,}}_\mathcal {F}(A)\) is parametrizable if and only if it is controllable. If A is not parametrizable, then the solution set \({{\,\mathrm{sol}\,}}_\mathcal {F}(A')\) is the subset of controllable behaviors in \({{\,\mathrm{sol}\,}}_\mathcal {F}(A)\), where \(A'\) is defined as in Theorem 2.

Solution sets of differential equations and polynomial boundary conditions can be intersected [21].

Theorem 4

([21, Theorem 5.2]). Let \(B_1\in R^{\ell \times \ell _1''}\) and \(B_2\in R^{\ell \times \ell _2'}\). Denote by \(C:=\begin{bmatrix} C_1 \\ C_2\end{bmatrix}\in R^{(\ell _1'+\ell _2')\times m}\) the right-nullspace of the matrix \(B:=\begin{bmatrix} B_1&B_2\end{bmatrix}\in R^{\ell \times (\ell _1'+\ell _2')}\). Then \(B_1C_1=-B_2C_2\) parametrizes solutions of \(B_1\mathcal {F}^{\ell _1'}\cap B_2\mathcal {F}^{\ell _2'}\).

Here, \(B_1\) might be a matrix of differential operators and \(B_2\) a matrix of polynomial functions, and we consider both matrices over a common ring R.

5 Rings of Differential Operators over Differential Algebras

We have considered parametrizations by differential operators and in Sect. 7 we consider parametrizations of boundary conditions by analytic functions. For their combination in Sect. 8, we now extend classical Gröbner and Janet bases.

Let \(D\subset {\mathbb {R}}^d\) be connected and denote by \(\delta _1, \ldots , \delta _d\) the commuting derivations in the coordinate directions of \({\mathbb {R}}^d\). Let \(K\) be a differential algebra over the real numbersFootnote 4 \({\mathbb {R}}\) generated by analytic functions \(f_1,\ldots ,f_r:D\rightarrow {\mathbb {R}}\). For algorithmic reasons assume that \(K\) is finitely presented as a differential algebra over \({\mathbb {R}}\) as

$$ K\, = \, {\mathbb {R}}\{ f_1, \ldots , f_r \} \, \cong \, {\mathbb {R}}\{ F_1, \ldots , F_r \} / P, $$

where \(P\) is a prime differential ideal of \({\mathbb {R}}\{ F_1, \ldots , F_r \}\), generated by

$$\begin{aligned} \{ \, \delta _j F_i - g_{i,j} \mid i = 1, \ldots , r, \, j = 1, \ldots , d \, \}, \end{aligned}$$
(4)

where \(g_{i,j} \in {\mathbb {R}}[F_1, \ldots , F_r]\) are (non-differential) polynomials in \(F_1, \ldots , F_r\), and the above isomorphism is given by \(f_i\mapsto F_i + P\). In particular, the generators \(f_1, \ldots , f_r\) of \(K\) are algebraically independent over \({\mathbb {R}}\). Then \(K\) is isomorphic to \({\mathbb {R}}[f_1, \ldots , f_r]\) as an \({\mathbb {R}}\)-algebra, and \(K\) is Noetherian, factorial, and a GCD domain.

Example 5

For the differential algebra \(K= {\mathbb {Q}}\{ x, y, \exp (x^2+y^2-1) \}\) with derivations \(\delta _1 = \partial /\partial x\), \(\delta _2 = \partial /\partial y\) we have \(K\cong {\mathbb {Q}}\{ F_1, F_2, F_3 \} / P\), where

$$\begin{aligned} \delta _1 F_1 - 1,&\delta _2 F_1,&\delta _1 F_2,&\delta _2 F_2 - 1,&\delta _1 F_3 - 2 F_1 F_3,&\delta _2 F_3 - 2 F_2 F_3 \end{aligned}$$

generate the prime differential ideal P such that

$$\begin{aligned} {\mathbb {Q}}\{ F_1, F_2, F_3 \} \longrightarrow K: F_1 \longmapsto x, \, F_2 \longmapsto y, \, F_3 \longmapsto \exp (x^2+y^2-1) \end{aligned}$$

is an epimorphism of differential algebras over \({\mathbb {Q}}\) mapping precisely \(P\) to zero.

Definition 1

Let the ring of differential operators \(R= K\langle \partial _1, \ldots , \partial _d \rangle \) be the iterated Ore extension of \(K\) defined by

$$ \begin{array}{rclll} \partial _i \, a &{} = &{} a \, \partial _i + \delta _i(a), \quad &{} \quad a \in K, &{} \quad i = 1, \ldots , d,\\ \partial _i \, \partial _j &{} = &{} \partial _j \, \partial _i, &{} &{} \quad i, j = 1, \ldots , d. \end{array} $$

Remark 1

The ring \(R\) is (left) Noetherian, because \(K\) is Noetherian (cf., e.g., [28, Thm. 1.2.9 (iv)]). Moreover, \(R\) has the left Ore property, i.e., every pair of non-zero elements of \(R\) has a non-zero common left multiple [28, Thm. 2.1.15], which, in particular, implies the existence of a skew field of fractions of \(R\).

We define the set of monomials of \(R\) as

$$ {{\,\mathrm{Mon}\,}}(R) = \{ \, f_1^{\alpha _1} \ldots f_r^{\alpha _r} \, \partial _1^{\beta _1} \ldots \partial _d^{\beta _d} \mid \alpha _1, \ldots , \alpha _r, \beta _1, \ldots , \beta _d \in {\mathbb {Z}}_{\ge 0} \, \}. $$

It is a basis of \(R\) as an \({\mathbb {R}}\)-vector space: every \(p \in R\) has a unique representation

figure b

where \(c_m \in {\mathbb {R}}\) and only finitely many \(c_m\) are non-zero.

A monomial ordering < on \(R\) is a total ordering on \({{\,\mathrm{Mon}\,}}(R)\) satisfying

$$ \begin{array}{ll} f_1^0 \ldots f_r^0 \, \partial _1^0 \ldots \partial _d^0 = 1< m &{} \quad \text{ for } \text{ all } 1 \ne m \in {{\,\mathrm{Mon}\,}}(R),\\ m_1< m_2 \quad \Rightarrow \quad f_i \, m_1< f_i \, m_2 &{} \quad \text{ for } \text{ all } m_1, m_2 \in {{\,\mathrm{Mon}\,}}(R), \, i = 1, \ldots , r,\\ m_1< m_2 \quad \Rightarrow \quad m_1 \, \partial _j < m_2 \, \partial _j &{} \quad \text{ for } \text{ all } m_1, m_2 \in {{\,\mathrm{Mon}\,}}(R), \, j = 1, \ldots , d. \end{array} $$

For every \(0 \ne p \in R\) the <-greatest monomial m occurring with non-zero coefficient \(c_m\) in the representation (*) of p is called the leading monomial of p and is denoted by \({{\,\mathrm{lm}\,}}(p)\). Its coefficient \(c_m\) is called the leading coefficient of p and is denoted by \({{\,\mathrm{lc}\,}}(p)\). For a subset S of \(R\) we let \({{\,\mathrm{lm}\,}}(S) = \{ \, {{\,\mathrm{lm}\,}}(s) \mid 0 \ne s \in S \, \}\).

Example 6

The weighted degree-reverse-lexicographical ordering < with weights \(w=(w_1, \ldots , w_{r+d})\in {\mathbb {Q}}_{>0}^{r+d}\) (weighted deg-rev-lex) is defined by

$$\begin{aligned} f_1^{\alpha _1} \ldots f_r^{\alpha _r} \partial _1^{\alpha _{r+1}} \ldots \partial _d^{\alpha _{r+d}} < f_1^{\alpha '_1} \ldots f_r^{\alpha '_r} \partial _1^{\alpha '_{r+1}} \ldots \partial _d^{\alpha '_{r+d}} \\ \iff \quad \Bigg ({-}\sum _{j=1}^{r+d} w_i \alpha _i, \alpha _{r+d}, \ldots , \alpha _1\Bigg ) >_{\mathrm{lex}} \Bigg ({-}\sum _{j=1}^{r+d} w_i \alpha '_i, \alpha '_{r+d}, \ldots , \alpha '_1\Bigg ), \end{aligned}$$

where \(\alpha _i, \alpha '_i\in {\mathbb {Z}}_{\ge 0}\) and \(>_{\mathrm{lex}}\) compares tuples lexicographically.

Example 7

We let the elimination ordering < on \(R\) (eliminating \(\partial _1\), ..., \(\partial _d\)) be

$$ \begin{array}{ll} &{} f_1^{\alpha _1} \ldots f_r^{\alpha _r} \, \partial _1^{\beta _1} \ldots \partial _d^{\beta _d} \, < \, f_1^{\alpha '_1} \ldots f_r^{\alpha '_r} \, \partial _1^{\beta '_1} \ldots \partial _d^{\beta '_d}\\[1em] \iff &{} \displaystyle \Big ( \, \partial _1^{\beta _1} \ldots \partial _d^{\beta _d} \, \prec _{\partial } \, \partial _1^{\beta '_1} \ldots \partial _d^{\beta '_d} \qquad \text{ or }\\[1em] &{} \displaystyle \quad \partial _1^{\beta _1} \ldots \partial _d^{\beta _d} \, = \, \partial _1^{\beta '_1} \ldots \partial _d^{\beta '_d} \quad \text{ and } \quad f_1^{\alpha _1} \ldots f_r^{\alpha _r} \, \prec _f \, f_1^{\alpha '_1} \ldots f_r^{\alpha '_r} \, \Big ), \end{array} $$

where \(\alpha _i\), \(\alpha '_i\), \(\beta _j\), \(\beta '_j \in {\mathbb {Z}}_{\ge 0}\) and where \(\prec _{\partial }\) and \(\prec _f\) are the deg-rev-lex ordering on the polynomial algebras \({\mathbb {Q}}[\partial _1, \ldots , \partial _d]\) and \({\mathbb {Q}}[f_1, \ldots , f_r]\), respectively.

Assumption 1

The monomial ordering < on \(R\) is chosen such that the leading monomial of

$$ \partial _j \, f_i \, = \, f_i \, \partial _j + \delta _j(f_i) \, = \, f_i \, \partial _j + g_{i,j} $$

with respect to > is \(f_i \, \partial _j\), for all \(i = 1\), ..., r and \(j = 1, \ldots , d\). (Recall that \(f_i \, \partial _j + g_{i,j}\) is the representation (*) of \(\partial _j \, f_i\) taking the generators (4) of the prime differential ideal \(P\) into account.)

In what follows, we make Assumption 1, which is met if > is a degree-reverse-lexicographical ordering with weights \((v_1, \ldots , v_r, w_1, \ldots , w_d)\) satisfying

$$ w_j \, \ge \, \max _{i=1, \ldots , r} \left( \, \sum _{k=1}^r v_k \deg _{f_k}(g_{i,j}) - v_i \, \right) \qquad \text{ for } \text{ all } j = 1, \ldots , d, $$

or if > is an elimination ordering as in Example 7.

Before introducing Janet bases for left ideals of \(R\) we recall the concept of Janet division, which we formulate for ideals of the free commutative semigroup \(({\mathbb {Z}}_{\ge 0})^{r+d}\) in our context. Note that if \(I\) is a non-zero left ideal of \(R\), then the exponent vectors \((\alpha _1, \ldots , \alpha _r, \beta _1, \ldots , \beta _d)\) of all elements of \({{\,\mathrm{lm}\,}}(I)\) form an ideal of \(({\mathbb {Z}}_{\ge 0})^{r+d}\) due to the definition of a monomial ordering and Assumption 1. The bijection between \({{\,\mathrm{Mon}\,}}(R)\) and \(({\mathbb {Z}}_{\ge 0})^{r+d}\) may as well be chosen to be, e.g.,

$$ \varepsilon :{{\,\mathrm{Mon}\,}}(R) \longrightarrow ({\mathbb {Z}}_{\ge 0})^{r+d}: f_1^{\alpha _1} \ldots f_r^{\alpha _r} \, \partial _1^{\beta _1} \ldots \partial _d^{\beta _d} \longmapsto (\beta _1, \ldots , \beta _d, \alpha _1, \ldots , \alpha _r), $$

which is the bijection we usually work with.

Recall that every ideal of \(({\mathbb {Z}}_{\ge 0})^{r+d}\) is finitely generated; moreover, it has a unique minimal generating set. For \(k \in \{ 1, \ldots , r+d \}\) we denote by \({\textbf {1}}_k\) the multi-index with 1 in position k and 0 elsewhere. Following M. Janet (cf., e.g., [36]) we make the following definition in terms of exponent vectors.

Definition 2

Let \(A \subset ({\mathbb {Z}}_{\ge 0})^{r+d}\) be finite and \(\alpha = (\alpha _1, \ldots , \alpha _{r+d}) \in A\). Then \(\varepsilon ^{-1}({\textbf {1}}_k)\) is said to be multiplicative for the monomial \(\varepsilon ^{-1}(\alpha )\) if and only if

$$ \alpha _k \, = \, \max \{ \, \alpha '_k \mid (\alpha '_1, \ldots , \alpha '_{r+d}) \in A \text{ with } \alpha '_1 = \alpha _1, \, \ldots , \, \alpha '_{k-1} = \alpha _{k-1} \, \}. $$

Let \(M \subset {{\,\mathrm{Mon}\,}}(R)\) be finite. Then for every \(m \in M\) we obtain a partition \(\mu (m, M) \uplus \overline{\mu }(m, M)\) of \(\{ f_1, \ldots , f_r, \partial _1, \ldots , \partial _d \}\), where each element of \(\mu (m, M)\) is multiplicative for m and each element of \(\overline{\mu }(m, M)\) is non-multiplicative for m.

Example 8

Let \(r = 2\), \(n = 1\), \(M = \{ \, f_1 \, f_2^2, \, f_1^2 \, f_2, \, f_2 \, \partial _1^2, \, f_1 \, \partial _1^2 \, \}\). Using the above bijection \(\varepsilon \) we obtain

$$ \begin{array}{l} \mu (f_1 \, f_2^2, M) = \{ \, f_2 \, \}, \qquad \mu (f_1^2 \, f_2, M) = \{ \, f_1, f_2 \, \},\\ \mu (f_2 \, \partial _1^2, M) = \{ \, \partial _1, f_2 \, \}, \qquad \mu (f_1 \, \partial _1^2, M) = \{ \, \partial _1, f_1, f_2 \, \}. \end{array} $$

Definition 3

Let \(M \subset {{\,\mathrm{Mon}\,}}(R)\) be finite. We define two supersets of M in \({{\,\mathrm{Mon}\,}}(R)\) as follows:

figure c

where the latter union is disjoint by construction of Janet division. The set M of monomials is said to be Janet complete if \([ \, M \, ] = \langle \, M \, \rangle \).

Any finite subset M of \({{\,\mathrm{Mon}\,}}(R)\) has a unique smallest (finite) Janet complete superset of M, which we call the Janet completion of M [36, Subsect. 2.1.1].

Definition 4

Let \(I\) be a non-zero left ideal of \(R\). Using the notation of Definition 3, a finite generating set \(G \subset R\setminus \{ 0 \}\) for \(I\) is called a Gröbner basis for \(I\) with respect to the monomial ordering < if \(\langle \, {{\,\mathrm{lm}\,}}(G) \, \rangle = {{\,\mathrm{lm}\,}}(I)\). If moreover, \({{\,\mathrm{lm}\,}}(G)\) is Janet complete, i.e., \([ \, {{\,\mathrm{lm}\,}}(G) \, ] = \langle \, {{\,\mathrm{lm}\,}}(G) \, \rangle = {{\,\mathrm{lm}\,}}(I)\), then G is called a Janet basis for \(I\) with respect to <.

Assumption 1 facilitates a multivariate polynomial division in \(R\).

Remark 2

Suppose \(L \subset R\setminus \{ 0 \}\) is finite and \({{\,\mathrm{lm}\,}}(L)\) is Janet complete. Let \(p_1 \in R\setminus \{ 0 \}\). If \({{\,\mathrm{lm}\,}}(p_1) \in [ \, {{\,\mathrm{lm}\,}}(L) \, ]\), then there exists a unique \(p_2 \in L\) such that

$$ {{\,\mathrm{lm}\,}}(p_1) \, = \, f_1^{\phi _1} \ldots f_r^{\phi _r} \, {{\,\mathrm{lm}\,}}(p_2) \, \partial _1^{\psi _1} \ldots \partial _d^{\psi _d} $$

for certain \(\phi _i\), \(\psi _j \in {\mathbb {Z}}_{\ge 0}\), where \(\phi _i = 0\) if \(f_i \not \in \mu ({{\,\mathrm{lm}\,}}(p_2), {{\,\mathrm{lm}\,}}(L))\) and \(\psi _j = 0\) if \(\partial _j \not \in \mu ({{\,\mathrm{lm}\,}}(p_2), {{\,\mathrm{lm}\,}}(L))\). Therefore, subtracting \({{\,\mathrm{lc}\,}}(p_1) \, f_1^{\phi _1} \ldots f_r^{\phi _r} \, \partial _1^{\psi _1} \ldots \partial _d^{\psi _d} \, p_2\) from \({{\,\mathrm{lc}\,}}(p_2) \, p_1\) yields either zero or an element of \(R\) whose leading monomial is less than \({{\,\mathrm{lm}\,}}(p_1)\). Since a monomial ordering < does not admit infinitely descending chains of monomials, this reduction procedure always terminates.

Iterated reduction, as just defined, modulo a Gröbner basis or a Janet basis for the left ideal \(I\) allows to decide membership to \(I\).

Proposition 3

Let G be a Gröbner basis or a Janet basis for the left ideal \(I\) of \(R\) with respect to any monomial ordering <, and let \(p \in R\). Then we have \(p \in I\) if and only if the remainder of reduction of p modulo G is zero.

Remark 3

Given a finite generating set L for a non-zero left ideal \(I\) of \(R\) and given a monomial ordering < as above, a Janet basis for \(I\) with respect to < can be computed in finitely many steps [36]. After a preliminary pairwise reduction of elements of L ensuring that the leading monomials of elements of L are pairwise different and that \(\varepsilon ({{\,\mathrm{lm}\,}}(L))\) is the unique minimal generating set of the ideal of \(({\mathbb {Z}}_{\ge 0})^{r+d}\) it generates, multiplicative and non-multiplicative variables are determined for each leading monomial (with respect to \({{\,\mathrm{lm}\,}}(L)\)) and L is replaced by its Janet completion. Reduction of left multiples of elements of L by non-multiplicative variables may yield non-zero remainders in \(I\). Augmenting L by such elements results in a larger ideal \(\varepsilon ({{\,\mathrm{lm}\,}}(L))\) of \(({\mathbb {Z}}_{\ge 0})^{r+d}\) than previously. Since every ascending chain of such ideals becomes stationary after finitely many steps, by iteration of these steps, one obtains a generating set G for \(I\) whose left multiples by non-multiplicative variables reduce to zero modulo G, which is a Janet basis for \(I\) with respect to <.

6 Module-Theoretic Constructions

The techniques of Sect. 5 can be extended to effectively deal with finitely presented left (and right) \(R\)-modules and module homomorphisms between them.

Let \(R\) be as in the previous section and \(q \in {\mathbb {N}}\). We choose the standard basis \(e_1\), ..., \(e_q\) of the free left \(R\)-module \(R^{1 \times q}\) and define the set of monomials

$$ {{\,\mathrm{Mon}\,}}(R^{1 \times q}) = \{ \, f_1^{\alpha _1} \ldots f_r^{\alpha _r} \, \partial _1^{\beta _1} \ldots \partial _d^{\beta _d} \, e_k \mid \alpha _i, \beta _j \in {\mathbb {Z}}_{\ge 0}, \, k = 1, \ldots , q \, \}. $$

Then every element of \(R^{1 \times q}\) has a unique representation as in (*), where \({{\,\mathrm{Mon}\,}}(R)\) is replaced by \({{\,\mathrm{Mon}\,}}(R^{1 \times q})\). By generalizing the notion of monomial ordering defined in Sect. 5 to total orderings on \({{\,\mathrm{Mon}\,}}(R^{1 \times q})\), one can extend the reduction procedure described in Remark 2 and indeed any algorithm computing Gröbner or Janet bases for left ideals of \(R\) to one that computes Gröbner or Janet bases for submodules \(R^{1 \times p} A\) of \(R^{1 \times q}\), where \(A \in R^{p \times q}\). In particular, membership to such a submodule can be decided by reduction, and therefore, computations with residue classes in \(R^{1 \times q} / R^{1 \times p} A\) can be performed effectively.

We recall some relevant monomial orderings on \(R^{1 \times q}\).

Example 9

A monomial ordering \(\prec \) on \(R\) can be extended to monomial orderings < on \(R^{1 \times q}\) in different ways, for example, by defining

$$ m_1 \, e_k \, < \, m_2 \, e_l \quad \iff \quad \Big ( \, m_1 \prec m_2 \quad \text{ or } \quad \big ( \, m_1 = m_2 \quad \text{ and } \quad k > l \, \big ) \, \Big ) $$

(“term-over-position”), or by defining

$$ m_1 \, e_k \, < \, m_2 \, e_l \qquad \iff \qquad \Big ( \, k > l \quad \text{ or } \quad \big ( \, k = l \quad \text{ and } \quad m_1 \prec m_2 \, \big ) \, \Big ) $$

(“position-over-term”), where \(m_1\), \(m_2 \in {{\,\mathrm{Mon}\,}}(R)\) and k, \(l \in \{ 1, \ldots , q \}\).

Example 10

Let \(s \in \{ 1, \ldots , q-1 \}\) and \(\prec _1\), \(\prec _2\) be monomial orderings on \(R^{1 \times s}\) and \(R^{1 \times (q-s)}\), with standard bases \(e_1\), ..., \(e_s\) and \(e_{s+1}\), ..., \(e_q\), respectively. A monomial ordering < on \(R^{1 \times q}\) eliminating \(e_1\), ..., \(e_s\) is defined by

$$ \begin{array}{rcl} m_1 \, e_k \,< \, m_2 \, e_l &{} \iff &{} \Big ( \, l \le s < k \quad \, \text{ or }\\[1em] &{} &{} \quad \big ( \, k \le s \quad \text{ and } \quad l \le s \quad \text{ and } \quad m_1 \, e_k \, \prec _1 \, m_2 \, e_l \, \big ) \quad \, \text{ or }\\[1em] &{} &{} \quad \big ( \, k> s \quad \text{ and } \quad l > s \quad \text{ and } \quad m_1 \, e_k \, \prec _2 \, m_2 \, e_l \, \big ) \, \Big )\,, \end{array} $$

where \(m_1\), \(m_2 \in {{\,\mathrm{Mon}\,}}(R)\) and k, \(l \in \{ 1, \ldots , q \}\).

Remark 4

Let \(\varphi :R^{1 \times a} \rightarrow R^{1 \times b}\) be a homomorphism of left \(R\)-modules, represented by a matrix \(A\in R^{a \times b}\). A Janet basis for the nullspace of \(\varphi \) can be computed as follows. Join the two standard bases of \(R^{1 \times a}\) and \(R^{1 \times b}\) to obtain the basis \(e_1\), ..., \(e_a\), \(e_{a+1}\), ..., \(e_{a+b}\) of \(R^{1 \times a} \oplus R^{1 \times b} \cong R^{1 \times (a+b)}\). Let < be a monomial ordering on \(R^{1 \times (a+b)}\) as defined in Example 10 for \(q = a+b\), \(s = a\) and certain \(\prec _1\) and \(\prec _2\), i.e., eliminating \(e_1\), ..., \(e_a\). Then let \(J_0\) be a Janet basis, with respect to <, for the submodule of \(R^{1 \times (a+b)}\) generated by the rows of the matrix \((A\quad I_a) \in R^{a \times (b+a)}\), where \(I_a\) is the identity matrix. Now \(J := \{ \, w \in R^{1 \times a} \mid (0, w) \in J_0 \, \}\) is a Janet basis for the nullspace of \(\varphi \) with respect to \(\prec _2\) (cf. also [37, Ex. 3.10], [36, Ex. 3.1.27]).

Remark 5

An involution \(\theta : R\rightarrow R\) of \(R\) allows to reduce computations with right \(R\)-modules to computations with left \(R\)-modules. More precisely, if we have \(\theta (r_1 + r_2) = \theta (r_1) + \theta (r_2)\) and \(\theta (r_1 \, r_2) = \theta (r_2) \, \theta (r_1)\) and \(\theta (\theta (r)) = r\) for all \(r_1\), \(r_2\), \(r \in R\), then any right \(R\)-module M is turned into a left \(R\)-module \(\widetilde{M} := M\) (as abelian groups) via \(r \, m := m \, \theta (r)\), where \(r \in R\), \(m \in \widetilde{M}\), and vice versa. The involution \(\theta \) is extended to matrices by (cf. also [37, Rem. 3.11])

$$ \theta (A) := (\theta ((A^{tr})_{i,j}))_{1 \le i \le q, 1 \le j \le p} \in R^{q \times p}, \qquad \quad A \in R^{p \times q}. $$

Since for \(A \in R^{p \times q}\), \(B \in R^{q \times r}\) we have \(A \, B = 0\) if and only if \(\theta (B) \, \theta (A) = 0\), the computation of nullspaces of homomorphisms of right \(R\)-modules is reduced to the situation described in Remark 4. For \(R\) introduced in Definition 1 we choose

$$ \theta : R\rightarrow R, \quad \theta |_{K} := \mathrm{id}_{K}, \quad \theta (\partial _j) := -\partial _j, \quad j = 1, \ldots , d. $$

7 Parametrizing Boundary Conditions

This section constructs parametrizations of functions satisfying certain boundary conditions, independent of the parametrization of differential equations.

We restrict ourselves to boundary conditions parametrized by analytic functions for two reasons. First, this allows algebraic algorithms. Second, due to the limiting behaviour of GPs when conditioning on more data points, closed sets of functions are preferable, see Theorem 6. For approximate resp. asymptotic resp. partially unknown boundary conditions for GPs see [42] resp. [44] resp. [10]. For a theoretic approach to endow RKHS with boundary information see [30].

Let again \(\mathcal {F}=C^\infty (D,{\mathbb {R}})\) with Fréchet topology from (2) be the set of smooth functions on \(D\subset {\mathbb {R}}^d\) compact, \(K= {\mathbb {R}}\{ f_1, \ldots , f_r \}\) with analytic functions \(f_i: D\rightarrow {\mathbb {R}}\), and let \(R\supseteq K\) be the Ore extension of \(K\).

This section is based on two theorems. The first one describes closed modules satisfying a Nullstellensatz via their Taylor expansion. Denote by \(T_p\) the Taylor series of a (vector or matrix of) smooth function(s) around a point \(p\in D\).

Theorem 5

(Whitney’s Spectral Theorem; [48], [46, V Theorem 1.3]). An \(\mathcal {F}\)-module \(M\le C^\infty (D,{\mathbb {R}})^\ell \) has topological closure \(\overline{M}=\bigcap _{p\in D} T^{-1}_p(T_p(M))\).

The second theorem specifies that analytic functions generate closed modules.

Theorem 6

([27, Theorem 4], [46, VI Theorem 1.1]). Let C be an \(m\times n\)-matrix of analytic functions on \(D\subset {\mathbb {R}}^d\) and \(\phi \in \left( C^\infty (D,{\mathbb {R}})\right) ^m\). Then there is a \(\psi \in \left( C^\infty (D,{\mathbb {R}})\right) ^n\) with \(\phi =C\cdot \psi \) if and only if for all \(p\in D\) the \(T_p(\phi )\) are an \({\mathbb {R}}[[x_1-p_1,\ldots ,x_d-p_d]]\)-linear combination of the columns of T(C).

7.1 Boundary Conditions for Function Values of Single Functions

We begin parametrizing functions which are zero on an analytic set M, e.g. Dirichlet boundary conditions which prescribe values at the boundary \(\partial D\).

We define boundaries \(M\subseteq D\) implicitly via

figure d

where \(I\trianglelefteq K\) is an ideal of equations. For any analytic set \(M\subseteq D\) we have \(M={\mathcal {V}}({\mathcal {I}}(M))\), where \({\mathcal {I}}(M)=\{b\in \mathcal {F}^\ell \mid b(m)=0 \text{ for } \text{ all } m\in M\}\subseteq \mathcal {F}\) is the (closed and radical) ideal of functions vanishing at M. If I is radical (it is automatically closed by Theorem 6, as generated by analytic functions), then \({\mathcal {I}}({\mathcal {V}}(I))=I\). Hence, any set of analytic function defined on \(D\) which generates a radical ideal parametrizes functions vanishing at its zero set. More formally:

Proposition 4

Let \(B'\in K^{1\times \ell }\) be a row of analytic functions whose entries generate a radical \(\mathcal {F}\)-ideal \(I=B'\mathcal {F}^{\ell }\le \mathcal {F}\) of smooth functions. Then, I is the set \(\left\{ f\in \mathcal {F}\mid f_{|{\mathcal {V}}(I)}=0\right\} \) of smooth functions vanishing at \({\mathcal {V}}(I)\).

Proof

The condition \(f_{|{\mathcal {V}}(I)}=0\) restricts the zeroth order Taylor coefficients by homogeneous equations. All functions satisfying such restrictions are contained in the closure \(\overline{I}\) of I by Whitney’s Spectral Theorem 5. The \(\mathcal {F}\)-module parametrization \(I=B'\mathcal {F}^{\ell }\) uses analytic functions as generators, which ensures that the ideal I is already equal to its closure \(\overline{I}\) by Theorem 6.    \(\square \)

We now compare constructions of rows \(B'\) of functions in Proposition 4.

Example 11

Functions \(\mathcal {F}=C^\infty ([0,1]^d,{\mathbb {R}})\) with Dirichlet boundary conditions \(f(\partial D)=0\) at the boundary of the domain \(D=[0,1]^d\) are parametrized by

$$\begin{aligned} B_1'=\begin{bmatrix} \prod _{i=1}^dx_i(x_i-1) \end{bmatrix} \end{aligned}$$
(5)

over \(K={\mathbb {R}}\{x_1,\ldots ,x_n\}={\mathbb {R}}[x_1,\ldots ,x_n]\), by

$$\begin{aligned} B_2'=\begin{bmatrix} 1-\exp \left( (-1)^{d+1}\cdot \frac{\prod _{i=1}^dx_i(x_i-1)}{\delta }\right) \end{bmatrix} \end{aligned}$$
(6)

over \(K={\mathbb {R}}\{\exp (x_1^2),\exp (x_1),x_1,\ldots ,\exp (x_d^2),\exp (x_d),x_d\}\), or byFootnote 5

$$\begin{aligned} B_3'=\begin{bmatrix} \sqrt{\prod _{i=1}^d\left( 1+\frac{\exp \left( -\frac{x_i^2}{\delta }\right) -2\exp \left( -\frac{x_i^2-x_i+1}{\delta }\right) +\exp \left( -\frac{(x_i-1)^2}{\delta }\right) }{\exp \left( -\frac{1}{\delta }\right) -1}\right) }~ \end{bmatrix} \end{aligned}$$
(7)

for any \(\delta >0\). See [9, Section 3] for the special case \(d=2\) in (5). For practical differences of these formalizations of boundary conditions see Remark 6.

Block diagonal matrices parametrize boundaries of a vector of \(\ell >1\) functions. Also, restrictions on sets with higher codimension can be defined.

Example 12

The following three matrices \(\begin{bmatrix} 1-\exp \left( -\frac{|x|}{\delta }\right)&1-\exp \left( -\frac{|y|}{\delta }\right) \end{bmatrix}\), \(\begin{bmatrix} 1-\exp \left( -\frac{\sqrt{x^2+y^2}}{\delta }\right) \end{bmatrix}\), and \(\begin{bmatrix} 1-\exp \left( -\frac{x^2+y^2}{\delta }\right) \end{bmatrix}\) parametrize functions \(f\in \mathcal {F}=C^\infty ({\mathbb {R}}^3,{\mathbb {R}})\) with \(f(0,0,z)=0\). The last parametrization is analytic.

7.2 Boundary Conditions for Derivatives and Vectors

Boundary conditions with vanishing derivatives can be constructed using multiplicities in the (no longer radical) ideal. The proof of the following proposition again follows from Theorems 5 and 6, in a similar way to Proposition 4.

Proposition 5

Let \(B'\in K^{\ell \times \ell '}\) be a matrix of analytic functions whose columns generate an \(\mathcal {F}\)-module \(M=B'\mathcal {F}^{\ell '}\le \mathcal {F}^\ell \) of smooth functions. Then,

figure e

is the closed set of smooth functions sharing the same vanishing lower order Taylor coefficients as the columns of \(B'\).

Example 13

Functions \(\mathcal {F}=C^\infty ([0,1]^d,{\mathbb {R}})\) with Dirichlet boundary conditions \(f(\partial D)=0\) and Neumann boundary condition \(\frac{\partial f}{\partial n}(\partial D)=0\) for n the normal to the boundary \(\partial D\) of the domain \(D=[0,1]^d\) are parametrized by

$$\begin{aligned} B'= \begin{bmatrix} 1-\exp \left( (-1)^{d+1}\cdot \frac{\prod _{i=1}^dx_i^2(x_i-1)^2}{\delta }\right) \end{bmatrix}\text{, } \end{aligned}$$
(8)

constructed by squaring the exponent from the parametrization in (6), or

$$\begin{aligned} B'=\begin{bmatrix} \prod _{i=1}^d\left( 1+\frac{\exp \left( -\frac{x_i^2}{\delta }\right) -2\exp \left( -\frac{x_i^2-x_i+1}{\delta }\right) +\exp \left( -\frac{(x_i-1)^2}{\delta }\right) }{\exp \left( -\frac{1}{\delta }\right) -1}\right) \end{bmatrix}\text{, } \end{aligned}$$
(9)

constructed by the squaring of the parametrization (7) for any \(\delta >0\).

Remark 6

In applications, the non-polynomial parametrizations from Examples 11 and 13 are more suitable. We demonstrate the effect by pushforward GPs obtained from these parametrizations in Fig. 2.

The polynomial pushforward from Example 11 yields the variance \(x^2\cdot (x-1)^2\), which strongly varies in the input interval [0, 1]. The analytic pushforwards from Example 11 also set the variance to zero at the boundary, but quickly return to the original variance, and never exceed it. Even the speed of returning to the original variance can be controlled by changing the parameter \(\delta \).

Fig. 2.
figure 2

GPs, represented by their mean function and two standard deviations. Upper left: a GP g with mean zero and square exponential covariance function. Upper middle: pushforward of the GP g by \(x\cdot (x-1)\) has a strong global influence. Upper right resp. lower left: pushforward of the GP g by (6) resp. (7). Lower middle resp. right: pushforward of the GP g by (8) resp. (9) set the function and its derivative to zero at the boundary. Set \(\delta :=\frac{1}{100}\).

Fig. 3.
figure 3

The mean fields of the GP for divergence-free fields in the interior of \(y^2=\sin (x)^4\) from Example 14, which are conditioned on the data \((-1,0)\) at \((\frac{\pi }{2},0)\) (left) and on (1, 0) resp. (0, 1) at \((\frac{\pi }{4},0)\) resp. \((\frac{\pi }{2},0)\). The data is plotted artificially larger in gray. The flow at the analytic boundary is zero.

Fig. 4.
figure 4

The mean fields of the GP for divergence-free fields from Example 15, which are conditioned on \(v=(0,-1)\) at (0, 1). The flow at the left and right boundary is zero, at the bottom resp. top there is flow into resp. out of the region. Both the data point resp. the inhomogeneous boundary conditions are plotted artificially larger in gray resp. dark gray.

8 Examples

Now, we intersect (Theorem 4) solution sets of differential equations and analytic boundary conditions (Sect. 7) using the algorithms from Sects. 5 and 6.

Example 14

Consider divergence-free fields in the region in \({\mathbb {R}}^2\) bounded by \(f:=y^2-\sin (x)^4\) for \(x\in [0,\pi ]\). Hence, consider

$$A=\begin{bmatrix} \partial _x&\partial _y \end{bmatrix}, B_1=\begin{bmatrix} \partial _y \\ -\partial _x \end{bmatrix}, B_2=\begin{bmatrix} f &{} 0 \\ 0 &{} f \end{bmatrix}.$$

The Matrix \(C=\begin{bmatrix} f^2 \\ \partial _y f \\ -\partial _x f \end{bmatrix}\) from Theorem 4 yields the parametrization

$$\begin{bmatrix} \partial _yf^2 \\ -\partial _xf^2 \end{bmatrix}=\begin{bmatrix} f^2\partial _y+4\cdot f\cdot y \\ -f^2\partial _x+8\cdot f\cdot \sin (x)^3\cos (x) \end{bmatrix}$$

and the push forward covariance

$$\begin{aligned} k\cdot \overline{f}\cdot \begin{bmatrix} 16y_1y_2+4\delta _y\cdot (f_1y_2-f_2y_1)-(\delta _y^2-1)\cdot \overline{f} &{} (f_1\delta _y+4y_1)\cdot {\text {fsc}}_2 \\ (f_2\delta _y-4y_2)\cdot {\text {fsc}}_1 &{} {\text {fsc}}_1\cdot {\text {fsc}}_2 -\overline{f}\cdot (2\delta _x^2-1) \end{bmatrix} \end{aligned}$$

of the squared exponential covariance function \(k=\exp (-\frac{1}{2}((x_1-x_2)^2+(y_1-y_2)^2))\), where \(f_1=f(x_1,y_1)\), \(f_2=f(x_2,y_2)\), \(\overline{f}=f_1\cdot f_2\), \(\delta _x=x_1-x_2\), \(\delta _y=y_1-y_2\), \({\text {sc}}(x)=8\sin (x)^3\cos (x)\), \({\text {fsc}}_1={\text {sc}}(x_1)-\delta _x\cdot f_1\), and \({\text {fsc}}_2={\text {sc}}(x_2)+\delta _x\cdot f_2\). For an illustration of this covariance see Fig. 3.

Example 15

Consider divergence-free fields in the compact domain D bounded by \(-\frac{\pi }{2}\le y\le \frac{\pi }{2}\) and \(3\sin (y)\le x\le 3\sin (y)+2\). Hence, consider

$$A=\begin{bmatrix} \partial _x&\partial _y \end{bmatrix}, B_1=\begin{bmatrix} \partial _y \\ -\partial _x \end{bmatrix}, B_2=\begin{bmatrix} f &{} 0 \\ 0 &{} f \end{bmatrix}$$

for \(f=(y-\frac{\pi }{2})\cdot (y+\frac{\pi }{2})\cdot (x-3\sin (y))\cdot (x-3\sin (y)-2)\). As the first entry in the column C is \(f^2\), such fields can be parametrized by \(\begin{bmatrix} \partial _yf^2 \\ -\partial _xf^2 \end{bmatrix}\). Pushing forward the squared exponential covariance function yields a covariance too big to display.

To encode non-zero boundary conditions we use a non-zero mean. Using the potential \(p:=-\frac{1}{4}\cdot (3\sin (y)-x+3)\cdot (3\sin (y)-x)^2\) yields the divergence-free

$$\begin{aligned} \mu := \begin{bmatrix} -\frac{9}{4} \cdot \cos (y) \cdot (3 \sin (y)-x+2) \cdot (3 \sin (y)-x)\\ -\frac{3}{4} \cdot (3 \sin (y)-x+2) \cdot (3 \sin (y)-x) \end{bmatrix} = \begin{bmatrix} \partial _y\\ -\partial _x \end{bmatrix} p \end{aligned}$$

satisfying the left and right boundaries and non-zero flow through at top and bottom. The GP \({{\,\mathrm{\mathcal{G}\mathcal{P}}\,}}(\mu ,k)\) hence models of divergence-free fields in D with no flow on the sinoidal boundary left or right, but flow into D from the bottom and out of D at the top of the region. See Fig. 4 for a demonstration.