1 Introduction

This paper provides the mathematical foundation for polynomial diffusions on a large class of state spaces in \({\mathbb {R}}^{d}\). A polynomial diffusion is characterized by having a linear drift and quadratic diffusion function. In consequence, moments are given in closed form. Such processes represent an extension of the affine class. They play an important role in a growing range of applications in finance, including financial market models of interest rates, credit risk, stochastic volatility, and commodities and electricity.

An arbitrage-free financial market model is determined by a state price density, i.e., a positive semimartingale \(\zeta\) defined on a filtered probability space \((\varOmega,{\mathcal {F}},{\mathcal {F}}_{t},{\mathbb {P}})\). The model price \(\varPi (t,T)\) at time \(t\) of any time \(T\) cash-flow \(C_{T}\) is given by

$$ \varPi(t,T) = \frac{1}{\zeta_{t}} {\mathbb {E}}\left[ \zeta_{T} C_{T}\,\big|\, {\mathcal {F}} _{t}\right]. $$
(1.1)

We may interpret ℙ as the historical measure, or more generally as an auxiliary measure possibly different from, but equivalent to, the historical measure. A polynomial diffusion model consists of a polynomial diffusion \(X\) as factor process, along with a positive polynomial \(p\) on the state space. The state price density is specified by \(\zeta_{t} = \mathrm{e}^{-\alpha t}p(X_{t})\), where \(\alpha\) is a real parameter chosen to control the lower bound on implied interest rates. We let the time \(T\) cash-flow of a security be given by \(C_{T}=q(X_{T})\) for some polynomial \(q\). The polynomial property of \(X\) along with the elementary fact that \(p q\) is a polynomial implies that \(\varPi(t,T)\) becomes a rational function in \(X_{t}\) with coefficients given in closed form in terms of a matrix exponential. Polynomial diffusion models thus yield closed form expressions for any security with cash-flows specified as polynomial functions of \(X\), which makes them universally applicable in finance. This includes financial market models for interest rates (with \(C_{T}=1\)), credit risk in a doubly stochastic framework (with \(C_{T}\) the conditional survival probability), stochastic volatility (with \(C_{T}\) the spot variance), and commodities and electricity (with \(C_{T}\) the spot price).

While polynomial diffusions have appeared in the literature since Wong [48], so far no existence and uniqueness theory has been available beyond the scalar case. This paper fills this gap and thus provides the mathematical foundation for polynomial diffusion models in finance.

Our main uniqueness result (Theorem 4.2) is based on the classical theory of the moment problem. Since the mixed moments of all finite-dimensional marginal distributions of a polynomial diffusion are uniquely determined by its generator (Theorem 3.1 and Corollary 3.2), uniqueness follows whenever these moments determine the underlying distribution. This is often true, for instance in the affine case or when the state space is compact, or more generally if exponential moments exist; Theorem 3.3 provides sufficient conditions. There are, however, situations where the moment problem approach fails. We therefore provide two additional results based on Yamada–Watanabe type arguments, which give uniqueness in the one-dimensional case (Theorem 4.3) as well as when the process dynamics exhibits a certain hierarchical structure (Theorem 4.4). These uniqueness results do not depend on the geometry of the state space.

In order to study existence, we assume that the state space is a basic closed semialgebraic set, i.e., the nonnegativity set of a finite family of polynomials. Existence reduces to a stochastic invariance problem that we solve under suitable geometric and algebraic conditions on the state space (Theorem 5.3). We also study boundary attainment. In applications, it is frequently of interest to know whether the trajectories of a given process may hit the boundary of the state space. In particular, simulating trajectories becomes a much more delicate task if the boundary is attained; see Lord et al. [35]. We present sufficient conditions for both attainment and non-attainment that are tight (Theorem 5.7).

A semialgebraic state space is a natural choice for at least three reasons. First, positive semidefiniteness of the quadratic diffusion matrix boils down to nonnegativity constraints on polynomials. Second, polynomial diffusion models in finance involve polynomials that are required to be positive on the state space. And third, semialgebraic sets turn out to be an ideal setting for employing tools from real algebraic geometry to verify the hypotheses of our existence and boundary attainment results.

We give a detailed analysis of some specific semialgebraic state spaces that do and will play an important role in financial applications, and that illustrate the scope of polynomial diffusions. Specifically, we consider certain quadric sets including the unit ball \(\{x\in{\mathbb {R}}^{d}: \| x\| \le1\}\), the product space \([0,1]^{m}\times{\mathbb {R}}^{n}_{+}\), and the unit simplex \(\{x\in{\mathbb {R}}^{d}_{+}: x_{1}+\cdots+x_{d}=1\}\). We also elaborate on polynomial diffusion models in finance, and show how to specify novel stochastic models for interest rates, stochastic volatility, and stock markets.

Polynomial processes have been studied in various degrees of generality by several authors, for instance Wong [48], Mazet [38], Zhou [49], Forman and Sørensen [24], among others. The first systematic accounts treating the time-homogeneous Markov jump-diffusion case are Cuchiero [9] and Cuchiero et al. [10]. The use of polynomial diffusions in financial modeling goes back at least to the early 2000s. Zhou [49] used one-dimensional polynomial (jump-)diffusions to build short rate models that were estimated to data using a generalized method-of-moments approach, relying crucially on the ability to compute moments efficiently. A short rate model based on the Jacobi process was presented by Delbaen and Shirakawa [15], and Larsen and Sørensen [33] used the same process for exchange rate modeling. The multidimensional Jacobi process was studied by Gouriéroux and Jasiak [27], who constructed a stock price model with smooth transitions of drift and volatility regimes. More recently, polynomial diffusions have featured in the context of financial applications in several papers; see Filipović et al. [23, 21] for models of the term structure of variance swap rates and interest rates, respectively, models, and Cuchiero et al. [10] for variance reduction for option pricing and hedging, among other applications. There are several reasons for moving beyond the affine class. In particular, nontrivial dynamics on compact state spaces become a possibility, which together with the polynomial property fits well with polynomial expansion techniques; see also Filipović et al. [20]. Also on non-compact state spaces, one can achieve richer dynamics than in the affine case. Examples of non-affine polynomial processes include multidimensional Jacobi or Fisher–Wright processes (Ethier [18], Gouriéroux and Jasiak [27]), Pearson diffusions (Forman and Sørensen [24]), and Dunkl processes (Dunkl [17], Gallardo and Yor [25]).

The rest of the paper is structured as follows. In Sect. 2, we define polynomial diffusions. Section 3 is concerned with power and exponential moments. In Sect. 4, we discuss uniqueness. In Sect. 5, we treat existence and boundary attainment. Section 6 contains examples of semialgebraic state spaces. Section 7 outlines various polynomial diffusion models in finance. For the sake of readability, most proofs are given in Appendices AI. Some basic notions from algebraic geometry are reviewed in Appendix J.

We end this introduction with some notational conventions that will be used throughout this paper. For a function \(f:{\mathbb {R}}^{d}\to{\mathbb {R}}\), we write \(\{ f=0\}\) for the set \(\{x\in{\mathbb {R}}^{d}:f(x)=0\}\). A polynomial \(p\) on \({\mathbb {R}} ^{d}\) is a map \({\mathbb {R}}^{d}\to{\mathbb {R}}\) of the form \(\sum_{\alpha}c_{\alpha}x_{1}^{\alpha _{1}}\cdots x_{d}^{\alpha_{d}}\), where the sum runs over all multi-indices \(\alpha=(\alpha_{1},\ldots,\alpha_{d})\in{\mathbb {N}}^{d}_{0}\) and only finitely many of the coefficients \(c_{\alpha}\) are nonzero. Such a representation is unique. The degree of \(p\) is the number \(\deg p=\max\{\alpha _{1}+\cdots+\alpha_{d} : c_{\alpha}\ne0\}\). We let \({\mathrm {Pol}}({\mathbb {R}}^{d})\) denote the ring of all polynomials on \({\mathbb {R}}^{d}\), and \({\mathrm {Pol}}_{n}({\mathbb {R}}^{d})\) the subspace consisting of polynomials of degree at most \(n\). Let \(E\) be a subset of \({\mathbb {R}}^{d}\). A polynomial on  \(E\) is the restriction \(p=q|_{E}\) to \(E\) of a polynomial \(q\in{\mathrm{Pol}}({\mathbb {R}}^{d})\). Its degree is \(\deg p=\min \{\deg q : p=q|_{E}, q\in{\mathrm{Pol}}({\mathbb {R}}^{d})\}\). We let \({\mathrm {Pol}}(E)\) denote the ring of polynomials on \(E\), and \({\mathrm{Pol}}_{n}(E)\) the subspace of polynomials on \(E\) of degree at most \(n\). Both \({\mathrm{Pol}}_{n}({\mathbb {R}}^{d})\) and \({\mathrm{Pol}}_{n}(E)\) are finite-dimensional real vector spaces, but if there are nontrivial polynomials that vanish on \(E\), their dimensions will be different. If \(E\) has a nonempty interior, then \({\mathrm{Pol}}_{n}({\mathbb {R}}^{d})\) and \({\mathrm{Pol}} _{n}(E)\) can be identified. The set of real symmetric \(d\times d\) matrices is denoted \({\mathbb {S}}^{d}\), and the subset of positive semidefinite matrices is denoted \({\mathbb {S}}^{d}_{+}\).

2 Definition of polynomial diffusions

Throughout this paper, we fix maps \(a: {\mathbb {R}}^{d}\to{\mathbb {S}}^{d}\) and \(b:{\mathbb {R}} ^{d}\to {\mathbb {R}}^{d}\) with

$$ \text{$a_{ij}\in{\mathrm{Pol}}_{2}({\mathbb {R}}^{d})$ and $b_{i}\in {\mathrm{Pol}}_{1}({\mathbb {R}}^{d})$ for all $i,j$} $$
(2.1)

and a state space \(E\subseteq{\mathbb {R}}^{d}\). Our goal is to investigate the following issues:

  1. (a)

    For a suitable class of state spaces \(E\), find conditions on \(a\), \(b\), \(E\) that guarantee the existence of an \(E\)-valued solution to the stochastic differential equation

    $$ {\mathrm{d}} X_{t} = b(X_{t}) {\,\mathrm{d}} t + \sigma(X_{t}) {\,\mathrm{d}} W_{t} $$
    (2.2)

    for some \(d\)-dimensional Brownian motion \(W\) and some continuous function \(\sigma:{\mathbb {R}}^{d}\to{\mathbb {R}}^{d\times d}\) with \(\sigma\sigma^{\top}=a\) on \(E\). We shall consider the class of basic closed semialgebraic sets \(E\), defined using polynomial equalities and inequalities.

  2. (b)

    Find conditions for uniqueness in law for \(E\) -valued solutions to ( 2.2 ). By this we mean that for any \(x\in E\) and any \(E\)-valued solutions \(X\) and \(X'\) to (2.2) with \(X_{0}=X_{0}'=x\), possibly with different driving Brownian motions, \(X\) and \(X'\) have the same law.

  3. (c)

    Find conditions for a solution to (2.2) to attain the boundary of \(E\).

  4. (d)

    Find large parametric classes of \(a\), \(b\), \(E\) for which (2.2) admits a solution.

Investigating these issues is motivated by the fact that diffusions as in (2.2) admit closed form conditional moments and have broad applications in finance, as we shall see below.

We consider the partial differential operator \({\mathcal {G}}\) given by

$$ {\mathcal {G}}f = \frac{1}{2}\operatorname{Tr}( a \nabla^{2} f) + b^{\top}\nabla f. $$
(2.3)

In view of (2.1), \({\mathcal {G}}\) maps \({\mathrm {Pol}}_{n}({\mathbb {R}}^{d})\) to itself for each \(n\in{\mathbb {N}}\). As we work on a state space \(E\subseteq {\mathbb {R}}^{d}\), we now refine this property. We say that \({\mathcal {G}}\) is well defined on \({\mathrm{Pol}} (E)\) if \({\mathcal {G}}f=0\) on \(E\) for any \(f\in{\mathrm {Pol}}({\mathbb {R}}^{d})\) with \(f=0\) on \(E\). In this case, \({\mathcal {G}}\) is well defined as an operator on \({\mathrm{Pol}}(E)\). This always holds if \(E\) has a nonempty interior.

Definition 2.1

The operator \({\mathcal {G}}\) is called polynomial on \(E\) if it is well defined on \({\mathrm{Pol}}(E)\), and thus maps \({\mathrm{Pol}}_{n}(E)\) to itself for each \(n\in {\mathbb {N}}\). In this case, we call any \(E\)-valued solution to (2.2) a polynomial diffusion on \(E\).

It is a simple matter to verify that any second order partial differential operator that maps \({\mathrm{Pol}}_{n}(E)\) to itself for each \(n\in {\mathbb {N}}\) is necessarily of the form (2.1) and (2.3) on \(E\).

Lemma 2.2

Let \(\widetilde{\mathcal {G}}f = \frac{1}{2}\operatorname{Tr}( \widetilde{a} \nabla ^{2} f) + \widetilde{b}^{\top}\nabla f\) be a partial differential operator for some maps \(\widetilde{a}: {\mathbb {R}}^{d}\to{\mathbb {S}}^{d}\) and \(\widetilde{b}:{\mathbb {R}}^{d}\to{\mathbb {R}}^{d}\). Assume \(\widetilde{\mathcal {G}}\) is well defined on \({\mathrm{Pol}}(E)\). Then the following are equivalent:

  1. (i)

    \(\widetilde{\mathcal {G}}\) maps \({\mathrm{Pol}}_{n}(E)\) to itself for each \(n\in{\mathbb {N}}\).

  2. (ii)

    \(\widetilde{\mathcal {G}}\) maps \({\mathrm{Pol}}_{n}(E)\) to itself for \(n\in \{ 1,2\}\).

  3. (iii)

    The components of \(\widetilde{a}\) and \(\widetilde{b}\) restricted to \(E\) lie in \({\mathrm{Pol}}_{2}(E)\) and \({\mathrm {Pol}}_{1}(E)\), respectively.

In this case, \(\widetilde{a}\) and \(\widetilde{b}\) restricted to \(E\) are uniquely determined by the action of \(\widetilde{\mathcal {G}}\) on \({\mathrm{Pol}}_{2}(E)\).

Proof

The implications \(\mbox{(i)}\Rightarrow\mbox{(ii)}\) and \(\mbox{(iii)}\Rightarrow\mbox{(i)}\) are immediate, and the implication \(\mbox{(ii)}\Rightarrow\mbox{(iii)}\) follows upon applying \(\widetilde {\mathcal {G}}\) to the monomials of degree one and two. In particular, this pins down \(\widetilde{a}\) and \(\widetilde{b}\) on \(E\), and thus also establishes the last part of the lemma. □

In the one-dimensional case \(d=1\), one can classify all polynomial diffusions on intervals \(E\). Indeed, one has \(a(x)=a+\alpha x+Ax^{2}\) and \(b(x)=b+\beta x\) for some scalars \(a,\alpha,A,b,\beta\), and \(E=\{x\in{\mathbb {R}}:a(x)\ge0\}\). See Forman and Sørensen [24] and Filipović et al. [23] for details.

The multidimensional case is less trivial. For example, let \(d=2\), \(E={\mathbb {R}} \times\{0\}\), and consider the operator \({\mathcal {G}}f(x,y)=\frac {1}{2}\partial_{xx}f(x,y)+\partial_{y} f(x,y)\). This operator is not well defined on \({\mathrm{Pol}}(E)\), since the polynomial \(f(x,y)=y\) vanishes on \(E\), but \({\mathcal {G}}f(x,y)=1\). On the other hand, \({\mathcal {G}}\) is the generator of the diffusion \({\mathrm{d}} X_{t}=({\mathrm{d}} B_{t},{\,\mathrm{d}} t)\), where \(B\) is a one-dimensional Brownian motion. This process immediately leaves \(E\) for any starting point \(x\in E\). If, however, an \(E\)-valued solution to (2.2) exists for any starting point \(x\in E\), then \({\mathcal {G}}\) is well defined on \({\mathrm{Pol}}(E)\). This follows from the following basic positive maximum principle.

Lemma 2.3

Consider \(f\in C^{2}({\mathbb {R}}^{d})\) and suppose \({\overline{x}}\in E\) is a maximizer of \(f\) over  \(E\). If (2.2) admits an \(E\)-valued solution with \(X_{0}=\overline{x}\), then \({\mathcal {G}}f({\overline{x}})\le0\).

Proof

Let \(X\) be an \(E\)-valued solution to (2.2) with \(X_{0}=\overline{x}\), and assume for contradiction that \({\mathcal {G}}f({\overline{x}})>0\). By the definition of a global maximizer, \(f(x)\le f({\overline{x}})\) for all \(x\in E\). Let \(\tau=\inf\{t\ge0: {\mathcal {G}}f(X_{t})\le0\}\), and note that \(\tau>0\). Then for \(t\in(0,\tau)\), we have \(f(X_{t})\le f({\overline{x}})\) and \({\mathcal {G}}f(X_{t})>0\), which implies

$$ f(X_{t\wedge\tau}) - f({\overline{x}}) - \int_{0}^{t\wedge\tau} {\mathcal {G}} f(X_{s}){\,\mathrm{d}} s < 0 $$

for all \(t>0\). Thus the left-hand side is a local martingale starting from zero, strictly negative for all \(t>0\). This contradiction proves that \({\mathcal {G}}f({\overline{x}})\le0\). □

Regarding uniqueness, it is crucial to restrict attention to \(E\)-valued solutions. To illustrate what can otherwise go wrong, consider the stochastic differential equation \({\mathrm{d}} X_{t} = -2\sqrt{X_{t}^{-}}{\,\mathrm{d}} t+ 2\sqrt {X_{t}^{+}}{\,\mathrm{d}} W_{t}\), which is well known to have a unique \({\mathbb {R}}_{+}\)-valued solution: the zero-dimensional squared Bessel process. However, this stochastic differential equation admits other solutions that do not remain in \({\mathbb {R}}_{+}\), for example \(X_{t} = Y_{t}{\boldsymbol{1}_{\{t\le \tau\}}} - (t-\tau)^{2}{\boldsymbol{1}_{\{t>\tau\}}}\), where \(Y\) is a zero-dimensional squared Bessel process with \(Y_{0}\ge0\) and \(\tau=\inf\{t:Y_{t}=0\}\). Here \(\tau\) is finite almost surely.

Note that in Definition 2.1, we require neither uniqueness of solutions to (2.2), nor that \({\mathcal {G}}\) be the generator of a Markov process on \(E\). There are two reasons for this. First, existence of \(E\)-valued solutions to (2.2) does not in itself imply that those solutions are Markovian. Second, in the context of Markov processes, the polynomial property holds if and only if the corresponding semigroup leaves \({\mathrm{Pol}}_{n}(E)\) invariant for each \(n\in {\mathbb {N}}\). However, this fact, properly phrased, does not require the Markov property. Only Itô calculus is needed. This observation is crucial for our approach to proving uniqueness. Finally, we remark that a polynomial diffusion that is also a Markov process is a “polynomial process” in the terminology of Cuchiero et al. [10], with vanishing killing rate and no jumps.

3 Power and exponential moments

Throughout this section, we assume that \({\mathcal {G}}\) is polynomial on \(E\) and let \(X\) be an \(E\)-valued solution to (2.2) realized on a filtered probability space \((\varOmega,{\mathcal {F}},{\mathcal {F}}_{t},{\mathbb {P}})\).

For any \(n\in{\mathbb {N}}\), we let \(N=N(n,E)\) denote the dimension of \(\mathrm{Pol}_{n}(E)\). We fix a basis of polynomials \(h_{1},\dots,h_{N}\) for \(\mathrm{Pol}_{n}(E)\) and write

$$ H(x) = \big(h_{1}(x),\dots,h_{N}(x)\big)^{\top}. $$

Then for each \(p\in{\mathrm{Pol}}_{n}(E)\), there exists a unique vector \(\vec{p}\in {\mathbb {R}}^{{N}}\) such that

$$ p(x)=H(x)^{\top}\vec{p}. $$
(3.1)

The restriction of \({\mathcal {G}}\) to \({\mathrm{Pol}}_{n}(E)\) has a unique matrix representation \(G\in{\mathbb {R}}^{{N}\times{N}}\), characterized by the property that \(G \vec{p}\) is the coordinate vector of \({\mathcal {G}}p\) whenever \(\vec{p}\) is the coordinate vector of \(p\). That is, we have

$$ {\mathcal {G}}p(x) = H(x)^{\top}G \vec{p}. $$
(3.2)

We now show that \({\mathbb {E}}[p(X_{T}) \,|\, {\mathcal {F}}_{t}]\) is indeed well defined as a polynomial function of \(X_{t}\). Recall that we do not assume uniqueness of solutions to (2.2), and we do not require \(X\) to be Markov. The proof is given in Appendix B.

Theorem 3.1

If \({\mathbb {E}}[\|X_{0}\|^{2n}]<\infty\), then for any \(p\in{\mathrm {Pol}}_{n}(E)\) with coordinate representation  \(\vec{p}\in{\mathbb {R}}^{{N}}\), we have

$$ {\mathbb {E}}[p(X_{T}) \,|\, {\mathcal {F}}_{t}] = H(X_{t})^{\top}\mathrm{e}^{(T-t)G} \vec{p}, \qquad t\le T. $$

The following result is a direct consequence of Theorem 3.1. Its statement and proof use standard multi-index notation: For a multi-index \({\mathbf {k}}=(k_{1},\ldots,k_{d})\in{\mathbb {N}}^{d}_{0}\), we write \(|{\mathbf {k}} |=k_{1}+\cdots+k_{d}\) and \(x^{\mathbf {k}}=x_{1}^{k_{1}}\cdots x_{d}^{k_{d}}\).

Corollary 3.2

For any time points \(0\le t_{1}<\cdots<t_{m}\) and for any multi-indices \({\mathbf {k}} (1), \ldots, {\mathbf {k}}(m)\) such that

$$ {\mathbb {E}}\big[ \|X_{0}\|^{2|{\mathbf {k}}(1)|+\cdots+2|{\mathbf {k}}(m)|}\big] < \infty, $$

the expectation \({\mathbb {E}}[ X_{t_{1}}^{{\mathbf {k}}(1)} \cdots X_{t_{m}}^{{\mathbf {k}}(m)} ]\) is uniquely determined by \({\mathcal {G}}\) and the law of \(X_{0}\).

Proof

We prove the result for \(m=2\); the general case follows by iteration. Set \({\mathbf {j}}={\mathbf {k}}(1)\), \({\mathbf {k}}={\mathbf {k}}(2)\), and \(n=|{\mathbf {j}}|+|{\mathbf {k}}|\). Since \({\mathbb {E}} [\| X_{0}\|^{2|{\mathbf {k}}|}]<\infty\), Theorem 3.1 yields \(X_{t_{1}}^{\mathbf {j}}{\mathbb {E}}[X_{t_{2}}^{{\mathbf {k}}}\,|\,{\mathcal {F}}_{t_{1}}]=p(X_{t_{1}})\) for some polynomial \(p\in {\mathrm{Pol}}_{n}(E)\) whose coordinate representation \(\vec{p}\) only depends on \(G\). Since \({\mathbb {E}}[\|X_{0}\|^{2n}]<\infty\), another application of Theorem 3.1 yields

$$ {\mathbb {E}}[X_{t_{1}}^{\mathbf {j}}X_{t_{2}}^{\mathbf {k}}] = {\mathbb {E}}\big[ {\mathbb {E}}[p(X_{t_{1}})\,|\,{\mathcal {F}} _{0}] \big] = {\mathbb {E}}[H(X_{0})^{\top}\mathrm{e}^{t_{1}G} \vec{q} ]. $$

This proves the corollary. □

We next provide conditions under which \(X_{T}\) admits finite exponential moments. This result will be used in connection with proving uniqueness in Theorem 4.2 below, but is also of interest on its own for applications in finance.Footnote 1 Its proof is given in Appendix C.

Theorem 3.3

If

$$ {\mathbb {E}}\big[ \mathrm{e}^{\delta\|X_{0}\|} \big] < \infty\qquad \textit{for some } \delta>0 $$
(3.3)

and the diffusion coefficient satisfies the linear growth condition

$$ \|a(x)\| \le C(1+\|x\|) \qquad\textit{for all }x\in E $$
(3.4)

for some constant \(C\), then for each \(t\ge0\), there exists \(\varepsilon >0\) with \({\mathbb {E}}[ \mathrm{e}^{\varepsilon\|X_{t}\|}] < \infty\).

4 Uniqueness

Throughout this section, we assume that \({\mathcal {G}}\) is polynomial on \(E\). We present three results regarding uniqueness in law for \(E\)-valued solutions to (2.2). Recall that this notion of uniqueness pertains to deterministic initial conditions, as defined under (b) in Sect. 2.

The first result relies on the fact that the joint moments of all finite-dimensional marginal distributions of a polynomial diffusion are uniquely determined by \({\mathcal {G}}\); see Corollary 3.2. Thus uniqueness in law follows if the finite-dimensional marginal distributions are the only ones with these moments. This property is known as determinacy in the literature on the moment problem, a classical topic in mathematics; references include Stieltjes [45], Akhiezer [3], Berg et al. [5], Schmüdgen [43], Stoyanov [46], Kleiber and Stoyanov [32] and many others.

Lemma 4.1

Let \(X\) be an \(E\)-valued solution to (2.2). If for each \(t\ge 0\), there exists \(\varepsilon>0\) with \({\mathbb {E}}[\exp(\varepsilon\| X_{t}\| )]<\infty\), then any \(E\)-valued solution to (2.2) with the same initial law as  \(X\) has the same law as \(X\). In particular, this holds if (3.3) and (3.4) are satisfied.

Proof

For any \(t\ge0\) and \(i\in\{1,\ldots,d\}\), the hypothesis yields \({\mathbb {E}}[\exp(\varepsilon|X_{i,t}|)]<\infty\) for some \(\varepsilon>0\). As a consequence, the moment-generating function of \(X_{i,t}\) exists and is analytic in \((-\varepsilon,\varepsilon)\), hence equal to its power series expansion, and thus determined by the moments of \(X_{i,t}\). By Curtiss [11, Theorem 1], the moment-generating function determines the law of \(X_{i,t}\), which thus satisfies the determinacy property. Now, according to Petersen [40, Theorem 3], determinacy of the (one-dimensional) marginals of a measure on \({\mathbb {R}}^{m}\) implies determinacy of the measure itself. It follows that determinacy holds for the law of each collection \((X_{t_{1}},\ldots,X_{t_{m}})\), \(0\le t_{1}<\cdots<t_{m}\). By Corollary 3.2, the corresponding moments are the same for any \(E\)-valued solution to (2.2) with the same initial law as \(X\). This proves the theorem. □

If \(X_{0}=x\) is deterministic, then (3.3) holds and Lemma 4.1 directly yields our first result.

Theorem 4.2

If the linear growth condition (3.4) is satisfied, then uniqueness in law for \(E\)-valued solutions to (2.2) holds.

Theorem 4.2 assumes the linear growth condition (3.4) to ensure existence of exponential moments. While valid for all affine diffusions, as well as when \(E\) is compact, this condition excludes some interesting examples, in particular geometric Brownian motion.Footnote 2 Uniqueness for the geometric Brownian motion holds of course, and can be established via the Yamada–Watanabe pathwise uniqueness theorem for one-dimensional diffusions. Our second result records this fact.

Theorem 4.3

If the dimension is \(d=1\), then uniqueness in law for \(E\)-valued solutions to (2.2) holds.

Proof

Since \({\mathcal {G}}\) is polynomial, the drift \(b(x)\) in (2.2) is an affine function on \(E\), and the dispersion restricted to \(E\) is of the form \(\sigma(x) = \sqrt{ \alpha+ ax + Ax^{2}}\) for some real parameters \(\alpha, a, A\). Hence \(b(x)\) is Lipschitz-continuous, and \(\sigma(x)\) satisfies

$$ \big(\sigma(x)-\sigma(y)\big)^{2} \le\rho_{n}\left( |x-y|\right ),\quad \text{for all $x,y\in E$ with $|x|, |y|\le n$,} $$

where \(\rho_{n}(z)= |a+2nA| z\), for any \(n\ge1\). A localization argument in conjunction with Rogers and Williams [42, Theorem V.40.1] shows that pathwise uniqueness holds for any \(E\)-valued solution to (2.2). This in turn implies uniqueness in law; see Rogers and Williams [42, Theorem V.17.1]. □

Our third result, in combination with Theorems 4.2 and 4.3, yields uniqueness in a wide range of cases that are encountered in applications. The setup is the following. We assume that any \(E\)-valued solution to (2.2) can be partitioned as \(X=(Y,Z)\), where \(Y\) is an autonomous \(m\)-dimensional diffusion with closed state space \(E_{Y}\subseteq{\mathbb {R}}^{m}\), \(Z\) is \(n\)-dimensional, and \(m+n=d\). That is, \((Y,Z)\) solves the stochastic differential equation

$$\begin{aligned} {\mathrm{d}} Y_{t} &= b_{Y}(Y_{t}) {\,\mathrm{d}} t + \sigma_{Y}(Y_{t}) {\,\mathrm{d}} W_{t} , \end{aligned}$$
(4.1)
$$\begin{aligned} {\mathrm{d}} Z_{t} &= b_{Z}(Y_{t},Z_{t}) {\,\mathrm{d}} t + \sigma_{Z}(Y_{t},Z_{t}) {\,\mathrm{d}} W_{t}, \end{aligned}$$
(4.2)

for polynomials \(b_{Y}:{\mathbb {R}}^{m}\to{\mathbb {R}}^{m}\) and \(b_{Z}:{\mathbb {R}}^{m}\times{\mathbb {R}}^{n}\to{\mathbb {R}} ^{n}\) of degree one, continuous maps \(\sigma_{Y}:{\mathbb {R}}^{m}\to{\mathbb {R}}^{m\times d}\) and \(\sigma _{Z}:{\mathbb {R}}^{m}\times{\mathbb {R}}^{n}\to{\mathbb {R}}^{n\times d}\), and where \(Y\) takes values in \(E_{Y}\). The proof of the following theorem is given in Appendix D.

Theorem 4.4

Assume that uniqueness in law for \(E_{Y}\)-valued solutions to (4.1) holds, and that \(\sigma_{Z}\) is locally Lipschitz in  \(z\), locally in  \(y\), on  \(E\). That is, for each compact subset \(K\subseteq E\), there exists a constant  \(\kappa\) such that for all \((y,z,y',z')\in K\times K\),

$$ \| \sigma_{Z}(y,z) - \sigma_{Z}(y',z') \| \le\kappa\|z-z'\|. $$
(4.3)

Then uniqueness in law for \(E\)-valued solutions to (2.2) holds.

5 Existence and boundary attainment

In this section, we discuss existence of \(E\)-valued solutions to (2.2) and give conditions under which the boundary of the state space is attained. The results are stated and proved using some basic concepts from algebra and algebraic geometry. Appendix J provides a review of the required notions.

Existence of a solution to (2.2) with values in \({\mathbb {R}}^{d}\) is well known to hold under linear growth conditions; see for instance Ikeda and Watanabe [31, Theorem IV.2.4]. The problem at hand thus boils down to finding conditions under which a solution to (2.2) takes values in \(E\). This is a stochastic invariance problem. In Appendix A, we discuss necessary and sufficient conditions for nonnegativity of certain Itô processes, which is the basic tool we use for proving stochastic invariance.

We henceforth assume that the state space \(E\) is a basic closed semialgebraic set. Specifically, let \({\mathcal {P}}\) and \({\mathcal {Q}}\) be finite collections of polynomials on \({\mathbb {R}}^{d}\), and define

$$E = \{x\in M : p(x)\ge0 \text{ for all } p\in{\mathcal {P}}\}, $$

where

$$ M = \{x\in{\mathbb {R}}^{d} : q(x)=0 \text{ for all } q\in{\mathcal {Q}}\}. $$
(5.1)

In particular, if \({\mathcal {Q}}=\emptyset\) then \(M={\mathbb {R}}^{d}\). The following result provides simple necessary conditions for the invariance of \(E\) with respect to (2.2).

Theorem 5.1

Suppose there exists an \(E\)-valued solution to (2.2) with \(X_{0}=x\), for any \(x\in E\). Then

  1. (i)

    \(a\nabla p=0\) and \({\mathcal {G}}p\ge0\) on \(E\cap\{p=0\}\), for each \(p\in{\mathcal {P}}\);

  2. (ii)

    \(a\nabla q=0\) and \({\mathcal {G}}q=0\) on \(E\), for each \(q\in {\mathcal {Q}}\).

Proof

Pick any \(p\in{\mathcal {P}}\), \(x\in E\cap\{p=0\}\), and let \(X\) be a solution to (2.2) with \(X_{0}=x\). Then \(p(X_{t})=\int_{0}^{t}{\mathcal {G}}p(X_{s}){\,\mathrm{d}} s+\int_{0}^{t}\nabla p(X_{s})^{\top}\sigma(X_{s}){\,\mathrm{d}} W_{s}\) and \(p(X)\ge0\), so (i) follows by Lemma A.1(ii). To prove (ii) for \(q\in{\mathcal {Q}}\), simply apply the same argument to \(q\) and \(-q\). □

The necessary condition \(a\nabla p=0\) states, roughly speaking, that at any boundary point of the state space, there can be no diffusive fluctuations orthogonally to the boundary. The necessary condition \({\mathcal {G}}p\ge0\) can be interpreted as “inward-pointing adjusted drift” at the boundary. The following example shows that this cannot be replaced by a simple “inward-pointing drift” condition.

Example 5.2

Consider the bivariate process \((U,V)\) with dynamics

$$\begin{aligned} \textstyle\begin{array}{llll} {\mathrm{d}} U_{t} &= {\,\mathrm{d}} W_{1t}, &&\qquad\qquad\qquad U_{0}\in{\mathbb {R}},\\ {\mathrm{d}} V_{t} &= \alpha{\,\mathrm{d}} t + 2\sqrt{V_{t}}{\,\mathrm{d}} W_{2t}, &&\qquad\qquad\qquad V_{0}\in{\mathbb {R}}_{+}, \end{array}\displaystyle \end{aligned}$$

where \((W_{1},W_{2})\) is a Brownian motion and \(\alpha> 0\). In other words, \(U\) is a Brownian motion and \(V\) is an independent squared Bessel process. The state space is \({\mathbb {R}}\times{\mathbb {R}}_{+}\). Now consider the process \((X,Y)=(U,V-U^{2})\). Its dynamics is

$$\begin{aligned} {\mathrm{d}} X_{t} &= {\,\mathrm{d}} W_{1t}, \\ {\mathrm{d}} Y_{t} &= (\alpha-1) {\,\mathrm{d}} t - 2X_{t} {\,\mathrm{d}} W_{1t} + 2\sqrt{X_{t}^{2}+Y_{t}}\, {\,\mathrm{d}} W_{2t}, \end{aligned}$$

and its state space is \(E=\{(x,y)\in{\mathbb {R}}^{2}:x^{2}+y\ge0\}\), the epigraph of the function \(-x^{2}\). The drift of \((X,Y)\) is \(b(x,y)=(0,\alpha-1)\), which points out of the state space at every boundary point, provided \(\alpha<1\). Nonetheless, with \(p(x,y)=x^{2}+y\), a calculation yields \({\mathcal {G}}p(x,y)=\alpha>0\).

As a converse to Theorem 5.1, we now give sufficient conditions for the existence of an \(E\)-valued solution to (2.2). The proof of the following theorem is given in Appendix E.

Theorem 5.3

Suppose \(E\) satisfies the following geometric and algebraic properties:

  1. (G1)

    \(\nabla r(x)\), \(r\in{\mathcal {Q}}\), are linearly independent for all \(x\in M\);

  2. (G2)

    the ideals generated by \({\mathcal {Q}}\cup\{p\}\) and \(M\cap\{ p=0\}\) are equal, i.e., we have \(({\mathcal {Q}}\cup\{p\})={\mathcal {I}}(M\cap\{p=0\})\), for each \(p\in{\mathcal {P}}\);

and the maps \(a\) and \(b\) satisfy

  1. (A0)

    \(a \in{\mathbb {S}}^{d}_{+}\) on \(E\);

  2. (A1)

    \(a \nabla p=0\) on \(M\cap\{p=0\}\) and \({\mathcal {G}}p>0\) on \(E\cap\{ p=0\}\), for each \(p\in{\mathcal {P}}\);

  3. (A2)

    \(a \nabla q=0\) and \({\mathcal {G}}q=0\) on \(M\), for each \(q\in {\mathcal {Q}}\).

Then \({\mathcal {G}}\) is polynomial on \(E\), and there exists a continuous map \(\sigma:{\mathbb {R}}^{d}\to{\mathbb {R}}^{d\times d}\) with \(\sigma\sigma^{\top}=a\) on \(E\) and such that the stochastic differential equation (2.2) admits an \(E\)-valued solution \(X\) for any initial law of \(X_{0}\). This solution can be chosen so that it spends zero time in the sets \(\{p=0\}\), \(p\in{\mathcal {P}}\). That is,

$$ \int_{0}^{t} {\boldsymbol{1}_{\{p(X_{s})=0\}}}{\,\mathrm{d}} s = 0 \qquad\textit{ for all } t\ge0 \textit{ and all } p\in{\mathcal {P}}. $$
(5.2)

Conditions (A1) and (A2) should be contrasted with the necessary conditions of Theorem 5.1. The latter are somewhat weaker, since they only make statements about \(a\) and \(b\) on \(E\) rather than \(M\), and since the inequality in Theorem 5.1(i) is weak. Theorem 5.3 can be generalized to allow a weak inequality in (A1), at the cost of allowing absorption of the process at the boundary. We do not consider this generalization here.

Condition (G1) implies that \(M\) is an algebraic submanifold in \({\mathbb {R}}^{d}\) of dimension \(d-|{\mathcal {Q}}|\). The least obvious condition is arguably (G2). The crucial implication of (G2) is that any polynomial \(f\) that vanishes on \(M\cap\{p=0\}\) has a representation \(f=h p\) on \(M\) for some polynomial \(h\). In conjunction with (A1), this implies that \(a(x)\nabla p(x)\) decays like \(p(x)\) as \(x\in E\) approaches the boundary set \(E\cap \{p=0\}\), for \(p\in{\mathcal {P}}\). This allows one to prove that the local time of \(p(X)\) at level zero vanishes, which makes Lemma A.1 applicable; see Appendix E for the details.

Condition (G2) is also the least straightforward to verify. We therefore present two sufficient conditions that are easier to check in concrete examples. The first condition is useful when \(M={\mathbb {R}}^{d}\), in which case each ideal appearing on the left-hand side in (G2) is generated by a single polynomial. This covers many interesting examples, yet yields conditions that are easy to verify in practice. A proof of the following result can be found in Bochnak et al. [6, Theorem 4.5.1].

Lemma 5.4

Let \(p\in{\mathrm{Pol}}({\mathbb {R}}^{d})\) be an irreducible polynomial and \({\mathcal {V}}(p)\) its zero set. Then \((p)={\mathcal {I}}({\mathcal {V}}(p))\) if and only if \(p\) changes sign on \({\mathbb {R}}^{d}\), that is, \(p(x)p(y)<0\) for some \(x,y\in{\mathbb {R}}^{d}\).

The second condition applies when the ideals generated by the families \({\mathcal {Q}}\cup\{p\}\) with \(p\in{\mathcal {P}}\) are prime and of full dimension.

Lemma 5.5

For \(p\in{\mathcal {P}}\), assume that the ideal \(({\mathcal {Q}}\cup\{p\} )\) is prime with dimension \(d-1-|{\mathcal {Q}}|\), and that there exists some \(x\in M\cap \{p=0\}\) such that the vectors \(\nabla r(x)\), \(r\in{\mathcal {Q}}\cup\{p\}\), are linearly independent. Then \(({\mathcal {Q}}\cup\{p\})={\mathcal {I}}(M\cap\{p=0\})\).

Proof

This follows directly from Bochnak et al. [6, Proposition 3.3.16]. □

Remark 5.6

Stochastic invariance problems have been studied by a number of authors; see Da Prato and Frankowska [12], Filipović et al. [22], among many others. The approach in these papers is to impose an “inward-pointing Stratonovich drift” condition. This breaks down for polynomial diffusions. Indeed, consider the squared Bessel process

$$ {\mathrm{d}} X_{t} = \alpha{\,\mathrm{d}} t + 2\sqrt{X_{t}} {\,\mathrm{d}} W_{t}, $$

which is an \({\mathbb {R}}_{+}\)-valued affine process for \(\alpha\ge0\). The stochastic integral cannot always be written in Stratonovich form, since \(\sqrt{X}\) fails to be a semimartingale for \(0<\alpha<1\). If nonetheless one formally computes the Stratonovich drift, one obtains \(\alpha-1\), suggesting that \(\alpha\ge1\) is needed for stochastic invariance of \({\mathbb {R}}_{+}\). However, it is well known that \(\alpha\ge0\) is the correct condition. Our approach is rather in the spirit of Da Prato and Frankowska [13] who however focus on stochastic invariance of closed convex sets.

Apart from existence, Theorem 5.3 asserts that \(X\) spends zero time in the sets \(\{p=0\}\), \(p\in{\mathcal {P}}\), which roughly speaking correspond to boundary segments of the state space. It does not, however, tell us whether these sets are actually hit. The purpose of the following theorem is to give necessary and sufficient conditions for this to occur. The proof is given in Appendix F. The vector \(h\) of polynomials appearing in the theorem exists if (G2) and (A1) are satisfied.

Theorem 5.7

Let \(X\) be an \(E\)-valued solution to (2.2) satisfying (5.2). Consider \(p\in{\mathcal {P}}\) and let \(h\) be a vector of polynomials such that \(a \nabla p = h p\) on \(M\).

  1. (i)

    Assume there exists a neighborhood \(U\) of \(E\cap\{p=0\}\) such that

    $$ 2 {\mathcal {G}}p - h^{\top}\nabla p\ge0 \qquad\textit{on } E\cap U. $$

    Then \(p(X_{t})>0\) for all \(t>0\).

  2. (ii)

    Assume (G2) holds and

    $$ 2 {\mathcal {G}}p - h^{\top}\nabla p=0 \qquad\textit{on } M\cap\{p=0\}. $$

    Then \(p(X_{t})>0\) for all \(t>0\).

  3. (iii)

    Let \(\overline{x}\in E\cap\{p=0\}\) and assume

    $$ {\mathcal {G}}p(\overline{x})\ge0 \qquad\textit{and}\qquad2 {\mathcal {G}}p({\overline{x}}) - h({\overline{x}})^{\top}\nabla p({\overline{x}})< 0. $$

    Then for any \(T>0\), there exists \(\varepsilon>0\) such that if \(\| X_{0}-\overline{x}\|<\varepsilon\) almost surely, then \(p(X_{t})=0\) for some \(t\le T\) with positive probability.

As a simple example, we may apply Theorem 5.7 to the scalar square-root diffusion \(dX_{t}=(b+\beta X_{t}) dt+\sigma\sqrt{X_{t}} dB_{t}\) with parameters \(b,\sigma> 0\) and \(\beta<0\), and where \(B\) is a one-dimensional Brownian motion. In this case \(E={\mathbb {R}}_{+}\), and \({\mathcal {P}}\) consists of the single polynomial \(p(x)=x\). We have \(a(x)p'(x) = \sigma ^{2} x = \sigma^{2} p(x)\), so that \(h(x)\equiv\sigma^{2}\), and thus

$$ 2 {\mathcal {G}}p(x) - h(x)p'(x) = 2(b+\beta x) - \sigma^{2}. $$

It is well known that \(X_{t}>0\) for all \(t>0\) if and only if the Feller condition \(2b\ge\sigma^{2}\) holds. Theorem 5.7(iii) gives the necessity of the Feller condition. Theorem 5.7(i) and (ii) together give the sufficiency of the Feller condition. Indeed, if \(2b>\sigma^{2}\), then Theorem 5.7(i) applies, while the condition in Theorem 5.7(ii) is not satisfied. Theorem 5.7(ii) in turn applies when \(2b=\sigma^{2}\), while Theorem 5.7(i) does not.

6 Examples of semialgebraic state spaces

We now discuss examples of semialgebraic state spaces of interest, where our results are applicable.

6.1 Some quadric sets

Let \(Q\in{\mathbb {S}}^{d}\) be nonsingular, and consider the state space \(E=\{x\in{\mathbb {R}}^{d}:x^{\top}Qx\le1\}\). Here \({\mathcal {P}}\) consists of the single polynomial \(p(x)=1-x^{\top}Qx\), and \(M={\mathbb {R}}^{d}\). After a linear change of coordinates, we may assume \(Q\) is diagonal with \(Q_{ii}\in\{+1,-1\}\). We also suppose \(Q_{ii}=1\) for at least some \(i\), since otherwise \(E={\mathbb {R}}^{d}\). State spaces of this type include the closed unit ball, but also non-convex sets like \(\{x\in{\mathbb {R}}^{2}:x_{1}^{2}-x_{2}^{2}\le1\}\), whose boundary is a hyperbola. One can also consider complements of such sets; see Remark 6.3 below. One interesting aspect of the state spaces investigated here is that they do not admit non-deterministic affine diffusions; this follows directly from Proposition 6.1 below, which shows that \(a\) is either quadratic or identically zero. This is in contrast to the parabolic state spaces considered by Spreij and Veerman [44].

The following convex cone of polynomial maps plays a key role. Recall that a polynomial \(r\in{\mathrm{Pol}}({\mathbb {R}}^{d})\) is called homogeneous of degree \(k\) if \(r(sx)=s^{k}r(x)\) for all \(x\in{\mathbb {R}}^{d}\) and \(s>0\). We define

$$ {\mathcal {C}}^{Q}_{+} = \left\{c:{\mathbb {R}}^{d}\to{\mathbb {S}}^{d}_{+}: \textstyle\begin{array}{l} c_{ij}\in{\mathrm{Pol}}_{2}({\mathbb {R}}^{d}) \text{ is homogeneous of degree 2 for all }i,j\\ \text{and } c(x)Qx=0\text{ for all }x\in{\mathbb {R}}^{d} \end{array}\displaystyle \right\}. $$

Note that the condition \(c(x)Qx=0\) is equivalent to \(c(x)\nabla p(x)=0\), meaning that all eigenvectors of \(c(x)\) with nonzero eigenvalues are orthogonal to \(\nabla p(x)\). The proof of the following proposition is given in Appendix G.

Proposition 6.1

Conditions (G1) and (G2) hold for the above state space  \(E\). Moreover, the operator  \({\mathcal {G}}\) satisfies (A0)–(A2) if and only if

$$\begin{aligned} a(x) &= (1-x^{\top}Qx)\alpha+ c(x), \\ b(x) &= \beta+ Bx \end{aligned}$$
(6.1)

for some \(\alpha\in{\mathbb {S}}^{d}_{+}\), \(\beta\in{\mathbb {R}}^{d}\), \(B\in {\mathbb {R}}^{d\times d}\) and \(c\in{\mathcal {C}}^{Q}_{+}\) such that

$$ \beta^{\top}Qx + x^{\top}B^{\top}Qx + \frac{1}{2}\operatorname{Tr}\big(c(x)Q\big) < 0 \qquad\textit{for all } x\in\{p=0\}. $$
(6.2)

Remark 6.2

If \(c(x)\) satisfies the linear growth condition \(\|c(x)\| \le C(1+\|x\| )\) for all \(x\in E\), then \(a(x)\) satisfies (3.4) and uniqueness in law for \(E\)-valued solutions to (2.2) holds by Theorem 4.2. In particular, this holds if \(Q\) is positive definite, i.e., \(Q={\mathrm{Id}}\), so that \(E\) is the unit ball and hence compact.

Remark 6.3

The conditions of Proposition 6.1 can easily be modified to cover state spaces of the form \(E=\{x\in{\mathbb {R}}^{d}:x^{\top}Qx\ge1\} \). This amounts to replacing \(p\) by \(-p\) above, and includes for example the complement of the open unit ball. With this modification, Proposition 6.1 is still true as stated, except that \(-\alpha\) should lie in \({\mathbb {S}}^{d}_{+}\), and the inequality in (6.2) should be reversed.

A question that is not addressed by Proposition 6.1 is how to describe the set \({\mathcal {C}}^{Q}_{+}\) in more explicit terms. We now provide a class of maps \(c\in{\mathcal {C}}^{Q}_{+}\), which yields a large family of polynomial diffusions on \(E\) that we expect to be useful in applications.

Let \(S_{k}\), \(k=1,\ldots,d(d-1)/2\), be a basis for the linear space of skew-symmetric \(d\times d\) matrices. Using the skew-symmetry of the \(S_{k}\) together with the fact that \(Q^{2}={\mathrm{Id}}\), it is easy to check that any map \(c\) of the form

$$ c(x) = \sum_{k, \ell=1}^{d(d-1)/2} \gamma_{k\ell} QS_{k} xx^{\top}S_{\ell}^{\top}Q, $$
(6.3)

where \(\varGamma=(\gamma_{k\ell})\in{\mathbb {S}}_{+}^{d(d-1)/2}\), lies in \({\mathcal {C}}^{Q}_{+}\). For any \(c(x)\) of the form (6.3), condition (6.2) then becomes

$$ \beta^{\top}Qx + x^{\top}\bigg( B^{\top}Q + {\sum_{k,\ell}} \gamma _{k\ell} S_{k}^{\top}Q S_{\ell}\bigg) x < 0 \qquad\text{for all } x\in\{p=0\}. $$

6.2 The product space \([0,1]^{m}\times{\mathbb {R}}^{n}_{+}\)

Consider the state space \(E=[0,1]^{m}\times{\mathbb {R}}^{n}_{+}\). Here \(d=m+n\), and the generating family of polynomials can be taken to be

$$ {\mathcal {P}}=\{ x_{i}:i=1,\ldots,m+n; 1-x_{i}: i=1,\ldots,m\}. $$

To simplify notation, introduce index sets \(I=\{1,\ldots,m\}\) and \(J {=} \{m {+} 1,\ldots,m {+} n\}\), and write \(x_{I}\) (resp. \(x_{J}\)) for the subvector of \(x\in{\mathbb {R}}^{d}\) consisting of the components with indices in \(I\) (resp. \(J\)). Similarly, for a matrix \(A\in{\mathbb {R}}^{d\times d}\), we write \(A_{II}\), \(A_{IJ}\), etc. for the submatrices with indicated row and column indices. The proof of the following proposition is given in Appendix H.

Proposition 6.4

Conditions (G1) and (G2) hold for the above state space  \(E\). Moreover, the operator  \({\mathcal {G}}\) satisfies (A0)–(A2) if and only if

  1. (i)

    the matrix \(a\) is given by

    $$\begin{aligned} \textstyle\begin{array}{llll} a_{ii}(x) &= \gamma_{i} x_{i}(1-x_{i}) &&\quad (i\in I),\\ a_{ij}(x) &=0 &&\quad (i\in I,\ j\in I\cup J,\ i\ne j),\\ a_{jj}(x) &= \alpha_{jj}x_{j}^{2} + x_{j} \big(\phi_{j} + \psi_{(j)}^{\top}x_{I} + \pi_{(j)}^{\top}x_{J}\big) &&\quad (j\in J),\\ a_{ij}(x) &= \alpha_{ij}x_{i}x_{j} &&\quad (i,j\in J,\ i\ne j) \end{array}\displaystyle \end{aligned}$$

    for some \(\gamma\in{\mathbb {R}}^{m}_{+}\), some \(\psi_{(j)}\in{\mathbb {R}}^{m}\), some \(\pi _{(j)}\in {\mathbb {R}}^{n}_{+}\) with \(\pi_{(j),j}=0\), some \(\phi\in{\mathbb {R}}^{n}\) with \(\phi_{j}\ge(\psi _{(j)}^{-})^{\top}{\mathbf{1}}\), and some \(\alpha=(\alpha_{ij})_{i,j\in J}\in{\mathbb {S}}^{n}\) such that we have \(\alpha+\operatorname{Diag}(\varPi^{\top}x_{J})\operatorname {Diag}(x_{J})^{-1}\in{\mathbb {S}}^{n}_{+}\) for all \(x_{J}\in{\mathbb {R}}^{n}_{++}\), where \(\varPi\in{\mathbb {R}}^{n\times n}\) is the matrix with columns \(\pi_{(j)}\);

  2. (ii)

    the vector \(b\) is given by

    $$ b(x) = \left( \textstyle\begin{array}{lllll} \beta_{I} &+& B_{II} x_{I} \\ \beta_{J} &+& B_{JI}x_{I} &+& B_{JJ}x_{J} \end{array}\displaystyle \right) $$
    (6.4)

    for some \(\beta\in{\mathbb {R}}^{d}\) and \(B\in{\mathbb {R}}^{d\times d}\) such that \((B^{-}_{i,I\setminus\{i\}}){\mathbf{1}}<\beta_{i}< - B_{ii}-(B^{+}_{i,I\setminus\{ i\} }){\mathbf{1}}\) for all \(i\in I\), \(\beta_{j}> (B^{-}_{jI}){\mathbf{1}}\) for all \(j\in J\), and \(B_{JJ}\in{\mathbb {R}}^{m\times m}\) has positive off-diagonal entries.

Remark 6.5

In the following two cases, we get uniqueness in law of \(E\)-valued solutions to (2.2); cf. Theorem 4.2. First, if \(\alpha=0\) and \(\pi_{(j)}=0\) for all \(j\), then the linear growth condition (3.4) is satisfied and uniqueness follows by Theorem 4.2. Second, if \(\psi_{(j)}=0\) and \(\pi_{(j)}=0\) for all \(j\) and \(\phi=0\), then the submatrix \(a_{JJ}(x)\) only depends on \(x_{J}\) and can be written \(a_{JJ}=\sigma_{JJ}\sigma_{JJ}\), where \(\sigma _{JJ}(x_{J})=\operatorname{Diag} (x_{J})\alpha^{1/2}\) is Lipschitz-continuous. Since also \(X_{I}\) is an autonomous \(m\)-dimensional diffusion on \([0,1]^{m}\), uniqueness follows from Theorem 4.4 in conjunction with Theorem 4.2. Note that \(X_{I}\) and \(X_{J}\) are coupled only through the drift in this case.

A natural next step is to consider the state space \([0,1]^{m}\times {\mathbb {R}} ^{n}_{+}\times{\mathbb {R}}^{\ell}\), \(d=m+n+\ell\). In this case, one readily continues the above argument to deduce that the diffusion matrix is of the form

$$ a(x) = \left( \textstyle\begin{array}{lll} a_{II}(x_{I}) & 0 & a_{IK}(x_{I}) \\ 0 & a_{JJ}(x_{I},x_{J}) & a_{JK}(x_{I},x_{J}) \\ a_{IK}(x_{I})^{\top}& a_{JK}(x_{I},x_{J})^{\top}& a_{KK}(x_{I},x_{J},x_{K}) \end{array}\displaystyle \right), $$

where \(K=\{m+n+1,\ldots,d\}\), the matrices \(a_{II}\) and \(a_{JJ}\) are given by Proposition 6.4(i), we have \(a_{IK}(x_{I})=\operatorname{Diag} (x_{I})({\mathrm{Id}}- \operatorname{Diag}(x_{I})){\mathrm {P}}\) for some \({\mathrm {P}}\in{\mathbb {R}}^{m\times l}\) and \(a_{JK}(x_{I},x_{J})= \operatorname{Diag}(x_{J}){\mathrm {H}}(x_{I},x_{J})\) for some matrix \({\mathrm {H}}\) of polynomials in \({\mathrm{Pol}}_{1}(E)\), and \(a_{KK}\) has component functions in \({\mathrm{Pol}}_{2}(E)\). Regarding the drift vector \(b {=} (b_{I},b_{J},b_{K})\), the last part \(b_{K}\) is unrestricted within the class of affine functions of \(x\), whereas \((b_{I},b_{J})\) must satisfy Proposition 6.4(ii). With this structure, we have (A0)–(A2) if and only if \(a\in{\mathbb {S}}^{d}_{+}\) on \(E\). This of course imposes additional restrictions on \({\mathrm {P}}\), \({\mathrm {H}}\) and \(a_{KK}\). Stating these restrictions explicitly is cumbersome, and we refrain from doing so here.

6.3 The unit simplex

Let \(d\ge2\) and consider the unit simplex \(E=\{x\in{\mathbb {R}}^{d}_{+}: x_{1}+\cdots +x_{d}=1\}\). Here \({\mathcal {P}}=\{x_{i}:i=1,\ldots,d\}\) consists of the coordinate functions and \({\mathcal {Q}}\) consists of the single polynomial \(1-{\mathbf{1}} ^{\top}x\). The proof of the following proposition is given in Appendix I.

Proposition 6.6

Conditions (G1) and (G2) hold for the above state space  \(E\). Moreover, the operator  \({\mathcal {G}}\) satisfies (A0)–(A2) if and only if

  1. (i)

    the matrix \(a\) is given by

    $$\begin{aligned} a_{ii}(x) &= \sum_{j\ne i}\alpha_{ij}x_{i}x_{j}, \\ a_{ij}(x) &= -\alpha_{ij}x_{i}x_{j} \qquad\qquad(i\ne j) \end{aligned}$$

    on \(E\) for some \(\alpha_{ij}\in{\mathbb {R}}_{+}\) such that \(\alpha _{ij}=\alpha _{ji}\) for all \(i,j\);

  2. (ii)

    the vector \(b\) is given by

    $$ b(x)=\beta+Bx, $$

    where \(\beta\in{\mathbb {R}}^{d}\) and \(B\in{\mathbb {R}}^{d\times d}\) satisfy \(B^{\top}{\mathbf{1}}+ (\beta^{\top}{\mathbf{1}}){\mathbf{1}}= 0\) and \(\beta_{i}+B_{ji} > 0\) for all \(i\) and all \(j\ne i\).

Remark 6.7

Since \(E\) is compact, Theorem 4.2 yields uniqueness in law for \(E\)-valued solutions to (2.2).

Remark 6.8

In the special case where \(\alpha_{ij}=\sigma^{2}\) for some \(\sigma>0\) and all \(i,j\), the diffusion matrix takes the form

$$\begin{aligned} a_{ii}(x) &= \sigma^{2}x_{i}(1-x_{i}), \\ a_{ij}(x) &= -\sigma^{2}x_{i}x_{j} \qquad(i\ne j). \end{aligned}$$

The resulting process is sometimes called a multivariate Jacobi process; see for instance Gouriéroux and Jasiak [27].

Remark 6.9

Alternatively, one can establish Proposition 6.6 by considering polynomial diffusions \(Y\) on the “solid” simplex \(\{y\in {\mathbb {R}} ^{d-1}_{+}:y_{1}+\cdots+y_{d-1}\le1\}\), and then set \(X=(X_{1},\ldots ,X_{d})=(Y,1-Y_{1}-\cdots-Y_{d-1})\). In this case \({\mathcal {Q}}=\emptyset \), and it would be enough to invoke Lemma 5.4 rather than Lemma 5.5.

7 Polynomial diffusion models in finance

We now elaborate on various polynomial diffusion models in finance, following up on the introduction about (1.1). Let the state price density \(\zeta\) be a positive semimartingale on a filtered probability space \((\varOmega,{\mathcal {F}},{\mathcal {F}}_{t},{\mathbb {P}})\). This induces an arbitrage-free financial market model on any finite time horizon \(T^{\ast}\). Indeed, let \(S^{1},\dots,S^{m}\) denote the price processes of \(m\) fundamental assets. According to (1.1), we have \(\zeta_{t} S^{i}_{t} = {\mathbb {E}}[ \zeta_{T^{\ast}} S^{i}_{T^{\ast}} \,|\, {\mathcal {F}}_{t} ]\). Assuming that \(S^{1}\) is positive, we choose it as numeraire. This implies an equivalent measure \({\mathbb {Q}}^{1}\sim{\mathbb {P}}\) on \({\mathcal {F}}_{T^{\ast}}\) by

$$ \frac{{\,\mathrm{d}}{\mathbb {Q}}^{1}}{{\,\mathrm{d}}{\mathbb {P}}} = \frac{\zeta_{T^{\ast}} {S^{1}_{T^{\ast}}}}{\zeta_{0} S^{1}_{0}}. $$

Discounted price processes \(\frac{S^{i}}{ S^{1} }\) are \({\mathbb {Q}} ^{1}\)-martingales, because

$$ \frac{S^{i}_{t}}{ S^{1}_{t} }\frac{{\,\mathrm{d}}{\mathbb {Q}}^{1}}{{\,\mathrm{d}}{\mathbb {P}}}\bigg|_{{\mathcal {F}}_{t}} = \frac {S^{i}_{t}}{ S^{1}_{t}} \frac{\zeta_{t} S^{1}_{t}}{\zeta_{0} S^{1}_{0}}= \frac {\zeta _{t} S^{i}_{t}}{\zeta_{0} S^{1}_{0}}. $$

This implies that the market \(\{S^{1},\dots,S^{m}\}\) is arbitrage-free in the sense of no free lunch with vanishing risk; see Delbaen and Schachermayer [14].

Now let \(X\) be a polynomial diffusion on a state space \(E\subseteq {\mathbb {R}} ^{d}\). Fix \(n\in{\mathbb {N}}\), and let \(p\in{\mathrm{Pol}}_{n}(E)\) be a positive polynomial on \(E\) with coordinate representation \(\vec{p}\) with respect to some basis \(H(x)=(h_{1}(x),\ldots,h_{N}(x))^{\top}\) for \({\mathrm{Pol}}_{n}(E)\). The state price density is specified by \(\zeta_{t} = \mathrm{e}^{-\alpha t} p(X_{t})\), where \(\alpha \) is a real parameter. This setup yields an arbitrage-free model for the term structure of interest rates. The time \(t\) price \(P(t,T)\) of a zero coupon bond maturing at \(T\), corresponding to \(C_{T}=1\) in (1.1), can now be computed explicitly, using Theorem 3.1, as

$$ P(t,T) = \mathrm{e}^{-\alpha(T-t)} \frac{{\mathbb {E}}[p(X_{T})\,|\, {\mathcal {F}} _{t}]}{p(X_{t})} = \mathrm{e} ^{-\alpha(T-t)} \frac{H(X_{t})^{\top}\mathrm{e}^{(T-t)G} \vec{p}}{H(X_{t})^{\top}\vec{p}}, $$

where \(G\in{\mathbb {R}}^{N\times N}\) is the matrix representation of \({\mathcal {G}}\) on \({\mathrm{Pol}}_{n}(E)\). The short rate is obtained via the relation \(r_{t}=-\partial _{T}\log P(t,T)\,|\,_{T=t}\), and is given by

$$ r_{t} = \alpha- \frac{H(X_{t})^{\top}G \vec{p} }{H(X_{t})^{\top}\vec{p}}. $$

This expression clarifies the role of the parameter \(\alpha\) adjusting the level of interest rates. Such models show great potential. The linear case with \(p\) of the form \(p(x)=\phi+\psi^{\top}x\) has been studied in Filipović et al. [21], including an extensive empirical assessment. The parameter \(\psi\) is chosen such that \(E\) lies in the positive cone \(\{ x\in{\mathbb {R}}^{d}: \psi^{\top}x\ge0\} \). A specific example is \(E={\mathbb {R}}^{d}_{+}\), as discussed in Sect. 6.2.

One attractive feature of the polynomial framework is that it yields efficient pricing formulae for options on coupon bearing bonds. This includes swaptions, which are among the most important interest rate options. The generic payoff of such an option at expiry date \(T\) is of the form

$$ C_{T}=\big(c_{0}+ c_{1} P(T,T_{1})+\cdots+ c_{m} P(T,T_{m})\big)^{+} $$

for maturity dates \(T< T_{1}<\cdots<T_{m}\) and deterministic coefficients \(c_{0},\dots,c_{m}\). Formula (1.1) for the time \(t\) price of this option boils down to computing the \({\mathcal {F}}_{t}\)-conditional expectation of

$$ \zeta_{T} C_{T}=\left(H(X_{T})^{\top}\sum_{i=0}^{m} c_{i} \mathrm{e}^{-\alpha T_{i}} \mathrm{e} ^{(T_{i}-T) G} \vec{p}\right)^{+}, $$

which is the positive part of a polynomial in \(X_{T}\). Efficient methods involving the closed form \({\mathcal {F}}_{t}\)-conditional moments of \(X_{T}\) are available; see Filipović et al. [20].

Polynomial diffusions can be employed in a similar way to build stochastic volatility models. We now interpret ℙ as risk-neutral measure, and specify the spot variance (squared volatility) of an underlying stock index by \(v_{t}=p(X_{t})\). The variance swap rate for period \([t,T]\) is then given in closed form by

$$ \mathrm{VS}(t,T) = \frac{1}{T-t}{\mathbb {E}}\left[ \int_{t}^{T} v_{s} {\,\mathrm{d}} s \, \bigg|\, {\mathcal {F}}_{t}\right] = \frac{1}{T-t} H(X_{t})^{\top}\left( \int_{t}^{T} \mathrm{e}^{(s-t)G} {\,\mathrm{d}} s\right) \vec{p}. $$

Such models have been successfully employed in Filipović et al. [23] and Ackerer et al. [2]. Both papers consider the quadratic case, which falls into the setup of Sect. 6.1, with a quadric state space \(E=\{x\in{\mathbb {R}}^{d}:x^{\top}Qx\le1\}\) and spot variance \(v_{t}=p(X_{t})\) for a polynomial \(p\) of the form \(p(x)=\phi+1-x^{\top}Q x\), where \(\phi\ge0\) denotes the minimal spot variance. While Filipović et al. [23] study unbounded state spaces, Ackerer et al. [2] focus on the compact case, where \(Q\) is positive definite. They derive analytic option pricing formulae in terms of Hermite polynomials for European call and put options on an asset with diffusive price process \({\mathrm{d}} S_{t} = S_{t} r {\,\mathrm{d}} t + S_{t} \sqrt{v_{t}} {\,\mathrm{d}} W^{\ast}_{t}\), where \(r\) denotes the constant short rate and \(W^{\ast}\) is a Brownian motion, which is possibly correlated with \(W\) in (2.2).

An application of the unit simplex in Sect. 6.3 is obtained as follows. Consider a stock index, such as the S&P 500, whose price process is given by a semimartingale \(Z\). As above, we interpret ℙ as risk-neutral measure and assume a constant short rate \(r\) such that \((\mathrm{e}^{-rt}Z_{t})\) is a martingale. Let \(d\) be the number of constituent stocks, and let \(X\) be a polynomial diffusion on \(E=\{x\in{\mathbb {R}}^{d}_{+}: x_{1}+\cdots+x_{d}=1\}\) which is independent of \(Z\). We fix a finite time horizon \(T^{\ast}\) and define the \(E\)-valued martingale, for \(t\le T^{\ast}\),

$$ Y_{t} = {\mathbb {E}}[X_{T^{\ast}}\,|\,{\mathcal {F}}_{t}]. $$

Since \(X\) is polynomial, \(Y_{t}\) is a first degree polynomial in \(X_{t}\) whose coefficients can be determined by an application of Theorem 3.1. Specifically, with \(\beta\) and \(B\) being the drift parameters of \(X\) as given in Proposition 6.6, one finds

$$ Y_{t} = \varPhi(T^{\ast}-t) + \varPsi(T^{\ast}-t) X_{t} \qquad \mbox{with } \varPhi (\tau)=\int_{0}^{\tau} \mathrm{e}^{s B}\beta{\,\mathrm{d}} s\mbox{ and }\varPsi(\tau)=\mathrm{e}^{\tau B }. $$

We now define the constituent stocks’ price processes \(S^{i} = Y^{i} Z\), \(i=1,\dots,d\), such that \(S^{1}+\cdots+S^{d}=Z\). Assume that the price of the European call option on the index with maturity \(T\) and strike \(K\) is given in closed form, \(C(T,K)\), for some analytic function \(C\). The price of the call option on stock \(i\) with maturity \(T\) and strike \(K\) is then given by

$$ C_{i}(T,K)= {\mathbb {E}}[ Y^{i}_{T} C(T,K/Y^{i}_{T})]. $$

This price can be efficiently computed in three steps. First, compute \(\xi C(T,K/\xi)\) for a finite set of grid points \(\xi\in[0,1]\). Second, apply some polynomial interpolation scheme, for example using Chebyshev polynomials, to obtain a polynomial approximation of degree \(n\), say \(q(T,K,\xi)\), of \(\xi C(T,K/\xi)\) in \(\xi\in[0,1]\). Third, approximate the option price \(C_{i}(T,K)\) by \(H(X_{0})^{\top}\mathrm{e}^{T G} \vec{p}_{i}(T,K)\), where \(\vec{p}_{i}(T,K)\) is the coordinate representation of the polynomial \(p(x) {=} q\left(T,K,\varPhi_{i}(T^{\ast} {-} T) {+} ( \varPsi(T^{\ast} {-} T) x)_{i}\right)\) in \(x\) with respect to some appropriately chosen basis of polynomials for \(\mathrm{Pol}_{n}(E)\). Extensions to basket and spread options on the stocks \(S^{1},\dots,S^{d}\) are straightforward. This is work in progress.

An application of polynomial diffusions on a compact state space to credit risk is given in Ackerer and Filipović [1].