Abstract
This paper provides the mathematical foundation for polynomial diffusions. They play an important role in a growing range of applications in finance, including financial market models for interest rates, credit risk, stochastic volatility, commodities and electricity. Uniqueness of polynomial diffusions is established via moment determinacy in combination with pathwise uniqueness. Existence boils down to a stochastic invariance problem that we solve for semialgebraic state spaces. Examples include the unit ball, the product of the unit cube and nonnegative orthant, and the unit simplex.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
This paper provides the mathematical foundation for polynomial diffusions on a large class of state spaces in \({\mathbb {R}}^{d}\). A polynomial diffusion is characterized by having a linear drift and quadratic diffusion function. In consequence, moments are given in closed form. Such processes represent an extension of the affine class. They play an important role in a growing range of applications in finance, including financial market models of interest rates, credit risk, stochastic volatility, and commodities and electricity.
An arbitrage-free financial market model is determined by a state price density, i.e., a positive semimartingale \(\zeta\) defined on a filtered probability space \((\varOmega,{\mathcal {F}},{\mathcal {F}}_{t},{\mathbb {P}})\). The model price \(\varPi (t,T)\) at time \(t\) of any time \(T\) cash-flow \(C_{T}\) is given by
We may interpret ℙ as the historical measure, or more generally as an auxiliary measure possibly different from, but equivalent to, the historical measure. A polynomial diffusion model consists of a polynomial diffusion \(X\) as factor process, along with a positive polynomial \(p\) on the state space. The state price density is specified by \(\zeta_{t} = \mathrm{e}^{-\alpha t}p(X_{t})\), where \(\alpha\) is a real parameter chosen to control the lower bound on implied interest rates. We let the time \(T\) cash-flow of a security be given by \(C_{T}=q(X_{T})\) for some polynomial \(q\). The polynomial property of \(X\) along with the elementary fact that \(p q\) is a polynomial implies that \(\varPi(t,T)\) becomes a rational function in \(X_{t}\) with coefficients given in closed form in terms of a matrix exponential. Polynomial diffusion models thus yield closed form expressions for any security with cash-flows specified as polynomial functions of \(X\), which makes them universally applicable in finance. This includes financial market models for interest rates (with \(C_{T}=1\)), credit risk in a doubly stochastic framework (with \(C_{T}\) the conditional survival probability), stochastic volatility (with \(C_{T}\) the spot variance), and commodities and electricity (with \(C_{T}\) the spot price).
While polynomial diffusions have appeared in the literature since Wong [48], so far no existence and uniqueness theory has been available beyond the scalar case. This paper fills this gap and thus provides the mathematical foundation for polynomial diffusion models in finance.
Our main uniqueness result (Theorem 4.2) is based on the classical theory of the moment problem. Since the mixed moments of all finite-dimensional marginal distributions of a polynomial diffusion are uniquely determined by its generator (Theorem 3.1 and Corollary 3.2), uniqueness follows whenever these moments determine the underlying distribution. This is often true, for instance in the affine case or when the state space is compact, or more generally if exponential moments exist; Theorem 3.3 provides sufficient conditions. There are, however, situations where the moment problem approach fails. We therefore provide two additional results based on Yamada–Watanabe type arguments, which give uniqueness in the one-dimensional case (Theorem 4.3) as well as when the process dynamics exhibits a certain hierarchical structure (Theorem 4.4). These uniqueness results do not depend on the geometry of the state space.
In order to study existence, we assume that the state space is a basic closed semialgebraic set, i.e., the nonnegativity set of a finite family of polynomials. Existence reduces to a stochastic invariance problem that we solve under suitable geometric and algebraic conditions on the state space (Theorem 5.3). We also study boundary attainment. In applications, it is frequently of interest to know whether the trajectories of a given process may hit the boundary of the state space. In particular, simulating trajectories becomes a much more delicate task if the boundary is attained; see Lord et al. [35]. We present sufficient conditions for both attainment and non-attainment that are tight (Theorem 5.7).
A semialgebraic state space is a natural choice for at least three reasons. First, positive semidefiniteness of the quadratic diffusion matrix boils down to nonnegativity constraints on polynomials. Second, polynomial diffusion models in finance involve polynomials that are required to be positive on the state space. And third, semialgebraic sets turn out to be an ideal setting for employing tools from real algebraic geometry to verify the hypotheses of our existence and boundary attainment results.
We give a detailed analysis of some specific semialgebraic state spaces that do and will play an important role in financial applications, and that illustrate the scope of polynomial diffusions. Specifically, we consider certain quadric sets including the unit ball \(\{x\in{\mathbb {R}}^{d}: \| x\| \le1\}\), the product space \([0,1]^{m}\times{\mathbb {R}}^{n}_{+}\), and the unit simplex \(\{x\in{\mathbb {R}}^{d}_{+}: x_{1}+\cdots+x_{d}=1\}\). We also elaborate on polynomial diffusion models in finance, and show how to specify novel stochastic models for interest rates, stochastic volatility, and stock markets.
Polynomial processes have been studied in various degrees of generality by several authors, for instance Wong [48], Mazet [38], Zhou [49], Forman and Sørensen [24], among others. The first systematic accounts treating the time-homogeneous Markov jump-diffusion case are Cuchiero [9] and Cuchiero et al. [10]. The use of polynomial diffusions in financial modeling goes back at least to the early 2000s. Zhou [49] used one-dimensional polynomial (jump-)diffusions to build short rate models that were estimated to data using a generalized method-of-moments approach, relying crucially on the ability to compute moments efficiently. A short rate model based on the Jacobi process was presented by Delbaen and Shirakawa [15], and Larsen and Sørensen [33] used the same process for exchange rate modeling. The multidimensional Jacobi process was studied by Gouriéroux and Jasiak [27], who constructed a stock price model with smooth transitions of drift and volatility regimes. More recently, polynomial diffusions have featured in the context of financial applications in several papers; see Filipović et al. [23, 21] for models of the term structure of variance swap rates and interest rates, respectively, models, and Cuchiero et al. [10] for variance reduction for option pricing and hedging, among other applications. There are several reasons for moving beyond the affine class. In particular, nontrivial dynamics on compact state spaces become a possibility, which together with the polynomial property fits well with polynomial expansion techniques; see also Filipović et al. [20]. Also on non-compact state spaces, one can achieve richer dynamics than in the affine case. Examples of non-affine polynomial processes include multidimensional Jacobi or Fisher–Wright processes (Ethier [18], Gouriéroux and Jasiak [27]), Pearson diffusions (Forman and Sørensen [24]), and Dunkl processes (Dunkl [17], Gallardo and Yor [25]).
The rest of the paper is structured as follows. In Sect. 2, we define polynomial diffusions. Section 3 is concerned with power and exponential moments. In Sect. 4, we discuss uniqueness. In Sect. 5, we treat existence and boundary attainment. Section 6 contains examples of semialgebraic state spaces. Section 7 outlines various polynomial diffusion models in finance. For the sake of readability, most proofs are given in Appendices A–I. Some basic notions from algebraic geometry are reviewed in Appendix J.
We end this introduction with some notational conventions that will be used throughout this paper. For a function \(f:{\mathbb {R}}^{d}\to{\mathbb {R}}\), we write \(\{ f=0\}\) for the set \(\{x\in{\mathbb {R}}^{d}:f(x)=0\}\). A polynomial \(p\) on \({\mathbb {R}} ^{d}\) is a map \({\mathbb {R}}^{d}\to{\mathbb {R}}\) of the form \(\sum_{\alpha}c_{\alpha}x_{1}^{\alpha _{1}}\cdots x_{d}^{\alpha_{d}}\), where the sum runs over all multi-indices \(\alpha=(\alpha_{1},\ldots,\alpha_{d})\in{\mathbb {N}}^{d}_{0}\) and only finitely many of the coefficients \(c_{\alpha}\) are nonzero. Such a representation is unique. The degree of \(p\) is the number \(\deg p=\max\{\alpha _{1}+\cdots+\alpha_{d} : c_{\alpha}\ne0\}\). We let \({\mathrm {Pol}}({\mathbb {R}}^{d})\) denote the ring of all polynomials on \({\mathbb {R}}^{d}\), and \({\mathrm {Pol}}_{n}({\mathbb {R}}^{d})\) the subspace consisting of polynomials of degree at most \(n\). Let \(E\) be a subset of \({\mathbb {R}}^{d}\). A polynomial on \(E\) is the restriction \(p=q|_{E}\) to \(E\) of a polynomial \(q\in{\mathrm{Pol}}({\mathbb {R}}^{d})\). Its degree is \(\deg p=\min \{\deg q : p=q|_{E}, q\in{\mathrm{Pol}}({\mathbb {R}}^{d})\}\). We let \({\mathrm {Pol}}(E)\) denote the ring of polynomials on \(E\), and \({\mathrm{Pol}}_{n}(E)\) the subspace of polynomials on \(E\) of degree at most \(n\). Both \({\mathrm{Pol}}_{n}({\mathbb {R}}^{d})\) and \({\mathrm{Pol}}_{n}(E)\) are finite-dimensional real vector spaces, but if there are nontrivial polynomials that vanish on \(E\), their dimensions will be different. If \(E\) has a nonempty interior, then \({\mathrm{Pol}}_{n}({\mathbb {R}}^{d})\) and \({\mathrm{Pol}} _{n}(E)\) can be identified. The set of real symmetric \(d\times d\) matrices is denoted \({\mathbb {S}}^{d}\), and the subset of positive semidefinite matrices is denoted \({\mathbb {S}}^{d}_{+}\).
2 Definition of polynomial diffusions
Throughout this paper, we fix maps \(a: {\mathbb {R}}^{d}\to{\mathbb {S}}^{d}\) and \(b:{\mathbb {R}} ^{d}\to {\mathbb {R}}^{d}\) with
and a state space \(E\subseteq{\mathbb {R}}^{d}\). Our goal is to investigate the following issues:
-
(a)
For a suitable class of state spaces \(E\), find conditions on \(a\), \(b\), \(E\) that guarantee the existence of an \(E\)-valued solution to the stochastic differential equation
$$ {\mathrm{d}} X_{t} = b(X_{t}) {\,\mathrm{d}} t + \sigma(X_{t}) {\,\mathrm{d}} W_{t} $$(2.2)for some \(d\)-dimensional Brownian motion \(W\) and some continuous function \(\sigma:{\mathbb {R}}^{d}\to{\mathbb {R}}^{d\times d}\) with \(\sigma\sigma^{\top}=a\) on \(E\). We shall consider the class of basic closed semialgebraic sets \(E\), defined using polynomial equalities and inequalities.
-
(b)
Find conditions for uniqueness in law for \(E\) -valued solutions to ( 2.2 ). By this we mean that for any \(x\in E\) and any \(E\)-valued solutions \(X\) and \(X'\) to (2.2) with \(X_{0}=X_{0}'=x\), possibly with different driving Brownian motions, \(X\) and \(X'\) have the same law.
-
(c)
Find conditions for a solution to (2.2) to attain the boundary of \(E\).
-
(d)
Find large parametric classes of \(a\), \(b\), \(E\) for which (2.2) admits a solution.
Investigating these issues is motivated by the fact that diffusions as in (2.2) admit closed form conditional moments and have broad applications in finance, as we shall see below.
We consider the partial differential operator \({\mathcal {G}}\) given by
In view of (2.1), \({\mathcal {G}}\) maps \({\mathrm {Pol}}_{n}({\mathbb {R}}^{d})\) to itself for each \(n\in{\mathbb {N}}\). As we work on a state space \(E\subseteq {\mathbb {R}}^{d}\), we now refine this property. We say that \({\mathcal {G}}\) is well defined on \({\mathrm{Pol}} (E)\) if \({\mathcal {G}}f=0\) on \(E\) for any \(f\in{\mathrm {Pol}}({\mathbb {R}}^{d})\) with \(f=0\) on \(E\). In this case, \({\mathcal {G}}\) is well defined as an operator on \({\mathrm{Pol}}(E)\). This always holds if \(E\) has a nonempty interior.
Definition 2.1
The operator \({\mathcal {G}}\) is called polynomial on \(E\) if it is well defined on \({\mathrm{Pol}}(E)\), and thus maps \({\mathrm{Pol}}_{n}(E)\) to itself for each \(n\in {\mathbb {N}}\). In this case, we call any \(E\)-valued solution to (2.2) a polynomial diffusion on \(E\).
It is a simple matter to verify that any second order partial differential operator that maps \({\mathrm{Pol}}_{n}(E)\) to itself for each \(n\in {\mathbb {N}}\) is necessarily of the form (2.1) and (2.3) on \(E\).
Lemma 2.2
Let \(\widetilde{\mathcal {G}}f = \frac{1}{2}\operatorname{Tr}( \widetilde{a} \nabla ^{2} f) + \widetilde{b}^{\top}\nabla f\) be a partial differential operator for some maps \(\widetilde{a}: {\mathbb {R}}^{d}\to{\mathbb {S}}^{d}\) and \(\widetilde{b}:{\mathbb {R}}^{d}\to{\mathbb {R}}^{d}\). Assume \(\widetilde{\mathcal {G}}\) is well defined on \({\mathrm{Pol}}(E)\). Then the following are equivalent:
-
(i)
\(\widetilde{\mathcal {G}}\) maps \({\mathrm{Pol}}_{n}(E)\) to itself for each \(n\in{\mathbb {N}}\).
-
(ii)
\(\widetilde{\mathcal {G}}\) maps \({\mathrm{Pol}}_{n}(E)\) to itself for \(n\in \{ 1,2\}\).
-
(iii)
The components of \(\widetilde{a}\) and \(\widetilde{b}\) restricted to \(E\) lie in \({\mathrm{Pol}}_{2}(E)\) and \({\mathrm {Pol}}_{1}(E)\), respectively.
In this case, \(\widetilde{a}\) and \(\widetilde{b}\) restricted to \(E\) are uniquely determined by the action of \(\widetilde{\mathcal {G}}\) on \({\mathrm{Pol}}_{2}(E)\).
Proof
The implications \(\mbox{(i)}\Rightarrow\mbox{(ii)}\) and \(\mbox{(iii)}\Rightarrow\mbox{(i)}\) are immediate, and the implication \(\mbox{(ii)}\Rightarrow\mbox{(iii)}\) follows upon applying \(\widetilde {\mathcal {G}}\) to the monomials of degree one and two. In particular, this pins down \(\widetilde{a}\) and \(\widetilde{b}\) on \(E\), and thus also establishes the last part of the lemma. □
In the one-dimensional case \(d=1\), one can classify all polynomial diffusions on intervals \(E\). Indeed, one has \(a(x)=a+\alpha x+Ax^{2}\) and \(b(x)=b+\beta x\) for some scalars \(a,\alpha,A,b,\beta\), and \(E=\{x\in{\mathbb {R}}:a(x)\ge0\}\). See Forman and Sørensen [24] and Filipović et al. [23] for details.
The multidimensional case is less trivial. For example, let \(d=2\), \(E={\mathbb {R}} \times\{0\}\), and consider the operator \({\mathcal {G}}f(x,y)=\frac {1}{2}\partial_{xx}f(x,y)+\partial_{y} f(x,y)\). This operator is not well defined on \({\mathrm{Pol}}(E)\), since the polynomial \(f(x,y)=y\) vanishes on \(E\), but \({\mathcal {G}}f(x,y)=1\). On the other hand, \({\mathcal {G}}\) is the generator of the diffusion \({\mathrm{d}} X_{t}=({\mathrm{d}} B_{t},{\,\mathrm{d}} t)\), where \(B\) is a one-dimensional Brownian motion. This process immediately leaves \(E\) for any starting point \(x\in E\). If, however, an \(E\)-valued solution to (2.2) exists for any starting point \(x\in E\), then \({\mathcal {G}}\) is well defined on \({\mathrm{Pol}}(E)\). This follows from the following basic positive maximum principle.
Lemma 2.3
Consider \(f\in C^{2}({\mathbb {R}}^{d})\) and suppose \({\overline{x}}\in E\) is a maximizer of \(f\) over \(E\). If (2.2) admits an \(E\)-valued solution with \(X_{0}=\overline{x}\), then \({\mathcal {G}}f({\overline{x}})\le0\).
Proof
Let \(X\) be an \(E\)-valued solution to (2.2) with \(X_{0}=\overline{x}\), and assume for contradiction that \({\mathcal {G}}f({\overline{x}})>0\). By the definition of a global maximizer, \(f(x)\le f({\overline{x}})\) for all \(x\in E\). Let \(\tau=\inf\{t\ge0: {\mathcal {G}}f(X_{t})\le0\}\), and note that \(\tau>0\). Then for \(t\in(0,\tau)\), we have \(f(X_{t})\le f({\overline{x}})\) and \({\mathcal {G}}f(X_{t})>0\), which implies
for all \(t>0\). Thus the left-hand side is a local martingale starting from zero, strictly negative for all \(t>0\). This contradiction proves that \({\mathcal {G}}f({\overline{x}})\le0\). □
Regarding uniqueness, it is crucial to restrict attention to \(E\)-valued solutions. To illustrate what can otherwise go wrong, consider the stochastic differential equation \({\mathrm{d}} X_{t} = -2\sqrt{X_{t}^{-}}{\,\mathrm{d}} t+ 2\sqrt {X_{t}^{+}}{\,\mathrm{d}} W_{t}\), which is well known to have a unique \({\mathbb {R}}_{+}\)-valued solution: the zero-dimensional squared Bessel process. However, this stochastic differential equation admits other solutions that do not remain in \({\mathbb {R}}_{+}\), for example \(X_{t} = Y_{t}{\boldsymbol{1}_{\{t\le \tau\}}} - (t-\tau)^{2}{\boldsymbol{1}_{\{t>\tau\}}}\), where \(Y\) is a zero-dimensional squared Bessel process with \(Y_{0}\ge0\) and \(\tau=\inf\{t:Y_{t}=0\}\). Here \(\tau\) is finite almost surely.
Note that in Definition 2.1, we require neither uniqueness of solutions to (2.2), nor that \({\mathcal {G}}\) be the generator of a Markov process on \(E\). There are two reasons for this. First, existence of \(E\)-valued solutions to (2.2) does not in itself imply that those solutions are Markovian. Second, in the context of Markov processes, the polynomial property holds if and only if the corresponding semigroup leaves \({\mathrm{Pol}}_{n}(E)\) invariant for each \(n\in {\mathbb {N}}\). However, this fact, properly phrased, does not require the Markov property. Only Itô calculus is needed. This observation is crucial for our approach to proving uniqueness. Finally, we remark that a polynomial diffusion that is also a Markov process is a “polynomial process” in the terminology of Cuchiero et al. [10], with vanishing killing rate and no jumps.
3 Power and exponential moments
Throughout this section, we assume that \({\mathcal {G}}\) is polynomial on \(E\) and let \(X\) be an \(E\)-valued solution to (2.2) realized on a filtered probability space \((\varOmega,{\mathcal {F}},{\mathcal {F}}_{t},{\mathbb {P}})\).
For any \(n\in{\mathbb {N}}\), we let \(N=N(n,E)\) denote the dimension of \(\mathrm{Pol}_{n}(E)\). We fix a basis of polynomials \(h_{1},\dots,h_{N}\) for \(\mathrm{Pol}_{n}(E)\) and write
Then for each \(p\in{\mathrm{Pol}}_{n}(E)\), there exists a unique vector \(\vec{p}\in {\mathbb {R}}^{{N}}\) such that
The restriction of \({\mathcal {G}}\) to \({\mathrm{Pol}}_{n}(E)\) has a unique matrix representation \(G\in{\mathbb {R}}^{{N}\times{N}}\), characterized by the property that \(G \vec{p}\) is the coordinate vector of \({\mathcal {G}}p\) whenever \(\vec{p}\) is the coordinate vector of \(p\). That is, we have
We now show that \({\mathbb {E}}[p(X_{T}) \,|\, {\mathcal {F}}_{t}]\) is indeed well defined as a polynomial function of \(X_{t}\). Recall that we do not assume uniqueness of solutions to (2.2), and we do not require \(X\) to be Markov. The proof is given in Appendix B.
Theorem 3.1
If \({\mathbb {E}}[\|X_{0}\|^{2n}]<\infty\), then for any \(p\in{\mathrm {Pol}}_{n}(E)\) with coordinate representation \(\vec{p}\in{\mathbb {R}}^{{N}}\), we have
The following result is a direct consequence of Theorem 3.1. Its statement and proof use standard multi-index notation: For a multi-index \({\mathbf {k}}=(k_{1},\ldots,k_{d})\in{\mathbb {N}}^{d}_{0}\), we write \(|{\mathbf {k}} |=k_{1}+\cdots+k_{d}\) and \(x^{\mathbf {k}}=x_{1}^{k_{1}}\cdots x_{d}^{k_{d}}\).
Corollary 3.2
For any time points \(0\le t_{1}<\cdots<t_{m}\) and for any multi-indices \({\mathbf {k}} (1), \ldots, {\mathbf {k}}(m)\) such that
the expectation \({\mathbb {E}}[ X_{t_{1}}^{{\mathbf {k}}(1)} \cdots X_{t_{m}}^{{\mathbf {k}}(m)} ]\) is uniquely determined by \({\mathcal {G}}\) and the law of \(X_{0}\).
Proof
We prove the result for \(m=2\); the general case follows by iteration. Set \({\mathbf {j}}={\mathbf {k}}(1)\), \({\mathbf {k}}={\mathbf {k}}(2)\), and \(n=|{\mathbf {j}}|+|{\mathbf {k}}|\). Since \({\mathbb {E}} [\| X_{0}\|^{2|{\mathbf {k}}|}]<\infty\), Theorem 3.1 yields \(X_{t_{1}}^{\mathbf {j}}{\mathbb {E}}[X_{t_{2}}^{{\mathbf {k}}}\,|\,{\mathcal {F}}_{t_{1}}]=p(X_{t_{1}})\) for some polynomial \(p\in {\mathrm{Pol}}_{n}(E)\) whose coordinate representation \(\vec{p}\) only depends on \(G\). Since \({\mathbb {E}}[\|X_{0}\|^{2n}]<\infty\), another application of Theorem 3.1 yields
This proves the corollary. □
We next provide conditions under which \(X_{T}\) admits finite exponential moments. This result will be used in connection with proving uniqueness in Theorem 4.2 below, but is also of interest on its own for applications in finance.Footnote 1 Its proof is given in Appendix C.
Theorem 3.3
If
and the diffusion coefficient satisfies the linear growth condition
for some constant \(C\), then for each \(t\ge0\), there exists \(\varepsilon >0\) with \({\mathbb {E}}[ \mathrm{e}^{\varepsilon\|X_{t}\|}] < \infty\).
4 Uniqueness
Throughout this section, we assume that \({\mathcal {G}}\) is polynomial on \(E\). We present three results regarding uniqueness in law for \(E\)-valued solutions to (2.2). Recall that this notion of uniqueness pertains to deterministic initial conditions, as defined under (b) in Sect. 2.
The first result relies on the fact that the joint moments of all finite-dimensional marginal distributions of a polynomial diffusion are uniquely determined by \({\mathcal {G}}\); see Corollary 3.2. Thus uniqueness in law follows if the finite-dimensional marginal distributions are the only ones with these moments. This property is known as determinacy in the literature on the moment problem, a classical topic in mathematics; references include Stieltjes [45], Akhiezer [3], Berg et al. [5], Schmüdgen [43], Stoyanov [46], Kleiber and Stoyanov [32] and many others.
Lemma 4.1
Let \(X\) be an \(E\)-valued solution to (2.2). If for each \(t\ge 0\), there exists \(\varepsilon>0\) with \({\mathbb {E}}[\exp(\varepsilon\| X_{t}\| )]<\infty\), then any \(E\)-valued solution to (2.2) with the same initial law as \(X\) has the same law as \(X\). In particular, this holds if (3.3) and (3.4) are satisfied.
Proof
For any \(t\ge0\) and \(i\in\{1,\ldots,d\}\), the hypothesis yields \({\mathbb {E}}[\exp(\varepsilon|X_{i,t}|)]<\infty\) for some \(\varepsilon>0\). As a consequence, the moment-generating function of \(X_{i,t}\) exists and is analytic in \((-\varepsilon,\varepsilon)\), hence equal to its power series expansion, and thus determined by the moments of \(X_{i,t}\). By Curtiss [11, Theorem 1], the moment-generating function determines the law of \(X_{i,t}\), which thus satisfies the determinacy property. Now, according to Petersen [40, Theorem 3], determinacy of the (one-dimensional) marginals of a measure on \({\mathbb {R}}^{m}\) implies determinacy of the measure itself. It follows that determinacy holds for the law of each collection \((X_{t_{1}},\ldots,X_{t_{m}})\), \(0\le t_{1}<\cdots<t_{m}\). By Corollary 3.2, the corresponding moments are the same for any \(E\)-valued solution to (2.2) with the same initial law as \(X\). This proves the theorem. □
If \(X_{0}=x\) is deterministic, then (3.3) holds and Lemma 4.1 directly yields our first result.
Theorem 4.2
If the linear growth condition (3.4) is satisfied, then uniqueness in law for \(E\)-valued solutions to (2.2) holds.
Theorem 4.2 assumes the linear growth condition (3.4) to ensure existence of exponential moments. While valid for all affine diffusions, as well as when \(E\) is compact, this condition excludes some interesting examples, in particular geometric Brownian motion.Footnote 2 Uniqueness for the geometric Brownian motion holds of course, and can be established via the Yamada–Watanabe pathwise uniqueness theorem for one-dimensional diffusions. Our second result records this fact.
Theorem 4.3
If the dimension is \(d=1\), then uniqueness in law for \(E\)-valued solutions to (2.2) holds.
Proof
Since \({\mathcal {G}}\) is polynomial, the drift \(b(x)\) in (2.2) is an affine function on \(E\), and the dispersion restricted to \(E\) is of the form \(\sigma(x) = \sqrt{ \alpha+ ax + Ax^{2}}\) for some real parameters \(\alpha, a, A\). Hence \(b(x)\) is Lipschitz-continuous, and \(\sigma(x)\) satisfies
where \(\rho_{n}(z)= |a+2nA| z\), for any \(n\ge1\). A localization argument in conjunction with Rogers and Williams [42, Theorem V.40.1] shows that pathwise uniqueness holds for any \(E\)-valued solution to (2.2). This in turn implies uniqueness in law; see Rogers and Williams [42, Theorem V.17.1]. □
Our third result, in combination with Theorems 4.2 and 4.3, yields uniqueness in a wide range of cases that are encountered in applications. The setup is the following. We assume that any \(E\)-valued solution to (2.2) can be partitioned as \(X=(Y,Z)\), where \(Y\) is an autonomous \(m\)-dimensional diffusion with closed state space \(E_{Y}\subseteq{\mathbb {R}}^{m}\), \(Z\) is \(n\)-dimensional, and \(m+n=d\). That is, \((Y,Z)\) solves the stochastic differential equation
for polynomials \(b_{Y}:{\mathbb {R}}^{m}\to{\mathbb {R}}^{m}\) and \(b_{Z}:{\mathbb {R}}^{m}\times{\mathbb {R}}^{n}\to{\mathbb {R}} ^{n}\) of degree one, continuous maps \(\sigma_{Y}:{\mathbb {R}}^{m}\to{\mathbb {R}}^{m\times d}\) and \(\sigma _{Z}:{\mathbb {R}}^{m}\times{\mathbb {R}}^{n}\to{\mathbb {R}}^{n\times d}\), and where \(Y\) takes values in \(E_{Y}\). The proof of the following theorem is given in Appendix D.
Theorem 4.4
Assume that uniqueness in law for \(E_{Y}\)-valued solutions to (4.1) holds, and that \(\sigma_{Z}\) is locally Lipschitz in \(z\), locally in \(y\), on \(E\). That is, for each compact subset \(K\subseteq E\), there exists a constant \(\kappa\) such that for all \((y,z,y',z')\in K\times K\),
Then uniqueness in law for \(E\)-valued solutions to (2.2) holds.
5 Existence and boundary attainment
In this section, we discuss existence of \(E\)-valued solutions to (2.2) and give conditions under which the boundary of the state space is attained. The results are stated and proved using some basic concepts from algebra and algebraic geometry. Appendix J provides a review of the required notions.
Existence of a solution to (2.2) with values in \({\mathbb {R}}^{d}\) is well known to hold under linear growth conditions; see for instance Ikeda and Watanabe [31, Theorem IV.2.4]. The problem at hand thus boils down to finding conditions under which a solution to (2.2) takes values in \(E\). This is a stochastic invariance problem. In Appendix A, we discuss necessary and sufficient conditions for nonnegativity of certain Itô processes, which is the basic tool we use for proving stochastic invariance.
We henceforth assume that the state space \(E\) is a basic closed semialgebraic set. Specifically, let \({\mathcal {P}}\) and \({\mathcal {Q}}\) be finite collections of polynomials on \({\mathbb {R}}^{d}\), and define
where
In particular, if \({\mathcal {Q}}=\emptyset\) then \(M={\mathbb {R}}^{d}\). The following result provides simple necessary conditions for the invariance of \(E\) with respect to (2.2).
Theorem 5.1
Suppose there exists an \(E\)-valued solution to (2.2) with \(X_{0}=x\), for any \(x\in E\). Then
-
(i)
\(a\nabla p=0\) and \({\mathcal {G}}p\ge0\) on \(E\cap\{p=0\}\), for each \(p\in{\mathcal {P}}\);
-
(ii)
\(a\nabla q=0\) and \({\mathcal {G}}q=0\) on \(E\), for each \(q\in {\mathcal {Q}}\).
Proof
Pick any \(p\in{\mathcal {P}}\), \(x\in E\cap\{p=0\}\), and let \(X\) be a solution to (2.2) with \(X_{0}=x\). Then \(p(X_{t})=\int_{0}^{t}{\mathcal {G}}p(X_{s}){\,\mathrm{d}} s+\int_{0}^{t}\nabla p(X_{s})^{\top}\sigma(X_{s}){\,\mathrm{d}} W_{s}\) and \(p(X)\ge0\), so (i) follows by Lemma A.1(ii). To prove (ii) for \(q\in{\mathcal {Q}}\), simply apply the same argument to \(q\) and \(-q\). □
The necessary condition \(a\nabla p=0\) states, roughly speaking, that at any boundary point of the state space, there can be no diffusive fluctuations orthogonally to the boundary. The necessary condition \({\mathcal {G}}p\ge0\) can be interpreted as “inward-pointing adjusted drift” at the boundary. The following example shows that this cannot be replaced by a simple “inward-pointing drift” condition.
Example 5.2
Consider the bivariate process \((U,V)\) with dynamics
where \((W_{1},W_{2})\) is a Brownian motion and \(\alpha> 0\). In other words, \(U\) is a Brownian motion and \(V\) is an independent squared Bessel process. The state space is \({\mathbb {R}}\times{\mathbb {R}}_{+}\). Now consider the process \((X,Y)=(U,V-U^{2})\). Its dynamics is
and its state space is \(E=\{(x,y)\in{\mathbb {R}}^{2}:x^{2}+y\ge0\}\), the epigraph of the function \(-x^{2}\). The drift of \((X,Y)\) is \(b(x,y)=(0,\alpha-1)\), which points out of the state space at every boundary point, provided \(\alpha<1\). Nonetheless, with \(p(x,y)=x^{2}+y\), a calculation yields \({\mathcal {G}}p(x,y)=\alpha>0\).
As a converse to Theorem 5.1, we now give sufficient conditions for the existence of an \(E\)-valued solution to (2.2). The proof of the following theorem is given in Appendix E.
Theorem 5.3
Suppose \(E\) satisfies the following geometric and algebraic properties:
-
(G1)
\(\nabla r(x)\), \(r\in{\mathcal {Q}}\), are linearly independent for all \(x\in M\);
-
(G2)
the ideals generated by \({\mathcal {Q}}\cup\{p\}\) and \(M\cap\{ p=0\}\) are equal, i.e., we have \(({\mathcal {Q}}\cup\{p\})={\mathcal {I}}(M\cap\{p=0\})\), for each \(p\in{\mathcal {P}}\);
and the maps \(a\) and \(b\) satisfy
-
(A0)
\(a \in{\mathbb {S}}^{d}_{+}\) on \(E\);
-
(A1)
\(a \nabla p=0\) on \(M\cap\{p=0\}\) and \({\mathcal {G}}p>0\) on \(E\cap\{ p=0\}\), for each \(p\in{\mathcal {P}}\);
-
(A2)
\(a \nabla q=0\) and \({\mathcal {G}}q=0\) on \(M\), for each \(q\in {\mathcal {Q}}\).
Then \({\mathcal {G}}\) is polynomial on \(E\), and there exists a continuous map \(\sigma:{\mathbb {R}}^{d}\to{\mathbb {R}}^{d\times d}\) with \(\sigma\sigma^{\top}=a\) on \(E\) and such that the stochastic differential equation (2.2) admits an \(E\)-valued solution \(X\) for any initial law of \(X_{0}\). This solution can be chosen so that it spends zero time in the sets \(\{p=0\}\), \(p\in{\mathcal {P}}\). That is,
Conditions (A1) and (A2) should be contrasted with the necessary conditions of Theorem 5.1. The latter are somewhat weaker, since they only make statements about \(a\) and \(b\) on \(E\) rather than \(M\), and since the inequality in Theorem 5.1(i) is weak. Theorem 5.3 can be generalized to allow a weak inequality in (A1), at the cost of allowing absorption of the process at the boundary. We do not consider this generalization here.
Condition (G1) implies that \(M\) is an algebraic submanifold in \({\mathbb {R}}^{d}\) of dimension \(d-|{\mathcal {Q}}|\). The least obvious condition is arguably (G2). The crucial implication of (G2) is that any polynomial \(f\) that vanishes on \(M\cap\{p=0\}\) has a representation \(f=h p\) on \(M\) for some polynomial \(h\). In conjunction with (A1), this implies that \(a(x)\nabla p(x)\) decays like \(p(x)\) as \(x\in E\) approaches the boundary set \(E\cap \{p=0\}\), for \(p\in{\mathcal {P}}\). This allows one to prove that the local time of \(p(X)\) at level zero vanishes, which makes Lemma A.1 applicable; see Appendix E for the details.
Condition (G2) is also the least straightforward to verify. We therefore present two sufficient conditions that are easier to check in concrete examples. The first condition is useful when \(M={\mathbb {R}}^{d}\), in which case each ideal appearing on the left-hand side in (G2) is generated by a single polynomial. This covers many interesting examples, yet yields conditions that are easy to verify in practice. A proof of the following result can be found in Bochnak et al. [6, Theorem 4.5.1].
Lemma 5.4
Let \(p\in{\mathrm{Pol}}({\mathbb {R}}^{d})\) be an irreducible polynomial and \({\mathcal {V}}(p)\) its zero set. Then \((p)={\mathcal {I}}({\mathcal {V}}(p))\) if and only if \(p\) changes sign on \({\mathbb {R}}^{d}\), that is, \(p(x)p(y)<0\) for some \(x,y\in{\mathbb {R}}^{d}\).
The second condition applies when the ideals generated by the families \({\mathcal {Q}}\cup\{p\}\) with \(p\in{\mathcal {P}}\) are prime and of full dimension.
Lemma 5.5
For \(p\in{\mathcal {P}}\), assume that the ideal \(({\mathcal {Q}}\cup\{p\} )\) is prime with dimension \(d-1-|{\mathcal {Q}}|\), and that there exists some \(x\in M\cap \{p=0\}\) such that the vectors \(\nabla r(x)\), \(r\in{\mathcal {Q}}\cup\{p\}\), are linearly independent. Then \(({\mathcal {Q}}\cup\{p\})={\mathcal {I}}(M\cap\{p=0\})\).
Proof
This follows directly from Bochnak et al. [6, Proposition 3.3.16]. □
Remark 5.6
Stochastic invariance problems have been studied by a number of authors; see Da Prato and Frankowska [12], Filipović et al. [22], among many others. The approach in these papers is to impose an “inward-pointing Stratonovich drift” condition. This breaks down for polynomial diffusions. Indeed, consider the squared Bessel process
which is an \({\mathbb {R}}_{+}\)-valued affine process for \(\alpha\ge0\). The stochastic integral cannot always be written in Stratonovich form, since \(\sqrt{X}\) fails to be a semimartingale for \(0<\alpha<1\). If nonetheless one formally computes the Stratonovich drift, one obtains \(\alpha-1\), suggesting that \(\alpha\ge1\) is needed for stochastic invariance of \({\mathbb {R}}_{+}\). However, it is well known that \(\alpha\ge0\) is the correct condition. Our approach is rather in the spirit of Da Prato and Frankowska [13] who however focus on stochastic invariance of closed convex sets.
Apart from existence, Theorem 5.3 asserts that \(X\) spends zero time in the sets \(\{p=0\}\), \(p\in{\mathcal {P}}\), which roughly speaking correspond to boundary segments of the state space. It does not, however, tell us whether these sets are actually hit. The purpose of the following theorem is to give necessary and sufficient conditions for this to occur. The proof is given in Appendix F. The vector \(h\) of polynomials appearing in the theorem exists if (G2) and (A1) are satisfied.
Theorem 5.7
Let \(X\) be an \(E\)-valued solution to (2.2) satisfying (5.2). Consider \(p\in{\mathcal {P}}\) and let \(h\) be a vector of polynomials such that \(a \nabla p = h p\) on \(M\).
-
(i)
Assume there exists a neighborhood \(U\) of \(E\cap\{p=0\}\) such that
$$ 2 {\mathcal {G}}p - h^{\top}\nabla p\ge0 \qquad\textit{on } E\cap U. $$Then \(p(X_{t})>0\) for all \(t>0\).
-
(ii)
Assume (G2) holds and
$$ 2 {\mathcal {G}}p - h^{\top}\nabla p=0 \qquad\textit{on } M\cap\{p=0\}. $$Then \(p(X_{t})>0\) for all \(t>0\).
-
(iii)
Let \(\overline{x}\in E\cap\{p=0\}\) and assume
$$ {\mathcal {G}}p(\overline{x})\ge0 \qquad\textit{and}\qquad2 {\mathcal {G}}p({\overline{x}}) - h({\overline{x}})^{\top}\nabla p({\overline{x}})< 0. $$Then for any \(T>0\), there exists \(\varepsilon>0\) such that if \(\| X_{0}-\overline{x}\|<\varepsilon\) almost surely, then \(p(X_{t})=0\) for some \(t\le T\) with positive probability.
As a simple example, we may apply Theorem 5.7 to the scalar square-root diffusion \(dX_{t}=(b+\beta X_{t}) dt+\sigma\sqrt{X_{t}} dB_{t}\) with parameters \(b,\sigma> 0\) and \(\beta<0\), and where \(B\) is a one-dimensional Brownian motion. In this case \(E={\mathbb {R}}_{+}\), and \({\mathcal {P}}\) consists of the single polynomial \(p(x)=x\). We have \(a(x)p'(x) = \sigma ^{2} x = \sigma^{2} p(x)\), so that \(h(x)\equiv\sigma^{2}\), and thus
It is well known that \(X_{t}>0\) for all \(t>0\) if and only if the Feller condition \(2b\ge\sigma^{2}\) holds. Theorem 5.7(iii) gives the necessity of the Feller condition. Theorem 5.7(i) and (ii) together give the sufficiency of the Feller condition. Indeed, if \(2b>\sigma^{2}\), then Theorem 5.7(i) applies, while the condition in Theorem 5.7(ii) is not satisfied. Theorem 5.7(ii) in turn applies when \(2b=\sigma^{2}\), while Theorem 5.7(i) does not.
6 Examples of semialgebraic state spaces
We now discuss examples of semialgebraic state spaces of interest, where our results are applicable.
6.1 Some quadric sets
Let \(Q\in{\mathbb {S}}^{d}\) be nonsingular, and consider the state space \(E=\{x\in{\mathbb {R}}^{d}:x^{\top}Qx\le1\}\). Here \({\mathcal {P}}\) consists of the single polynomial \(p(x)=1-x^{\top}Qx\), and \(M={\mathbb {R}}^{d}\). After a linear change of coordinates, we may assume \(Q\) is diagonal with \(Q_{ii}\in\{+1,-1\}\). We also suppose \(Q_{ii}=1\) for at least some \(i\), since otherwise \(E={\mathbb {R}}^{d}\). State spaces of this type include the closed unit ball, but also non-convex sets like \(\{x\in{\mathbb {R}}^{2}:x_{1}^{2}-x_{2}^{2}\le1\}\), whose boundary is a hyperbola. One can also consider complements of such sets; see Remark 6.3 below. One interesting aspect of the state spaces investigated here is that they do not admit non-deterministic affine diffusions; this follows directly from Proposition 6.1 below, which shows that \(a\) is either quadratic or identically zero. This is in contrast to the parabolic state spaces considered by Spreij and Veerman [44].
The following convex cone of polynomial maps plays a key role. Recall that a polynomial \(r\in{\mathrm{Pol}}({\mathbb {R}}^{d})\) is called homogeneous of degree \(k\) if \(r(sx)=s^{k}r(x)\) for all \(x\in{\mathbb {R}}^{d}\) and \(s>0\). We define
Note that the condition \(c(x)Qx=0\) is equivalent to \(c(x)\nabla p(x)=0\), meaning that all eigenvectors of \(c(x)\) with nonzero eigenvalues are orthogonal to \(\nabla p(x)\). The proof of the following proposition is given in Appendix G.
Proposition 6.1
Conditions (G1) and (G2) hold for the above state space \(E\). Moreover, the operator \({\mathcal {G}}\) satisfies (A0)–(A2) if and only if
for some \(\alpha\in{\mathbb {S}}^{d}_{+}\), \(\beta\in{\mathbb {R}}^{d}\), \(B\in {\mathbb {R}}^{d\times d}\) and \(c\in{\mathcal {C}}^{Q}_{+}\) such that
Remark 6.2
If \(c(x)\) satisfies the linear growth condition \(\|c(x)\| \le C(1+\|x\| )\) for all \(x\in E\), then \(a(x)\) satisfies (3.4) and uniqueness in law for \(E\)-valued solutions to (2.2) holds by Theorem 4.2. In particular, this holds if \(Q\) is positive definite, i.e., \(Q={\mathrm{Id}}\), so that \(E\) is the unit ball and hence compact.
Remark 6.3
The conditions of Proposition 6.1 can easily be modified to cover state spaces of the form \(E=\{x\in{\mathbb {R}}^{d}:x^{\top}Qx\ge1\} \). This amounts to replacing \(p\) by \(-p\) above, and includes for example the complement of the open unit ball. With this modification, Proposition 6.1 is still true as stated, except that \(-\alpha\) should lie in \({\mathbb {S}}^{d}_{+}\), and the inequality in (6.2) should be reversed.
A question that is not addressed by Proposition 6.1 is how to describe the set \({\mathcal {C}}^{Q}_{+}\) in more explicit terms. We now provide a class of maps \(c\in{\mathcal {C}}^{Q}_{+}\), which yields a large family of polynomial diffusions on \(E\) that we expect to be useful in applications.
Let \(S_{k}\), \(k=1,\ldots,d(d-1)/2\), be a basis for the linear space of skew-symmetric \(d\times d\) matrices. Using the skew-symmetry of the \(S_{k}\) together with the fact that \(Q^{2}={\mathrm{Id}}\), it is easy to check that any map \(c\) of the form
where \(\varGamma=(\gamma_{k\ell})\in{\mathbb {S}}_{+}^{d(d-1)/2}\), lies in \({\mathcal {C}}^{Q}_{+}\). For any \(c(x)\) of the form (6.3), condition (6.2) then becomes
6.2 The product space \([0,1]^{m}\times{\mathbb {R}}^{n}_{+}\)
Consider the state space \(E=[0,1]^{m}\times{\mathbb {R}}^{n}_{+}\). Here \(d=m+n\), and the generating family of polynomials can be taken to be
To simplify notation, introduce index sets \(I=\{1,\ldots,m\}\) and \(J {=} \{m {+} 1,\ldots,m {+} n\}\), and write \(x_{I}\) (resp. \(x_{J}\)) for the subvector of \(x\in{\mathbb {R}}^{d}\) consisting of the components with indices in \(I\) (resp. \(J\)). Similarly, for a matrix \(A\in{\mathbb {R}}^{d\times d}\), we write \(A_{II}\), \(A_{IJ}\), etc. for the submatrices with indicated row and column indices. The proof of the following proposition is given in Appendix H.
Proposition 6.4
Conditions (G1) and (G2) hold for the above state space \(E\). Moreover, the operator \({\mathcal {G}}\) satisfies (A0)–(A2) if and only if
-
(i)
the matrix \(a\) is given by
$$\begin{aligned} \textstyle\begin{array}{llll} a_{ii}(x) &= \gamma_{i} x_{i}(1-x_{i}) &&\quad (i\in I),\\ a_{ij}(x) &=0 &&\quad (i\in I,\ j\in I\cup J,\ i\ne j),\\ a_{jj}(x) &= \alpha_{jj}x_{j}^{2} + x_{j} \big(\phi_{j} + \psi_{(j)}^{\top}x_{I} + \pi_{(j)}^{\top}x_{J}\big) &&\quad (j\in J),\\ a_{ij}(x) &= \alpha_{ij}x_{i}x_{j} &&\quad (i,j\in J,\ i\ne j) \end{array}\displaystyle \end{aligned}$$for some \(\gamma\in{\mathbb {R}}^{m}_{+}\), some \(\psi_{(j)}\in{\mathbb {R}}^{m}\), some \(\pi _{(j)}\in {\mathbb {R}}^{n}_{+}\) with \(\pi_{(j),j}=0\), some \(\phi\in{\mathbb {R}}^{n}\) with \(\phi_{j}\ge(\psi _{(j)}^{-})^{\top}{\mathbf{1}}\), and some \(\alpha=(\alpha_{ij})_{i,j\in J}\in{\mathbb {S}}^{n}\) such that we have \(\alpha+\operatorname{Diag}(\varPi^{\top}x_{J})\operatorname {Diag}(x_{J})^{-1}\in{\mathbb {S}}^{n}_{+}\) for all \(x_{J}\in{\mathbb {R}}^{n}_{++}\), where \(\varPi\in{\mathbb {R}}^{n\times n}\) is the matrix with columns \(\pi_{(j)}\);
-
(ii)
the vector \(b\) is given by
$$ b(x) = \left( \textstyle\begin{array}{lllll} \beta_{I} &+& B_{II} x_{I} \\ \beta_{J} &+& B_{JI}x_{I} &+& B_{JJ}x_{J} \end{array}\displaystyle \right) $$(6.4)for some \(\beta\in{\mathbb {R}}^{d}\) and \(B\in{\mathbb {R}}^{d\times d}\) such that \((B^{-}_{i,I\setminus\{i\}}){\mathbf{1}}<\beta_{i}< - B_{ii}-(B^{+}_{i,I\setminus\{ i\} }){\mathbf{1}}\) for all \(i\in I\), \(\beta_{j}> (B^{-}_{jI}){\mathbf{1}}\) for all \(j\in J\), and \(B_{JJ}\in{\mathbb {R}}^{m\times m}\) has positive off-diagonal entries.
Remark 6.5
In the following two cases, we get uniqueness in law of \(E\)-valued solutions to (2.2); cf. Theorem 4.2. First, if \(\alpha=0\) and \(\pi_{(j)}=0\) for all \(j\), then the linear growth condition (3.4) is satisfied and uniqueness follows by Theorem 4.2. Second, if \(\psi_{(j)}=0\) and \(\pi_{(j)}=0\) for all \(j\) and \(\phi=0\), then the submatrix \(a_{JJ}(x)\) only depends on \(x_{J}\) and can be written \(a_{JJ}=\sigma_{JJ}\sigma_{JJ}\), where \(\sigma _{JJ}(x_{J})=\operatorname{Diag} (x_{J})\alpha^{1/2}\) is Lipschitz-continuous. Since also \(X_{I}\) is an autonomous \(m\)-dimensional diffusion on \([0,1]^{m}\), uniqueness follows from Theorem 4.4 in conjunction with Theorem 4.2. Note that \(X_{I}\) and \(X_{J}\) are coupled only through the drift in this case.
A natural next step is to consider the state space \([0,1]^{m}\times {\mathbb {R}} ^{n}_{+}\times{\mathbb {R}}^{\ell}\), \(d=m+n+\ell\). In this case, one readily continues the above argument to deduce that the diffusion matrix is of the form
where \(K=\{m+n+1,\ldots,d\}\), the matrices \(a_{II}\) and \(a_{JJ}\) are given by Proposition 6.4(i), we have \(a_{IK}(x_{I})=\operatorname{Diag} (x_{I})({\mathrm{Id}}- \operatorname{Diag}(x_{I})){\mathrm {P}}\) for some \({\mathrm {P}}\in{\mathbb {R}}^{m\times l}\) and \(a_{JK}(x_{I},x_{J})= \operatorname{Diag}(x_{J}){\mathrm {H}}(x_{I},x_{J})\) for some matrix \({\mathrm {H}}\) of polynomials in \({\mathrm{Pol}}_{1}(E)\), and \(a_{KK}\) has component functions in \({\mathrm{Pol}}_{2}(E)\). Regarding the drift vector \(b {=} (b_{I},b_{J},b_{K})\), the last part \(b_{K}\) is unrestricted within the class of affine functions of \(x\), whereas \((b_{I},b_{J})\) must satisfy Proposition 6.4(ii). With this structure, we have (A0)–(A2) if and only if \(a\in{\mathbb {S}}^{d}_{+}\) on \(E\). This of course imposes additional restrictions on \({\mathrm {P}}\), \({\mathrm {H}}\) and \(a_{KK}\). Stating these restrictions explicitly is cumbersome, and we refrain from doing so here.
6.3 The unit simplex
Let \(d\ge2\) and consider the unit simplex \(E=\{x\in{\mathbb {R}}^{d}_{+}: x_{1}+\cdots +x_{d}=1\}\). Here \({\mathcal {P}}=\{x_{i}:i=1,\ldots,d\}\) consists of the coordinate functions and \({\mathcal {Q}}\) consists of the single polynomial \(1-{\mathbf{1}} ^{\top}x\). The proof of the following proposition is given in Appendix I.
Proposition 6.6
Conditions (G1) and (G2) hold for the above state space \(E\). Moreover, the operator \({\mathcal {G}}\) satisfies (A0)–(A2) if and only if
-
(i)
the matrix \(a\) is given by
$$\begin{aligned} a_{ii}(x) &= \sum_{j\ne i}\alpha_{ij}x_{i}x_{j}, \\ a_{ij}(x) &= -\alpha_{ij}x_{i}x_{j} \qquad\qquad(i\ne j) \end{aligned}$$on \(E\) for some \(\alpha_{ij}\in{\mathbb {R}}_{+}\) such that \(\alpha _{ij}=\alpha _{ji}\) for all \(i,j\);
-
(ii)
the vector \(b\) is given by
$$ b(x)=\beta+Bx, $$where \(\beta\in{\mathbb {R}}^{d}\) and \(B\in{\mathbb {R}}^{d\times d}\) satisfy \(B^{\top}{\mathbf{1}}+ (\beta^{\top}{\mathbf{1}}){\mathbf{1}}= 0\) and \(\beta_{i}+B_{ji} > 0\) for all \(i\) and all \(j\ne i\).
Remark 6.7
Since \(E\) is compact, Theorem 4.2 yields uniqueness in law for \(E\)-valued solutions to (2.2).
Remark 6.8
In the special case where \(\alpha_{ij}=\sigma^{2}\) for some \(\sigma>0\) and all \(i,j\), the diffusion matrix takes the form
The resulting process is sometimes called a multivariate Jacobi process; see for instance Gouriéroux and Jasiak [27].
Remark 6.9
Alternatively, one can establish Proposition 6.6 by considering polynomial diffusions \(Y\) on the “solid” simplex \(\{y\in {\mathbb {R}} ^{d-1}_{+}:y_{1}+\cdots+y_{d-1}\le1\}\), and then set \(X=(X_{1},\ldots ,X_{d})=(Y,1-Y_{1}-\cdots-Y_{d-1})\). In this case \({\mathcal {Q}}=\emptyset \), and it would be enough to invoke Lemma 5.4 rather than Lemma 5.5.
7 Polynomial diffusion models in finance
We now elaborate on various polynomial diffusion models in finance, following up on the introduction about (1.1). Let the state price density \(\zeta\) be a positive semimartingale on a filtered probability space \((\varOmega,{\mathcal {F}},{\mathcal {F}}_{t},{\mathbb {P}})\). This induces an arbitrage-free financial market model on any finite time horizon \(T^{\ast}\). Indeed, let \(S^{1},\dots,S^{m}\) denote the price processes of \(m\) fundamental assets. According to (1.1), we have \(\zeta_{t} S^{i}_{t} = {\mathbb {E}}[ \zeta_{T^{\ast}} S^{i}_{T^{\ast}} \,|\, {\mathcal {F}}_{t} ]\). Assuming that \(S^{1}\) is positive, we choose it as numeraire. This implies an equivalent measure \({\mathbb {Q}}^{1}\sim{\mathbb {P}}\) on \({\mathcal {F}}_{T^{\ast}}\) by
Discounted price processes \(\frac{S^{i}}{ S^{1} }\) are \({\mathbb {Q}} ^{1}\)-martingales, because
This implies that the market \(\{S^{1},\dots,S^{m}\}\) is arbitrage-free in the sense of no free lunch with vanishing risk; see Delbaen and Schachermayer [14].
Now let \(X\) be a polynomial diffusion on a state space \(E\subseteq {\mathbb {R}} ^{d}\). Fix \(n\in{\mathbb {N}}\), and let \(p\in{\mathrm{Pol}}_{n}(E)\) be a positive polynomial on \(E\) with coordinate representation \(\vec{p}\) with respect to some basis \(H(x)=(h_{1}(x),\ldots,h_{N}(x))^{\top}\) for \({\mathrm{Pol}}_{n}(E)\). The state price density is specified by \(\zeta_{t} = \mathrm{e}^{-\alpha t} p(X_{t})\), where \(\alpha \) is a real parameter. This setup yields an arbitrage-free model for the term structure of interest rates. The time \(t\) price \(P(t,T)\) of a zero coupon bond maturing at \(T\), corresponding to \(C_{T}=1\) in (1.1), can now be computed explicitly, using Theorem 3.1, as
where \(G\in{\mathbb {R}}^{N\times N}\) is the matrix representation of \({\mathcal {G}}\) on \({\mathrm{Pol}}_{n}(E)\). The short rate is obtained via the relation \(r_{t}=-\partial _{T}\log P(t,T)\,|\,_{T=t}\), and is given by
This expression clarifies the role of the parameter \(\alpha\) adjusting the level of interest rates. Such models show great potential. The linear case with \(p\) of the form \(p(x)=\phi+\psi^{\top}x\) has been studied in Filipović et al. [21], including an extensive empirical assessment. The parameter \(\psi\) is chosen such that \(E\) lies in the positive cone \(\{ x\in{\mathbb {R}}^{d}: \psi^{\top}x\ge0\} \). A specific example is \(E={\mathbb {R}}^{d}_{+}\), as discussed in Sect. 6.2.
One attractive feature of the polynomial framework is that it yields efficient pricing formulae for options on coupon bearing bonds. This includes swaptions, which are among the most important interest rate options. The generic payoff of such an option at expiry date \(T\) is of the form
for maturity dates \(T< T_{1}<\cdots<T_{m}\) and deterministic coefficients \(c_{0},\dots,c_{m}\). Formula (1.1) for the time \(t\) price of this option boils down to computing the \({\mathcal {F}}_{t}\)-conditional expectation of
which is the positive part of a polynomial in \(X_{T}\). Efficient methods involving the closed form \({\mathcal {F}}_{t}\)-conditional moments of \(X_{T}\) are available; see Filipović et al. [20].
Polynomial diffusions can be employed in a similar way to build stochastic volatility models. We now interpret ℙ as risk-neutral measure, and specify the spot variance (squared volatility) of an underlying stock index by \(v_{t}=p(X_{t})\). The variance swap rate for period \([t,T]\) is then given in closed form by
Such models have been successfully employed in Filipović et al. [23] and Ackerer et al. [2]. Both papers consider the quadratic case, which falls into the setup of Sect. 6.1, with a quadric state space \(E=\{x\in{\mathbb {R}}^{d}:x^{\top}Qx\le1\}\) and spot variance \(v_{t}=p(X_{t})\) for a polynomial \(p\) of the form \(p(x)=\phi+1-x^{\top}Q x\), where \(\phi\ge0\) denotes the minimal spot variance. While Filipović et al. [23] study unbounded state spaces, Ackerer et al. [2] focus on the compact case, where \(Q\) is positive definite. They derive analytic option pricing formulae in terms of Hermite polynomials for European call and put options on an asset with diffusive price process \({\mathrm{d}} S_{t} = S_{t} r {\,\mathrm{d}} t + S_{t} \sqrt{v_{t}} {\,\mathrm{d}} W^{\ast}_{t}\), where \(r\) denotes the constant short rate and \(W^{\ast}\) is a Brownian motion, which is possibly correlated with \(W\) in (2.2).
An application of the unit simplex in Sect. 6.3 is obtained as follows. Consider a stock index, such as the S&P 500, whose price process is given by a semimartingale \(Z\). As above, we interpret ℙ as risk-neutral measure and assume a constant short rate \(r\) such that \((\mathrm{e}^{-rt}Z_{t})\) is a martingale. Let \(d\) be the number of constituent stocks, and let \(X\) be a polynomial diffusion on \(E=\{x\in{\mathbb {R}}^{d}_{+}: x_{1}+\cdots+x_{d}=1\}\) which is independent of \(Z\). We fix a finite time horizon \(T^{\ast}\) and define the \(E\)-valued martingale, for \(t\le T^{\ast}\),
Since \(X\) is polynomial, \(Y_{t}\) is a first degree polynomial in \(X_{t}\) whose coefficients can be determined by an application of Theorem 3.1. Specifically, with \(\beta\) and \(B\) being the drift parameters of \(X\) as given in Proposition 6.6, one finds
We now define the constituent stocks’ price processes \(S^{i} = Y^{i} Z\), \(i=1,\dots,d\), such that \(S^{1}+\cdots+S^{d}=Z\). Assume that the price of the European call option on the index with maturity \(T\) and strike \(K\) is given in closed form, \(C(T,K)\), for some analytic function \(C\). The price of the call option on stock \(i\) with maturity \(T\) and strike \(K\) is then given by
This price can be efficiently computed in three steps. First, compute \(\xi C(T,K/\xi)\) for a finite set of grid points \(\xi\in[0,1]\). Second, apply some polynomial interpolation scheme, for example using Chebyshev polynomials, to obtain a polynomial approximation of degree \(n\), say \(q(T,K,\xi)\), of \(\xi C(T,K/\xi)\) in \(\xi\in[0,1]\). Third, approximate the option price \(C_{i}(T,K)\) by \(H(X_{0})^{\top}\mathrm{e}^{T G} \vec{p}_{i}(T,K)\), where \(\vec{p}_{i}(T,K)\) is the coordinate representation of the polynomial \(p(x) {=} q\left(T,K,\varPhi_{i}(T^{\ast} {-} T) {+} ( \varPsi(T^{\ast} {-} T) x)_{i}\right)\) in \(x\) with respect to some appropriately chosen basis of polynomials for \(\mathrm{Pol}_{n}(E)\). Extensions to basket and spread options on the stocks \(S^{1},\dots,S^{d}\) are straightforward. This is work in progress.
An application of polynomial diffusions on a compact state space to credit risk is given in Ackerer and Filipović [1].
Notes
We thank Mykhaylo Shkolnikov for suggesting a way to improve an earlier version of this result.
For geometric Brownian motion, there is a more fundamental reason to expect that uniqueness cannot be proved via the moment problem: it is well known that the lognormal distribution is not determined by its moments; see Heyde [29]. It thus becomes natural to pose the following question: Can one find a process \(Y\) , essentially different from geometric Brownian motion, such that all joint moments of all finite-dimensional marginal distributions,
$$ {\mathbb {E}}[Y_{t_{1}}^{\alpha_{1}} \cdots Y_{t_{m}}^{\alpha_{m}}], \qquad m\in{\mathbb {N}}, (\alpha _{1},\ldots,\alpha_{m})\in{\mathbb {N}}^{m}, 0\le t_{1}< \cdots< t_{m}< \infty, $$coincide with those of geometric Brownian motion? We have not been able to exhibit such a process. Note that any such \(Y\) must possess a continuous version. Indeed, the known formulas for the moments of the lognormal distribution imply that for each \(T\ge0\), there is a constant \(c=c(T)\) such that \({\mathbb {E}}[(Y_{t}-Y_{s})^{4}] \le c(t-s)^{2}\) for all \(s\le t\le T, |t-s|\le1\), whence Kolmogorov’s continuity lemma implies that \(Y\) has a continuous version; see Rogers and Williams [42, Theorem I.25.2].
Note that unlike many other results in that paper, Proposition 2 in Bakry and Émery [4] does not require \(\widehat{\mathcal {G}}\) to leave \(C^{\infty}_{c}(E_{0})\) invariant, and is thus applicable in our setting.
A matrix \(A\) is called strictly diagonally dominant if \(|A_{ii}|>\sum_{j\ne i}|A_{ij}|\) for all \(i\); see Horn and Johnson [30, Definition 6.1.9].
References
Ackerer, D., Filipović, D.: Linear credit risk models. Swiss Finance Institute Research Paper No. 16-34 (2016). Available online at http://ssrn.com/abstract=2782455
Ackerer, D., Filipović, D., Pulido, S.: The Jacobi stochastic volatility model. Swiss Finance Institute Research Paper No. 16-35 (2016). Available online at http://ssrn.com/abstract=2782486
Akhiezer, N.I.: The Classical Moment Problem and Some Related Questions in Analysis. Oliver & Boyd, Edinburgh (1965)
Bakry, D., Émery, M.: Diffusions hypercontractives. In: Yor, M., Azéma, J. (eds.) Séminaire de Probabilités XIX. Lecture Notes in Mathematics, vol. 1123, pp. 177–206. Springer, Berlin (1985)
Berg, C., Christensen, J.P.R., Jensen, C.U.: A remark on the multidimensional moment problem. Math. Ann. 243, 163–169 (1979)
Bochnak, J., Coste, M., Roy, M.-F.: Real Algebraic Geometry. Springer, Berlin (1998)
Carr, P., Fisher, T., Ruf, J.: On the hedging of options on exploding exchange rates. Finance Stoch. 18, 115–144 (2014)
Cherny, A.: On the uniqueness in law and the pathwise uniqueness for stochastic differential equations. Theory Probab. Appl. 46, 406–419 (2002)
Cuchiero, C.: Affine and polynomial processes. Ph.D. thesis, ETH Zurich (2011). Available online at http://e-collection.library.ethz.ch/eserv/eth:4629/eth-4629-02.pdf
Cuchiero, C., Keller-Ressel, M., Teichmann, J.: Polynomial processes and their applications to mathematical finance. Finance Stoch. 16, 711–740 (2012)
Curtiss, J.H.: A note on the theory of moment generating functions. Ann. Math. Stat. 13, 430–433 (1942)
Da Prato, G., Frankowska, H.: Invariance of stochastic control systems with deterministic arguments. J. Differ. Equ. 200, 18–52 (2004)
Da Prato, G., Frankowska, H.: Stochastic viability of convex sets. J. Math. Anal. Appl. 333, 151–163 (2007)
Delbaen, F., Schachermayer, W.: A general version of the fundamental theorem of asset pricing. Math. Ann. 300, 463–520 (1994)
Delbaen, F., Shirakawa, H.: An interest rate model with upper and lower bounds. Asia-Pac. Financ. Mark. 9, 191–209 (2002)
Dummit, D.S., Foote, R.M.: Abstract Algebra, 3rd edn. Wiley, Hoboken (2004)
Dunkl, C.F.: Hankel transforms associated to finite reflection groups. Contemp. Math. 138, 123–138 (1992)
Ethier, S.N.: A class of degenerate diffusion processes occurring in population genetics. Commun. Pure Appl. Math. 29, 483–493 (1976)
Ethier, S.N., Kurtz, T.G.: Markov Processes: Characterization and Convergence. Wiley, Hoboken (2005)
Filipović, D., Mayerhofer, E., Schneider, P.: Density approximations for multivariate affine jump-diffusion processes. J. Econom. 176, 93–111 (2013)
Filipović, D., Larsson, M., Trolle, A.: Linear-rational term structure models. J. Finance. Forthcoming. Available at SSRN http://ssrn.com/abstract=2397898
Filipović, D., Tappe, S., Teichmann, J.: Invariant manifolds with boundary for jump-diffusions. Electron. J. Probab. 19, 1–28 (2014)
Filipović, D., Gourier, E., Mancini, L.: Quadratic variance swap models. J. Financ. Econ. 119, 44–68 (2016)
Forman, J.L., Sørensen, M.: The Pearson diffusions: a class of statistically tractable diffusion processes. Scand. J. Stat. 35, 438–465 (2008)
Gallardo, L., Yor, M.: A chaotic representation property of the multidimensional Dunkl processes. Ann. Probab. 34, 1530–1549 (2006)
Göing-Jaeschke, A., Yor, M.: A survey and some generalizations of Bessel processes. Bernoulli 9, 313–349 (2003)
Gouriéroux, C., Jasiak, J.: Multivariate Jacobi process with application to smooth transitions. J. Econom. 131, 475–505 (2006)
Hajek, B.: Mean stochastic comparison of diffusions. Z. Wahrscheinlichkeitstheor. Verw. Geb. 68, 315–329 (1985)
Heyde, C.C.: On a property of the lognormal distribution. J. R. Stat. Soc., Ser. B, Stat. Methodol. 25, 392–393 (1963)
Horn, R.A., Johnson, C.A.: Matrix Analysis. Cambridge University Press, Cambridge (1985)
Ikeda, N., Watanabe, S.: Stochastic Differential Equations and Diffusion Processes. North-Holland, Amsterdam (1981)
Kleiber, C., Stoyanov, J.: Multivariate distributions and the moment problem. J. Multivar. Anal. 113, 7–18 (2013)
Larsen, K.S., Sørensen, M.: Diffusion models for exchange rates in a target zone. Math. Finance 17, 285–306 (2007)
Larsson, M., Ruf, J.: Convergence of local supermartingales and Novikov–Kazamaki type conditions for processes with jumps (2014). arXiv:1411.6229
Lord, R., Koekkoek, R., van Dijk, D.: A comparison of biased simulation schemes for stochastic volatility models. Quant. Finance 10, 177–194 (2012)
Maisonneuve, B.: Une mise au point sur les martingales locales continues définies sur un intervalle stochastique. In: Dellacherie, C., et al. (eds.) Séminaire de Probabilités XI. Lecture Notes in Mathematics, vol. 581, pp. 435–445. Springer, Berlin (1977)
Mayerhofer, E., Pfaffel, O., Stelzer, R.: On strong solutions for positive definite jump diffusions. Stoch. Process. Appl. 121, 2072–2086 (2011)
Mazet, O.: Classification des semi-groupes de diffusion sur ℝ associés à une famille de polynômes orthogonaux. In: Azéma, J., et al. (eds.) Séminaire de Probabilités XXXI. Lecture Notes in Mathematics, vol. 1655, pp. 40–53. Springer, Berlin (1997)
Penrose, R.: A generalized inverse for matrices. Math. Proc. Camb. Philos. Soc. 51, 406–413 (1955)
Petersen, L.C.: On the relation between the multidimensional moment problem and the one-dimensional moment problem. Math. Scand. 51, 361–366 (1982)
Revuz, D., Yor, M.: Continuous Martingales and Brownian Motion, 3rd edn. Springer, Berlin (1999)
Rogers, L.C.G., Williams, D.: Diffusions, Markov Processes and Martingales. Cambridge University Press, Cambridge (1994)
Schmüdgen, K.: The \(K\)-moment problem for compact semi-algebraic sets. Math. Ann. 289, 203–206 (1991)
Spreij, P., Veerman, E.: Affine diffusions with non-canonical state space. Stoch. Anal. Appl. 30, 605–641 (2012)
Stieltjes, T.J.: Recherches sur les fractions continues. Ann. Fac. Sci. Toulouse 8(4), 1–122 (1894)
Stoyanov, J.: Krein condition in probabilistic moment problems. Bernoulli 6, 939–949 (2000)
Willard, S.: General Topology. Courier Corporation, North Chelmsford (2004)
Wong, E.: The construction of a class of stationary Markoff processes. In: Bellman, R. (ed.) Stochastic Processes in Mathematical Physics and Engineering, pp. 264–276. Am. Math. Soc., Providence (1964)
Zhou, H.: Itô conditional moment generator and the estimation of short-rate processes. J. Financ. Econom. 1, 250–271 (2003)
Acknowledgements
The authors wish to thank Damien Ackerer, Peter Glynn, Kostas Kardaras, Guillermo Mantilla-Soler, Sergio Pulido, Mykhaylo Shkolnikov, Jordan Stoyanov and Josef Teichmann for useful comments and stimulating discussions. Thanks are also due to the referees, co-editor, and editor for their valuable remarks.
Author information
Authors and Affiliations
Corresponding author
Additional information
The research leading to these results has received funding from the European Research Council under the European Union’s Seventh Framework Programme (FP/2007-2013)/ERC Grant Agreement n. 307465-POLYTE.
Appendices
Appendix A: Nonnegative Itô processes
The following auxiliary result forms the basis of the proof of Theorem 5.3. It gives necessary and sufficient conditions for nonnegativity of certain Itô processes.
Lemma A.1
Let \(Z\) be a continuous semimartingale of the form
where \(Z_{0}\ge0\), \(\mu\) and \(\nu\) are continuous processes, and \(B\) is a Brownian motion. Let \(L^{0}\) be the local time of \(Z\) at level zero.
-
(i)
If \(\mu>0\) on \(\{Z=0\}\) and \(L^{0}=0\), then \(Z\ge0\) and \(\int _{0}^{t} {\boldsymbol{1}_{\{Z_{s}=0\}}}{\,\mathrm{d}} s=0\).
-
(ii)
If \(Z\ge0\), then on \(\{Z=0\}\), we have \(\mu\ge0\) and \(\nu=0\).
Proof
After stopping we may assume that \(Z_{t}\), \(\int_{0}^{t}\mu_{s}{\,\mathrm{d}} s\) and \(\int _{0}^{t}\nu_{s}{\,\mathrm{d}} B_{s}\) are uniformly bounded. This is done throughout the proof.
We first prove (i). By [41, Theorem VI.1.7] and using that \(\mu>0\) on \(\{Z=0\}\) and \(L^{0}=0\), we obtain \(0 = L^{0}_{t} =L^{0-}_{t} + 2\int_{0}^{t} {\boldsymbol {1}_{\{Z_{s}=0\}}}\mu _{s}{\,\mathrm{d}} s \ge0\). In particular, \(\int_{0}^{t}{\boldsymbol{1}_{\{Z_{s}=0\} }}{\,\mathrm{d}} s=0\), as claimed. Furthermore, Tanaka’s formula [41, Theorem VI.1.2] yields
Define \(\rho=\inf\left\{ t\ge0: Z_{t}<0\right\}\) and \(\tau=\inf \left\{ t\ge\rho: \mu_{t}=0 \right\} \wedge(\rho+1)\). Using that \(Z^{-}=0\) on \(\{\rho=\infty\}\) as well as dominated convergence, we obtain
Here \(Z_{\tau}\) is well defined on \(\{\rho<\infty\}\) since \(\tau <\infty\) on this set. On the other hand, by (A.1), the fact that \(\int_{0}^{t}{\boldsymbol{1}_{\{Z_{s}\le0\}}}\mu_{s}{\,\mathrm{d}} s=\int _{0}^{t}{\boldsymbol{1}_{\{Z_{s}=0\}}}\mu_{s}{\,\mathrm{d}} s=0\) on \(\{ \rho =\infty\}\) and monotone convergence, we get
Consequently,
The following hold on \(\{\rho<\infty\}\): \(\tau>\rho\); \(Z_{t}\ge0\) on \([0,\rho]\); \(\mu_{t}>0\) on \([\rho,\tau)\); and \(Z_{t}<0\) on some nonempty open subset of \((\rho,\tau)\). Therefore, the random variable inside the expectation on the right-hand side of (A.2) is strictly negative on \(\{\rho<\infty\}\). The left-hand side, however, is nonnegative; so we deduce \({\mathbb {P}}[\rho<\infty]=0\). Part (i) is proved.
The proof of Part (ii) involves the same ideas as used for instance in Spreij and Veerman [44, Proposition 3.1]. We first assume \(Z_{0}=0\) and prove \(\mu_{0}\ge0\) and \(\nu_{0}=0\). Assume for contradiction that \({\mathbb {P}} [\mu_{0}<0]>0\), and define \(\tau=\inf\{t\ge0:\mu_{t}\ge0\}\wedge1\). Then \(0\le{\mathbb {E}}[Z_{\tau}] = {\mathbb {E}}[\int_{0}^{\tau}\mu_{s}{\,\mathrm{d}} s]<0\), a contradiction, whence \(\mu_{0}\ge0\) as desired. Next, pick any \(\phi\in{\mathbb {R}}\) and consider an equivalent measure \({\mathrm{d}}{\mathbb {Q}}={\mathcal {E}}(-\phi B)_{1}{\,\mathrm{d}} {\mathbb {P}}\). Then \(B^{\mathbb {Q}}_{t} = B_{t} + \phi t\) is a ℚ-Brownian motion on \([0,1]\), and we have
Pick any \(\varepsilon>0\) and define \(\sigma=\inf\{t\ge0:|\nu_{t}|\le \varepsilon\}\wedge1\). The first part of the proof applied to the stopped process \(Z^{\sigma}\) under ℚ yields \((\mu_{0}-\phi \nu_{0}){\boldsymbol{1}_{\{\sigma>0\}}}\ge0\) for all \(\phi\in {\mathbb {R}}\). But this forces \(\sigma=0\) and hence \(|\nu_{0}|\le\varepsilon\). Since \(\varepsilon>0\) was arbitrary, we get \(\nu_{0}=0\) as desired.
Now consider any stopping time \(\rho\) such that \(Z_{\rho}=0\) on \(\{\rho <\infty\}\). Applying the result we have already proved to the process \((Z_{\rho+t}{\boldsymbol{1}_{\{\rho<\infty\}}})_{t\ge0}\) with filtration \(({\mathcal {F}} _{\rho+t}\cap\{\rho<\infty\})_{t\ge0}\) then yields \(\mu_{\rho}\ge0\) and \(\nu_{\rho}=0\) on \(\{\rho<\infty\}\). Finally, let \(\{\rho_{n}:n\in{\mathbb {N}}\}\) be a countable collection of such stopping times that are dense in \(\{t:Z_{t}=0\}\). Applying the above result to each \(\rho_{n}\) and using the continuity of \(\mu\) and \(\nu\), we obtain (ii). □
The following two examples show that the assumptions of Lemma A.1 are tight in the sense that the gap between (i) and (ii) cannot be closed.
Example A.2
The strict inequality appearing in Lemma A.1(i) cannot be relaxed to a weak inequality: just consider the deterministic process \(Z_{t}=(1-t)^{3}\).
Example A.3
The assumption of vanishing local time at zero in Lemma A.1(i) cannot be replaced by the zero volatility condition \(\nu =0\) on \(\{Z=0\}\), even if the strictly positive drift condition is retained. This is demonstrated by a construction that is closely related to the so-called Girsanov SDE; see Rogers and Williams [42, Sect. V.26]. Let \(Y\) be a one-dimensional Brownian motion, and define \(\rho(y)=|y|^{-2\alpha }\vee1\) for some \(0<\alpha<1/4\). The occupation density formula implies that
for all \(t\ge0\); so we may define a positive local martingale by
Let \(\tau\) be a strictly positive stopping time such that the stopped process \(R^{\tau}\) is a uniformly integrable martingale. Then define the equivalent probability measure \({\mathrm{d}}{\mathbb {Q}}=R_{\tau}{\,\mathrm{d}}{\mathbb {P}}\), under which the process \(B_{t}=Y_{t}-\int_{0}^{t\wedge\tau}\rho(Y_{s}){\,\mathrm{d}} s\) is a Brownian motion. We now change time via
and define \(Z_{u} = Y_{A_{u}}\). This process satisfies \(Z_{u} = B_{A_{u}} + u\wedge\sigma\), where \(\sigma=\varphi_{\tau}\). Define then \(\beta _{u}=\int _{0}^{u} \rho(Z_{v})^{1/2}{\,\mathrm{d}} B_{A_{v}}\), which is a Brownian motion because we have \(\langle\beta,\beta\rangle_{u}=\int_{0}^{u}\rho(Z_{v}){\,\mathrm{d}} A_{v}=u\). This finally gives
This process starts at zero, has zero volatility whenever \(Z_{t}=0\), and strictly positive drift prior to the stopping time \(\sigma\), which is strictly positive. Nonetheless, its sign changes infinitely often on any time interval \([0,t)\) since it is a time-changed Brownian motion viewed under an equivalent measure.
Appendix B: Proof of Theorem 3.1
We first establish a lemma.
Lemma B.1
For any \(k\in{\mathbb {N}}\) such that \({\mathbb {E}}[\|X_{0}\|^{2k}]<\infty \), there is a constant \(C\) such that
Proof
This is done as in the proof of Theorem 2.10 in Cuchiero et al. [10] via Gronwall’s inequality. Specifically, let \(f\in {\mathrm{Pol}}_{2k}(E)\) be given by \(f(x)=1+\|x\|^{2k}\), and note that the polynomial property implies that there exists a constant \(C\) such that \(|{\mathcal {G}}f(x)| \le Cf(x)\) for all \(x\in E\). For each \(m\), let \(\tau_{m}\) be the first exit time of \(X\) from the ball \(\{x\in E:\|x\|< m\}\). We can always choose a continuous version of \(t\mapsto{\mathbb {E}}[f(X_{t\wedge \tau_{m}})\,|\,{\mathcal {F}}_{0}]\), so let us fix such a version. Then by Itô’s formula and the martingale property of \(\int_{0}^{t\wedge\tau_{m}}\nabla f(X_{s})^{\top}\sigma(X_{s}){\,\mathrm{d}} W_{s}\),
Gronwall’s inequality now yields \({\mathbb {E}}[f(X_{t\wedge\tau_{m}})\, |\,{\mathcal {F}} _{0}]\le f(X_{0}) \mathrm{e}^{Ct}\). Sending \(m\) to infinity and applying Fatou’s lemma gives the result. □
We can now prove Theorem 3.1. For any \(p\in{\mathrm{Pol}}_{n}(E)\), Itô’s formula yields
The quadratic variation of the right-hand side satisfies
for some constant \(C\). This right-hand side has finite expectation by Lemma B.1, so the stochastic integral above is a martingale. Let \(\vec{p}\in{\mathbb {R}}^{{N}}\) be the coordinate representation of \(p\). Then (3.1) and (3.2) in conjunction with the linearity of the expectation and integration operators yield
Fubini’s theorem, justified by Lemma B.1, yields
where we define \(F(u) = {\mathbb {E}}[H(X_{u}) \,|\,{\mathcal {F}}_{t}]\). By choosing unit vectors for \(\vec{p}\), this gives a system of linear integral equations for \(F(u)\), whose unique solution is given by \(F(u)=\mathrm{e}^{(u-t)G^{\top}}H(X_{t})\). Hence
as claimed. This completes the proof of the theorem. □
Appendix C: Proof of Theorem 3.3
Theorem 3.3 is an immediate corollary of the following result.
Lemma C.1
Consider the \(d\)-dimensional Itô process \(X\) with representation
where \(\sigma\) satisfies a square-root growth condition
for some constant \(C\). If
then for each \(T\ge0\), there exists \(\varepsilon>0\) with
Proof
Fix \(T\ge0\). Variation of constants lets us rewrite \(X_{t} = A_{t} + \mathrm{e} ^{-\beta(T-t)}Y_{t} \) with
and
where we write \(\sigma^{Y}_{t} = \mathrm{e}^{\beta(T- t)}\sigma(A_{t} + \mathrm{e}^{-\beta (T-t)}Y_{t} )\). By (C.1), the dispersion process \(\sigma^{Y}\) satisfies
for some constant \(C_{Y}\).
Now let \(f(y)\) be a real-valued and positive smooth function on \({\mathbb {R}}^{d}\) satisfying \(f(y)=\sqrt{1+\|y\|}\) for \(\|y\|>1\). Some differential calculus gives, for \(y\neq0\),
Hence
and
for \(\|y\|>1\), while the first and second order derivatives of \(f(y)\) are uniformly bounded for \(\|y\|\le1\). Itô’s formula for \(Z_{t}=f(Y_{t})\) gives
with drift and dispersion processes
In view of (C.4) and the above expressions for \(\nabla f(y)\) and \(\frac{\partial^{2} f(y)}{\partial y_{i}\partial y_{j}}\), these are bounded,
for some constants \(m\) and \(\rho\). Hajek [28, Theorem 1.3] now implies that
for any nondecreasing convex function \(\varPhi\) on ℝ, where \(V\) is a Gaussian random variable with mean \(f(0)+m T\) and variance \(\rho^{2} T\). Hence, for any \(0<\varepsilon' <1/(2\rho^{2} T)\), we have \({\mathbb {E}}[\mathrm{e} ^{\varepsilon' V^{2}}] <\infty\). We now let \(\varPhi\) be a nondecreasing convex function on ℝ with \(\varPhi (z) = \mathrm{e}^{\varepsilon' z^{2}}\) for \(z\ge0\). Noting that \(Z_{T}\) is positive, we obtain \({\mathbb {E}}[ \mathrm{e}^{\varepsilon' Z_{T}^{2}}]<\infty\). As \(f^{2}(y)=1+\|y\|\) for \(\|y\|>1\), this implies \({\mathbb {E}}[ \mathrm{e}^{\varepsilon' \| Y_{T}\|}]<\infty\). Combining this with the fact that \(\|X_{T}\| \le\|A_{T}\| + \|Y_{T}\| \) and (C.2), we obtain using Hölder’s inequality the existence of some \(\varepsilon>0\) with (C.3). □
Appendix D: Proof of Theorem 4.4
We first provide a lemma.
Lemma D.1
Assume uniqueness in law holds for \(E_{Y}\)-valued solutions to (4.1). Let \(Y^{1}\), \(Y^{2}\) be two \(E_{Y}\)-valued solutions to (4.1) with driving Brownian motions \(W^{1}\), \(W^{2}\) and with \(Y^{1}_{0}=Y^{2}_{0}=y\) for some \(y\in E_{Y}\). Then \((Y^{1},W^{1})\) and \((Y^{2},W^{2})\) have the same law.
Proof
Consider the equation
where \(\widehat{b}_{Y}(y)=b_{Y}(y){\mathbf{1}}_{E_{Y}}(y)\) and \(\widehat{\sigma}_{Y}(y)=\sigma_{Y}(y){\mathbf{1}}_{E_{Y}}(y)\). Since \(E_{Y}\) is closed, any solution \(Y\) to this equation with \(Y_{0}\in E_{Y}\) must remain inside \(E_{Y}\). To see this, let \(\tau=\inf\{t:Y_{t}\notin E_{Y}\}\). Then there exists \(\varepsilon >0\), depending on \(\omega\), such that \(Y_{t}\notin E_{Y}\) for all \(\tau < t<\tau+\varepsilon\). However, since \(\widehat{b}_{Y}\) and \(\widehat{\sigma}_{Y}\) vanish outside \(E_{Y}\), \(Y_{t}\) is constant on \((\tau,\tau +\varepsilon )\). Since \(E_{Y}\) is closed this is only possible if \(\tau=\infty\).
The hypothesis of the lemma now implies that uniqueness in law for \({\mathbb {R}}^{d}\)-valued solutions holds for \({\mathrm{d}} Y_{t} = \widehat{b}_{Y}(Y_{t}) {\,\mathrm{d}} t + \widehat{\sigma}_{Y}(Y_{t}) {\,\mathrm{d}} W_{t}\). Since \((Y^{i},W^{i})\), \(i=1,2\), are two solutions with \(Y^{1}_{0}=Y^{2}_{0}=y\), Cherny [8, Theorem 3.1] shows that \((W^{1},Y^{1})\) and \((W^{2},Y^{2})\) have the same law. □
The proof of Theorem 4.4 follows along the lines of the proof of the Yamada–Watanabe theorem that pathwise uniqueness implies uniqueness in law; see Rogers and Williams [42, Theorem V.17.1]. Let \((W^{i},Y^{i},Z^{i})\), \(i=1,2\), be \(E\)-valued weak solutions to (4.1), (4.2) starting from \((y_{0},z_{0})\in E\subseteq{\mathbb {R}}^{m}\times{\mathbb {R}}^{n}\). We need to show that \((Y^{1},Z^{1})\) and \((Y^{2},Z^{2})\) have the same law. Since uniqueness in law holds for \(E_{Y}\)-valued solutions to (4.1), Lemma D.1 implies that \((W^{1},Y^{1})\) and \((W^{2},Y^{2})\) have the same law, which we denote by \(\pi({\mathrm{d}} w,{\,\mathrm{d}} y)\). Let \(Q^{i}({\mathrm{d}} z;w,y)\), \(i=1,2\), denote a regular conditional distribution of \(Z^{i}\) given \((W^{i},Y^{i})\). We equip the path space \(C({\mathbb {R}}_{+},{\mathbb {R}}^{d}\times{\mathbb {R}}^{m}\times{\mathbb {R}}^{n}\times{\mathbb {R}}^{n})\) with the probability measure
Let \((W,Y,Z,Z')\) denote the coordinate process on \(C({\mathbb {R}}_{+},{\mathbb {R}}^{d}\times{\mathbb {R}}^{m}\times{\mathbb {R}}^{n}\times{\mathbb {R}}^{n})\). Then the law under \(\overline{\mathbb {P}}\) of \((W,Y,Z)\) equals the law of \((W^{1},Y^{1},Z^{1})\), and the law under \(\overline{\mathbb {P}}\) of \((W,Y,Z')\) equals the law of \((W^{2},Y^{2},Z^{2})\). By well-known arguments, see for instance Rogers and Williams [42, Lemma V.10.1 and Theorems V.10.4 and V.17.1], it follows that
By localization, we may assume that \(b_{Z}\) and \(\sigma_{Z}\) are Lipschitz in \(z\), uniformly in \(y\). A standard argument based on the BDG inequalities and Jensen’s inequality (see Rogers and Williams [42, Corollary V.11.7]) together with Gronwall’s inequality yields \(\overline{\mathbb {P}}[Z'=Z]=1\). Hence
as was to be shown. □
Remark D.2
Theorem 4.4 carries over, and its proof literally goes through, to the case where \((Y,Z)\) is an arbitrary \(E\)-valued diffusion that solves (4.1), (4.2) and where uniqueness in law for \(E_{Y}\)-valued solutions to (4.1) holds, provided (4.3) is replaced by the assumption that both \(b_{Z}\) and \(\sigma_{Z}\) are locally Lipschitz in \(z\), locally in \(y\), on \(E\). That is, for each compact subset \(K\subseteq E\), there exists a constant \(\kappa\) such that for all \((y,z,y',z')\in K\times K\),
Appendix E: Proof of Theorem 5.3
The proof of Theorem 5.3 consists of two main parts. First, we construct coefficients \(\widehat{a}=\widehat{\sigma}\widehat{\sigma}^{\top}\) and \(\widehat{b}\) that coincide with \(a\) and \(b\) on \(E\), such that a local solution to (2.2), with \(b\) and \(\sigma\) replaced by \(\widehat{b}\) and \(\widehat{\sigma}\), can be obtained with values in a neighborhood of \(E\) in \(M\). This relies on (G1) and (A2), and occupies this section up to and including Lemma E.4. Second, we complete the proof by showing that this solution in fact stays inside \(E\) and spends zero time in the sets \(\{p=0\}\), \(p\in{\mathcal {P}}\). This relies on (G2) and (A1).
Let \(\pi:{\mathbb {S}}^{d}\to{\mathbb {S}}^{d}_{+}\) be the Euclidean metric projection onto the positive semidefinite cone. It has the following well-known property.
Lemma E.1
For any symmetric matrix \(A\in{\mathbb {S}}^{d}\) with the spectral decomposition \(A=S\varLambda S^{\top}\), we have \(\pi(A)=S\varLambda^{+} S^{\top}\), where \(\varLambda^{+}\) is the element-wise positive part of \(\varLambda\).
Proof
This result follows from the fact that the map \(\lambda:{\mathbb {S}}^{d}\to{\mathbb {R}}^{d}\) taking a symmetric matrix to its ordered eigenvalues is 1-Lipschitz; see Horn and Johnson [30, Theorem 7.4.51]. Indeed, for any \(B\in{\mathbb {S}}^{d}_{+}\), we have
Here the first inequality uses that the projection of an ordered vector \(x\in{\mathbb {R}}^{d}\) onto the set of ordered vectors with nonnegative entries is simply \(x^{+}\). □
We use the projection \(\pi\) to modify the given coefficients \(a\) and \(b\) outside \(E\) in order to obtain candidate coefficients for the stochastic differential equation (2.2). The diffusion coefficients are defined by
In order to construct the drift coefficient \(\widehat{b}\), we need the following lemma.
Lemma E.2
There exists a continuous map \(\widehat{b} :{\mathbb {R}}^{d}\to{\mathbb {R}}^{d}\) with \(\widehat{b}=b\) on \(E\) and such that the operator \(\widehat{\mathcal {G}}\) given by
satisfies \(\widehat{\mathcal {G}}f={\mathcal {G}}f\) on \(E\) and \(\widehat {\mathcal {G}}q = 0 \) on \(M\) for all \(q\in{\mathcal {Q}}\).
Proof
We first prove that there exists a continuous map \(c:{\mathbb {R}}^{d}\to {\mathbb {R}}^{d}\) such that
Indeed, let \(a=S\varLambda S^{\top}\) be the spectral decomposition of \(a\), so that the columns \(S_{i}\) of \(S\) constitute an orthonormal basis of eigenvectors of \(a\) and the diagonal elements \(\lambda_{i}\) of \(\varLambda\) are the corresponding eigenvalues. These quantities depend on \(x\) in a possibly discontinuous way. For each \(q\in{\mathcal {Q}}\),
Consider now any fixed \(x\in M\). For each \(i\) such that \(\lambda _{i}(x)^{-}\ne0\), \(S_{i}(x)\) lies in the tangent space of \(M\) at \(x\). Thus we may find a smooth path \(\gamma_{i}:(-1,1)\to M\) such that \(\gamma _{i}(0)=x\) and \(\gamma_{i}'(0)=S_{i}(x)\). For any \(q\in{\mathcal {Q}}\), we have \(q=0\) on \(M\) by definition, whence
or equivalently, \(S_{i}(x)^{\top}\nabla^{2} q(x) S_{i}(x) = -\nabla q(x)^{\top}\gamma_{i}'(0)\). In view of (E.2), this yields
Let \(q_{1},\ldots,q_{m}\) be an enumeration of the elements of \({\mathcal {Q}}\), and write the above equation in vector form as
The left-hand side thus lies in the range of \([\nabla q_{1}(x) \cdots \nabla q_{m}(x)]^{\top}\) for each \(x\in M\). Since linear independence is an open condition, (G1) implies that the latter matrix has full rank for all \(x\) in a whole neighborhood \(U\) of \(M\). It thus has a Moore–Penrose inverse which is a continuous function of \(x\); see Penrose [39, page 408]. The desired map \(c\) is now obtained on \(U\) by
where the Moore–Penrose inverse is understood. Finally, after shrinking \(U\) while maintaining \(M\subseteq U\), \(c\) is continuous on the closure \(\overline{U}\), and can then be extended to a continuous map on \({\mathbb {R}}^{d}\) by the Tietze extension theorem; see Willard [47, Theorem 15.8]. This proves (E.1).
The extended drift coefficient is now defined by \(\widehat{b} = b + c\), and the operator \(\widehat{\mathcal {G}}\) by
In view of (E.1), it satisfies \(\widehat{\mathcal {G}}f={\mathcal {G}}f\) on \(E\) and
on \(M\) for all \(q\in{\mathcal {Q}}\), as desired. □
We now define the set
Note that \(E\subseteq E_{0}\) since \(\widehat{b}=b\) on \(E\). Furthermore, the linear growth condition
is satisfied for some constant \(C\). This uses that the component functions of \(a\) and \(b\) lie in \({\mathrm{Pol}}_{2}({\mathbb {R}}^{d})\) and \({\mathrm{Pol}} _{1}({\mathbb {R}}^{d})\), respectively.
An \(E_{0}\)-valued local solution to (2.2), with \(b\) and \(\sigma\) replaced by \(\widehat{b}\) and \(\widehat{\sigma}\), can now be constructed by solving the martingale problem for the operator \(\widehat{\mathcal {G}}\) and state space \(E_{0}\). We first prove an auxiliary lemma.
Lemma E.3
Let \(f\in C^{\infty}({\mathbb {R}}^{d})\) and assume the support \(K\) of \(f\) satisfies \(K\cap M\subseteq E_{0}\). Let \(x_{0}\) be a maximizer of \(f\) over \(E_{0}\). Then \(\widehat{\mathcal {G}} f(x_{0})\le0\).
Proof
Let \(\gamma:(-1,1)\to M\) be any smooth curve in \(M\) with \(\gamma (0)=x_{0}\). Optimality of \(x_{0}\) and the chain rule yield
from which it follows that \(\nabla f(x_{0})\) is orthogonal to the tangent space of \(M\) at \(x_{0}\). Thus
for some coefficients \(c_{q}\). Next, differentiating once more yields
Similarly, for any \(q\in{\mathcal {Q}}\),
In view of (E.4), this implies
Observe that Lemma E.1 implies that \(\ker A\subseteq\ker\pi (A)\) for any symmetric matrix \(A\). Thus \(\widehat{a}(x_{0})\nabla q(x_{0})=0\) for all \(q\in{\mathcal {Q}}\) by (A2), which implies that \(\widehat{a}(x_{0})=\sum_{i} u_{i} u_{i}^{\top}\) for some vectors \(u_{i}\) in the tangent space of \(M\) at \(x_{0}\). Thus, choosing curves \(\gamma\) with \(\gamma'(0)=u_{i}\), (E.5) yields
Combining (E.4), (E.6) and Lemma E.2, we obtain
as desired. □
Let \(C_{0}(E_{0})\) denote the space of continuous functions on \(E_{0}\) vanishing at infinity. Lemma E.3 implies that \(\widehat {\mathcal {G}} \) is a well-defined linear operator on \(C_{0}(E_{0})\) with domain \(C^{\infty}_{c}(E_{0})\). It also implies that \(\widehat{\mathcal {G}}\) satisfies the positive maximum principle as a linear operator on \(C_{0}(E_{0})\). Hence the following local existence result can be proved.
Lemma E.4
Let \(\mu\) be a probability measure on \(E\). There exists an \({\mathbb {R}} ^{d}\)-valued càdlàg process \(X\) with initial distribution \(\mu\) that satisfies
for all \(t<\tau\), where \(\tau= \inf\{t \ge0: X_{t} \notin E_{0}\}>0\), and some \(d\)-dimensional Brownian motion \(W\).
Proof
The conditions of Ethier and Kurtz [19, Theorem 4.5.4] are satisfied, so there exists an \(E_{0}^{\Delta}\)-valued càdlàg process \(X\) such that \(N^{f}_{t} {=} f(X_{t}) {-} f(X_{0}) {-} \int_{0}^{t} \widehat{\mathcal {G}}f(X_{s}) {\,\mathrm{d}} s\) is a martingale for any \(f\in C^{\infty}_{c}(E_{0})\). Here \(E_{0}^{\Delta}\) denotes the one-point compactification of \(E_{0}\) with some \(\Delta \notin E_{0}\), and we set \(f(\Delta)=\widehat{\mathcal {G}}f(\Delta)=0\). Bakry and Émery [4, Proposition 2] then yields that \(f(X)\) and \(N^{f}\) are continuous.Footnote 3 In particular, \(X\) cannot jump to \(\Delta\) from any point in \(E_{0}\), whence \(\tau\) is a strictly positive predictable time.
A localized version of the argument in Ethier and Kurtz [19, Theorem 5.3.3] now shows that on an extended probability space, \(X\) satisfies (E.7) for all \(t<\tau\) and some Brownian motion \(W\). It remains to show that \(X\) is non-explosive in the sense that \(\sup_{t<\tau}\|X_{\tau}\|<\infty\) on \(\{\tau<\infty\}\). Indeed, non-explosion implies that either \(\tau=\infty\), or \({\mathbb {R}}^{d}\setminus E_{0}\neq\emptyset\) in which case we can take \(\Delta\in{\mathbb {R}}^{d}\setminus E_{0}\). In either case, \(X\) is \({\mathbb {R}}^{d}\)-valued. To prove that \(X\) is non-explosive, let \(Z_{t}=1+\|X_{t}\|^{2}\) for \(t<\tau\), and observe that the linear growth condition (E.3) in conjunction with Itô’s formula yields \(Z_{t} \le Z_{0} + C\int_{0}^{t} Z_{s}{\,\mathrm{d}} s + N_{t}\) for all \(t<\tau\), where \(C>0\) is a constant and \(N\) a local martingale on \([0,\tau)\). Let \(Y_{t}\) denote the right-hand side. Then
for all \(t<\tau\). The right-hand side is a nonnegative supermartingale on \([0,\tau)\), and we deduce \(\sup_{t<\tau}Z_{t}<\infty\) on \(\{\tau <\infty \}\), as required. □
Let \(X\) and \(\tau\) be the process and stopping time provided by Lemma E.4. We now show that \(\tau=\infty\) and that \(X_{t}\) remains in \(E\) for all \(t\ge0\) and spends zero time in each of the sets \(\{p=0\}\), \(p\in{\mathcal {P}}\). This will complete the proof of Theorem 5.3, since \(\widehat{a}\) and \(\widehat{b}\) coincide with \(a\) and \(b\) on \(E\).
We need to prove that \(p(X_{t})\ge0\) for all \(0\le t<\tau\) and all \(p\in{\mathcal {P}}\). Fix \(p\in{\mathcal {P}}\) and let \(L^{y}\) denote the local time of \(p(X)\) at level \(y\), where we choose a modification that is càdlàg in \(y\); see Revuz and Yor [41, Theorem VI.1.7]. Itô’s formula yields
We first claim that \(L^{0}_{t}=0\) for \(t<\tau\). The occupation density formula [41, Corollary VI.1.6] yields
By right-continuity of \(L^{y}_{t}\) in \(y\), it suffices to show that the right-hand side is finite. For this, in turn, it is enough to prove that \((\nabla p^{\top}\widehat{a} \nabla p)/p\) is locally bounded on \(M\). To this end, let \(a=S\varLambda S^{\top}\) be the spectral decomposition of \(a\), so that the columns \(S_{i}\) of \(S\) constitute an orthonormal basis of eigenvectors of \(a\) and the diagonal elements \(\lambda_{i}\) of \(\varLambda \) are the corresponding eigenvalues. Note that these quantities depend on \(x\) in general. Since \(a \nabla p=0\) on \(M\cap\{p=0\}\) by (A1), condition (G2) implies that there exists a vector \(h=(h_{1},\ldots ,h_{d})^{\top}\) of polynomials such that
Thus \(\lambda_{i} S_{i}^{\top}\nabla p = S_{i}^{\top}a \nabla p = S_{i}^{\top}h p\), and hence \(\lambda_{i}(S_{i}^{\top}\nabla p)^{2} = S_{i}^{\top}\nabla p S_{i}^{\top}h p\). In conjunction with Lemma E.1, this yields
Consequently,
Since \(\|S_{i}\|=1\) and \(\nabla p\) and \(h\) are locally bounded, we deduce that \((\nabla p^{\top}\widehat{a} \nabla p)/p\) is locally bounded, as required. Thus \(L^{0}=0\) as claimed.
Next, since \(\widehat{\mathcal {G}}p= {\mathcal {G}}p\) on \(E\), the hypothesis (A1) implies that \(\widehat{\mathcal {G}}p>0\) on a neighborhood \(U_{p}\) of \(E\cap\{ p=0\}\). Shrinking \(E_{0}\) if necessary, we may assume that \(E_{0}\subseteq E\cup\bigcup_{p\in{\mathcal {P}}} U_{p}\) and thus
Since \(L^{0}=0\) before \(\tau\), Lemma A.1 implies
Thus the stopping time \(\tau_{E}=\inf\{t\colon X_{t}\notin E\}\le\tau\) actually satisfies \(\tau_{E}=\tau\). This implies \(\tau=\infty\). Indeed, \(X\) has left limits on \(\{\tau<\infty\}\) by Lemma E.4, and \(E_{0}\) is a neighborhood in \(M\) of the closed set \(E\). Thus \(\tau _{E}<\tau\) on \(\{\tau<\infty\}\), whence this set is empty. Finally, Lemma A.1 also gives \(\int_{0}^{t}{\boldsymbol{1}_{\{p(X_{s})=0\} }}{\,\mathrm{d}} s=0\). The proof of Theorem 5.3 is complete. □
Appendix F: Proof of Theorem 5.7
The proof of Theorem 5.7 is divided into three parts.
Proof of Theorem 5.7(i)
The following argument is a version of what is sometimes called “McKean’s argument”; see Mayerhofer et al. [37, Sect. 4.1] for an overview and further references. Suppose first \(p(X_{0})>0\) almost surely. Itô’s formula and the identity \(a \nabla h=h p\) on \(M\) yield
for \(t<\tau=\inf\{s\ge0:p(X_{s})=0\}\). We now modify \(\log p(X)\) to turn it into a local submartingale. To this end, define
We claim that \(V_{t}<\infty\) for all \(t\ge0\). To see this, note that the set \(E {\cap} U^{c} {\cap} \{x:\|x\| {\le} n\}\) is compact and disjoint from \(\{ p=0\}\cap E\) for each \(n\). Thus
is strictly positive. Defining \(\sigma_{n}=\inf\{t:\|X_{t}\|\ge n\}\), this yields
Since \(\sigma_{n}\to\infty\) due to the fact that \(X\) does not explode, we have \(V_{t}<\infty\) for all \(t\ge0\) as claimed. It follows that the process
is well defined and finite for all \(t\ge0\), with total variation process \(V\).
Now define stopping times \(\rho_{n}=\inf\{t\ge0: |A_{t}|+p(X_{t}) \ge n\}\) and note that \(\rho_{n}\to\infty\) since neither \(A\) nor \(X\) explodes. Consider the process \(Z = \log p(X) - A\), which satisfies
Then \(-Z^{\rho_{n}}\) is a supermartingale on the stochastic interval \([0,\tau)\), bounded from below.Footnote 4 Thus by the supermartingale convergence theorem, \(\lim_{t\uparrow\tau}Z_{t\wedge\rho_{n}}\) exists in ℝ, which implies \(\tau\ge\rho_{n}\). Since \(\rho_{n}\to \infty\), we deduce \(\tau=\infty\), as desired.
Finally, suppose \({\mathbb {P}}[p(X_{0})=0]>0\). The above proof shows that \(p(X)\) cannot return to zero once it becomes positive. But due to (5.2), we have \(p(X_{t})>0\) for arbitrarily small \(t>0\), and this completes the proof. □
Proof of Theorem 5.7(ii)
As in the proof of (i), it is enough to consider the case where \(p(X_{0})>0\). By (G2), we deduce \(2 {\mathcal {G}}p - h^{\top}\nabla p = \alpha p\) on \(M\) for some \(\alpha\in{\mathrm{Pol}}({\mathbb {R}}^{d})\). However, we have \(\deg {\mathcal {G}}p\le\deg p\) and \(\deg a\nabla p \le1+\deg p\), which yields \(\deg h\le1\). Consequently \(\deg\alpha p \le\deg p\), implying that \(\alpha\) is constant. Inserting this into (F.1) yields
for \(t<\tau=\inf\{t: p(X_{t})=0\}\). The process \(\log p(X_{t})-\alpha t/2\) is thus locally a martingale bounded from above, and hence nonexplosive by the same “McKean’s argument” as in the proof of part (i). This proves the result. □
Proof of Theorem 5.7(iii)
The proof of relies on the following two lemmas.
Lemma F.1
Let \(b:{\mathbb {R}}^{d}\to{\mathbb {R}}^{d}\) and \(\sigma:{\mathbb {R}}^{d}\to {\mathbb {R}}^{d\times d}\) be continuous functions with \(\|b(x)\|^{2}+\|\sigma(x)\|^{2}\le\kappa(1+\|x\|^{2})\) for some \(\kappa>0\), and fix \(\rho>0\). Let \(Y\) be a \(d\)-dimensional Itô process satisfying \(Y_{t} = Y_{0} + \int_{0}^{t} b(Y_{s}){\,\mathrm{d}} s + \int_{0}^{t} \sigma(Y_{s}){\,\mathrm{d}} W_{s}\). Then there exist constants \(c_{1},c_{2}>0\) that only depend on \(\kappa\) and \(\rho\), but not on \(Y_{0}\), such that
Proof
By Markov’s inequality,
Let \(\tau_{n}\) be the first time \(\|Y_{t}\|\) reaches level \(n\). A standard argument using the BDG inequality and Jensen’s inequality yields
for \(t\le c_{2}\), where \(c_{2}\) is the constant in the BDG inequality. The growth condition yields
for \(t\le c_{2}\), and Gronwall’s lemma then gives \({\mathbb {E}}[ \sup _{s\le t\wedge \tau_{n}}\|Y_{s}-Y_{0}\|^{2}] \le c_{3}t \mathrm{e}^{4c_{2}\kappa t}\), where \(c_{3}=4c_{2}\kappa(1+{\mathbb {E}}[\|Y_{0}\|^{2}])\). Sending \(n\) to infinity and applying Fatou’s lemma concludes the proof, upon setting \(c_{1}=4c_{2}\kappa\mathrm{e}^{4c_{2}^{2}\kappa}\wedge c_{2}\). □
Lemma F.2
Let \(0<\alpha<2\) and \(z\ge0\), and let \(Z\) be a \(\mathrm{BESQ}(\alpha)\) process starting from \(z\ge0\). Let \({\mathbb {P}}_{z}\) denote its law. Let \(\tau _{0}=\inf\{t\ge0:Z_{t}=0\}\) be the first time \(Z\) hits zero. Then for any \(\varepsilon>0\),
Proof
By Göing-Jaeschke and Yor [26, Eq. (15)], we have
where \(\varGamma(\cdot)\) is the Gamma function and \(\widehat{\nu}=1-\alpha /2\in(0,1)\). Changing variables to \(s=z/(2t)\) yields \({\mathbb {P}}_{z}[\tau _{0}>\varepsilon]=\frac{1}{\varGamma(\widehat{\nu})}\int _{0}^{z/(2\varepsilon )}s^{\widehat{\nu}-1}\mathrm{e}^{-s}{\,\mathrm{d}} s\), which converges to zero as \(z\to0\) by dominated convergence. □
We may now complete the proof of Theorem 5.7(iii). The hypotheses yield
Hence there exist some \(\delta>0\) such that \(2 {\mathcal {G}}p({\overline{x}}) < (1-2\delta) h({\overline{x}})^{\top}\nabla p({\overline{x}})\) and an open ball \(U\) in \({\mathbb {R}}^{d}\) of radius \(\rho>0\), centered at \({\overline{x}}\), such that
Note that the radius \(\rho\) does not depend on the starting point \(X_{0}\).
For all \(t<\tau(U)=\inf\{s\ge0:X_{s}\notin U\}\wedge T\), we have
for some one-dimensional Brownian motion, possibly defined on an enlargement of the original probability space. Here the equality \(a\nabla p =hp\) on \(E\) was used in the last step. Define an increasing process \(A_{t}=\int_{0}^{t}\frac{1}{4}h^{\top}\nabla p(X_{s}){\,\mathrm{d}} s\). Since \(h^{\top}\nabla p(X_{t})>0\) on \([0,\tau(U))\), the process \(A\) is strictly increasing there. It follows that the time-change \(\gamma_{u}=\inf\{ t\ge 0:A_{t}>u\}\) is continuous and strictly increasing on \([0,A_{\tau(U)})\). The time-changed process \(Y_{u}=p(X_{\gamma_{u}})\) thus satisfies
Consider now the \(\mathrm{BESQ}(2-2\delta)\) process \(Z\) defined as the unique strong solution to the equation
Since \(4 {\mathcal {G}}p(X_{t}) / h^{\top}\nabla p(X_{t}) \le2-2\delta\) for \(t<\tau(U)\), a standard comparison theorem implies that \(Y_{u}\le Z_{u}\) for \(u< A_{\tau(U)}\); see for instance Rogers and Williams [42, Theorem V.43.1]. It is well known that a BESQ\((\alpha)\) process hits zero if and only if \(\alpha<2\); see Revuz and Yor [41, page 442]. It thus remains to exhibit \(\varepsilon>0\) such that if \(\|X_{0}-\overline{x}\|<\varepsilon\) almost surely, there is a positive probability that \(Z_{u}\) hits zero before \(X_{\gamma_{u}}\) leaves \(U\), or equivalently, that \(Z_{u}=0\) for some \(u< A_{\tau(U)}\). To this end, set \(C=\sup_{x\in U} h(x)^{\top}\nabla p(x)/4\), so that \(A_{\tau(U)}\ge C\tau(U)\), and let \(\eta>0\) be a number to be determined later. We have
where we recall that \(\rho\) is the radius of the open ball \(U\), and where the last inequality follows from the triangle inequality provided \(\|X_{0}-{\overline{x}}\|\le\rho/2\). By Lemma F.1, we can choose \(\eta>0\) independently of \(X_{0}\) so that \({\mathbb {P}}[ \sup _{t\le\eta C^{-1}} \|X_{t} - X_{0}\| <\rho/2 ]>1/2\). Then by Lemma F.2, we have \({\mathbb {P}}[ \inf_{u\le\eta} Z_{u} > 0]<1/3\) whenever \(Z_{0}=p(X_{0})\) is sufficiently close to zero. This happens if \(X_{0}\) is sufficiently close to \({\overline{x}}\), say within a distance \(\rho'>0\). Thus, setting \(\varepsilon=\rho'\wedge(\rho/2)\), the condition \(\|X_{0}-{\overline{x}}\| <\rho'\wedge(\rho/2)\) implies that (F.2) is valid, with the right-hand side strictly positive. The theorem is proved. □
Appendix G: Proof of Proposition 6.1
Condition (G1) is vacuously true, so we prove (G2). If \(d=1\), then \(\{p=0\}=\{-1,1\}\), and it is clear that any univariate polynomial vanishing on this set has \(p(x)=1-x^{2}\) as a factor. Thus (G2) holds. If \(d\ge2\), then \(p(x)=1-x^{\top}Qx\) is irreducible and changes sign, so (G2) follows from Lemma 5.4.
Next, it is straightforward to verify that (6.1), (6.2) imply (A0)–(A2), so we focus on the converse direction and assume (A0)–(A2) hold. We first prove that \(a(x)\) has the stated form. Write \(a(x)=\alpha+ L(x) + A(x)\), where \(\alpha=a(0)\in{\mathbb {S}}^{d}_{+}\), \(L(x)\in{\mathbb {S}}^{d}\) is linear in \(x\), and \(A(x)\in{\mathbb {S}}^{d}\) is homogeneous of degree two in \(x\). Since \(a(x)Qx=a(x)\nabla p(x)/2=0\) on \(\{p=0\}\), we have for any \(x\in\{p=0\}\) and \(\epsilon\in\{-1,1\} \) that
This implies \(L(x)Qx=0\) for all \(x\in\{p=0\}\), and thus, by scaling, for all \(x\in{\mathbb {R}}^{d}\). We now argue that this implies \(L=0\). To this end, consider the linear map \(T: {\mathcal {X}}\to{\mathcal {Y}}\) where
and \(TK\in{\mathcal {Y}}\) is given by \((TK)(x) = K(x)Qx\). One readily checks that we have \(\dim{\mathcal {X}}=\dim{\mathcal {Y}}=d^{2}(d+1)/2\). Thus if we can show that \(T\) is surjective, the rank-nullity theorem \(\dim(\ker T) + \dim(\mathrm{range } T) = \dim{\mathcal {X}} \) implies that \(\ker T\) is trivial. But the identity \(L(x)Qx\equiv0\) precisely states that \(L\in\ker T\), yielding \(L=0\) as desired. To see that \(T\) is surjective, note that \({\mathcal {Y}}\) is spanned by elements of the form
with the \(k\)th component being nonzero. But all these elements can be realized as \((TK)(x)=K(x)Qx\) as follows: If \(i,j,k\) are all distinct, one may take
and all remaining entries of \(K(x)\) equal to zero. If \(i=k\), one takes \(K_{ii}(x)=x_{j}\) and the remaining entries zero, and similarly if \(j=k\). If \(i=j\ne k\), one sets
and the remaining entries zero. This covers all possible cases, and shows that \(T\) is surjective. Thus \(L=0\) as claimed.
At this point, we have shown that \(a(x)=\alpha+A(x)\) with \(A\) homogeneous of degree two. Next, since \(a \nabla p=0\) on \(\{p=0\}\), there exists a vector \(h\) of polynomials such that \(a \nabla p/2=h p\). By counting degrees, \(h\) is of the form \(h(x)=f+Fx\) for some \(f\in {\mathbb {R}} ^{d}\), \(F\in{\mathbb {R}}^{d\times d}\). For any \(s>0\) and \(x\in{\mathbb {R}}^{d}\) such that \(sx\in E\),
By sending \(s\) to zero, we deduce \(f=0\) and \(\alpha x=Fx\) for all \(x\) in some open set, hence \(F=\alpha\). Thus \(a(x)Qx=(1-x^{\top}Qx)\alpha Qx\) for all \(x\in E\). Defining \(c(x)=a(x) - (1-x^{\top}Qx)\alpha\), this shows that \(c(x)Qx=0\) for all \(x\in{\mathbb {R}}^{d}\), that \(c(0)=0\), and that \(c(x)\) has no linear part. In particular, \(c\) is homogeneous of degree two. To prove that \(c\in{\mathcal {C}}^{Q}_{+}\), it only remains to show that \(c(x)\) is positive semidefinite for all \(x\). For this we observe that for any \(u\in{\mathbb {R}}^{d}\) and any \(x\in\{p=0\}\),
In view of the homogeneity property, positive semidefiniteness follows for any \(x\). Thus \(c\in{\mathcal {C}}^{Q}_{+}\) and hence this \(a(x)\) has the stated form. Furthermore, the drift vector is always of the form \(b(x)=\beta +Bx\), and a brief calculation using the expressions for \(a(x)\) and \(b(x)\) shows that the condition \({\mathcal {G}}p> 0\) on \(\{p=0\}\) is equivalent to (6.2). □
Appendix H: Proof of Proposition 6.4
Condition (G1) is vacuously true, and it is not hard to check that (G2) holds.
Next, it is straightforward to verify that (i) and (ii) imply (A0)–(A2), so we focus on the converse direction and assume (A0)–(A2) hold.
We first deduce (i) from the condition \(a \nabla p=0\) on \(\{p=0\}\) for all \(p\in{\mathcal {P}}\) together with the positive semidefinite requirement of \(a(x)\). Taking \(p(x)=x_{i}\), \(i=1,\ldots,d\), we obtain \(a(x)\nabla p(x) = a(x) e_{i} = 0\) on \(\{x_{i}=0\}\). Hence the \(i\)th column of \(a(x)\) is a polynomial multiple of \(x_{i}\). Similarly, with \(p=1-x_{i}\), \(i\in I\), it follows that \(a(x)e_{i}\) is a polynomial multiple of \(1-x_{i}\) for \(i\in I\). Hence, by symmetry of \(a\), we get
for some constants \(\gamma_{ij}\) and polynomials \(h_{ij}\in{\mathrm {Pol}}_{1}(E)\) (using also that \(\deg a_{ij}\le2\)). For \(i\ne j\), this is possible only if \(a_{ij}(x)=0\), and for \(i=j\in I\) it implies that \(a_{ii}(x)=\gamma_{i}x_{i}(1-x_{i})\) as desired. In order to maintain positive semidefiniteness, we necessarily have \(\gamma_{i}\ge0\).
Now consider \(i,j\in J\). By the above, we have \(a_{ij}(x)=h_{ij}(x)x_{j}\) for some \(h_{ij}\in{\mathrm{Pol}}_{1}(E)\). Similarly as before, symmetry of \(a(x)\) yields
so that for \(i\ne j\), \(h_{ij}\) has \(x_{i}\) as a factor. It follows that \(a_{ij}(x)=\alpha_{ij}x_{i}x_{j}\) for some \(\alpha_{ij}\in{\mathbb {R}}\). If \(i=j\), we get \(a_{jj}(x)=\alpha_{jj}x_{j}^{2}+x_{j}(\phi_{j}+\psi_{(j)}^{\top}x_{I} + \pi _{(j)}^{\top}x_{J})\) for some \(\alpha_{jj}\in{\mathbb {R}}\), \(\phi_{j}\in {\mathbb {R}}\), \(\psi _{(j)}\in{\mathbb {R}}^{m}\), \(\pi_{(j)}\in{\mathbb {R}}^{n}\) with \(\pi _{(j),j}=0\). Positive semidefiniteness requires \(a_{jj}(x)\ge0\) for all \(x\in E\). This directly yields \(\pi_{(j)}\in{\mathbb {R}}^{n}_{+}\). Further, by setting \(x_{i}=0\) for \(i\in J\setminus\{j\}\) and making \(x_{j}>0\) sufficiently small, we see that \(\phi_{j}+\psi_{(j)}^{\top}x_{I}\ge0\) is required for all \(x_{I}\in [0,1]^{m}\), which forces \(\phi_{j}\ge(\psi_{(j)}^{-})^{\top}{\mathbf{1}}\). Finally, let \(\alpha\in{\mathbb {S}}^{n}\) be the matrix with elements \(\alpha_{ij}\) for \(i,j\in J\), let \(\varPsi\in{\mathbb {R}}^{m\times n}\) have columns \(\psi_{(j)}\), and \(\varPi \in{\mathbb {R}} ^{n\times n}\) columns \(\pi_{(j)}\). We then have
so by sending \(s\) to infinity we see that \(\alpha+ \operatorname {Diag}(\varPi^{\top}x_{J})\operatorname{Diag}(x_{J})^{-1}\) must lie in \({\mathbb {S}}^{n}_{+}\) for all \(x_{J}\in {\mathbb {R}}^{n}_{++}\). This proves (i).
For (ii), note that \({\mathcal {G}}p(x) = b_{i}(x)\) for \(p(x)=x_{i}\), and \({\mathcal {G}} p(x)=-b_{i}(x)\) for \(p(x)=1-x_{i}\). In particular, if \(i\in I\), then \(b_{i}(x)\) cannot depend on \(x_{J}\). This establishes (6.4). Next, for \(i\in I\), we have \(\beta _{i}+B_{iI}x_{I}> 0\) for all \(x_{I}\in[0,1]^{m}\) with \(x_{i}=0\), and this yields \(\beta_{i} - (B^{-}_{i,I\setminus\{i\}}){\mathbf{1}}> 0\). Similarly, \(\beta _{i}+B_{iI}x_{I}<0\) for all \(x_{I}\in[0,1]^{m}\) with \(x_{i}=1\), so that \(\beta_{i} + (B^{+}_{i,I\setminus\{i\}}){\mathbf{1}}+ B_{ii}< 0\). For \(j\in J\), we may set \(x_{J}=0\) to see that \(\beta_{J}+B_{JI}x_{I}\in{\mathbb {R}}^{n}_{++}\) for all \(x_{I}\in [0,1]^{m}\). Hence \(\beta_{j}> (B^{-}_{jI}){\mathbf{1}}\) for all \(j\in J\). Moreover, fixing \(j\in J\), setting \(x_{j}=0\) and letting \(x_{i}\to\infty\) for \(i\ne j\) forces \(B_{ji}>0\). The proof of (ii) is complete. □
Appendix I: Proof of Proposition 6.6
Since \({\mathcal {Q}}\) consists of the single polynomial \(q(x)=1-{\mathbf{1}} ^{\top}x\), it is clear that (G1) holds. To prove (G2), it suffices by Lemma 5.5 to prove for each \(i\) that the ideal \((x_{i}, 1-{\mathbf {1}}^{\top}x)\) is prime and has dimension \(d-2\). But an affine change of coordinates shows that this is equivalent to the same statement for \((x_{1},x_{2})\), which is well known to be true.
Next, the only nontrivial aspect of verifying that (i) and (ii) imply (A0)–(A2) is to check that \(a(x)\) is positive semidefinite for each \(x\in E\). To do this, fix any \(x\in E\) and let \(\varLambda\) denote the diagonal matrix with \(a_{ii}(x)\), \(i=1,\ldots,d\), on the diagonal. Then for each \(s\in[0,1)\), the matrix \(A(s)=(1-s)(\varLambda+{\mathrm{Id}})+sa(x)\) is strictly diagonally dominantFootnote 5 with positive diagonal elements. Hence by Horn and Johnson [30, Theorem 6.1.10], it is positive definite. But since \({\mathbb {S}}^{d}_{+}\) is closed and \(\lim_{s\to1}A(s)=a(x)\), we get \(a(x)\in{\mathbb {S}}^{d}_{+}\).
We now focus on the converse direction and assume (A0)–(A2) hold. We first prove (i). As the ideal \((x_{i},1-{\mathbf{1}}^{\top}x)\) satisfies (G2) for each \(i\), the condition \(a(x)e_{i}=0\) on \(M\cap\{x_{i}=0\}\) implies that
for some polynomials \(h_{ji}\) and \(g_{ji}\) in \({\mathrm {Pol}}_{1}({\mathbb {R}}^{d})\). Suppose \(j\ne i\). By symmetry of \(a(x)\), we get
Thus \(h_{ij}=0\) on \(M\cap\{x_{i}=0\}\cap\{x_{j}\ne0\}\), and, by continuity, on \(M\cap\{x_{i}=0\}\). Another application of (G2) and counting degrees gives \(h_{ij}(x)=-\alpha_{ij}x_{i}+(1-{\mathbf{1}}^{\top}x)\gamma_{ij}\) for some constants \(\alpha_{ij}\) and \(\gamma_{ij}\). This proves \(a_{ij}(x)=-\alpha_{ij}x_{i}x_{j}\) on \(E\) for \(i\ne j\), as claimed. For \(i=j\), note that (I.1) can be written as
for some constants \(\alpha_{ij}\), \(\phi_{i}\) and vectors \(\psi _{(i)}\in{\mathbb {R}} ^{d}\) with \(\psi_{(i),i}=0\). We need to identify \(\phi_{i}\) and \(\psi _{(i)}\). To this end, note that the condition \(a(x){\mathbf{1}}=0\) on \(\{ 1-{\mathbf{1}} ^{\top}x=0\}\) yields \(a(x){\mathbf{1}}=(1-{\mathbf{1}}^{\top}x)f(x)\) for all \(x\in {\mathbb {R}}^{d}\), where \(f\) is some vector of polynomials \(f_{i}\in{\mathrm {Pol}}_{1}({\mathbb {R}}^{d})\). Writing the \(i\)th component of \(a(x){\mathbf{1}}\) in two ways then yields
for all \(x\in{\mathbb {R}}^{d}\) and some \(\eta\in{\mathbb {R}}^{d}\), \({\mathrm {H}} \in{\mathbb {R}}^{d\times d}\). Replacing \(x\) by \(sx\), dividing by \(s\) and sending \(s\) to zero gives \(x_{i}\phi_{i} = \lim_{s\to0} s^{-1}\eta_{i} + ({\mathrm {H}}x)_{i}\), which forces \(\eta _{i}=0\), \({\mathrm {H}}_{ij}=0\) for \(j\ne i\) and \({\mathrm {H}}_{ii}=\phi _{i}\). Substituting into (I.2) and rearranging yields
for all \(x\in{\mathbb {R}}^{d}\). The coefficient in front of \(x_{i}^{2}\) on the left-hand side is \(-\alpha_{ii}+\phi_{i}\) (recall that \(\psi_{(i),i}=0\)), which therefore is zero. That is, \(\phi_{i}=\alpha_{ii}\). With this in mind, (I.3) becomes \(x_{i} \sum_{j\ne i} (-\alpha _{ij}+\psi _{(i),j}+\alpha_{ii})x_{j} = 0\) for all \(x\in{\mathbb {R}}^{d}\), which implies \(\psi _{(i),j}=\alpha_{ij}-\alpha_{ii}\). At this point, we have proved
on \(E\), which yields the stated form of \(a_{ii}(x)\). It remains to show that \(\alpha_{ij}\ge0\) for all \(i\ne j\). To see this, suppose for contradiction that \(\alpha_{ik}<0\) for some \((i,k)\). Pick \(s\in(0,1)\) and set \(x_{k}=s\), \(x_{j}=(1-s)/(d-1)\) for \(j\ne k\). Then
For \(s\) sufficiently close to 1, the right-hand side becomes negative, which contradicts positive semidefiniteness of \(a\) on \(E\). This proves (i).
For (ii), first note that we always have \(b(x)=\beta+Bx\) for some \(\beta \in{\mathbb {R}}^{d}\) and \(B\in{\mathbb {R}}^{d\times d}\). The condition \({\mathcal {G}}q=0\) on \(M\) for \(q(x)=1-{\mathbf{1}}^{\top}x\) yields \(\beta^{\top}{\mathbf{1}}+ x^{\top}B^{\top}{\mathbf{1}}= 0\) on \(M\). Hence by Lemma 5.4, \(\beta^{\top}{\mathbf{1}}+ x^{\top}B^{\top}{\mathbf{1}} =\kappa(1-{\mathbf{1}}^{\top}x)\) for all \(x\in{\mathbb {R}}^{d}\) and some constant \(\kappa\). This yields \(\beta^{\top}{\mathbf{1}}=\kappa\) and then \(B^{\top}{\mathbf {1}}=-\kappa {\mathbf{1}} =-(\beta^{\top}{\mathbf{1}}){\mathbf{1}}\). Next, the condition \({\mathcal {G}}p_{i} \ge0\) on \(M\cap\{ p_{i}=0\}\) for \(p_{i}(x)=x_{i}\) can be written as
which in turn is equivalent to
The feasible region of this optimization problem is the convex hull of \(\{e_{j}:j\ne i\}\), and the linear objective function achieves its minimum at one of the extreme points. Thus we obtain \(\beta_{i}+B_{ji} \ge0\) for all \(j\ne i\) and all \(i\), as required. □
Appendix J: Some notions from algebraic geometry
In this appendix, we briefly review some well-known concepts and results from algebra and algebraic geometry. The reader is referred to Dummit and Foote [16, Chaps. 7 and 15] and Bochnak et al. [6, Chap. 4] for more details.
An ideal \(I\) of \({\mathrm{Pol}}({\mathbb {R}}^{d})\) is a subset of \({\mathrm{Pol}} ({\mathbb {R}}^{d})\) closed under addition and such that \(f\in I\) and \(g\in{\mathrm {Pol}}({\mathbb {R}}^{d})\) implies \(fg\in I\). Given a finite family \({\mathcal {R}}=\{r_{1},\ldots,r_{m}\}\) of polynomials, the ideal generated by ℛ, denoted by \(({\mathcal {R}})\) or \((r_{1},\ldots,r_{m})\), is the ideal consisting of all polynomials of the form \(f_{1} r_{1}+\cdots+f_{m}r_{m}\), with \(f_{i}\in{\mathrm {Pol}}({\mathbb {R}}^{d})\). Given any set of polynomials \(S\), its zero set is the set
The zero set of the family ℛ coincides with the zero set of the ideal \(I=({\mathcal {R}})\), that is, \({\mathcal {V}}( {\mathcal {R}})={\mathcal {V}}(I)\). For example, the set \(M\) in (5.1) is the zero set of the ideal \(({\mathcal {Q}})\). Given a set \(V\subseteq{\mathbb {R}}^{d}\), the ideal generated by \(V\), denoted by \({\mathcal {I}}(V)\), is the set of all polynomials that vanish on \(V\). It follows from the definition that \(S\subseteq{\mathcal {I}}({\mathcal {V}}(S))\) for any set \(S\) of polynomials. A basic problem in algebraic geometry is to establish when an ideal \(I\) is equal to the ideal generated by the zero set of \(I\),
If the ideal \(I=({\mathcal {R}})\) satisfies (J.1), then that means that any polynomial \(f\) that vanishes on the zero set \({\mathcal {V}}(I)\) has a representation \(f=f_{1}r_{1}+\cdots+f_{m}r_{m}\) for some polynomials \(f_{1},\ldots,f_{m}\).
An ideal \(I\) of \({\mathrm{Pol}}({\mathbb {R}}^{d})\) is said to be prime if it is not all of \({\mathrm{Pol}}({\mathbb {R}}^{d})\) and if the conditions \(f,g\in {\mathrm{Pol}}({\mathbb {R}}^{d})\) and \(fg\in I\) imply \(f\in I\) or \(g\in I\). The dimension of an ideal \(I\) of \({\mathrm{Pol}} ({\mathbb {R}}^{d})\) is the dimension of the quotient ring \({\mathrm {Pol}}({\mathbb {R}}^{d})/I\); for a definition of the latter, see Dummit and Foote [16, Sect. 16.1].
Rights and permissions
About this article
Cite this article
Filipović, D., Larsson, M. Polynomial diffusions and applications in finance. Finance Stoch 20, 931–972 (2016). https://doi.org/10.1007/s00780-016-0304-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00780-016-0304-4
Keywords
- Polynomial diffusions
- Polynomial diffusion models in finance
- Stochastic invariance
- Boundary attainment
- Moment problem