Abstract
We derive analytic closed-form moment and Laplace transform formulae for the quasi-stationary distribution of the classical Shiryaev diffusion restricted to the interval [0, A] with absorption at a given \(A>0\).
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
This work is an investigation into quasi-stationarity of the classical Shiryaev diffusion restricted to an interval. Specifically, the focus is on the solution \((R_{t}^{r})_{t\ge 0}\) of the stochastic differential equation
where \((B_t)_{t\ge 0}\) is standard Brownian motion in the sense that \({{\mathrm{\mathbb {E}}}}[dB_t]=0\), \({{\mathrm{\mathbb {E}}}}[(dB_t)^2]=dt\), and \(B_0=0\). The time-homogeneous Markov process \((R_{t}^{r})_{t\ge 0}\) is an important particular version of the so-called generalized Shiryaev process. The latter has been first arrived at and studied by Prof. A.N. Shiryaev—hence, the name—in his fundamental work (Shiryaev 1961, 1963) on quickest change-point detection. While interest to the Shiryaev process in the context of quickest change-point detection has never weakened (see, e.g., Pollak and Siegmund 1985; Shiryaev 2002; Feinberg and Shiryaev 2006; Burnaev et al. 2009; Polunchenko 2016, 2017a, b, c), the process has received a great deal of attention in other areas as well, notably in mathematical finance (see, e.g., Geman and Yor 1993; Donati-Martin et al. 2001; Linetsky 2004) and in mathematical physics (see, e.g., Monthus and Comtet 1994; Comtet and Monthus 1996). It has also been considered in the literature on general stochastic processes (see, e.g., Wong 1964; Yor 1992; Donati-Martin et al. 2001; Dufresne 2001; Schröder 2003; Peskir 2006; Polunchenko and Sokolov 2016; Polunchenko et al. 2018).
The particular version of the Shiryaev process \((R_{t}^{r})_{t\ge 0}\) governed by Eq. (1) is of special importance and interest because it is the only version with probabilistically nontrivial behavior in the limit as \(t\rightarrow +\infty \), exhibited in spite of the distinct martingale property \({{\mathrm{\mathbb {E}}}}[R_{t}^{r}-r-t]=0\) for all \(t\ge 0\) and \(r\ge 0\). Moreover, the process is convergent (as \(t\rightarrow +\infty \)) regardless of whether the state space is (I) the entire half-line \([0,+\infty )\) with no absorption on the interior; or (II) the interval [0, A] with absorption at a given level \(A>0\); or (III) the shortened half-line \([A,+\infty )\) also with absorption at \(A>0\) given. The case of a negative initial value r was touched upon in Peskir (2006). Cases (I), (II), and (III) have all been considered in the literature, which we now briefly review.
Case (I) is probably the easiest of the three cases. The asymptotic (as \(t\rightarrow +\infty \)) distribution of \((R_{t}^{r})_{t\ge 0}\) in this case is known as the stationary distribution. Formally, the latter is defined as
and it has already been found, e.g., in Shiryaev (1961), Shiryaev (1963), Pollak and Siegmund (1985), Feinberg and Shiryaev (2006), Burnaev et al. (2009), Polunchenko and Sokolov (2016), to be the momentless distribution
which is an extreme-value Fréchet-type distribution, and can also be recognized as a particular case of the inverse (reciprocal) gamma distribution. Exact closed-form formulae for the distribution of \(R_{t}^{r}\) for any given \(t\ge 0\) and \(r\ge 0\) can be found, e.g., in Linetsky (2004), Avram et al. (2013), Polunchenko and Sokolov (2016).
Cases (II) and (III) are fundamentally different from and far less understood than case (I), due to absorption at one of the boundaries. The corresponding asymptotic (as \(t\rightarrow +\infty \)) distributions are quasi-stationary distributions, i.e., stationary but conditional on extended survival. Formally, consider the stopping time
where \(R_{0}^{r}\,{:}{=}\,r\ge 0\) and \(A>0\) are fixed. The quasi-stationary distribution is defined as
and it does depend on whether \(r\in [0,A]\), which is case (II), or \(r\in [A,+\infty )\), which is case (III), but the specific value of r inside the state space of choice is irrelevant.
Case (III) is arguably the least understood case. To the best of our knowledge, the first attempt to treat this case was made in (Collet et al. 2013, Section 7.8.2) where the authors proved that not only does the quasi-stationary distribution exist for any \(A>0\), but also that there is a whole parametric continuum of quasi-stationary distributions when A is not sufficiently large. Further progress on this case was recently made in Polunchenko et al. (2018) where \(Q_A(x)\) and \(q_A(x)\) were, for the first time, found analytically for any \(A>0\). It was also shown in Polunchenko et al. (2018) that the quasi-stationary distribution is unique whenever \(A\ge A^{*}\approx 1.265857361\) where \(A^{*}\) is the solution of a certain transcendental equation. While case (III) may be the least understood case, the focus of this work is entirely on case (II), which is discussed next along with the motivation.
Case (II) is of importance in quickest change-point detection, and in this context, it was investigated in, e.g., Pollak and Siegmund (1985), Burnaev et al. (2009), Polunchenko (2017c). See also, e.g., Pollak and Siegmund (1986), Linetsky (2004) and Colletsps et al. (2013, Section 7.8.2). For example, it is known from (Pollak and Siegmund 1985, 1986) that, expectedly, the limit of \(Q_A(x)\), defined in (4), as \(A\rightarrow +\infty \) is H(x), defined in (2) and given by (3); the convergence is from above, and is pointwise, at every \(x\in [0,+\infty )\), i.e., at all continuity points of H(x). Moreover, analytic closed-form formulae for \(Q_A(x)\) and \(q_A(x)\) were recently obtained in Polunchenko (2017c), apparently for the first time in the literature; see formulae (10) and (11) below. To boot, the distribution of \(R_{t}^{r}\) conditional on no extinction prior to time \(t>0\), for any given \(t>0\) and \(r\in [0,A)\) has been derived explicitly as well (see, e.g., Polunchenko 2016; Linetsky 2004); this conditional distribution becomes the quasi-stationary distribution in the limit, as \(t\rightarrow +\infty \). Due to its connection to quickest change-point detection, it is case (II) that is of interest to this work, which is also motivated by quickest change-point detection. Notwithstanding all the headway made lately on case (II), gaps do remain, and this work seeks to fill some of these gaps in.
More precisely, the contribution of this work in relation to case (II) is two-fold: (a) obtain exact closed-form formulae for the quasi-stationary distribution’s moments; and subsequently use the moment formulae to (b) derive an exact formula (in different forms) for the Laplace transform of the quasi-stationary distribution. The moment formulae are obtained as an extension of the effort made earlier in Polunchenko (2017c) where the moment sequence was shown to satisfy a certain recurrence whose closed-form solution, at the time, seemed out of reach. This work “runs that leg” and solves the recurrence explicitly. This is done in the first half of Sect. 3, which is the main section of the present paper. The second half of Sect. 3 is devoted to the computation of the Laplace transform in two different ways: first using the obtained moment formulae, and then also by solving a certain order-two ordinary differential equation that the Laplace transform of interest can be easily shown (see Polunchenko 2017c) to satisfy. Since nearly all of the formulae involve special functions, we conveniently preface Sect. 3 and the derivations therein with Sect. 2 which introduces the relevant special functions. Lastly, Sect. 4 wraps up the entire paper with a few concluding remarks.
2 Notation and nomenclature
For convenience we shall adapt the standard notation employed uniformly across mathematical literature. In particular, this applies to a host of special functions we shall deal with throughout the sequel. These functions, in their most common notation, are:
-
1.
The Gamma function \(\varGamma (z)\), \(z\in {\mathbb {C}}\), frequently also referred to as the extension of the factorial to complex numbers, due to the property \(\varGamma (n)=(n-1)!\) exhibited for \(n\in {\mathbb {N}}\). See, e.g., (Bateman and Erdélyi 1953a, Chapter 1).
-
2.
The Pochhammer symbol, or the rising factorial, often notated as \((z)_n\) and defined for \(z\in {\mathbb {C}}\) and \(n\in {\mathbb {N}}\cup \{0\}\) as
$$\begin{aligned} (z)_{n}\,{:}{=}\,\left\{ \begin{array}{l@{\qquad }l} 1, &{}\text {for } n=0;\\ z(z+1)\cdots (z+n-1), &{}\text {for } n\in {\mathbb {N}}, \end{array}\right. \end{aligned}$$and it is of note that \((1)_n=n!\) for any \(n\in {\mathbb {N}}\cup \{0\}\). See, e.g., (Srivastava and Karlsson 1985, pp. 16–18). Also, observe that
$$\begin{aligned} (z)_{n} = \dfrac{\varGamma (z+n)}{\varGamma (z)} \; \text {for} \; n\in {\mathbb {N}}\cup \{0\} \; \text {and} \; z\in {\mathbb {C}}\setminus \{0,-1,-2,\ldots \}, \end{aligned}$$and if z is a negative integer or zero, i.e., if \(z=-k\) and \(k\in {\mathbb {N}}\cup \{0\}\), then
$$\begin{aligned} (-k)_{n} = {\left\{ \begin{array}{ll} \dfrac{(-1)^{n}\,k!}{(k-n)!},&{}\text {for } n=0,1,\ldots ,k;\\ 0,&{}\text {for } n=k+1,k+2,\ldots ; \end{array}\right. } \end{aligned}$$(5)cf. (Srivastava and Karlsson 1985, pp. 16–17).
-
3.
The special case of the generalized hypergeometric function (see, e.g., Bateman and Erdélyi 1953a, Chapter 4) with two numeratorial and two denominatorial parameters. The function, denoted as \({}_{2}F_{2}[z]\), is defined via the power series
(6)where \(b_1,b_2\not \in \{0,-1,-2,\ldots \}\) and \(\vert z\vert <+\infty \). See (Srivastava and Karlsson 1985, p. 20). It is of note that when only one of the numeratorial parameters \(a_i\), \(i=1,2\), is a negative integer or zero, then, in view of (5), the power series on the right of (6) terminates, thereby turning the function \({}_{2}F_{2}[z]\) into a polynomial in z of degree \(-a_i\).
-
4.
The Whittaker M and W functions, traditionally denoted, respectively, as \(M_{a,b}(z)\) and \(W_{a,b}(z)\), where \(a,b,z\in {\mathbb {C}}\); the Whittaker M function is undefined when \(-2b\in {\mathbb {N}}\), but can be regularized. These functions were introduced by Whittaker (1904) as the fundamental solutions to the Whittaker differential equation (see, e.g., Slater 1960; Buchholz 1969).
-
5.
The modified Bessel functions of the first and second kinds, conventionally denoted, respectively, as \(I_{a}(z)\) and \(K_{a}(z)\), where \(a,z\in {\mathbb {C}}\); the index a is referred to as the function’s order. See (Bateman and Erdélyi 1953b, Chapter 7). These functions form a set of fundamental solutions to the modified Bessel differential equation. The modified Bessel K function is also known as the MacDonald function.
-
6.
The particular case of the generalized bivariate Kampé de Fériet function
(7)which is well-defined for \(b_1,b_2\not \in \{0,-1,-2,\ldots \}\) and \(\vert x\vert <+\infty \) and \(\vert y\vert <+\infty \). See Srivastava and Karlsson (1985, p. 27). The above function was introduced in Lavoie and Grondin (1994), and is slightly more general than the original Kampé de Fériet function proposed by Prof. J. Kampé de Fériet in Kampé de Fériet (1921).
3 The formulae and discussion
As was mentioned in the introduction, the quasi-stationary distribution defined in (4) was recently expressed analytically in Polunchenko (2016) through the Whittaker W function. Specifically, it can be deduced from (Polunchenko 2016, Theorem 3.1) that if \(A>0\) is fixed and \(\lambda \equiv \lambda _A>0\) is the smallest (positive) solution of the equation
where
then the quasi-stationary probability density function (pdf) is given by
and the respective cumulative distribution function (cdf) is given by
and \(q_A(x)\) and \(Q_A(x)\) are each a smooth function of x and A; observe that \(q_A(A)=0\), as implied by (8), (9), and (10). The smoothness of \(q_A(x)\) and \(Q_A(x)\) is due to analytic properties of the Whittaker W function on the right of formulae (10) and (11). These formulae stem from the solution of a certain Sturm–Liouville problem, and \(\lambda \) is the smallest positive eigenvalue of the corresponding Sturm–Liouville operator; in Polunchenko (2016), the Sturm–Liouville operator is negated, causing \(\lambda \) to be its largest negative eigenvalue.
Remark 1
The definition (9) of \(\xi (\lambda )\) can actually be changed to \(\xi (\lambda )\,{:}{=}\,-\sqrt{1-8\lambda }\) with no effect whatsoever on either Eq. (8), or formulae (10) and (11), i.e., all three are invariant with respect to the sign of \(\xi (\lambda )\). This was previously pointed out in Polunchenko (2017c), and the reason for this \(\xi (\lambda )\)-symmetry is because Eq. (8) and formulae (10) and (11) each have \(\xi (\lambda )\) present only as (double) the second index of the corresponding Whittaker W function or functions involved, and the Whittaker W function in general is known (see, e.g., Buchholz 1969, Identity (19), p. 19) to be an even function of its second index, i.e., \(W_{a,b}(z)=W_{a,-b}(z)\).
It is evident that Eq. (8) is a key ingredient of formulae (10) and (11), and consequently, of all of the characteristics of the quasi-stationary distribution as well. As a transcendental equation, it can only be solved numerically, although to within any desired accuracy, as was previously done, e.g., in Linetsky (2004), Polunchenko (2016), Polunchenko (2017c), Polunchenko (2017a), with the aid of Mathematica developed by Wolfram Research: Mathematica’s special functions capabilities have long proven to be superb. Yet, it is known (see Linetsky 2004; Polunchenko 2016) that for any fixed \(A>0\), the equation has countably many simple solutions \(0<\lambda _1<\lambda _2<\lambda _3<\cdots \), such that \(\lim _{k\rightarrow +\infty }\lambda _k=+\infty \). All of them depend on A, but since we are interested only in the smallest one, we shall use either the “short” notation \(\lambda \), or the more explicit \(\lambda _A\) to emphasize the dependence on A. Also, it can be concluded from (Polunchenko 2016, p. 136 and Lemma 3.3) that \(\lambda _A\) is a monotonically decreasing function of A such that \(\lim _{A\rightarrow +\infty }\lambda _A=0\), and more specifically \(\lambda _A=A^{-1}+O(A^{-3/2})\).
Remark 2
Since \(\lambda \equiv \lambda _{A}\) is monotonically decreasing in A, and such that \(\lim _{A\rightarrow +\infty }\lambda _{A}=0\), one can conclude from (9) that \(\xi (\lambda )\) is either (a) purely imaginary (i.e., \(\xi (\lambda )=\mathrm {i}\alpha \) where \(\mathrm {i}\,{:}{=}\,\sqrt{-1}\) and \(\alpha \in \mathbb {R}\)) if A is sufficiently small, or (b) purely real and between 0 inclusive and 1 exclusive (i.e., \(0\le \xi (\lambda )<1\)) otherwise. The borderline case is when \(\xi (\lambda )=0\), i.e., when \(\lambda _{A}=1/8\), and the corresponding critical value of A is the solution \(\tilde{A}>0\) of the equation \(W_{1,0}(2/\tilde{A})=0\). A basic numerical calculation gives \(\tilde{A}\approx 10.240465\). Hence, if \(A<\tilde{A}\approx 10.240465\), then \(\lambda _{A}>1/8\) so that \(\xi (\lambda )\) is purely imaginary; otherwise, if \(A\ge \tilde{A}\approx 10.240465\), then \(\lambda _{A}\in (0,1/8]\) so that \(\xi (\lambda )\) is purely real and such that \(\xi (\lambda )\in [0,1)\) with \(\lim _{A\rightarrow +\infty }\xi (\lambda _{A})=1\).
The asymptotics \(\lambda _A=A^{-1}+O(A^{-3/2})\) was first established (in a more general form) in Polunchenko (2017c) with the aid of Jensen’s inequality applied to ascertain that the variance of the quasi-stationary distribution (10)–(11) is strictly positive. This is an example of potential applications of the quasi-stationary distribution’s low-order moments. We now recover the distribution’s entire moment series.
3.1 The moment series
Let Z be a random variable sampled from the quasi-stationary distribution given by (10) and (11). Let \({\mathfrak {M}}_{n}\,{:}{=}\,{{\mathrm{\mathbb {E}}}}[Z^{n}]\) denote the n-th moment of Z for \(n\in {\mathbb {N}}\cup \{0\}\); it is to be understood that \({\mathfrak {M}}_{0}\equiv 1\) for any \(A>0\), and that all other \({\mathfrak {M}}_{n}\)’s actually do depend on A. For every fixed \(A>0\), the series \(\{{\mathfrak {M}}_{n}\}_{n\ge 0}\) can be inferred from (Polunchenko 2017c, Theorem 3.2, p. 136) to satisfy the recurrence
with \({\mathfrak {M}}_{0}\equiv 1\); recall that \(\lambda \equiv \lambda _{A}\) and A are interconnected via Eq. (8). While recurrence (12) may seem easy to iterate forward on a computer, a general closed-form expression for \({\mathfrak {M}}_{n}\) for any \(n\in {\mathbb {N}}\cup \{0\}\) would be more convenient, especially for analytic purposes. To that end, it was lamented in Polunchenko (2017c) that although the recurrence is possible to solve explicitly, the solution is too cumbersome. We now show that the solution can be expressed compactly through the hypergeometric function \({}_{2}F_{2}[z]\) defined in (6).
Lemma 1
For every \(A>0\) fixed, the solution \(\{{\mathfrak {M}}_{n}\}_{n\ge 0}\) to the recurrence (12) is given by
where \(\lambda \equiv \lambda _A\;(>0)\) is determined by (8) while \(\xi (\lambda )\) is defined in (9); recall also that \({}_{2}F_{2}[z]\) denotes the generalized hypergeometric function (6).
Proof
The idea is to first rewrite (12) equivalently as
and then substitute \({\mathfrak {M}}_{n}\) of the form
where m(n, A) is the new unknown. After some elementary algebra this gives
which can be recognized as a particular case of the contiguous function identity
that the function \({}_{2}F_{2}[z]\) defined in (6) is known to satisfy: it suffices to set
and observe directly from (6) that
for any appropriate \(a_2\), \(b_1\) and \(b_2\). \(\square \)
It is clear that the obtained formula (13) is symmetric with respect to \(\xi (\lambda )\), as it should be, by Remark 1. More importantly, since one of the numeratorial parameters of the function \({}_{2}F_{2}[z]\) on the right of (13) is from the set \(\{0,-1,-2,-3,\ldots \}\), the power series buried inside the generalized hypergeometric function terminates, so that \({\mathfrak {M}}_{n}\) ends up being a polynomial of degree n in A. However, the coefficients of the polynomial do depend on \(\lambda \equiv \lambda _A\), and since the latter is connected to A via the transcendental equation (8), the actual nature of dependence of \({\mathfrak {M}}_{n}\) on A is more complicated than polynomial. Specifically, from (5), (6), and the identity
as given, e.g., by (Srivastava and Karlsson 1985, Formula (10), p. 17), we readily obtain
whence
and subsequently, in view of (13), we finally find
where again \(\lambda \equiv \lambda _A\;(>0)\) is determined by (8) and \(\xi (\lambda )\) is defined in (9); this formula is also invariant with respect to the sign of \(\xi (\lambda )\).
Let us now briefly contrast the two obtained formulae (13) and (15). To this end, observe first that formula (15) is more explicit than formula (13): unlike the latter, the former is free of special functions, and can thus provide more insight into the relationship between \({\mathfrak {M}}_{n}\) and A. A better understanding of this relationship can, in turn, shed more light on the relationship between \(\lambda \equiv \lambda _A\) and A, an important question difficult to answer by direct analysis of the transcendental equation (8) connecting the two. For example, from (15) and the trivial observation that \({\mathfrak {M}}_{n}>0\) for all n we readily obtain
whence
so that \(\lambda _{A}=A^{-1}+O(A^{-3/2})\); cf. Polunchenko (2017c). For applications of this result in quickest change-point detection see (Polunchenko 2017a, b). Similarly, since the quasi-stationary distribution is supported on the interval [0, A], we may further deduce that \({\mathfrak {M}}_{n}\le A^{i}\, {\mathfrak {M}}_{n-i}\) for any \(i\in \{0,1,\ldots ,n\}\) and \(n\in {\mathbb {N}}\cup \{0\}\). For \(n=2\) and \(i=1\), after some elementary algebra, this leads to the lower-bound
which clearly improves the left half of the double inequality (16). By “playing around” with the moments more, one can tighten up the lower- and upper-bounds for \(\lambda _{A}\) even further, although every such improvement will come at the price of increased complexity of the bounds. That said, the bounds will remain fully amenable to numerical evaluation. See (Polunchenko 2017c) for very accurate high-order bounds.
On the other hand, formula (13) is more convenient than formula (15) to implement in software, especially in Wolfram Mathematica with its excellent special functions capabilities. To illustrate this point, we implemented formula (13) in a Mathematica script, and used the script to produce Figures 1 and 2 which show the behavior of \({\mathfrak {M}}_{n}\) as a function of A with n fixed and as a function of n with A fixed, respectively; note the different ordinate scales in the figures. Figures 1a–f make it clear that if n is fixed, then \({\mathfrak {M}}_{n}\) is an increasing function of A, concave for \(n=1\) and convex otherwise. Given the definition of \({\mathfrak {M}}_{n}\), the increasing nature of its dependence on A is in alignment with one’s intuition. The concavity of the \({\mathfrak {M}}_{n}\)-vs-A curve for \(n=1\) and its convexity for \(n\ge 2\) is due to the aforementioned asymptotics \(\lambda _A=A^{-1}+O(A^{-3/2})\), implying \(\lim _{A\rightarrow +\infty }\big (\lambda _{A}\,A\big )=1\) but \(\lim _{A\rightarrow +\infty }\big (\lambda _{A}^{1+\kappa }A\big )=0\) for any \(\kappa >0\); cf. (Polunchenko 2017a, c). The dependence of \({\mathfrak {M}}_{n}\) on n for a fixed A has its nuances too: as can be seen from Figures 2a–f, if A is sufficiently small (as in around 1 or even less), then \({\mathfrak {M}}_n\) is a decreasing function of n, and otherwise \({\mathfrak {M}}_{n}\) is an increasing function of n. This is essentially because \(f(x)\,{:}{=}\,a^{x}\) with \(a>0\) is an increasing function of x for \(a>1\), and is a decreasing function for \(a\in (0,1)\). It is also noteworthy that the rate of growth (or, correspondingly, the rate of decay) of \({\mathfrak {M}}_{n}\) as a function of n with A fixed or as a function of A with n fixed (at 2 or higher) is rather steep: an eye examination of Figures 1b–f and Figures 2a–f suggests that it is at least exponential, and the rate is the higher, the higher the (fixed) value of n or A.
However, as we shall see below, should one wish to compute the Laplace transform of the quasi-stationary distribution (10)–(11), either of the two formulae is instrumental, although one may find formula (15) to be of greater help than formula (13). The details as well as the actual computation of the Laplace transform are offered in the next subsection.
3.2 Laplace transform
We now use the moment formulae obtained above to recover the Laplace transform of the quasi-stationary distribution (4). Specifically, recall that, for each \(A>0\) fixed, the quasi-stationary pdf \(q_A(x)\) is given explicitly by (10), and since it is supported on the interval [0, A], its Laplace transform can be defined as the integral
and it is connected to the quasi-stationary distribution’s moment sequence \(\{{\mathfrak {M}}_{n}\}_{n\ge 0}\), given either by (13) or by (15), via the standard identity
leading to the classical power series representation of the Laplace transform
which is nothing but the Taylor expansion of \({\mathscr {L}}_{Q}(s)\) around the origin. It is this expansion, rather than definition (17), that we intend to employ shortly to compute \({\mathscr {L}}_{Q}\{q_A(x);x\rightarrow s\}(s,A)\), although with some restrictions on s and A. The reason to prefer (19) along with (13) and (15) over (17) and (10) is the presence of the Whittaker W function on the right of the quasi-stationary pdf formula (10): the Whittaker W function is a special function direct integration of which as in (17) is unlikely an option, for existing handbooks of special functions appear to offer no suitable integral identities. By contrast, the power series (19) and the explicit moment formulae (13) and (15) provide a more straightforward way to recover \({\mathscr {L}}_{Q}(s)\). However, one should keep in mind that the domain of convergence of the series need not be as large as the region of convergence of the integral (17) defining \({\mathscr {L}}_{Q}(s)\).
Lemma 2
For every \(A>0\) fixed and finite, the Laplace transform \({\mathscr {L}}_{Q}\{q_A(x);x\rightarrow s\}(s,A)\) of the quasi-stationary distribution (10)–(11) is given by
where \(s\in [0,+\infty )\), and \(\lambda \equiv \lambda _A\;(>0)\) is determined by (8) while \(\xi (\lambda )\) is defined in (9); recall also that denotes the Kampé de Fériet function (7).
Proof
If we tentatively set
to ease our notation, then together (15), (19), and (7) can be seen to yield
and the desired result is now apparent. \(\square \)
The obtained Laplace transform formula (20) was arrived at through the transform’s power series expansion (19) and the quasi-stationary distribution’s n-th moment formula (15). However, since the n-th moment also has the alternative but equivalent representation (13), the latter, too, by virtue of the power series expansion (19), can be used to obtain a (different, but equivalent) expression for the Laplace transform.
Lemma 3
For every \(A>0\) fixed and finite, the Laplace transform \({\mathscr {L}}_{Q}\{q_A(x);x\rightarrow s\}(s,A)\) of the quasi-stationary distribution (10)–(11) is given by
where \(s\in [0,+\infty )\), and \(\lambda \equiv \lambda _A\;(>0)\) is determined by (8) while \(\xi (\lambda )\) is defined in (9); recall also that denotes the Kampé de Fériet function (7).
Proof
The idea is to multiply Eq. (12) through by \((-s)^n/n!\) to obtain
which, in conjunction with (19), readily gives
so that if we could now show that
then the proof would be complete. To show (22), introduce
to, again, temporarily ease the notation, and observe from (13) and (14) that
which, in view of (23), can be recognized to be exactly (22). The proof is now complete. \(\square \)
We now return to the point made earlier about the domain of convergence of the series (19) potentially being narrower than the region of convergence of the integral (17) defining \({\mathscr {L}}_{Q}(s)\). This is, in fact, the case, for the obtained Laplace transform formulae (20) and (21) both break down in the limit, as either \(A\rightarrow +\infty \) or \(s\rightarrow +\infty \). The reason is because the Kampé de Fériet function involved in either formula is well-defined only when both of its two arguments are finite. That said, except for the two limiting cases—one as \(A\rightarrow +\infty \) and one as \(s\rightarrow +\infty \,\)—formulae (20) and (21) are valid.
At this point one may rightly remark that the Kampé de Fériet function in general is a somewhat “exotic” special function, although its importance appears to have been well-understood in the literature on mathematical physics. To that end, an interesting question is whether the function on the right of formula (20) permits an alternative expression involving either no special functions at all, or, in the worst case, only “less exotic” special functions. While it is very unlikely that our particular function can be reduced to a form completely free of special functions, it may be possible to express it in terms of fairly widespread modified Bessel functions of the first and second kinds, conventionally denoted as \(I_{a}(z)\) and \(K_{a}(z)\), respectively. This possibility is indicated by (Miller and Moskowitz 1998, Identity (4.2a), p. 184) which states that
valid so long as \(\mathfrak {R}(1+a\pm b)>0\); the condition \(\mathfrak {R}(1+a\pm b)>0\) is to assure that the near-origin behavior of the modified Bessel I function
as given, e.g., by (Abramowitz and Stegun 1964, Property 9.6.7, p. 375), and that of the modified Bessel K function
as given, e.g., by (Abramowitz and Stegun 1964, Property 9.6.9, p. 375), are such that the two integrals on the right of (24), i.e., the integrals
are convergent, for any \(y\in [0,+\infty )\); cf. Miller and Moskowitz (1991). Incidentally, the foregoing two integrals are examples of incomplete Weber integrals, which arise in mathematical physics and in certain areas of probability theory (see, e.g. Miller and Moskowitz 1991, 1998).
It is plain to see that the Kampé de Fériet function on the left of identity (24) with \(a=-2\) and \(b=\xi (\lambda )\) is of precisely the same form as the Kampé de Fériet function on the right of the Laplace transform formula (20). However, identity (24) with \(a=-2\) and \(b=\xi (\lambda )\), which is the case we are interested in, does not hold true. This is due to two reasons. First, the condition \(\mathfrak {R}(1+a\pm b)>0\) is false for \(a=-2\) and \(b=\xi (\lambda )\), because \(\xi (\lambda )\), as was explained in Remark 2, is either purely imaginary (so that \(\mathfrak {R}(b)=0\)) or purely real and between 0 inclusive and 1 exclusive (so that \(0\le b<1\)). The second reason is that, in our case, the parameter \(b=\xi (\lambda )\) happens to be connected (and in very specific manner!) to the first argument of the Kampé de Fériet function; the connection is through Eq. (8). Yet, although not directly applicable in our case, identity (24) is still of value: observe that its right-hand side resembles the variation of parameters formula for a particular solution to a second-order nonhomogeneous ordinary differential equation. Moreover, this equation is not too difficult to “reverse engineer”. To this end, it can be deduced from Polunchenko (2017c) that, for every \(A>0\) fixed, the Laplace transform \({\mathscr {L}}_{Q}(s)\equiv {\mathscr {L}}_{Q}\{q_A(x);x\rightarrow s\}(s,A)\) defined in (17) is the solution \(L(s)\equiv L(s,A)\) of the equation
where recall that \(\lambda \equiv \lambda _{A}\;(>0)\) and A are coupled together via Eq. (8). As we shall see shortly, the right-hand side of identity (24) with \(a=-2\) and \(b=\xi (\lambda )\) is precisely what the method of variation of parameters yields as a particular solution to the foregoing Eq. (27). However, this particular solution is not the solution, because it does not satisfy the appropriate boundary conditions, which are \(\lim _{s\rightarrow 0+}L(s)=1\), \(\lim _{s\rightarrow +\infty }L(s)=0\), and
where \({\mathfrak {M}}_{n}\) is the n-th moment of the quasi-stationary distribution; recall formulae (13) and (15) we established for \({\mathfrak {M}}_{n}\) in the preceding subsection. The first two of the boundary conditions come from the definition (17) of the Laplace transform, and the third condition is due to (18).
To solve Eq. (27) directly, observe that the change of variables \(s\mapsto u\equiv u(s)\,{:}{=}\,2\sqrt{2s}\) and the substitution \(L(s)\mapsto L(u)\,{:}{=}\,u\, \ell (u)\) together convert the equation into
which is a nonhomogeneous version of the modified Bessel equation. Hence, by definition, the two fundamental solutions, \(\ell ^{(1)}(u)\) and \(\ell ^{(2)}(u)\), to the homogeneous version of the equation are
which can be used to construct a particular solution, \(\ell ^{\mathrm {(p)}}(u)\), to the nonhomogeneous equation via variation of parameters. Specifically, since the Wronskian between \(I_{a}(z)\) and \(K_{a}(z)\) is
as given, e.g., by (Abramowitz and Stegun 1964, Formula 9.6.15, p. 375), the basic variation of parameters formula asserts, after some calculation, that the function
when defined, solves the nonhomogeneous equation (29). Parenthetically, it is worth nothing that, just as the Laplace transform \({\mathscr {L}}_Q(s)\) should be, by definition (17) and Remark 1, the above function \(\ell ^{\mathrm {(p)}}(u)\) is, too, an even function of \(\xi (\lambda )\), because
as given, e.g., by (Abramowitz and Stegun 1964, Identity 9.6.2, p. 375).
The problem now is to understand whether the two indefinite integrals involved in the above function \(\ell ^{\mathrm {(p)}}(u)\) can be turned into convergent definite integrals, so that the result is a well-defined function that still satisfies the nonhomogeneous equation (29). To that end, it can be gleaned, e.g., from (Bateman and Erdélyi 1953a, p. 99), that
which, in conjunction with Remark 2, enables one to see that the integrals
are both convergent for any \(z>0\), but divergent for \(z=0\). As a result, one can conclude that the function
is a well-defined, valid particular solution to Eq. (29); note the similarity of \(\ell ^{\mathrm {(p)}}(u)\) to the right-hand side of identity (24).
We are now in a position to claim that the general solution to Eq. (27) is of the form
where \(C_1\) and \(C_2\) are arbitrary constants, each independent of s, but possibly dependent on A. The only question left to be considered is that of “pinning down” the two constants \(C_1\) and \(C_2\) so as to make the foregoing L(s) satisfy the necessary boundary conditions.
With regard to fitting the boundary conditions, let us first examine the behavior of L(s) given by (32) in the limit as \(s\rightarrow 0+\). To that end, from the small-argument asymptotics (25) of the modified Bessel I function, and the derivative formula
as given, e.g., by (Gradshteyn and Ryzhik 2014, Identity 8.486.4, p. 937), we obtain
where equality \((*)\) is due to L’Hôpital’s rule, applicable because the corresponding integral of the modified Bessel K function is divergent when the lower limit of integration is zero.
Likewise, from the small-argument asymptotics (26) of the modified Bessel K function, its symmetry with respect to the order, i.e., \(K_{a}(z)=K_{-a}(z)\), trivially implied by (30), and the derivative formula
as given, e.g., by (Gradshteyn and Ryzhik 2014, Identity 8.486.12, p. 938), we obtain
where we again used L’Hôpital’s rule, applicable because the corresponding integral of the modified Bessel I function is divergent when the lower limit of integration is zero.
Next, from the foregoing two limits (34) and (36), and (9) we obtain
whence, recalling again (25) and (26), one finds that L(s) given by (32) converges to unity as \(s\rightarrow 0+\), whatever \(C_1\) and \(C_2\) be. Put another way, it turns out that \(\lim _{s\rightarrow 0+}L(s)=1\), for any choice of \(C_1\) and \(C_2\).
Let us switch attention to the behavior of L(s) for large values of s. To that end, from (31) and (35) we obtain
and
so that the limit of L(s) given by (32) as \(s\rightarrow +\infty \) can now be seen to be infinite if \(C_1\ne 0\), or 0 if \(C_1=0\). Hence, with \(C_1\) set to 0, our function L(s) simplifies down to
where \(C_2\) is still to be found.
To “pin down” \(C_2\) one may invoke (28) for any one value of \(n\in {\mathbb {N}}\). The easiest choice is \(n=1\), so that, in view of (15), we obtain
which is what we now intend to make L(s) given by (37) satisfy so as to get an equation to subsequently recover \(C_2\) from.
To find dL(s) / ds, first recall the symmetry \(K_{a}(z)=K_{-a}(z)\), and then devise (33) and (35) and integration by parts to establish the indefinite integral identities
and
so that
and
which is sufficient to compute the limit of dL(s) / ds as \(s\rightarrow 0+\). Specifically, from (38), after quite a bit of algebra involving repeated use of (25) and (26), we find that
which can be brought to a more explicit form by appealing to (Gradshteyn and Ryzhik 2014, Identity 6.643.2, p. 716), i.e., the definite integral
valid for \(\mathfrak {R}(\kappa +a+1/2)>0\); recall that \(M_{a,b}(z)\) here denotes the Whittaker M function. The foregoing definite integral immediately gives
and
so that
where we also used the factorial property of the Gamma function \(\varGamma (z+1)=z\,\varGamma (z)\). Now, from (Abramowitz and Stegun 1964, Identity 13.4.28, p. 507), i.e., the identity
we find at once that
whence
which is equivalent to
because of (Abramowitz and Stegun 1964, Identity 12.4.29, p. 507), i.e., the recurrence
whereby
Next, since the Wronskian between \(M_{a,b}(z)\) and \(W_{a,b}(z)\) is
as given, e.g., by (Slater 1960, Formula (2.4.27), p. 26), and because
as given, e.g., by (Slater 1960, Formula (2.4.21), p. 25), it follows that
where we also appealed to Eq. (8).
Putting all of the above together, we can finally conclude that
which is precisely the normalizing factor in the quasi-stationary distribution’s formulae (10) and (11).
We have now solved the differential equation (27) and obtained yet another representation of the Laplace transform \({\mathscr {L}}_{Q}\{q_A(x);x\rightarrow s\}(s,A)\) of the quasi-stationary distribution (4).
Lemma 4
For every \(A>0\) fixed, the Laplace transform \({\mathscr {L}}_{Q}\{q_A(x);x\rightarrow s\}(s,A)\) of the quasi-stationary distribution (10)–(11) is given by
where \(s\ge 0\), and \(\lambda \equiv \lambda _{A}\;(>0)\) is determined by (8) while \(\xi (\lambda )\) is defined in (9); recall also that \(W_{a,b}(z)\) denotes the Whittaker W function, and \(I_{a}(z)\) and \(K_{a}(z)\) denote the modified Bessel functions of the first and second kinds, respectively.
Yet again, from the symmetry of the Whittaker W with respect to the second index, i.e., \(W_{a,b}(z)=W_{a,-b}(z)\), one can see that, just like formulae (20) and (21) obtained earlier, the new formula (39) is also symmetric with respect to \(\xi (\lambda )\), as it should be, by definition (17) and Remark 1. However, unlike formulae (20) and (21), the new formula (39) is not only free of the Kampé de Fériet function, but more importantly, is also valid even in the limit, as \(A\rightarrow +\infty \) or as \(s\rightarrow +\infty \). While the (trivial) limit as \(s\rightarrow +\infty \) is of little interest, the (nontrivial) limit as \(A\rightarrow +\infty \) does merit some consideration, especially in the context of quickest change-point detection, as previously elucidated by Pollak and Siegmund (1985). To that end, since
which was observed previously in (Polunchenko 2017c, p. 139) as an implication of the limits \(\lim _{A\rightarrow +\infty }\lambda _{A}=0\) and \(\lim _{A\rightarrow +\infty }\xi (\lambda _{A})=1\), it can be shown directly from (39) with the aid of (25) that
for every \(s\ge 0\) fixed. However, in view of (Bateman and Erdélyi 1953b, Identity (24), p. 82), i.e., the identity
the function \({\mathscr {L}}_{H}(s)\,{:}{=}\,2\sqrt{2s}\,K_{1}(2\sqrt{2s})\) can be recognized to be the Laplace transform of the Shiryaev diffusion’s stationary distribution defined in (2) and given explicitly by (3). That is, for every \(s\ge 0\) fixed, the limit of \({\mathscr {L}}_{Q}\{q_A(x);x\rightarrow s\}(s,A)\) as \(A\rightarrow +\infty \) is precisely \({\mathscr {L}}_{H}(s)\), and, therefore, the stationary distribution (3) is the limit of the quasi-stationary distribution (10)–(11) as \(A\rightarrow +\infty \). This convergence of distributions (for a more general family of stochastically monotone processes) was previously established by Pollak and Siegmund (1985, 1986), although through an entirely different approach and with no explicit formulae.
We conclude with an admission that, in our derivation of the Laplace transform formula (39), we actually had to “cut some corners”. Strictly speaking, by Remark 2, we should have considered separately three different cases: (1) \(A<\tilde{A}\approx 10.240465\) so that \(\lambda _{A}>1/8\) and \(\xi (\lambda )\) is purely imaginary; (2) \(A=\tilde{A}\approx 10.240465\) so that \(\lambda _{A}=1/8\) and \(\xi (\lambda )=0\); and (3) \(A>\tilde{A}\approx 10.240465\) so that \(\lambda _{A}<1/8\) and \(\xi (\lambda )\) is purely real and strictly between 0 and 1. However, for lack of space, we only attended to the third case. The reason to distinguish the three cases is because the asymptotics of the modified Bessel I and K functions are highly order-dependent, and, in our specific situation, the order of either function is determined entirely by \(\xi (\lambda )\), which is either purely imaginary, or zero, or purely real and strictly between 0 and 1. For example, the limits (34) and (36) are clearly false when \(\xi (\lambda )=0\). Nevertheless, the end-result, viz. formula (39), is valid in all three cases.
4 Concluding remarks
It is generally rare that quasi-stationary distributions or characteristics thereof lend themselves to explicit analytic evaluation. Furthermore, in the rare cases one can recover the distribution itself or its characteristics analytically, the result is usually of limited use, for the corresponding formulae, though explicit, are typically rather complex and involve special functions (or, worse yet, exotic special functions). This work, as a continuation of Polunchenko (2017c) and Polunchenko et al. (2018), presented an example of a situation where the distribution itself, its Laplace transform as well as the entire moment series are all obtainable analytically and in closed-form, despite the presence of special functions in all of the calculations. It is our hope that the special functions calculus heavily used in this work will aid further research on stochastic processes, an area where special functions (including those appearing in this paper) arise routinely.
References
Abramowitz M, Stegun I (eds) (1964) Handbook of mathematical functions with formulas, graphs, and mathematical tables. Applied mathematics series, vol 55, 10th edn. United States National Bureau of Standards, Gaithersburg
Avram F, Leonenko NN, Šuvak N (2013) On spectral analysis of heavy-tailed Kolmogorov–Pearson diffusions. Markov Process Relat Fields 19(2):249–298
Bateman H, Erdélyi A (1953) Higher transcendental functions, vol 1. McGraw-Hill, New York
Bateman H, Erdélyi A (1953) Higher transcendental functions, vol 2. McGraw-Hill, New York
Buchholz H (1969) The confluent hypergeometric function, Springer tracts in natural philosophy, vol 15. Springer, New York (1969) Translated from German by H. Lichtblau and K. Wetzel
Burnaev EV, Feinberg EA, Shiryaev AN (2009) On asymptotic optimality of the second order in the minimax quickest detection problem of drift change for Brownian motion. Theory Probab Appl 53(3):519–536. https://doi.org/10.1137/S0040585X97983791
Collet P, Martinez S, San Martin J (2013) Quasi-stationary distributions: Markov chains, diffusions and dynamical systems. Probability and its applications. Springer, New York
Comtet A, Monthus C (1996) Diffusion in a one-dimensional random medium and hyperbolic Brownian motion. J Phys A Math Gen 29(7):1331–1345. https://doi.org/10.1088/0305-4470/29/7/006
Donati-Martin C, Ghomrasni R, Yor M (2001) On certain Markov processes attached to exponential functionals of Brownian motion: Application to Asian options. Rev Mat Iberoam 17(1):179–193. https://doi.org/10.4171/RMI/292
Dufresne D (2001) The integral of geometric Brownian motion. Adv Appl Probab 33(1):223–241. https://doi.org/10.1239/aap/999187905
Feinberg EA, Shiryaev AN (2006) Quickest detection of drift change for Brownian motion in generalized Bayesian and minimax settings. Stat Decis 24(4):445–470. https://doi.org/10.1524/stnd.2006.24.4.445
Kampé de Fériet J (1921) Les fonctions hypergéométrique d’ordre supérieur, à deux variables. In: Comptes rendus hebdomadaires des séances de l’Académie des sciences, vol 173, pp 401–404. Paris (1921). In French
Geman H, Yor M (1993) Bessel processes, Asian options, and perpetuities. Math Financ 3(4):349–375. https://doi.org/10.1111/j.1467-9965.1993.tb00092.x
Gradshteyn IS, Ryzhik IM (2014) Table of integrals, series, and products, 8th edn. Academic Press, Amsterdam
Lavoie J, Grondin F (1994) The Kampé-de Fériet functions: A family of reduction formulas. J Mat Anal Appl 186(2):393–401. https://doi.org/10.1006/jmaa.1994.1307
Linetsky V (2004) Spectral expansions for Asian (average price) options. Oper Res 52(6):856–867. https://doi.org/10.1287/opre.1040.0113
Miller AR, Moskowitz IS (1991) Incomplete Weber integrals of cylindrical functions. Journal of the Franklin Institute 328(4):445–457. https://doi.org/10.1016/0016-0032(91)90019-Y
Miller AR, Moskowitz IS (1998) On certain generalized incomplete Gamma functions. J Comput Appl Math 91(2):179–190. https://doi.org/10.1016/S0377-0427(98)00031-4
Monthus C, Comtet A (1994) On the flux distribution in a one dimensional disordered system. J Phys I Fr 4(5):635–653. https://doi.org/10.1051/jp1:1994167
Peskir G (2006) On the fundamental solution of the Kolmogorov–Shiryaev equation. In: Y. Kabanov, R. Liptser, J. Stoyanov (eds.) From stochastic calculus to mathematical finance: the Shiryaev Festschrift, pp 535–546. Springer, Berlin. https://doi.org/10.1007/978-3-540-30788-4_26
Pollak M, Siegmund D (1985) A diffusion process and its applications to detecting a change in the drift of Brownian motion. Biometrika 72(2):267–280. https://doi.org/10.1093/biomet/72.2.267
Pollak M, Siegmund D (1986) Convergence of quasi-stationary to stationary distributions for stochastically monotone Markov processes. J Appl Probab 23(1):215–220. https://doi.org/10.2307/3214131
Polunchenko AS (2016) Exact distribution of the Generalized Shiryaev–Roberts stopping time under the minimax Brownian motion setup. Sequ. Anal. 35(1):108–143. https://doi.org/10.1080/07474946.2016.1132066
Polunchenko AS (2017) Asymptotic exponentiality of the first exit time of the Shiryaev–Roberts difusion with constant positive drift. Sequ Anal 36(3):370–383. https://doi.org/10.1080/07474946.2017.1360089
Polunchenko AS (2017) Asymptotic near-minimaxity of the randomized Shiryaev–Roberts–Pollak change-point detection procedure in continuous time. Theory Probab Appl 64(4):769–786. https://doi.org/10.4213/tvp5142
Polunchenko AS (2017) On the quasi-stationary distribution of the Shiryaev–Roberts diffusion. Seq Anal 36(1):126–149. https://doi.org/10.1080/07474946.2016.1275512
Polunchenko AS, Martínez S, San Martín J (2018) A note on the quasi-stationary distribution of the Shiryaev martingale on the positive half-line. Theory Probab Appl 63(3) (in press)
Polunchenko AS, Sokolov G (2016) An analytic expression for the distribution of the Generalized Shiryaev–Roberts diffusion: The Fourier spectral expansion approach. Methodol Comput Appl Probab 18(4):1153–1195. https://doi.org/10.1007/s11009-016-9478-7
Schröder M (2003) On the integral of geometric Brownian motion. Adv Appl Probab 35(1):159–183. https://doi.org/10.1239/aap/1046366104
Shiryaev AN (1961) The problem of the most rapid detection of a disturbance in a stationary process. Soviet Mathematics—Doklady, vol 2, pp 795–799 (Translated from Dokl. Akad. Nauk SSSR 138:1039–1042, 1961)
Shiryaev AN (1963) On optimum methods in quickest detection problems. Theory Probab Appl 8(1):22–46. https://doi.org/10.1137/1108002
Shiryaev AN (2002) Quickest detection problems in the technical analysis of the financial data. In: H. Geman, D. Madan, S.R. Pliska, T. Vorst (eds.) Mathematical Finance—Bachelier Congress 2000, Springer Finance, pp 487–521. Springer, Berlin. https://doi.org/10.1007/978-3-662-12429-1_22
Slater LJ (1960) Confluent Hypergeometric Functions. Cambridge University Press, Cambirdge
Srivastava HM, Karlsson PW (1985) Multiple Gaussian Hypergeometric Series. Halsted Press, New York
Whittaker ET (1904) An expression of certain known functions as generalized hypergeometric functions. Bull Am Math Soc 10(3):125–134
Wong E (1964) The construction of a class of stationary Markoff processes. In: Bellman R (ed) Stochastic Processes in Mathematical Physics and Engineering. American Mathematical Society, Providence, pp 264–276
Yor M (1992) On some exponential functionals of Brownian motion. Adv Appl Probab 24(3):509–531. https://doi.org/10.2307/1427477
Acknowledgements
The authors would like to thank the two anonymous referees for the careful reading of the manuscript and pertinent comments; the referees’ constructive feedback helped substantially improve the quality of this work and shape its final form.
Author information
Authors and Affiliations
Corresponding author
Additional information
The effort of A.S. Polunchenko was partially supported by the Simons Foundation via a Collaboration Grant in Mathematics under Award # 304574. The work of A. Pepelyshev was partially supported by the Russian Foundation for Basic Research under Projects ## 17-01-00267-a and 17-01-00161-a.
Rights and permissions
About this article
Cite this article
Polunchenko, A.S., Pepelyshev, A. Analytic moment and Laplace transform formulae for the quasi-stationary distribution of the Shiryaev diffusion on an interval. Stat Papers 59, 1351–1377 (2018). https://doi.org/10.1007/s00362-018-1019-8
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-018-1019-8
Keywords
- Laplace transform
- Markov diffusions
- Quasi-stationarity
- Shiryaev process
- Special functions
- Stochastic processes