Some Approximation Results for Bayesian Posteriors that Involve the Hurwitz–Lerch Zeta Distribution

Li, Hongxiang; Khang, Tsung Fei

doi:10.1007/s40840-023-01463-9

Some Approximation Results for Bayesian Posteriors that Involve the Hurwitz–Lerch Zeta Distribution

Published: 31 January 2023

Volume 46, article number 72, (2023)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Bulletin of the Malaysian Mathematical Sciences Society Aims and scope Submit manuscript

Some Approximation Results for Bayesian Posteriors that Involve the Hurwitz–Lerch Zeta Distribution

Download PDF

82 Accesses
3 Altmetric
Explore all metrics

Abstract

Consider the generalized Poisson and the negative binomial model with mean parameter equal to kb, where $k \ge 0$ is a count parameter and $0< b < 1$ is a hyperparameter. We show that conditioning on counts from both models and assuming a uniform prior for k lead to the following Bayesian posterior distributions: (i) geometric for conditioning value of 0; (ii) extended negative binomial for conditioning value of 1; (iii) approximately extended Hurwitz–Lerch zeta distribution for conditioning value of 2 or more. Kullback–Leibler divergence for measuring the quality of the approximating distributions for some combinations of b and the mean–variance ratio is given.

Exact posterior computation for the binomial–Kumaraswamy model

Article 09 November 2020

Pseudo-Likelihoods for Bayesian Inference

On asymptotic expansion of posterior distribution

Article 14 July 2016

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The family of generalized Poisson (GP) distribution [1] has been used for more than 40 years to model count data that may be overdispersed or underdispersed. Some of its interesting theoretical properties include a Poisson mixture interpretation [2], and a heavier tail compared to the negative binomial distribution [2, 3]. Various chance mechanisms have been found to generate the GP distribution [4]. Numerous applications are given in [5]. In bioinformatics, the GP distribution has been used for modeling RNA-Seq count data [6,7,8,9,10].

The negative binomial (NB) distribution has a long history of use in the analysis of biological count data [11,12,13]. In bioinformatics, it is the primary model for modelling RNA-Seq count data and forms the basis of statistical tests of differential gene expression [14,15,16,17].

Low et al. [10] pointed out that the observed gene counts in RNA-Seq experiments are the consequence of stochastic variation acting on true gene counts. They proposed to view the true gene count as a count parameter k. Under the GP model, the expectation of the observed gene counts given k is equal to the product of k and a hyperparameter b. Thus, the modeling focus shifts to finding the posterior distribution of k, assuming a specific prior distribution. Since posterior distributions often have complicated forms that make further analysis difficult and computationally expensive, finding appropriate approximations to them is important to improve their applied value in statistics.

We first introduce the GP model. Let X be a random variable following a $\text{ GP }(\lambda _1, \lambda _2)$ distribution. Its probability mass function (pmf) is given by

$$\begin{aligned} P(X=x \vert \lambda _1,\lambda _2) = \frac{ \lambda _1(\lambda _1+x\lambda _2)^{x-1} e^{-(\lambda _1+x\lambda _2)} }{x!}, \end{aligned}$$

(1)

where $x=0,1,2,\ldots $, $\lambda _1 > 0$ and $\max \{-1, -\frac{\lambda _1}{4}\}< \lambda _2 < 1$. Negative values of $\lambda _2$ correspond to overdispersion, positive values to underdispersion, and $\lambda _2 = 0$ reduces Eq. (1) to the Poisson distribution with mean $\lambda _1$. Consider the following parametrisation: $\lambda _2 = 1- \sqrt{m}$, $\lambda _1 = kb\sqrt{m}$, where $k=0,1,2,\ldots $ is the count parameter, $0< b < 1$ is a hyperparameter, and $m > 0$ is the mean–variance ratio. Under this parametrisation, the mean and the variance of the GP model are given by ${\text {E}}(X \vert k) = \lambda _1 / (1-\lambda _2) = k b,$ and $\text{ Var }(X\vert k) = {\text {E}}(X\vert k) / (1-\lambda _2)^2 = k b/m$, respectively.

Now, consider a random variable Y that has an NB distribution with parameters r (the number of failures until y successes) and p (the probability of success). Its pmf is given by

$$\begin{aligned} P(Y=y \vert r,p)=\frac{\Gamma {(y+r)}}{\Gamma {(y+1)}\Gamma {(r)}}p^y(1-p)^r, \end{aligned}$$

(2)

for $y=0,1,2,\ldots $. The mean and the variance of Y are given by $\text {E}(Y \vert k)={pr}/(1-p) = kb$, and $\text {Var}(Y \vert k)=pr/(1-p)^2 = {kb}/{m}$, respectively, where $k = 0, 1, 2, \ldots $, $0<b<1$, $0<m<1$, and m is the mean-variance ratio. Thus, $r={kbm}/(1-m)=k\tau $, where $\tau = bm/(1-m)$.

We are interested in the posterior distribution of k conditioned on observations from these two models, using an improper uniform prior $P(k)=1, k=0,1,2,\ldots $. The posterior distribution of k under a GP model (Eq. (1)) is given by

$$\begin{aligned} \begin{aligned} P(k\vert X=x)&= \frac{k(bk\sqrt{m}+x(1-\sqrt{m}))^{x-1}e^{-bk\sqrt{m}}}{\sum _{j=x}^\infty j(bj\sqrt{m}+x(1-\sqrt{m}))^{x-1}e^{-bj\sqrt{m}}} \\&= \frac{k(k+g(x))^{x-1}e^{-bk\sqrt{m}}}{\sum _{j=x}^\infty j(j+g(x))^{x-1}e^{-b\sqrt{m}j}}, \end{aligned} \end{aligned}$$

(3)

for $k \ge x$, where $0<m<\min \bigl \{{(1-b)^{-2}} ,4 \bigr \} $ and $g(x) = \{(1-\sqrt{m}) /(b\sqrt{m})\}x$. Then, the posterior distribution of k under an NB model (Eq. (2)) is given by

$$\begin{aligned} \begin{aligned} P(k \vert Y=y)&=\frac{\frac{\Gamma (y+k\tau )}{\Gamma (y+1)\Gamma (k\tau )}(1-m)^{y}m^{k\tau }}{\sum _{j= y}^{\infty }\frac{\Gamma (y+j\tau )}{\Gamma (y+1)\Gamma (j\tau )}(1-m)^{y}m^{j\tau }} \\&= \frac{\frac{\Gamma (y+k\tau )}{\Gamma (k\tau )}m^{k\tau }}{\sum _{j= y}^{\infty }\frac{\Gamma (y+j\tau )}{\Gamma (j\tau )}m^{j\tau }}, \end{aligned} \end{aligned}$$

(4)

for $k \ge y$.

The posterior distributions (Eq. (3) and (4)) are proper even though an improper uniform prior distribution is used. The aim of this paper is to find their exact distributions, and where this is not possible, approximating distributions that are mathematically tractable. By doing so, their mean and variance can be determined directly from the theoretical properties of the approximating distribution.

2 Results

We first show that when the GP and the NB models have count of 0 or 1, the posterior distribution of k is geometric, and extended NB, respectively.

Theorem 1

The posterior distribution of k is (i) geometric with mean $e^{-b\sqrt{m}}(1-e^{-b\sqrt{m}})^{-1}$ and variance ${e^{-b\sqrt{m}}(1-e^{-b\sqrt{m}})^{-2}}$ for $k \ge 0$, when $x=0$ for the GP model; and with mean ${m^\tau }{(1-m^\tau )^{-1}}$ and variance ${m^\tau }{(1-m^\tau )^{-2}}$ for $k \ge 0$, when $y=0$ for the NB model; (ii) extended NB with mean ${\bigl (1 + e^{-b\sqrt{m}}\bigr )}{\bigl (1-e^{-b\sqrt{m}}\bigr )^{-1}}$ and variance ${2e^{-b\sqrt{m}}}{\bigl (1-e^{-b\sqrt{m}}\bigr )^{-2}}$ for $k \ge 1$, when $x=1$ for the GP model; and with mean ${(1+m^\tau )}{(1-m^\tau )^{-1}}$ and variance ${2m^\tau }{(1-m^\tau )^{-2}}$ for $k \ge 1$, when $y=1$ for the NB model.

Proof

We only show the proof for the GP model with $x=0,1$ since the proof for the NB model with $y=0,1$ is similar with $m^{\tau }=e^{-b\sqrt{m}}$.

(i)
When $x=0$:
$$\begin{aligned} \begin{aligned} P(k \vert X=0)&=\frac{e^{-bk\sqrt{m}}}{\sum _{j = 0}^{\infty } e^{-bj\sqrt{m}}} = \bigl (e^{-b\sqrt{m}}\bigr )^k \bigl (1-e^{-b\sqrt{m}}\bigr ), \end{aligned} \end{aligned}$$
where $k = 0,1,\ldots $. Hence, the posterior distribution of k is geometric with success probability $p=1-e^{-b\sqrt{m}}$. The mean and the variance follow from standard results.
(ii)
When $x=1$:
$$\begin{aligned} P(k \vert X=1) =\frac{k e^{-bk\sqrt{m}}}{\sum _{j= 1}^{\infty }j e^{-b j\sqrt{m}}} = \left( {\begin{array}{c}k\\ 1\end{array}}\right) \bigl (1-e^{-b\sqrt{m}})^2 \bigl (e^{-b\sqrt{m}}\bigr )^{k-1}, \end{aligned}$$
where $k=1,2,\ldots $. Therefore, the posterior distribution of k is extended NB with parameters $p=e^{-b\sqrt{m}}$ and $r=2$. The mean and the variance are
$$\begin{aligned} \text {E}(k \vert X=1)&= \sum _{k=1}^{\infty }k P(k \vert X=1) = \frac{1+e^{-b\sqrt{m}}}{1-e^{-b\sqrt{m}}},\\ \text {Var}(k \vert X=1)&= \sum _{k=1}^{\infty }k^2 P(k\vert X=1) - \bigl [\text {E}(k \vert X=1) \bigr ]^2 =\frac{2e^{-b\sqrt{m}}}{\bigl (1-e^{-b\sqrt{m}}\bigr )^2}, \end{aligned}$$

respectively. $\square $

Corollary 1

The cumulative distribution function (cdf) of the posterior distribution of k is

(i)
$F_{k \vert X=0}(k)= 1-e^{-b\sqrt{m}(k+1)}$ for $k \ge 0$;
(ii)
$F_{k \vert X=1}(k)=1-\bigl [k(1-e^{-b\sqrt{m}})+1\bigr ] e^{-b\sqrt{m}k}$ for $k\ge 1$;
(iii)
$F_{k \vert Y=0}(k)= 1-m^{\tau {(k+1)}}$ for $k \ge 0$;
(iv)
$F_{k \vert Y=1}(k)=1-\bigl [k(1-m^\tau )+1\bigr ]m^{\tau k}$ for $k\ge 1$.

We now show that the extended Hurwitz–Lerch zeta distribution [18] is an appropriate approximation for the posterior distribution of k under the GP model with $x\ge 2$.

Theorem 2

The posterior distribution of k, given $X=x$ has a GP distribution with mean kb and variance kb/m, can be approximated by an extended Hurwitz–Lerch zeta distribution with mean $x/b + 1/(b\sqrt{m})$ and variance $(x+1)/(b^2 m)$ for some $k \ge l$ where $x \ge 2$ is the conditioning value, and $l \ge x$.

Proof

First, we note that the denominator in Eq. (3) can be written as

$$\begin{aligned} \sum _{j=x}^\infty (j+g(x))^x e^{-b\sqrt{m}j} - g(x)\sum _{j=x}^\infty (j+g(x))^{x-1}e^{-b\sqrt{m}j}. \end{aligned}$$

The Lerch transcendent $\Phi (u,s,v)$ (see [19]) is given by

$$\begin{aligned} \Phi (u,s,v) = \sum _{k=0}^\infty \frac{u^k}{(v+k)^s}, \end{aligned}$$

where u is complex and $\vert u\vert < 1$, $v \ne 0, -1, -2, \ldots $, and $s \ne 1, 2, \ldots $. Representing the denominator using the Lerch transcendent, we get

$$\begin{aligned} e^{-b\sqrt{m}x}\Phi (e^{-b\sqrt{m}},-x,wx) - (w-1)xe^{-b\sqrt{m}x}\Phi (e^{-b\sqrt{m}},-(x-1),wx), \end{aligned}$$

(5)

where $w = 1+(1-\sqrt{m})/(b\sqrt{m})$.

The following identity (Eq.1.11(11) in [19]) relates the Lerch transcendent to the Bernoulli polynomials for $s=-h$:

$$\begin{aligned} \Phi (u,-h,v) = \frac{h!}{u^v}\left( \log \frac{1}{u} \right) ^{-(h+1)} - \frac{1}{u^v}\sum _{r=0}^\infty \frac{B_{h+r+1}(v)(\log u)^r}{r!(h+r+1)}, \end{aligned}$$

(6)

where $\vert {\log u}\vert < 2\pi $, $v \ne 0,-1,-2,\ldots $, $h \ne -1,-2,\ldots $, and $B_n(v)$ is the nth Bernoulli polynomial with argument v. The Bernoulli polynomial is defined as

$$\begin{aligned} B_n(v) = \sum _{j=0}^n \left( {\begin{array}{c}n\\ j\end{array}}\right) B_{n-j}(0)v^j, \end{aligned}$$

where $B_{i}(0)$ is the ith Bernoulli number.

If we substitute $u=e^{-b\sqrt{m}}$, $h=x$, $v=wx$ into Eq. (6), and then multiply both sides by $e^{-bw\sqrt{m}x}$, we obtain

$$\begin{aligned} e^{-bw\sqrt{m}x}\Phi (e^{-b\sqrt{m}},-x,wx) = \frac{\Gamma (x+1)}{(b\sqrt{m})^{x+1}} - \sum _{r=0}^\infty \frac{B_{x+r+1}(wx)(-b\sqrt{m})^r}{r!(x+r+1)}. \end{aligned}$$

(7)

We can approximate Eq. (7) as

$$\begin{aligned} e^{-bw\sqrt{m}x}\Phi (e^{-b\sqrt{m}},-x,wx) \approx \frac{\Gamma (x+1)}{(b\sqrt{m})^{x+1}}, \end{aligned}$$

(8)

provided that

$$\begin{aligned} \Biggl \vert \sum _{r=0}^\infty \frac{B_{x+r+1}(wx)(-b\sqrt{m})^r}{r!(x+r+1)}\Biggr \vert = o\left( \frac{\Gamma (x+1)}{(b\sqrt{m})^{x+1}}\right) , \end{aligned}$$

as $x \rightarrow \infty $, o being the little o notation. To show this, we use an identity involving the Bernoulli polynomials and the sum of xth powers (Eq. 1.13(10) in [19]):

$$\begin{aligned} \frac{B_{x+1}(z)-B_{x+1}(0)}{x+1}= \sum _{t=0}^{z-1}t^x, \end{aligned}$$

for $x=2,3,\ldots $ and $z \in {\mathbb {Z}}^+$. For sufficiently large z, $B_{x+1}(z)$ is positive and dominates $B_{x+1}(0)$. Suppose wx is a positive integer that is sufficiently large, then

$$\begin{aligned} \frac{B_{x+r+1}(wx)}{x+r+1} \approx \sum _{t=0}^{wx-1} t^{x+r}, \end{aligned}$$

if $B_{x+r+1}(0) > 0$ and for some b and m such that $B_{x+r+1}(wx) \gg B_{x+r+1}(0)$; if $B_{x+r+1}(0) \le 0$, we have

$$\begin{aligned} 0 < \frac{B_{x+r+1}(wx)}{x+r+1} \le \sum _{t=0}^{wx-1} t^{x+r}. \end{aligned}$$

Thus,

$$\begin{aligned} \begin{aligned}&\frac{(b\sqrt{m})^{x+1}}{\Gamma (x+1)} \Biggl \vert \sum _{r=0}^\infty \frac{B_{x+r+1}(wx)(-b\sqrt{m})^r}{r!(x+r+1)} \Biggr \vert \\<\,&\frac{1}{x!} (b\sqrt{m})^{x+1} \sum _{r=0}^{\infty }\left[ \frac{(b\sqrt{m})^r}{r!} \Bigl \vert \frac{B_{x+r+1}(wx)}{x+r+1} \Bigr \vert \right] \\ \lessapprox&\frac{1}{x!} (b\sqrt{m})^{x+1} \sum _{r=0}^{\infty }\left[ \frac{(b\sqrt{m})^r}{r!} \sum _{t=0}^{wx-1} t^{x+r} \right] \\ <\,&\frac{1}{x!} (b\sqrt{m})^{x+1} \sum _{r=0}^{\infty }\left[ \frac{(b\sqrt{m})^r}{r!} \int _{0}^{wx} t^{x+r}dt \right] \\ =\,&\frac{1}{x!} (b\sqrt{m})^{x+1} \sum _{r=0}^{\infty }\left[ \frac{(b\sqrt{m})^r}{r!} \frac{(wx)^{x+r+1}}{x+1}\right] \\ =\,&\frac{(bw\sqrt{m}x)^{x+1}}{(x+1)!} e^{bw\sqrt{m}x}. \end{aligned} \end{aligned}$$

Applying Stirling’s approximation for x!, the right-hand-side simplifies to $U = (c/\sqrt{2\pi x})(ce^{c+1})^x$, where $c = bw\sqrt{m}$. For $0 < ce^{c+1} \le 1$, U converges to 0 as $x \rightarrow \infty $. Therefore, $0 < c \le W(e^{-1}) \approx 0.2785$, where $W(\cdot )$ is the Lambert W function. Thus, for $ \min \{ [1-W(e^{-1})]^{2}(1-b)^{-2}, 4\} \le m < \min \{ (1-b)^{-2}, 4\}$, Eq. (8) should give reasonably good approximation.

Subsequently, we can approximate Eq. (5) using Eq. (8):

$$\begin{aligned}&e^{-b\sqrt{m}x}\Phi (e^{-b\sqrt{m}},-x,wx) - (w-1)xe^{-b\sqrt{m}x}\Phi (e^{-b\sqrt{m}},-(x-1),wx) \nonumber \\&\approx \, e^{-b\sqrt{m}x} e^{bw\sqrt{m}x} \frac{\Gamma (x+1)}{(b\sqrt{m})^{x+1}} -(w-1)x e^{-b\sqrt{m}x} e^{bw\sqrt{m}x} \frac{\Gamma (x)}{(b\sqrt{m})^{x}}\nonumber \\&\quad =\, e^{(1-\sqrt{m})x} \frac{\sqrt{m} \ \Gamma (x+1)}{(b\sqrt{m})^{x+1}}\nonumber \\&\quad \approx \, e^{(1-\sqrt{m})x} \sqrt{m} e^{-bw\sqrt{m}x} \Phi (e^{-b\sqrt{m}}, -x, wx)\nonumber \\&\quad =\, \sqrt{m} e^{-b\sqrt{m}x} \Phi (e^{-b\sqrt{m}}, -x, wx). \end{aligned}$$

(9)

Substituting Eq. (9) for the denominator in Eq. (3), we obtain

$$\begin{aligned} \begin{aligned} P(k\vert X=x)&\approx \frac{k(k+g(x))^{x-1} e^{-b\sqrt{m}k}}{\sqrt{m}\Phi (e^{-b\sqrt{m}}, -x, wx) e^{-b\sqrt{m}x}}\\&= \underbrace{\frac{(e^{-b\sqrt{m}})^{k-x}}{\Phi (e^{-b\sqrt{m}}, -x, wx)(k-x+wx)^{-x}}}_{\text {A}} \times \underbrace{\frac{k}{\sqrt{m}(k+g(x))}}_{\text {B}}. \end{aligned} \end{aligned}$$

(10)

To see that A is the pmf of an extended Hurwitz–Lerch zeta distribution with extended parameter space [20], we start with the pmf of the Hurwitz–Lerch zeta distribution:

$$\begin{aligned} \begin{aligned} q_{k}&=\frac{1}{\theta \Phi (\theta ,s+1,a+1)}\frac{\theta ^{k}}{(k+a)^{s+1}}, \end{aligned} \end{aligned}$$

(11)

for $k=1,2,\ldots $, where $a>-1$ and $s \in {\mathbb {R}}$ if $0<\theta <1$ or $s>0$ if $\theta = 1$. Shifting the support to $k=x,x+1,\ldots $, Eq. (11) becomes

$$\begin{aligned} \begin{aligned} q_k&= \frac{\theta ^{k-x}}{\Phi (\theta , s+1,a+1)(k-x+1+a)^{s+1}}, \end{aligned} \end{aligned}$$

(12)

for $k=x,x+1,\ldots $, where $x\ge 2$. Taking $\theta = e^{-b\sqrt{m}}$, $s+1=-x$, $a+1=wx$ and substituting them into Eq. (12) leads to A in Eq. (10).

For some combinations of b and m, there exists an $l \ge x$ such that $k \ge l$ results in $\sqrt{m}(k + g(x)) = k + o(k)$. In this case, B of Eq. (10) becomes approximately 1, thus

$$\begin{aligned} \begin{aligned} P(k\vert X=x)&\approx \frac{(e^{-b\sqrt{m}})^{k-x}}{\Phi (e^{-b\sqrt{m}}, -x, wx)(k-x+wx)^{-x}}, \end{aligned} \end{aligned}$$

(13)

which is the pmf of the extended Hurwitz–Lerch zeta distribution with parameters $\theta = e^{-b\sqrt{m}}$, $s+1=-x$, $a+1=wx$.

To derive the mean and the variance of the extended Hurwitz–Lerch distribution (Eq. (13)), we first note that the latter is a special case of the modified power series distribution [21], which has pmf

$$\begin{aligned} P(Z=z)=\frac{A(z)[g(\theta )]^z}{f(\theta )},\quad z \in {\mathbb {B}}, \end{aligned}$$

where ${\mathbb {B}} \subset {\mathbb {Z}}^+$, $A(z)>0$, and $f(\theta )$, $g(\theta )$ are positive, finite and differentiable functions of $\theta $. In this case, we have

$$\begin{aligned} g(\theta )=\theta ,\quad f(\theta )=\theta ^{x}\Phi (\theta ,s+1,a+1),\quad A(z)=\frac{1}{(z-x+1+a)^{s+1}}. \end{aligned}$$

The expectation of Z is

$$\begin{aligned} {\text {E}}(Z) =\frac{g(\theta )f^{'}(\theta )}{f(\theta )g^{'}(\theta )} =\frac{\theta }{\theta ^{x}\Phi (\theta ,s+1,a+1)} \frac{\partial }{\partial \theta }\theta ^{x}\Phi (\theta ,s+1,a+1), \end{aligned}$$

(14)

and the variance is

$$\begin{aligned} \text {Var}(Z) = \frac{g(\theta )}{g^{'}(\theta )}\frac{\partial }{\partial {\theta }} {\text {E}}(Z). \end{aligned}$$

(15)

Note that

$$\begin{aligned} \frac{\partial }{\partial \theta }\Phi (\theta ,s,a)=\frac{1}{\theta }\Phi (\theta ,s-1,a)-\frac{a}{\theta }\Phi (\theta ,s,a). \end{aligned}$$

(16)

Therefore,

$$\begin{aligned} \begin{aligned} \frac{\partial }{\partial \theta }\theta ^{x}\Phi (\theta ,s\!+\!1,a\!+\!1)&= \theta ^{x-1}\bigl [(x-a-1)\Phi (\theta ,s\!+\!1,a\!+\!1) + \Phi (\theta ,s,a+1) \bigr ]. \end{aligned}\nonumber \\ \end{aligned}$$

(17)

Let $\theta = e^{-b\sqrt{m}}$, $s+1=-x$ and $a+1=wx$. Substituting Eq. (17) into Eq. (14) yields

$$\begin{aligned} {\text {E}}(k \vert X=x) \approx \frac{\Phi (e^{-b\sqrt{m}}, -x-1, wx)}{\Phi (e^{-b\sqrt{m}}, -x, wx)} - \frac{(1-\sqrt{m})x}{b\sqrt{m}}. \end{aligned}$$

(18)

Then, substituting Eq. (18) into Eq. (15) and using Eq. (16) yields

$$\begin{aligned} \text {Var}(k \vert X=x) \approx \frac{\Phi (e^{-b\sqrt{m}}, -x-2, wx)}{\Phi (e^{-b\sqrt{m}}, -x, wx)} - \left[ \frac{\Phi (e^{-b\sqrt{m}}, -x-1, wx)}{\Phi (e^{-b\sqrt{m}}, -x, wx)} \right] ^2. \end{aligned}$$

(19)

Eq. (18) can be further approximated by applying Eq. (8):

$$\begin{aligned} {\text {E}}(k \vert X=x) \approx \frac{\frac{\Gamma (x+2)}{(b\sqrt{m})^{x+2}}}{\frac{\Gamma (x+1)}{(b\sqrt{m})^{x+1}}} - \frac{1-\sqrt{m}}{b\sqrt{m}}x = \frac{x+1}{b\sqrt{m}} - \frac{(1-\sqrt{m})x}{b\sqrt{m}} = \frac{x}{b} + \frac{1}{b\sqrt{m}}. \end{aligned}$$

Similarly, Eq. (19) can be further approximated as

$$\begin{aligned} \text {Var}(k \vert X=x) \approx \frac{\frac{\Gamma (x+3)}{(b\sqrt{m})^{x+3}}}{\frac{\Gamma (x+1)}{(b\sqrt{m})^{x+1}}} - \left[ \frac{\frac{\Gamma (x+2)}{(b\sqrt{m})^{x+2}}}{\frac{\Gamma (x+1)}{(b\sqrt{m})^{x+1}}} \right] ^2 = \frac{(x+2)(x+1)}{(b\sqrt{m})^2} - \frac{(x+1)^2}{(b\sqrt{m})^2} = \frac{x+1}{b^2 m}. \end{aligned}$$

$\square $

For the NB model with $y \ge 2$, we again find that the posterior distribution of k is approximately given by the extended Hurwitz–Lerch zeta distribution.

Theorem 3

Let Y have an NB distribution with pmf given by Eq. (2). The posterior distribution of k given $Y=y$ approximately follows the extended Hurwitz–Lerch zeta distribution with mean $-(y+1)/(\tau \log m) - (y-1)/(2\tau )$ and variance $(y+1)/(\tau \log m)^2$, for $k \ge y$, where $y \ge 2$.

Proof

For non-negative real $\alpha $, $\beta $ such that $\alpha \ne \beta $, Laforgia & Natalini [22] give the following approximation for the quotient of gamma functions:

$$\begin{aligned} \frac{\Gamma (y+\alpha )}{\Gamma (y+\beta )} \approx \frac{1}{(y+c)^{\beta - \alpha }}, \end{aligned}$$

(20)

when $y \rightarrow \infty $, with $c=(\alpha +\beta -1)/2$. Applying Eq. (20) in Eq. (4) yields

$$\begin{aligned} \begin{aligned} P(k \vert Y=y)&\approx \frac{\bigl (k\tau +\frac{y-1}{2}\bigr )^y m^{k\tau }}{\sum _{j= y}^{\infty } \bigl (j\tau +\frac{y-1}{2}\bigr )^y m^{j\tau }} \\&= \frac{\bigl (k +\frac{y-1}{2\tau }\bigr )^y m^{k\tau }}{\sum _{j= y}^{\infty } \bigl (j+\frac{y-1}{2\tau }\bigr )^y m^{j\tau }}, \end{aligned} \end{aligned}$$

(21)

as $k\tau , j\tau \rightarrow \infty $, for $k \ge y$. Then, representing the denominator of Eq. (21) using the Lerch transcendent, we obtain

$$\begin{aligned} \sum _{j= y}^{\infty } \left( j+\frac{y-1}{2\tau }\right) ^y m^{j\tau } = m^{\tau y} \Phi \left( {m^\tau , -y, \left( \frac{1}{2\tau }+1\right) y - \frac{1}{2\tau }} \right) . \end{aligned}$$

(22)

Hence, Eq. (21) can be expressed as

$$\begin{aligned} P(k \vert Y=y) \approx \frac{\bigl (k +\frac{y-1}{2\tau }\bigr )^y m^{\tau (k-y)}}{\Phi \bigl ({m^\tau , -y, (\frac{1}{2\tau }+1)y - \frac{1}{2\tau }} \bigr )}, \end{aligned}$$

(23)

where $k=y,y+1,\ldots $. Eq. (23) is just Eq. (12) with $\theta = m^\tau $, $s+1=-y$, $a+1=\bigl (\frac{1}{2\tau } +1 \bigr )y - \frac{1}{2\tau }$. Therefore, we conclude that the pmf of the extended Hurwitz–Lerch zeta distribution approximates the posterior distribution of k under NB model for $y \ge 2$.

By the same approach used in Theorem 2 to derive the mean and the variance of the posterior distribution of k, we obtain

$$\begin{aligned} {\text {E}}(k\vert Y=y) \approx \frac{\Phi \bigl (m^\tau ,-y-1,\bigl (\frac{1}{2\tau } +1 \bigr )y - \frac{1}{2\tau }\bigr )}{\Phi \bigl (m^\tau ,-y,\bigl (\frac{1}{2\tau } +1 \bigr )y - \frac{1}{2\tau }\bigr )} - \frac{y-1}{2\tau }, \end{aligned}$$

(24)

$$\begin{aligned} \begin{aligned} {\text {Var}}(k \vert Y=y) \approx&\frac{\Phi \bigl (m^\tau ,-y-2,\bigl (\frac{1}{2\tau } +1 \bigr )y - \frac{1}{2\tau }\bigr )}{\Phi \bigl (m^\tau ,-y,\bigl (\frac{1}{2\tau } +1 \bigr )y - \frac{1}{2\tau }\bigr )}\\ {}&-\left[ \frac{\Phi \bigl (m^\tau ,-y-1,\bigl (\frac{1}{2\tau } +1 \bigr )y - \frac{1}{2\tau }\bigr )}{\Phi \bigl (m^\tau ,-y,\bigl (\frac{1}{2\tau } +1 \bigr )y - \frac{1}{2\tau }\bigr )} \right] ^2, \end{aligned} \end{aligned}$$

(25)

for $k \ge y$, where $y \ge 2$. By similar argument leading to Eq. (8), we obtain

$$\begin{aligned} m^{\tau {g(y)}} \Phi \bigl (m^\tau , -y, g(y)\bigr )\approx \frac{\Gamma (y+1)}{(-\tau \log m)^{y+1}}, \end{aligned}$$

(26)

where $g(y)=(\frac{1}{2\tau } + 1)y - \frac{1}{2\tau }$. Using Eq. (26), Eq. (24) can be approximated as

$$\begin{aligned} {\text {E}}(k\vert Y=y) \approx \frac{\frac{\Gamma (y+2)}{(-\tau \log m)^{y+2}}}{\frac{\Gamma (y+1)}{(-\tau \log m)^{y+1}}} - \frac{y-1}{2\tau } = -\frac{y+1}{\tau \log m} - \frac{y-1}{2\tau }. \end{aligned}$$

Similarly, we can use Eq. (26) to approximate Eq. (25) as

$$\begin{aligned} \begin{aligned} {\text {Var}}(k\vert Y=y)&\approx \frac{\frac{\Gamma (y+3)}{(-\tau \log m)^{y+3}}}{\frac{\Gamma (y+1)}{(-\tau \log m)^{y+1}}} - \Bigl (\frac{y+1}{-\tau \log m} \Bigr )^2 \\&= \frac{(y+2)(y+1)}{ (\tau \log m )^2} - \frac{(y+1)^2}{(\tau \log m)^2 } \\&= \frac{y+1}{(\tau \log m)^2}. \end{aligned} \end{aligned}$$

$\square $

The results of Theorem 2 and Theorem 3 lead us to the following corollary.

Table 1 Kullback–Leibler divergence for Eq. (3) versus Eq. (13)

Full size table

Table 2 Kullback–Leibler divergence for Eq. (4) versus Eq. (23)

Full size table

Table 3 Posterior mean of k when X has a GP distribution. For the array in each cell, the first value is the exact posterior mean computed using Eq. (3) (first 10, 000 terms); the second and the third give the deviation from the first value using Eq. (18) and Theorem 2, respectively

Full size table

Table 4 Posterior standard deviation of k when X has a GP distribution. For the array in each cell, the first value is the exact posterior standard deviation computed using Eq. (3) (first 10, 000 terms); the second and the third give the deviation from the first value using Eq. (19) and Theorem 2, respectively. NA indicates an instance of failure to compute the Lerch transcendent using the VGAM package for $X=50$ and $m=b=0.8$

Full size table

Table 5 Posterior mean of k when Y has an NB distribution. For the array in each cell, the first value is the exact posterior mean computed using Eq. (4) (first 10, 000 terms); the second and the third give the deviation from the first value using Eq. (24) and Theorem 3, respectively

Full size table

Table 6 Posterior standard deviation of k when Y has an NB distribution. For the array in each cell, the first value is the exact posterior standard deviation computed using Eq. (4) (first 10, 000 terms); the second and the third give the deviation from the first value using Eq. (25) and Theorem 3, respectively. NA indicates an instance of failure to compute the Lerch transcendent using the VGAM package for $Y=50$ and $m=b=0.8$

Full size table

Corollary 2

The cdf of the posterior distribution of k is approximately

(i)
$ F_{k \vert X=x}(k) \approx 1 - e^{-b\sqrt{m}(k-x+1)}\Phi (e^{-b\sqrt{m}}, -x, k+wx-x+1)/\Phi (e^{-b\sqrt{m}},-x,wx), $ for $k \ge x$, where $x \ge 2$, for the GP model;
(ii)
$ F_{k \vert Y=y}(k) \approx 1 - m^{\tau (k-y+1)}\Phi (m^\tau , -y, k+ \frac{y-1}{2\tau } +1)/\Phi (m^\tau ,-y,\frac{y-1}{2\tau }+y ), $ for $k \ge y$, where $y \ge 2$, for the NB model.

3 Computational Validation

Table 1 shows how well the extended Hurwitz–Lerch zeta approximates the posterior distribution of k under GP for different combinations of b and m. For a fixed b, the approximation is best for m in the neighborhood of 1. For a fixed m, the approximation improves as b becomes closer to 1. Finally, for fixed m and b, the approximation improves as x increases.

Table 2 shows that for a given y, the larger the values of m and b, the better the extended Hurwitz–Lerch zeta distribution approximates the posterior distribution of k under the NB model. For fixed m and b, the approximation deteriorates as y increases up to 50. However, the Kullback–Leibler divergence remains well below 0.02 when $0.6 \le m < 1$ for the b values considered.

Table 3 and 4 show results of approximating the mean and the standard deviation of the posterior distribution of k given $X=x$ has a GP distribution. Similar results for the posterior distribution of k given $Y=y$ has a NB distribution are given in Table 5 and 6. In general, the approximations have relative error that stays within $10\%$ of the true value when $m \ge 0.6$, for the b and X, Y values considered. Approximations that use the Lerch transcendent function (e.g., Eq. (18), Eq. (19), Eq. (24) and Eq. (25) have smaller relative error compared to the simpler equations in Theorem 2 and Theorem 3. However, for large values of X and Y, both approximations generally have similar relative error for $m \ge 0.6$. Since the Lerch transcendent cannot be evaluated for some combinations of b and m for large X, Y values, the use of the simpler equations in these two theorems suffices.

4 Concluding Remarks

In this paper, we have clarified several theoretical properties of the posterior distribution of a count parameter k arising in the GP and the NB model. Thus, for conditioning values of 0 and 1 from these two models, the posterior distribution of k is found to be geometric and extended NB, respectively. For conditioning values of 2 or more, the posterior distribution of k under either a GP or an NB model is approximated by the extended Hurwitz–Lerch zeta distribution. To our knowledge, this is the first instance where a connection between the Hurwitz–Lerch zeta distribution and a Bayesian posterior distribution is demonstrated. The present results open up the possibility of using the posterior mean to correct for observed gene counts in RNA-Seq data analysis.

Data Availability

None

Code Availability

R package used: VGAM [23]. R codes for reproducing the analyses are available at https://github.com/Divo-Lee/R-Code/blob/main/Post_dist.R

References

Consul, P.C., Jain, G.C.: A generalization of the Poisson distribution. Technometrics 15(4), 791–799 (1973)
Article MathSciNet MATH Google Scholar
Joe, H., Zhu, R.: Generalized Poisson distribution: the property of mixture of Poisson and comparison with negative binomial distribution. Biom. J. 47(2), 219–229 (2005)
Article MathSciNet MATH Google Scholar
Nikoloulopoulos, A.K., Karlis, D.: On modeling count data: a comparison of some well-known discrete distributions. J. Stat. Comput. Simul. 78(3), 437–457 (2008)
Article MathSciNet MATH Google Scholar
Shoukri, M., Consul, P.: Some chance mechanisms generating the generalized Poisson probability models. In: Biostatistics, pp. 259–268. Springer, Dordrecht (1987)
Consul, P.C.: Generalized Poisson Distribution: Properties and Applications. Marcel Dekker, Marcel Dekker, New York (1989)
MATH Google Scholar
Srivastava, S., Chen, L.: A two-parameter generalized Poisson model to improve the analysis of RNA-seq data. Nucleic Acids Res. 38(17), 170 (2010)
Article Google Scholar
Li, W., Jiang, T.: Transcriptome assembly and isoform expression level estimation from biased RNA-Seq reads. Bioinformatics 28(22), 2914–2921 (2012)
Article Google Scholar
Zhang, J., Kuo, C.-C.J., Chen, L.: WemIQ: an accurate and robust isoform quantification method for RNA-seq data. Bioinformatics 31(6), 878–885 (2015)
Article Google Scholar
Wang, Z., Wang, J., Wu, C., Deng, M.: Estimation of isoform expression in RNA-seq data using a hierarchical Bayesian model. J. Bioinform. Comput. Biol. 13(06), 1542001 (2015)
Article Google Scholar
Low, J.Z.-B., Khang, T.F., Tammi, M.T.: CORNAS: coverage-dependent RNA-Seq analysis of gene expression data without biological replicates. BMC Bioinformatics 17, 575 (2017)
Article Google Scholar
Anscombe, F.: The statistical analysis of insect counts based on the negative binomial distribution. Biometrics 5(2), 165–173 (1949)
Article Google Scholar
Bliss, C.I., Fisher, R.A.: Fitting the negative binomial distribution to biological data. Biometrics 9(2), 176–200 (1953)
Article MathSciNet Google Scholar
White, G.C., Bennetts, R.E.: Analysis of frequency count data using the negative binomial distribution. Ecology 77(8), 2549–2557 (1996)
Article Google Scholar
Anders, S., Huber, W.: Differential expression analysis for sequence count data. Genome Biol. 11, 106 (2010)
Article Google Scholar
Hardcastle, T.J., Kelly, K.A.: baySeq: empirical Bayesian methods for identifying differential expression in sequence count data. BMC Bioinformatics 11(1), 422 (2010)
Article Google Scholar
Wu, H., Wang, C., Wu, Z.: A new shrinkage estimator for dispersion improves differential expression detection in RNA-seq data. Biostatistics 14(2), 232–243 (2013)
Article Google Scholar
Love, M.I., Huber, W., Anders, S.: Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15(12), 550 (2014)
Article Google Scholar
Gupta, P.L., Gupta, R.C., Ong, S.H., Srivastava, H.: A class of Hurwitz-Lerch Zeta distributions and their applications in reliability. Appl. Math. Comput. 196(2), 521–531 (2008)
MathSciNet MATH Google Scholar
Bateman, H.: Higher Transcendental Functions, vol. 1. McGraw-Hill Book Company, New York (1953)
Google Scholar
Liew, K.W., Ong, S.H., Toh, K.K.: The Poisson-stopped Hurwitz-Lerch zeta distribution. Communications in Statistics - Theory and Methods 51(16), 5638–5652 (2022)
Article MathSciNet MATH Google Scholar
Gupta, R.C.: Modifed power series distribution and some of its applications. Sankhya, Series B 36(3), 288–298 (1974)
MathSciNet MATH Google Scholar
Laforgia, A., Natalini, P.: On the asymptotic expansion of a ratio of gamma functions. J. Math. Anal. Appl. 389(2), 833–837 (2012)
Article MathSciNet MATH Google Scholar
Yee, T.W.: VGAM: Vector Generalized Linear and Additive Models. (2022). R package version 1.1-7. https://CRAN.R-project.org/package=VGAM

Download references

Acknowledgements

We thank two anonymous reviewers for their constructive comments which helped improve the clarity of the present work.

Funding

None

Author information

Authors and Affiliations

Institute of Mathematical Sciences, Faculty of Science, Universiti Malaya, 50603, Kuala Lumpur, Malaysia
Hongxiang Li & Tsung Fei Khang
Universiti Malaya Centre for Data Analytics, Universiti Malaya, 50603, Kuala Lumpur, Malaysia
Tsung Fei Khang

Authors

Hongxiang Li
View author publications
You can also search for this author in PubMed Google Scholar
Tsung Fei Khang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

TFK was involved in the conceptualisation; all authors contributed to the methodology, formal analysis, writing—original draft preparation and writing—review and editing.

Corresponding author

Correspondence to Tsung Fei Khang.

Ethics declarations

Conflict of interest:

None

Ethics approval:

Not applicable

Consent to participate:

Not applicable

Consent for publication:

Both authors agree to publish

Additional information

Communicated by Rosihan M. Ali.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, H., Khang, T.F. Some Approximation Results for Bayesian Posteriors that Involve the Hurwitz–Lerch Zeta Distribution. Bull. Malays. Math. Sci. Soc. 46, 72 (2023). https://doi.org/10.1007/s40840-023-01463-9

Download citation

Received: 04 April 2022
Revised: 28 November 2022
Accepted: 11 January 2023
Published: 31 January 2023
DOI: https://doi.org/10.1007/s40840-023-01463-9

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Some Approximation Results for Bayesian Posteriors that Involve the Hurwitz–Lerch Zeta Distribution

Abstract

Similar content being viewed by others

Exact posterior computation for the binomial–Kumaraswamy model

Pseudo-Likelihoods for Bayesian Inference

On asymptotic expansion of posterior distribution

1 Introduction

2 Results

Theorem 1

Proof

Corollary 1

Theorem 2

Proof

Theorem 3

Proof

Corollary 2

3 Computational Validation

4 Concluding Remarks

Data Availability

Code Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest:

Ethics approval:

Consent to participate:

Consent for publication:

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Some Approximation Results for Bayesian Posteriors that Involve the Hurwitz–Lerch Zeta Distribution

Abstract

Similar content being viewed by others

Exact posterior computation for the binomial–Kumaraswamy model

Pseudo-Likelihoods for Bayesian Inference

On asymptotic expansion of posterior distribution

1 Introduction

2 Results

Theorem 1

Proof

Corollary 1

Theorem 2

Proof

Theorem 3

Proof

Corollary 2

3 Computational Validation

4 Concluding Remarks

Data Availability

Code Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest:

Ethics approval:

Consent to participate:

Consent for publication:

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation