A Double Recursion for Calculating Moments of the Truncated Normal Distribution and its Connection to Change Detection

Pollak, Moshe; Shauly-Aharonov, Michal

doi:10.1007/s11009-018-9622-7

A Double Recursion for Calculating Moments of the Truncated Normal Distribution and its Connection to Change Detection

Published: 02 March 2018

Volume 21, pages 889–906, (2019)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Methodology and Computing in Applied Probability Aims and scope Submit manuscript

A Double Recursion for Calculating Moments of the Truncated Normal Distribution and its Connection to Change Detection

Download PDF

129 Accesses
6 Citations
Explore all metrics

Abstract

The integral ${\int }_{0}^{\infty }x^{m} e^{-\frac {1}{2}(x-a)^{2}}dx$ appears in likelihood ratios used to detect a change in the parameters of a normal distribution. As part of the m^th moment of a truncated normal distribution, this integral is known to satisfy a recursion relation, which has been used to calculate the first four moments of a truncated normal. Use of higher order moments was rare. In more recent times, this integral has found important applications in methods of changepoint detection, with m going up to the thousands. The standard recursion formula entails numbers whose values grow quickly with m, rendering a low cap on computational feasibility. We present various aspects of dealing with the computational issues: asymptotics, recursion and approximation. We provide an example in a changepoint detection setting.

Article PDF

Practical Aspects of False Alarm Control for Change Point Detection: Beyond Average Run Length

Article Open access 17 May 2018

The distribution of the maximum likelihood estimates of the change point and their relation to random walks

Article 19 December 2023

A method for estimating the power of moments

Article Open access 06 March 2018

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Barr D R, Sherrill E T (1999) Mean and variance of truncated normal distributions. Amer Statist 53:357–361
Google Scholar
Burkardt J (2014) The truncated normal distribution. https://people.sc.fsu.edu/~jburkardt/presentations/truncated_normal.pdf
Cook JD (2010) Upper bounds on non-central chi-square tails and truncated normal moments. https://www.johndcook.com/non_central_chi_square.pdf
Dhrymes P J (2005) Moments of truncated (normal) distributions. http://www.columbia.edu/~pjd1/papers.html
Gordon L, Pollak M (1995) A robust surveillance scheme for stochastically ordered alternatives. Ann Statist 23:1350–1375
Article MathSciNet MATH Google Scholar
Horrace WC (2015) Moments of the truncated normal distribution. J Prod Anal 43:133–138
Article Google Scholar
Krieger AM, Pollak M, Yakir B (2003) Surveillance of a simple linear regression. J Amer Statist Assoc 98:1–15
Article MathSciNet MATH Google Scholar
Lee A (1914) Table of the Gaussian “tail” tunctions; when the “tail” is larger than the body. Biometrika 10:208–214
Google Scholar
Liquet B, Nazarathy Y (2015) A dynamic view to moment matching of truncated distributions. Statist Prob Letts 104(53):87–93
Article MathSciNet MATH Google Scholar
MATLAB (2014) MATLAB and statistics toolbox release 2014b. The Mathworks Inc., Natick
Google Scholar
Toolbox (2016) MATLAB multiprecision computing toolbox (ADVANPIX Release 2016). The MathWorks, Inc., Natick. http://www.advanpix
Google Scholar
O’Connor AN (2011) Probability distributions used in reliability engineering. Reliability information analysis center (RIAC)
Pearson K, Lee A (1908) On the generalized probable error in multiple normal correlation. Biometrika 6:59–68
Article Google Scholar
Pollak M (1987) Average run lengths of an optimal method of detecting a change in distribution. Ann Statist 15:749–779
Article MathSciNet MATH Google Scholar
Pollak M, Siegmund D (1991) Sequential detection of a change in a normal mean when the initial value is unknown. Ann Statist 19:394–416
Article MathSciNet MATH Google Scholar
Pollak M, Croarkin C, Hagwood C (1993) Surveillance schemes with application to mass calibration. NISTIR 5158 Technical Report, Statistical Division, The National Institute of Standards and Technology, Gaithersburg, MD 20899, USA
Quesenberry C (1991) SPC Q processes for start-up processes and short or long runs. J Qual Tech 23:213–224
Article Google Scholar
van Dobben de Bruyn CS (1968) Cumulative Sum Tests: Theory and Practice. Griffin, London
Google Scholar

Download references

Acknowledgements

This work was supported by grant 1450/13 from the Israel Science Foundation and by the Marcy Bogen Chair of Statistics at the Department of Statistics, The Hebrew University of Jerusalem. The authors would like to thank the referees for their salient comments, which greatly improved the paper.

Author information

Authors and Affiliations

Department of Statistics, The Hebrew University of Jerusalem, 91905, Jerusalem, Israel
Moshe Pollak & Michal Shauly-Aharonov

Authors

Moshe Pollak
View author publications
You can also search for this author in PubMed Google Scholar
Michal Shauly-Aharonov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Moshe Pollak.

Appendices

Appendix A: Proof of Theorem 1

The case a = 0 is trivial. In the following we assume a≠ 0. Recall that

$${\int}_{0}^{\infty}x^{m} e^{-\frac{1}{2}x^{2}}dx= 2^{\frac{1}{2}(m-1)}{\Gamma}(\frac{m + 1}{2}) . $$

Hence by Stirling’s approximation

$$\begin{array}{@{}rcl@{}} \frac{{\int}_{0}^{\infty}x^{m + 1} e^{-\frac{1}{2}x^{2}}dx}{{\int}_{0}^{\infty}x^{m} e^{-\frac{1}{2}x^{2}}dx} &=&\frac{2^{\frac{1}{2}m}{\Gamma}(\frac{m + 2}{2})} {2^{\frac{1}{2}(m-1)}{\Gamma}(\frac{m + 1}{2})} \\ &=&2^{\frac{1}{2}}\frac{e^{-\frac{m + 2}{2}}(\frac{m + 2}{2})^{\frac{m + 1}{2}}}{e^{-\frac{m + 1}{2}}(\frac{m + 1}{2})^{\frac{m}{2}}}(1+O(\frac{1}{m^{2}})) \\ &=&e^{-\frac{1}{2}}(\frac{m + 2}{m + 1})^{(m + 1)\frac{1}{2}}(m + 1)^{\frac{1}{2}}(1+O(\frac{1}{m^{2}})) \\ &=&\sqrt{m + 1}(1+O(\frac{1}{m})) \end{array} $$

and so

$$\begin{array}{@{}rcl@{}} e^{\frac{1}{2}a^{2}} \frac{{\int}_{0}^{\infty}x^{m} e^{-\frac{1}{2}(x-a)^{2}}dx}{{\int}_{0}^{\infty} x^{m} e^{-\frac{1}{2}x^{2}} dx} &=& \frac{{\int}_{0}^{\infty}e^{ax}x^{m} e^{-\frac{1}{2}x^{2}}dx}{{\int}_{0}^{\infty} x^{m} e^{-\frac{1}{2}x^{2} }dx} \\ &=&\sum\limits_{j = 0}^{\infty}\frac{{\int}_{0}^{\infty}\frac{(ax)^{j}}{j!}x^{m} e^{-\frac{1}{2}x^{2}}dx}{{\int}_{0}^{\infty} x^{m} e^{-\frac{1}{2}x^{2} dx}} \\ &=&\sum\limits_{j = 0}^{\infty}\frac{a^{j}}{j!}\frac{{\int}_{0}^{\infty}x^{m+j} e^{-\frac{1}{2}x^{2}}dx}{{\int}_{0}^{\infty} x^{m} e^{-\frac{1}{2}x^{2}} dx} \\ &=&\sum\limits_{j = 0}^{\infty}\frac{a^{j}}{j!}\prod\limits_{i = 1}^{j}\frac{{\int}_{0}^{\infty}x^{m+i} e^{-\frac{1}{2}x^{2}}dx}{{\int}_{0}^{\infty} x^{m+i-1} e^{-\frac{1}{2}x^{2} }dx} \\ &=&\sum\limits_{j = 0}^{\infty}\frac{a^{j}}{j!}\prod\limits_{i = 1}^{j}(\sqrt{m+i}(1+O(\frac{1}{m}))) \\ &=&\sum\limits_{j = 0}^{\infty}\frac{(|a|\sqrt{m}(1+O(\frac{1}{m})))^{j}}{j!}\sqrt{\prod\limits_{i = 1}^{j}(1+\frac{i}{m})} . \end{array} $$

(1)

We break the sum in Eq. 1 into pieces and analyze them separately.

Let γ > e and let $M=a\sqrt {m}\gamma $. We first show that the sum from M to $\infty $ in Eq. 1 is negligible. Again, by Stirling’s approximation

$$\begin{array}{@{}rcl@{}} \prod\limits_{i = 1}^{j} (1+\frac{i}{m}) &=&\frac{(m+j)!}{m!m^{j}} =\frac{e^{-(m+j)}(m+j)^{m+j+\frac{1}{2}}(1+O(\frac{1}{m+j}))}{m^{j}e^{-m}m^{m+\frac{1}{2}}(1+O(\frac{1}{m}))} \\ &=&e^{-j}(\frac{m+j}{m})^{m+j+\frac{1}{2}}(1+O(\frac{1}{m})) \end{array} $$

and so

$$\begin{array}{@{}rcl@{}} \frac{(|a|\sqrt{m}(1+O(\frac{1}{m})))^{j}}{j!}\sqrt{\prod\limits_{i = 1}^{j} (1+\frac{i}{m})} &=&\frac{(|a|\sqrt{m})^{j}e^{-\frac{1}{2}j}\sqrt{(\frac{m+j}{m})^{m+j+\frac{1}{2}}}(1+O(\frac{1}{m}))^{j}}{\sqrt{2\pi}e^{-j}j^{j+\frac{1}{2}}} \\ && \times (1+O(\frac{1}{m})). \end{array} $$

Hence

$$\log \{\frac{(|a|\sqrt{m}(1+O(\frac{1}{m})))^{j}}{j!}\sqrt{\prod\limits_{i = 1}^{j} (1+\frac{i}{m})}\}$$

$$\begin{array}{@{}rcl@{}} &=&j\log(|a|\sqrt{m})+\frac{1}{2}(m+j+\frac{1}{2})\log(\frac{m+j}{m})-\frac{1}{2}\log(2\pi) +\frac{1}{2}j-(j+\frac{1}{2})\log(j) \\ &&\quad \quad +j\log(1+o(\frac{1}{m})) \\ &=&-(j+\frac{1}{2})\log(j)+\frac{1}{2}(m+j+\frac{1}{2})\log(m+j)-\frac{1}{2}(m+j+\frac{1}{2})\log(m)+j\log(|a|\sqrt{m}) \\ &&\quad \quad+\frac{1}{2}j +j\log(1+o(\frac{1}{m}))-\frac{1}{2}\log(2\pi). \ \end{array} $$

For large enough m and $j=M=|a|\sqrt {m}\gamma $, this expression equals

$$-|a|\sqrt{m}\gamma(\log(\gamma)-1+o(1)) +\frac{1}{4}a^{2}\gamma^{2}(1+o(1))\quad<\quad -\frac{1}{2}a\sqrt{m}\gamma(\log(\gamma)-1). $$

Let 0 < ε. For j ≥ M the derivative with respect to j of

$$-(j+\frac{1}{2})\log(j)+\frac{1}{2}(m+j+\frac{1}{2})\log(m+j)-\frac{1}{2}(m+j+\frac{1}{2})\log(m)+j\log(|a|\sqrt{m})+\frac{1}{2}j $$

$$+j\log(1+\frac{\varepsilon}{m}) $$

equals

$$-\log(\frac{j}{|a|\sqrt{m}})-\frac{2m+j}{4j(m+j)}+\frac{1}{2}\log(1+\frac{j}{m})+\log(1+\frac{\varepsilon}{m}) \quad \leq \quad -\frac{1}{2}\log(\gamma). $$

It follows that as $m\rightarrow {\infty }$

$$\begin{array}{@{}rcl@{}} \sum\limits_{j=M}^{\infty}\frac{(a\sqrt{m}(1+O(\frac{1}{m})))^{j}}{j!}\sqrt{\prod\limits_{i = 1}^{j}(1+\frac{i}{m})} &\leq& e^{-\frac{1}{2}a\sqrt{m}\gamma(\log(\gamma)-1)} \sum\limits_{j = 0}^{\infty}e^{-\frac{1}{2}\log(\gamma)j} \\ &=&\frac{ e^{-\frac{1}{2}a\sqrt{m}\gamma(\log(\gamma)-1)}}{1- e^{-\frac{1}{2}\log(\gamma)}}\rightarrow 0. \end{array} $$

(2)

For j ≤ M

$$\begin{array}{@{}rcl@{}} \log(\prod\limits_{i = 1}^{j}(1+\frac{i}{m}))&=&\sum\limits_{i = 1}^{j}[\frac{i}{m}-\frac{1}{2}\frac{i^{2}}{m^{2}}+\frac{1}{3}\frac{i^{3}}{m^{3}} {\ldots} ] \\ &=&\frac{j(j + 1)}{2m}-\sum\limits_{i = 1}^{j}\frac{i^{2}}{m^{2}} (\frac{1}{2}-\frac{1}{3}\frac{i}{m}+\frac{1}{4}\frac{i^{2}}{m^{2}} {\ldots} ) \\ &=& j(\frac{1}{2}\frac{j}{m}) -O(\frac{|a|^{3}}{\sqrt{m}}) . \end{array} $$

It follows that

$$\begin{array}{@{}rcl@{}} \sum\limits_{j = 0}^{M}\frac{(a\sqrt{m}(1+O(\frac{1}{m})))^{j}}{j!}\sqrt{\prod\limits_{i = 1}^{j}(1+\frac{i}{m})} &=&\sum\limits_{j = 0}^{M}\frac{(a\sqrt{m}(1+O(\frac{1}{m})))^{j}}{j!}e^{\frac{1}{4}\frac{j^{2}}{m}-O(\frac{|a|^{3}}{\sqrt{m}})}. \end{array} $$

(3)

Note that if X ∼ Poisson(λ) then $P(X>y)\leq \frac {Ee^{tX}}{e^{ty}}=\frac {e^{\lambda (e^{t}-1)}}{e^{ty}}$ for all t > 0, which is minimal at $t=\log (\frac {y}{\lambda })$; hence $\log (P(X>y))<{e^{y-\lambda }}/{(\frac {y}{\lambda })^{y}}$. Recall that $\frac {1}{2}<\eta <1$ and denote $\lambda =a\sqrt {m}+\frac {1}{4}\gamma a^{2}$ , $y=a\sqrt {m}+(a\sqrt {m})^{\eta }$. Thus

$$\sum\limits_{j=a\sqrt{m}+(a\sqrt{m})^{\eta}}^{M=\gamma a\sqrt{m}}\frac{(a\sqrt{m}(1+O(\frac{1}{m})))^{j}}{j!}e^{\frac{1}{4}\frac{j^{2}}{m}}$$

$$\begin{array}{@{}rcl@{}} &&< e^{a\sqrt{m}+\frac{1}{4} a^{2}}e^{\frac{1}{4} a^{2}(\gamma-1)}{P}(Poisson(\lambda)>y) \\ &&\leq e^{a\sqrt{m}+\frac{1}{4} a^{2}}e^{\frac{1}{4} a^{2}(\gamma-1)}e^{\log((a\sqrt{m})^{\eta}-\frac{1}{4}\gamma a^{2})-(a\sqrt{m}+(a\sqrt{m})^{\eta})\log(\frac{a\sqrt{m}+(a\sqrt{m})^{\eta}}{a\sqrt{m}+\frac{1}{4}\gamma a^{2}})} \\ &&\leq e^{a\sqrt{m}+\frac{1}{4} a^{2}}O(e^{-(a\sqrt{m})^{\eta}}). \end{array} $$

(4)

If X ∼ Poisson(λ) then $P(X<y)\leq \frac {Ee^{-tX}}{e^{-ty}}=\frac {e^{\lambda (e^{-t}-1)}}{e^{-ty}}$ for all t > 0, which is minimal at $t=-\log (\frac {y}{\lambda })$; hence $\log (P(X<y))<{e^{y-\lambda }}/{(\frac {y}{\lambda })^{y}}$. Hence for $j<a\sqrt {m}-(a\sqrt {m})^{\eta }$

$$\sum\limits_{j = 0}^{a\sqrt{m}-(a\sqrt{m})^{\eta}}\frac{(a\sqrt{m}(1+O(\frac{1}{m})))^{j}}{j!}e^{\frac{1}{4}\frac{j^{2}}{m}}$$

$$\begin{array}{@{}rcl@{}} &&<e^{a\sqrt{m}+\frac{1}{4}a^{2}} \times{{P}(Poisson}(a\sqrt{m}(1+O(\frac{1}{m})))<a\sqrt{m}-(a\sqrt{m})^{\eta}) \\ &&=e^{a\sqrt{m}+\frac{1}{4}a^{2}}O(e^{-(a\sqrt{m})^{2\eta-1}}). \end{array} $$

(5)

Finally, for $ a\sqrt {m}-(a\sqrt {m})^{\eta }\leq j\leq a\sqrt {m}+(a\sqrt {m})^{\eta }$

$$a^{2}[1-2a^{\eta-1}m^{\frac{1}{2}(\eta-1)}+a^{2(\eta-1)}m^{(\eta-1)}]\leq \frac{j^{2}}{m}\leq a^{2}[1 + 2a^{\eta-1}m^{\frac{1}{2}(\eta-1)}+a^{2(\eta-1)}m^{(\eta-1)}]. $$

Hence $\frac {j^{2}}{m}=a^{2}[1+O((a\sqrt {m})^{\eta -1})]$ and

$$ \sum\limits_{j=a\sqrt{m}-(a\sqrt{m})^{\eta}}^{ a\sqrt{m}+(a\sqrt{m})^{\eta}}\frac{(a\sqrt{m}(1+O(\frac{1}{m})))^{j}}{j!}e^{\frac{1}{4}\frac{j^{2}}{m}}=e^{a\sqrt{m}+\frac{1}{4}a^{2}[1+O((a\sqrt{m})^{\eta-1})]}. $$

(6)

Combining Equations (1)–(6) accounts for Theorem 1.

Appendix B: Computing

This is a MATLAB (2014) program for calculating

$$\frac{{\int}_{c}^{\infty} x^{m} e^{-\frac{1}{2}(x-a)^{2}}dx}{ {\int}_{0}^{\infty} x^{m} e^{-\frac{1}{2}x^{2}}dx}e^{-a \sqrt{m}} $$

Input: m > 2,a,c. Output: psi =$ \psi _{a,c}=\frac {{\int }_{c}^{\infty } x^{m} e^{-\frac {1}{2}(x-a)^{2}}dx}{ {\int }_{0}^{\infty } x^{m} e^{-\frac {1}{2}x^{2}}dx}e^{-a \sqrt {m}}$.

Remark 1

The program above is intended to cover all cases of c, includingc = 0.When n becomes large, $\log ({\Gamma }(n/2))$is calculated more precisely than Γ(n/2),so that when c≠ 0the last summands in both psi(n)and xi(n)should be completely calculated first on the log scale and only then exponentiated.

Appendix C: A Sequential Changepoint Detection Context

The need for calculating ψ_0,a(m) appears in a number of changepoint problems (cf. Pollak et al. 1993 and Krieger et al. 2003). For example, consider independent normally distributed random variables observed sequentially whose mean may (or may not) increase at an unknown time ν and one monitors the sequence to detect such a change. Formally,

$$X_{1},\ldots,X_{\nu-1} \sim {Normal}(\mu,\sigma^{2}) $$

$$X_{\nu},{\ldots} \quad \sim {Normal}(\mu+\delta \sigma, \sigma^{2}) $$

where X₁,X₂,… are independent. Consider the case that μ and σ are unknown and one considers an increase of δ standard deviations to be of import. Define:

$$Y_{i}=(X_{i}-\bar{X}_{i-1})\sqrt{\frac{i-1}{i}} $$

$$Z_{i}=\frac{Y_{i}}{|Y_{2}|} . $$

The sequence {Z_i} is invariant with respect to the unknown parameters μ and σ. Therefore, the likelihood ratio

$${{\Lambda}_{k}^{n}}=\frac{f_{\nu=k}(Z_{2},\ldots,Z_{n})}{f_{\nu=\infty}(Z_{2},\ldots,Z_{n})} $$

of the first n − 1 invariant statistics Z₂,…,Z_n (for a change occurring at the ν = k^th observation vs. no change ever occurring) can be calculated. Once done, the Shiryaev–Roberts statistic

$$R_{n}=\sum\limits_{k = 1}^{n}{{\Lambda}_{k}^{n}} $$

can be used to declare that a change is in effect; an alarm would be raised if R_n crosses a pre-specified threshold A.

Since one cannot differentiate between the case that there is no change ever and the case that a change occurred at the very beginning, necessarily ${{\Lambda }_{1}^{n}}= 1$. Clearly,

$${{\Lambda}_{2}^{2}}(\delta)=\{{\Phi}(\frac{\delta}{\sqrt{2}})I(Z_{2}= 1)+[1-{\Phi}(\frac{\delta}{\sqrt{2}})]I(Z_{2}=-1)\}/\frac{1}{2} $$

$$= 1+[2{\Phi} (\frac{\delta}{\sqrt{2}})-1]\text{sign}(Z_{2}) . $$

Letting

$$U_{k}(n)=(k-1) \frac{\sum\limits_{i=k}^{n} \frac{Z_{i}}{\sqrt{i(i-1)}}}{\sqrt{\sum\limits_{i = 2}^{n} {Z_{i}^{2}}}}=(k-1) \frac{\sum\limits_{i=k}^{n} \frac{Y_{i}}{\sqrt{i(i-1)}}}{\sqrt{\sum\limits_{i = 2}^{n} {Y_{i}^{2}}}} $$

a lengthy calculation (similar to that in Pollak et al. 1993) yields for 2 ≤ k ≤ n

$${{\Lambda}_{k}^{n}}(\delta)= \{{{\Lambda}_{2}^{2}}I\{k = 2\}+I\{k \neq 2\}\}e^{-\frac{1}{2}(k-1)^{2} \delta^{2} [\frac{1}{k-1}-\frac{1}{n}]}e^{\frac{1}{2}\delta^{2} {U_{k}^{2}}(n)} \frac{g_{0,\delta U_{k}(n)}(n-2)}{g_{0,0}(n-2)} . $$

(Note that $\frac {g_{0,a}(m)}{g_{0,0}(m)}=\psi _{0,a}(m)e^{a\sqrt {m}}$. For the sake of precision, it is advisable to first calculate the log of the components of ${{\Lambda }_{k}^{n}}(\delta )$ and then exponentiate their sum.)

As an example where m can be very large, we consider the list of weights at birth of 196710 babies born at a large hospital in Israel between 13/10/2004 and 18/5/2017 whose weight at birth was between 2000 and 5000 grams. Figure 1 is a histogram of the data. A normal distribution seems to fit the data well.

During the last decades, worldwide, obesity and macrosomia have been on the rise. It would have been reasonable to monitor the weight of newborn infants for an increase in mean. Since a rise could be gradual, it would be reasonable to start by trying to detect a small change — for example’s sake, we choose δ = 0.1, an increase of one tenth of a standard deviation (ca. 46 grams). Figure 2 presents the sequence of surveillance statistics R_n for the first 5000 observations and Fig. 3 for the first 65000 observations.

The choice of the threshold A is made in light of the risk one is willing to take regarding a false alarm. Suppose one were willing to tolerate a false alarm on the average once in 20 years. With an average of roughly 15000 babies born each year, this would mean a false alarm on the average once in 300000 observations. Using a renewal-theoretic approximation for the ARL2FA (Pollak 1987), this means that the cutoff level A should be 300000/1.06 ∼ 283000. It is clear from Figs. 2 and 3 that such a change would not been detected within the first 65000 observations.

Figure 4 presents the sequence of surveillance statistics R_n for the entire sequence. Clearly, R_n exceeds A = 283000 a short while after the 70000^th observation. In fact,the 70440^th is the first to carry R_n over the threshold. This means that it would have been declared after the 70440^th newborn that the mean weight has increased. Figure 5 depicts the loglikelihood function $\log {\Lambda }^{70440}_{k}(\delta = 0.1)$, k = 1,…, 70440. This function attains its maximum at k = 65834, so 65834 can be regarded at the time of stopping (70400) as the maximum likelihood estimate of the changepoint. This could be interpreted as the increase being detected 4606 observations (ca. 4 months) after its occurrence.

In fact, the average weight of the first 65833 newborns is 3286 grams, whereas the average weight of newborns #65834 − #67000 is 3332 grams (an increase of approximately 0.1 standard deviations).

Continuing with the ensuing newborns, R_n drops again, and until the end of the sequence does not cross the A = 283000 level again. In fact, the average weight of the newborns from babies #67001 − 196710 is 3291 grams. So, it seems as if the increase was either temporary, or apparent. Was the alarm false? After observing 67000 newborns, it would have seemed that the increase was real. With hindsight, one may either believe that the increase was real, but a change (a decrease) took place thereafter and the mean weight reverted to its original level, or one may interpret the episode as having been a false alarm.

The calculation of R_n was done in the following way. R₁,…,R₅₀₀₀ were calculated using the recursions of Section ??; ensuing R_n^′s were calculated by applying Theorem 1. Rather than calculating $\frac {g_{0,\delta U_{k}(n)}(n-2)}{g_{0,0}(n-2)}$ by the recursions separately for each δU_k(n), calculation time drops immensely by creating a fine enough grid of $\frac {g_{0,x}(n)}{g_{0,0}(n)}$ and interpolating. Creation of such a grid (in our example for n = 1,…, 5000 and approximately 500 values of x, spaced so that between adjacent x’s the function $\frac {g_{0,x}(n)}{g_{0,0}(n)}$ is almost perfectly linear). This grid is calculated much faster by the recursions of Section ?? than by the method of Section ?? (after all, the recursion that generated $\frac {g_{0,x}(5000)}{g_{0,0}(5000)}$ created all of $\frac {g_{0,x}(n)}{g_{0,0}(n)}$, n = 1,…, 5000 along the way, whereas by the method of Section ?? each $\frac {g_{0,x}(n)}{g_{0,0}(n)}$ has to be calculated separately.)

Appendix D: Remarks

Remark 1

A lower bound can be obtained by Jensen’s inequality:

$$\begin{array}{@{}rcl@{}} \log(e^{\frac{1}{2}a^{2}}\frac{{\int}_{0}^{\infty}x^{m} e^{-\frac{1}{2}(x-a)^{2}}dx}{{\int}_{0}^{\infty}x^{m} e^{-\frac{1}{2}x^{2}}dx}) &=&\log({\int}_{0}^{\infty}e^{ax}\frac{x^{m}e^{-\frac{1}{2}x^{2}}dx}{{\int}_{0}^{\infty}x^{m}e^{-\frac{1}{2}x^{2}}dx}) \\ &\geq& {\int}_{0}^{\infty}ax\frac{x^{m}e^{-\frac{1}{2}x^{2}}dx}{{\int}_{0}^{\infty}x^{m}e^{-\frac{1}{2}x^{2}}dx} \\ &=&a\frac{{\int}_{0}^{\infty}x^{m + 1}e^{-\frac{1}{2}x^{2}}dx}{{\int}_{0}^{\infty}x^{m}e^{-\frac{1}{2}x^{2}}dx} \\ &=&a\frac{2^{\frac{1}{2}m}{\Gamma}(\frac{m + 2}{2})} {2^{\frac{1}{2}(m-1)}{\Gamma}(\frac{m + 1}{2})} =a\frac{\sqrt{2}{\Gamma}(\frac{m + 2}{2})} {{\Gamma}(\frac{m + 1}{2})} \\ &=&a\sqrt{m}(1+O(\frac{1}{m^{2}})) \quad . \end{array} $$

Remark 2

Cook (2010) presented an upper bound

$$ {\int}_{c}^{\infty}x^{m}e^{-\frac{1}{2}(x-a)^{2}}dx \leq \frac{\pi}{2\sqrt{2}}e^{-c^{2}-m+\frac{ma}{c}+\frac{m^{2}}{2c^{2}}}\frac{c^{m}}{c+\sqrt{c^{2}+\frac{4}{\pi}}} $$

(7)

for c > 0.On the log scale, Cook’s upper bound has an asymptotic($m \rightarrow \infty $) order of magnitudem², whereas Theorem 1 positsan order of magnitude $m\log {m}$.

Remark 3

A similar type of analysis can be done for${\int }_{-\infty }^{\infty }x^{m}e^{-\frac {1}{2}(x-a)^{2}}dx$(cf. Pollak et al. 1993). Note that

$$\begin{array}{@{}rcl@{}} {\int}_{-\infty}^{\infty}x^{m}e^{-\frac{1}{2}(x-a)^{2}}dx &=&{\int}_{-\infty}^{0}x^{m}e^{-\frac{1}{2}(x-a)^{2}}dx+{\int}_{0}^{\infty}x^{m}e^{-\frac{1}{2}(x-a)^{2}}dx \\ &=&(-1)^{m}{\int}_{0}^{\infty}x^{m}e^{-\frac{1}{2}(x+a)^{2}}dx+{\int}_{0}^{\infty}x^{m}e^{-\frac{1}{2}(x-a)^{2}}dx \end{array} $$

and (unless m = 0)that (depending on a) one of the two integrals is asymptotically negligible with respectto the other. Thus Theorem 1 can be applied to obtain an approximation to them^thmoment of a (non-truncated) normal distribution.

Remark 4

A similar type of analysis can be done when the truncation is from both ends. To see this, it sufficesto consider ${{\int }_{0}^{c}} x^{m}e^{-\frac {1}{2}(x-a)^{2}}dx$for c > 0.Clearly, $\frac {{{\int }_{0}^{c}} x^{m}e^{-\frac {1}{2}(x-a)^{2}}dx}{{{\int }_{0}^{c}} x^{m}dx}\rightarrow e^{-\frac {1}{2}(c-a)^{2}}$as$m \rightarrow \infty $. It followsthat

$${{\int}_{0}^{c}} x^{m}e^{-\frac{1}{2}(x-a)^{2}}dx=e^{-\frac{1}{2}(c-a)^{2}}\frac{c^{m + 1}}{m + 1}(1+o(1)) . $$

Without loss of generality,assume that c > |b|. A recursionfor ${{\int }_{b}^{c}} x^{m}e^{-\frac {1}{2}(x-a)^{2}}dx$that builds on thisis

$$h_{a,b,c}(m)=\frac{m + 1}{c^{2}}[h_{a,b,c}(m-2)+\frac{ac}{m}h_{a,b,c}(m-1)+(\frac{b}{c})^{m-1}e^{-\frac{1}{2}(b-a)^{2}}-e^{-\frac{1}{2}(c-a)^{2}}] $$

where$h_{a,b,c}(m)=\frac {m + 1}{c^{m + 1}}{{\int }_{b}^{c}} x^{m}e^{-\frac {1}{2}(x-a)^{2}}dx$. (Notethat for |b| > c > b,${{\int }_{b}^{c}} x^{m}e^{-\frac {1}{2}(x-a)^{2}}dx= (-1)^{m} {\int }_{-c}^{-b} x^{m}e^{-\frac {1}{2}(x+a)^{2}}dx$.)

Remark 5

A similar recursion can be constructed for densities proportional to$e^{{const} \times x^{2}}I(b<x<c)$whereconst > 0andb,c are finite.

Remark 6

Tables relevant to the truncated normal distribution have been around for over a century(cf. Pearson and Lee 1908and Lee 1914). Even today it is considered as part and parcelof applied distributions (cf. O’Connor 2011) and papers have been devoted to estimation ofits parameters (cf. Barr and Sherrill 1999; Horrace 2015; Liquet and Nazarathy 2015). Formost practical purposes, the first four moments of the distribution (mean, variance, skewnessand kurtosis) have been of applied interest. Higher order moments may appear in quadraturemethods (cf. Burkardt 2014); their order of magnitude would seem to be in the tens at most.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pollak, M., Shauly-Aharonov, M. A Double Recursion for Calculating Moments of the Truncated Normal Distribution and its Connection to Change Detection. Methodol Comput Appl Probab 21, 889–906 (2019). https://doi.org/10.1007/s11009-018-9622-7

Download citation

Received: 07 August 2016
Revised: 15 February 2018
Accepted: 18 February 2018
Published: 02 March 2018
Issue Date: 15 September 2019
DOI: https://doi.org/10.1007/s11009-018-9622-7

Keywords

Mathematics Subject Classification (2010)

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A Double Recursion for Calculating Moments of the Truncated Normal Distribution and its Connection to Change Detection

Abstract

Article PDF

Similar content being viewed by others

Practical Aspects of False Alarm Control for Change Point Detection: Beyond Average Run Length

The distribution of the maximum likelihood estimates of the change point and their relation to random walks

A method for estimating the power of moments

References

Acknowledgements