1 Introduction and summary

Some years ago, Cacoullos (Personal Communication), considering discretization of well-known continuous distributions, introduced a (standard) discrete Cauchy random variable (r.v.) X with probability mass function (p.m.f.)

$$\begin{aligned} \Pr (X=k)=\frac{1/\pi _0}{1+k^2}, \ \ \ k\in {Z}, \end{aligned}$$
(1)

by the obvious substitution \(k\in {Z}\) for \(x\in {R}\) in the standard Cauchy density

$$\begin{aligned} f(x)=\frac{1/\pi }{1+x^2}, \ \ x\in {R}. \end{aligned}$$
(2)

Cacoullos immediately raised two natural questions:

  1. (A)

    While it is expected to be very close to \(\pi \), what is the exact value of the normalizing constant \(\pi _0\) in (1)?

  2. (B)

    While the characteristic function (ch.f.) of (2) is \(\phi (t)=e^{-|t|}\), what is the corresponding one, say \(\phi _1\), of (1)?

We provide explicit answers in Sect. 2. It is well known that the (continuous) Cauchy distribution appears naturally in statistics and probability. At this point, it should be noted though the standard Cauchy r.v. is customarily defined as the ratio of two independent standard normal r.v.’s, or as the tangent of a randomly chosen angle in \([0,2\pi )\), it has recently been shown [1, 6, 7] that the ratio representation still holds if (XY) follows any bivariate spherically symmetric distribution.

In [2], Cacoullos showed that if \(X=(X_1,\ldots ,X_p)'\) (\(p\ge 3\)) is spherically symmetrically distributed around zero, then all polar angle tangent vectors follow a multivariate Cauchy; note that, e.g., Feller [4] defines the symmetric bivariate and trivariate Cauchy distributions directly through their densities—not as tangent vectors.

In contrast to (2) and its location-scale extension, for which several applications are known both in probability and statistics, for (1) we have been able to find few results related to stochastic processes—see, e.g., [14,  p. 383]. However, the asymptotic distribution of the sample means for (1), Theorem 4, may serve as a starting point for applications; so appears to be the Cauchy-Cacoullos family defined by (4). These considerations are, however, beyond the scope of the present note.

In Sect. 3, we introduce a novel family of integer-valued distributions, the Cauchy-Cacoullos family, sharing similar properties—see Definition 1 and Remark 2. In particular, any distribution in this family has a simple characteristic function that can be written down explicitly, Theorem 2, and the same is valid for the discrete Student-type distributions of Remark 2. Basic inference properties for this family are included in Theorem 3, while some distributional properties are discussed in some detail in Sect. 4; see Theorems 46. We hope that the proposed simple formulae will enlarge the applicability of discrete Cauchy distribution in the future.

2 The characteristic function

Since \(\phi _1(t)=\mathrm{E} e^{itX}=\mathrm{E}\cos (tX)+i\mathrm{E}\sin (tX)\) (i denotes the imaginary unit) and X is symmetrically distributed around the origin (hence, \(\mathrm{E}\sin (tX)=0\)), both questions, (A), (B), will be answered if we manage to calculate in a closed form the function \(g:{R}\rightarrow {R}\), defined by the Fourier series

$$\begin{aligned} g(t):=\sum _{n=0}^{\infty } \frac{\cos (n t)}{1+n^2}, \ \ \ \ t\in {R}. \end{aligned}$$
(3)

Therefore, the problem is to identify which function g is represented as a series of cosines with Fourier coefficients as in (3). Clearly, g is periodic with period \(2\pi \). Thus, it suffices to restrict our attention to t-values in the interval \(-\pi \le t\le \pi \). On the other hand, since a cosine Fourier series corresponds to an even function, we may further restrict the t-values into the interval \(0\le t\le \pi \).

The key lemma is:

Lemma 1

For \(-2\pi \le t\le 2\pi \),

$$\begin{aligned} g(t)=\frac{1}{2}+\frac{\pi \cosh (\pi -|t|)}{2\sinh (\pi )}. \end{aligned}$$

We omit the proof because we shall show a more general result in Sect. 3.

Corollary 1

The normalizing constant \(\pi _0\) is given by

$$\begin{aligned} \pi _0=2g(0)-1=\frac{\pi \cosh (\pi )}{\sinh (\pi )} =\pi \left( 1+\frac{2}{e^{2\pi }-1}\right) \simeq 3.15334809493716\ldots \ . \end{aligned}$$

The formula for the ch.f., and is an immediate consequence of Lemma 1 and (3):

Theorem 1

The ch.f. of X is given by \( \phi _1(t)=\cosh (\pi -|t|)/\cosh (\pi )\), \(-2\pi \le t \le 2\pi \), and it is periodic with period \(2\pi \) (see Fig. 1).

Fig. 1
figure 1

The characteristic function \(\phi _1(t)\) in the interval \(-2\pi \le t\le 4\pi \)

3 The Cauchy-Cacoullos Family of Discrete Distributions

If we multiply a continuous Cauchy r.v. by a constant \(\lambda >0\) we stay in the same family of distributions—the Cauchy scale family. More precisely, if X is standard Cauchy, the density of \(\lambda X\) is given by

$$\begin{aligned} f(x)= \frac{1}{\pi }\ \frac{\lambda }{\lambda ^2+x^2}, \ \ \ x\in {R}, \ \ \lambda >0. \end{aligned}$$

However, this is no longer true for a discrete Cauchy X, since the support of \(\lambda X\) is not the set of integers. Motivated from this observation, we define a family of discrete integer-valued distributions as follows:

Definition 1

The discrete Cauchy-Cacoullos family (\(\mathcal{CC}\), for short) contains the p.m.f.’s

$$\begin{aligned} f_{\lambda }(k)=\frac{\tanh (\lambda \pi )}{\pi } \ \frac{\lambda }{\lambda ^2+k^2}, \ \ \ \ k\in {Z}, \ \ \lambda >0. \end{aligned}$$
(4)

For completeness of the presentation, it is convenient to include the limiting case \(\lambda =0\), which corresponds to a degenerate r.v. at zero.

Although this family has several interesting properties, similar to the Cauchy, it does not seem to have been studied elsewhere. Clearly, for \(\lambda =1\) we get (1). At a first glance, it is not entirely obvious to verify that the normalizing constant is as in (4). This is a by-product of the following result.

Lemma 2

For \(-\pi \le t\le \pi \) and \(\lambda >0\),

$$\begin{aligned} \cosh (\lambda t)=\frac{\lambda \sinh (\lambda \pi )}{\pi } \left\{ \frac{1}{\lambda ^2}+2\sum _{n=1}^\infty \frac{(-1)^n\cos (nt)}{\lambda ^2+n^2} \right\} . \end{aligned}$$

Proof

We express the even function \(h(t)=\cosh (\lambda t)\) in a cosine Fourier series to get \(h(t)\sim \sum _{n=0}^\infty \alpha _n \cos (nt)\). Simple calculations show that

$$\begin{aligned} \alpha _0=\frac{1}{2\pi }\int _{-\pi }^\pi \cosh (\lambda u)\mathrm{d}u= \frac{\sinh (\lambda \pi )}{\pi \lambda } \end{aligned}$$

and

$$\begin{aligned} \alpha _n=\frac{1}{\pi }\int _{-\pi }^\pi \cosh (\lambda u)\cos (n u)\mathrm{d}u = (-1)^n \frac{2\lambda \sinh (\lambda \pi )}{\pi (\lambda ^2+n^2)}, \ \ n=1,2,\ldots \ . \end{aligned}$$

Since h is differentiable in \([-\pi ,\pi ]\) with \(h(-\pi )=h(\pi )\), the lemma is proved (and the series converges uniformly to h). \(\square \)

If we set \(\lambda =1\) and \(t\rightarrow t-\pi \) in Lemma 2, we obtain Lemma 1 with g as in (3).

Corollary 2

We have

$$\begin{aligned} \sum _{k=-\infty }^\infty \frac{1}{\lambda ^2+k^2} =\frac{\pi }{\lambda \tanh (\lambda \pi )}, \end{aligned}$$

and hence, (4) defines a p.m.f. for any \(\lambda >0\).

Proof

Substitute \(t=\pi \) in Lemma 2. \(\square \)

As for the case \(\lambda =1\), we can obtain the ch.f. of \(X_\lambda \sim f_{\lambda }\) in a closed form.

Theorem 2

The ch.f. of \(X_\lambda \) with p.m.f. \(f_\lambda \in \mathcal{CC}\) is given by

$$\begin{aligned} \phi _\lambda (t)=\frac{\cosh \big (\lambda (t-\pi )\big )}{\cosh (\lambda \pi )}, \ \ \ 0\le t \le 2\pi , \end{aligned}$$

and it is periodic with period \(2\pi \). More precisely,

$$\begin{aligned} \phi _\lambda (t)=\frac{\cosh \left( \lambda \Big (t-2\pi \lfloor \frac{t}{2\pi }\rfloor -\pi \Big )\right) }{\cosh (\lambda \pi )}, \ \ -\infty<t<\infty , \end{aligned}$$

where \(\lfloor x\rfloor \) denotes the integer part of x.

Proof

As is well known, all integer-valued r.v.’s have periodic ch.f.’s, with period \(2\pi \). The particular r.v. is symmetrically distributed around zero, and thus, its ch.f. is real and even, so that \(\phi _\lambda (t)=\mathrm{E}\cos (t X_\lambda )\). To calculate this, we may restrict our attention in the interval \(0\le t\le 2\pi \). Then, since \(-\pi \le t-\pi \le \pi \) and \(\cos (nt)=(-1)^n \cos (n (t-\pi ))\),

$$\begin{aligned} \phi _{\lambda }(t)= & {} \frac{\lambda \tanh (\lambda \pi )}{\pi } \left\{ \frac{1}{\lambda ^2}+2 \sum _{n=1}^\infty \frac{(-1)^n \cos (n (t-\pi ))}{\lambda ^2+n^2}\right\} \\= & {} \frac{\lambda \tanh (\lambda \pi )}{\pi } \ \frac{\pi \cosh (\lambda (t-\pi ))}{\lambda \sinh (\lambda \pi )}, \end{aligned}$$

where the second equality follows from Lemma 2. \(\square \)

Statistical inference for the parameter \(\lambda \) is facilitated from the fact that the p.m.f.’s and the ch.f.’s in \(\mathcal{CC}\) have tractable forms.

Theorem 3

Consider a random sample \(X_1,\ldots ,X_n\sim f_\lambda \in \mathcal{CC}\) with \(\lambda >0\) unknown.

(i) The minimal sufficient statistic is \(T=(Y_1,\ldots ,Y_n)\), with \(Y_1\le Y_2\le \cdots \le Y_n\) being the order statistics of \(|X_1|,\ldots ,|X_n|\).

(ii) The Fisher Information (of a single observation) is

$$\begin{aligned} I(\lambda )=\frac{1}{2\lambda ^2}+\frac{\pi }{\lambda } w(\lambda ) \ \ \text{ where } \ \ w(\lambda )=\frac{\lambda \pi }{\cosh (\lambda \pi )^2} -\frac{1}{\sinh (2\lambda \pi )}. \end{aligned}$$
(5)

(iii) The MLE \(\widehat{\lambda }_n\) of \(\lambda \) is unique; it is given as the unique solution in \([0,\infty )\) of the equation

$$\begin{aligned} \frac{\pi \lambda }{\sinh (2\pi \lambda )}+\frac{1}{n}\sum _{i=1}^n\frac{X_i^2}{\lambda ^2+X_i^2}=\frac{1}{2}. \end{aligned}$$
(6)

(iv) The MLE is consistent and asymptotically efficient,

$$\begin{aligned} \sqrt{n}\left( \widehat{\lambda }_n-\lambda \right) {\mathop {\rightarrow }\limits ^{d}} N(0,1/I(\lambda )), \end{aligned}$$

where \({\mathop {\rightarrow }\limits ^{d}}\) denotes weak convergence.

Proof

Let \(\mathbf{x}=(x_1,\ldots ,x_n)\) and \(\mathbf{y}=(y_1,\ldots ,y_n)\) be two vectors in \({Z}^n\). Then, the likelihood ratio is given by

$$\begin{aligned} \frac{L(\mathbf{x};\lambda )}{L(\mathbf{y};\lambda )}=\prod _{i=1}^n \frac{\lambda ^2+y_i^2}{\lambda ^2+x_i^2}, \end{aligned}$$

and it has the same form as in the continuous Cauchy scale-family. Obviously, this ratio is independent of \(\lambda >0\) if and only if the ordered squared values of \(\mathbf{x}\) and \(\mathbf{y}\) are identical, and this verifies (i). Now, a straightforward computation yields the score function

$$\begin{aligned} S(k;\lambda ):=\frac{\partial }{\partial \lambda } \log f_\lambda (k)=\frac{1}{\lambda } +\frac{2\pi }{\sinh (2\pi \lambda )}-\frac{2\lambda }{\lambda ^2+k^2}. \end{aligned}$$

Let \(X_\lambda \sim f_\lambda \). Using Remarks 1, 2 below, it is seen that \(\mathrm{IE}S(X_\lambda ;\lambda )=0\) and \(\mathrm{IE}S(X_\lambda ;\lambda )^2=I(\lambda )\) with \(I(\lambda )\) as in (5). Note that \(I(\lambda )=-\mathrm{IE}\frac{\partial ^2}{\partial \lambda ^2} \log f_\lambda (X_\lambda )\), since the regularity conditions are obviously fulfilled; both formulae require computation of the series \(\sum _n (\lambda ^2+n^2)^{-s}\), \(s=1,2,3\). Moreover, one can easily verify that the log-likelihood is given by

$$\begin{aligned} \frac{\partial }{\partial \lambda }\log L(\mathbf{x};\lambda )=\frac{2n}{\lambda } \left( \frac{\pi \lambda }{\sinh (2\pi \lambda )}-\frac{1}{2}+\frac{1}{n}\sum _{i=1}^n \frac{x_i^2}{\lambda ^2+x_i^2}\right) . \end{aligned}$$
(7)

For fixed \(\mathbf{x}\in {Z}^n\), the positive function \(u(\lambda ):=\pi \lambda /\sinh (2\pi \lambda )+n^{-1} \sum _{i=1}^n x_i^2/(\lambda ^2+x_i^2)\) decreases to zero as \(\lambda \rightarrow \infty \) and has a limit \(u(0+)\ge 1/2\) (it equals to 1/2 iff \(\mathbf{x}=\mathbf{0}\)). Since u is strictly decreasing and continuous, the likelihood is first increasing and then decreasing, reaching its global maximum at \(\lambda _0\), where \(u(\lambda _0)=1/2\). This shows that the MLE is the unique solution of (6), it equals to 0 iff \(\mathbf{X}=\mathbf{0}\), and it is otherwise positive. Finally, in order to prove (iv), fix \(\lambda =\lambda _0\) and \(c\in (0,\lambda _0)\), and assume that \(\lambda \) varies in the interval \((\lambda _0-c,\lambda _0+c)\). Then, \(\frac{\partial ^3}{\partial \lambda ^3}\log f_{\lambda }(k)=A(\lambda )+B(\lambda ,k)\) where

$$\begin{aligned} A(\lambda )= 4\pi ^3\ \frac{3+\cosh (4\lambda \pi )}{\sinh (2\lambda \pi )^3}+\frac{2}{\lambda ^3}, \ \ B(\lambda ,k)=4\lambda \frac{3k^2-\lambda ^2}{(\lambda ^2+k^2)^3}. \end{aligned}$$

The function A is decreasing and positive, so that \(|A(\lambda )|<A(\lambda _0-c)\). Moreover,

$$\begin{aligned} |B(\lambda ,k)|<4\lambda \frac{3k^2+3\lambda ^2}{(\lambda ^2+k^2)^3} <\frac{12(\lambda _0+c)}{((\lambda _0-c)^2+k^2)^2}\le \frac{12(\lambda _0+c)}{(\lambda _0-c)^4}. \end{aligned}$$

It follows that we can find a finite constant \(M=M(\lambda _0,c)\) such that \(|\frac{\partial ^3}{\partial \lambda ^3}\log f_{\lambda }(k)|<M\) uniformly in \(k\in {Z}\), \(\lambda \in (\lambda _0-c,\lambda _0+c)\), and the result follows by applying Theorem 3.10 in [11]. \(\square \)

Unfortunately, the MLE does not admit a closed form and, hence, numerical procedures should be employed. On the other hand, we can construct closed-form consistent estimators, due to the fact that the ch.f. admits a simple form. For example, \(\phi _{\lambda }(\pi )=1/\cosh (\lambda \pi )=\beta \), say, equals the difference \(\Pr (X_\lambda \ \text{ even})-\Pr (X_\lambda \mathrm{odd})\). This can be consistently and unbiasedly estimated by \(\widehat{\beta }_n=n^{-1}\) \(sum_{i=1}^n (-1)^{X_{i}}\), and a trivial application of the CLT leads to \(\sqrt{n}(\widehat{\beta }_n-\beta ){\mathop {\rightarrow }\limits ^{d}}N(0,1-\beta ^2)\), while the SLLN shows that \(\widehat{\beta }_n\) is eventually positive w.p. 1. Applying the delta-method (see [16]) with \(g(\beta )=\pi ^{-1}\left( \log (1 + \sqrt{1 - \beta ^2})-\log (\beta )\right) \), so that \(g(\beta )=\lambda \), we obtain

$$\begin{aligned} \sqrt{n}\left( g(\widehat{\beta }_n)-\lambda \right) {\mathop {\rightarrow }\limits ^{d}}N\Big (0,\cosh (\pi \lambda )^2/\pi ^2\Big ). \end{aligned}$$

However, compared to the MLE, the closed-form estimator \(g(\widehat{\beta }_n)\) is by far less efficient. Thus, it is natural to seek for closed-form highly efficient estimators, and this may be possible as in the continuous case. In the continuous case, it is shown that the asymptotic relative efficiency of the geometric mean of the absolute values of the observations is \(8/\pi ^2\simeq 81\%\), and in [10] a more efficient closed-form estimate is proposed. Also, highly efficient estimators that are based on the ch.f. may be obtained by adapting the methodology of [9] to the present discrete case. However, such results are beyond the scope of the present note. Note that the Fisher information in the continuous Cauchy scale family equals to \(1/(2\lambda ^2)\) (compare to (5)), and the likelihood equation is as in (7), with the absence of the term \(\pi \lambda /\sinh (2\pi \lambda )\).

Remark 1

The series in Corollary 2 is of some interest in itself, because of the computation of the sum \( \sum _{n=1}^{\infty } (\lambda ^2+n^2)^{-1} \) in a closed form. Then, e.g., taking limits as \(\lambda \searrow 0\), we arrive at the famous Euler sum, \(\sum _{n=1}^\infty n^{-2}=\pi ^2/6\). Moreover, differentiating term by term with respect to \(\lambda \), we can evaluate the series

$$\begin{aligned} \sum _{n=1}^{\infty } \frac{1}{(\lambda ^2+n^2)^2}. \end{aligned}$$

From this, taking limits as \(\lambda \searrow 0\), we arrive at the sum for \(\zeta (4)\), that is, \(\sum _{n=1}^\infty n^{-4}=\pi ^4/90\); clearly, this process can be continued to evaluate all \(\zeta (2s)\) values, as well as the series \(\sum _{n=1}^{\infty } (\lambda ^2+n^2)^{-s}\), \(s=1,2,\ldots \) .

Remark 2

Differentiating m times with respect to \(\lambda ^2\) the series in Lemma 2, it is possible to introduce and investigate discrete Student-type families with \(\nu =2m+1\) degrees of freedom, that is, p.m.f.’s of the form

$$\begin{aligned} : f_{\nu ;\lambda }(k)= \frac{c_{\nu ;\lambda }}{(\lambda ^2+k^2)^{(\nu +1)/2}}, \ \ \ k\in {Z}, \ \ \ \nu =1,3,5,\ldots , \ \ \ \lambda >0, \end{aligned}$$
(8)

admitting closed-form ch.f.’s \(\phi _{\nu ;\lambda }(t)\) and explicit normalizing constants \(c_{\nu ;\lambda }\). However, the situation becomes quite complicated for even values of \(\nu \).

4 Some distributional properties of the \(\mathcal{CC}\) family

We observe that the ch.f. \(\phi _\lambda (t)\) is not differentiable at the points \(t=2k\pi \), \(k\in {Z}\) (c.f. Fig. 1). It is known that a random variable \(Y_1\) satisfies a weak law of large numbers, that is,

$$\begin{aligned} \overline{Y}_n :=\frac{Y_1+\cdots +Y_n}{n}\rightarrow \text{ some } \text{ constant } c, \text{ in } \text{ probability }, \end{aligned}$$

if and only if its ch.f., \(\phi _{Y_1}\), is differentiable at \(t=0\); then, \(\phi _{Y_1}'(0)=i c\) where i is the imaginary unit (the problem was treated by A. Zygmund and E.J.G. Pitman, and it is closely connected to Khintchine’s weak law of large numbers; see Feller [4,  p. 528] and van der Vaart [16,  p. 15]). Hence, the distributions of the \(\mathcal{CC}\) family do not satisfy the weak law of large numbers, since their ch.f.’s are not differentiable at \(t=0\). Therefore, it is of some interest to study the asymptotic behavior of the sample means from a \(\mathcal{CC}\) random variable with p.m.f. as in (4). Recall the well-known continuous counterpart, which says that \(\overline{X}_n\) is the same Cauchy for all n (Cauchy r.v.’s are stable).

We have the following result.

Theorem 4

If \(X_1,X_2,\ldots \) are independent identically distributed random variables with p.m.f. as in (4), then

$$\begin{aligned} \overline{X}_n{\mathop {\rightarrow }\limits ^{d}} \lambda \tanh (\lambda \pi ) \ Z, \end{aligned}$$

where Z is standard (continuous) Cauchy with density (2).

Proof

Fix \(t\ge 0\).

Theorem 2 shows that the ch.f. of \(\overline{X}_n\) is given by

$$\begin{aligned} \phi _\lambda (t/n)^n= \left( \frac{\cosh \big (\lambda (\pi -t/n)\big )}{\cosh (\lambda \pi )}\right) ^n, \ n\ge \frac{t}{2\pi }. \end{aligned}$$

Using this, it is easy to verily (e.g., by taking logarithms) that \(\phi _\lambda (t/n)^n\rightarrow e^{- c t}\), \(t\ge 0\), where \(c=\lambda \tanh (\lambda \pi )\).

Finally, from the fact that \(\phi _\lambda \) is even, it follows that \(\phi _\lambda (t/n)^n\rightarrow e^{-c|t|}\) for all \(t\in {R}\), which is the ch.f. of cZ, and the result follows from the continuity theorem of characteristic functions. \(\square \)

Unlike the usual Cauchy scale family, the \(\mathcal{CC}\) family is not convolution closed; however, it is “almost" closed. More precisely, the following result holds.

Theorem 5

For independent r.v.’s X, Y in \(\mathcal{CC}\) with \(X\sim f_{\lambda _1}\) and \(Y\sim f_{\lambda _2}\), the ch.f. of \(X+Y\) is given by

$$\begin{aligned} \phi _{X+Y}(t)= \frac{\alpha (\lambda _1+\lambda _2)}{2\alpha (\lambda _1) \alpha (\lambda _2)} \phi _{\lambda _1+\lambda _2}(t) +\frac{\alpha (|\lambda _2-\lambda _1|)}{2\alpha (\lambda _1) \alpha (\lambda _2)} \phi _{|\lambda _2-\lambda _1|}(t), \ \ t\in {R}, \end{aligned}$$

where \(\phi _0(t)\equiv 1\) is the ch.f. of the degenerate r.v. \(X_0\) with \(\Pr (X_0=0)=1\), and \(\alpha (\lambda ):=\cosh (\lambda \pi )\), \(\lambda \ge 0\). Consequently, \(X+Y\) is a mixture of two r.v.’s that are members of \(\mathcal{CC}\) family,

$$\begin{aligned} \Pr (X+Y=k)=\frac{\alpha (\lambda _1+\lambda _2)}{2\alpha (\lambda _1) \alpha (\lambda _2)} f_{\lambda _1+\lambda _2}(k) +\frac{\alpha (|\lambda _2-\lambda _1|)}{2\alpha (\lambda _1) \alpha (\lambda _2)} f_{|\lambda _2-\lambda _1|}(k), \ \ k\in {Z}. \end{aligned}$$

Proof

Set

$$\begin{aligned} p=\frac{\alpha (\lambda _1+\lambda _2)}{2\alpha (\lambda _1) \alpha (\lambda _2)}, \ \ \ \ q=\frac{\alpha (|\lambda _2-\lambda _1|)}{2\alpha (\lambda _1) \alpha (\lambda _2)}. \end{aligned}$$

Obviously, \(p>0\) and \(q>0\). Also, using the formula

$$\begin{aligned} \cosh (x)\cosh (y)=\frac{1}{2}\cosh (x+y)+\frac{1}{2}\cosh (y-x) \end{aligned}$$
(9)

it is easily seen that \(p+q=1\). Restricting our attention to the interval \(0\le t\le 2\pi \), we have

$$\begin{aligned} \phi _{X+Y}(t)=\phi _{\lambda _1}(t)\phi _{\lambda _2}(t) =\frac{\cosh (\lambda _1(t-\pi ))\cosh (\lambda _2(t-\pi ))}{\alpha (\lambda _1) \alpha (\lambda _2)} \end{aligned}$$

and a final application of (9) to the numerator, taking into account Theorem 2, completes the proof. \(\square \)

Remark 3

If XY are i.i.d. from \(f_\lambda \), then, since \(\alpha (0)=1\) and \(f_0(k)=I(k=0)\), we get

$$\begin{aligned} \Pr (X+Y=k)=\left\{ \begin{array}{ll} \displaystyle \frac{1}{2\cosh (\lambda \pi )^2}+\frac{\tanh (\lambda \pi )}{2\lambda \pi }, &{} k=0, \\ &{} \\ \displaystyle \frac{\tanh (\lambda \pi )}{\pi } \ \frac{2\lambda }{(2\lambda )^2+k^2}, &{} k\in {Z}^*. \end{array} \right. \end{aligned}$$

This formula quantifies the fact that the p.m.f. of \(X+Y\) lies outside \(\mathcal{CC}\), but it is close, in some sense, to \(f_{2\lambda }\); in fact, the ratio \(f_{X+Y}(k)/f_{2\lambda }(k)\) does not vary with \(k\in {Z}^*\).

A ch.f. \(\phi \) (or the corresponding r.v. X) is called infinitely divisible (i.d.) if for each n, we can find a ch.f. \(\phi _n\) such that \(\phi _n^n=\phi \); equivalently, if \(X_{1,n}+\cdots +X_{n,n}\) has the same distribution as X, where \(X_{1,n},\ldots ,X_{n,n}\) are i.i.d. with ch.f. \(\phi _n\). Properties of this kind are included in what is called “arithmetic of probability laws" [12, 13], and a vast bibliography exists, see, e.g., [3, 5, 8, 12, 13, 15], and references therein.

Since the notion of i.d. is related to limit theorems of sums of independent r.v.’s, it would be useful to know whether the \(\mathcal{CC}\) family is i.d. This is indeed the case, and it follows immediately from a result of Polya, because the ch.f. \(\phi _{\lambda }\) is even, log-convex in \([0,2\pi ]\) and \(2\pi \) periodic, see [8, 13]. In fact, \(\phi _\lambda ^\alpha \) is a ch.f. for all \(\lambda \ge 0\) and \(\alpha \ge 0\).

As is well known, the notion of self-decomposability, as well as that of stability, do not apply to discrete r.v.’s. Recall that X is stable if, for each n, we can find constants \(\alpha _n>0\) and \(\beta _n\in {R}\) such that X and \((X_1+\cdots +X_n)/\alpha _n-\beta _n\) have the same distribution, where \(X_1,\ldots ,X_n\) are i.i.d. copies X. Obviously, the class of stable distributions is a proper subset of i.d. distributions. Due to a fundamental result of Lévy, stable distributions are very important because their class contains exactly all possible limits of (properly) normalized sums of i.i.d. r.v.’s. Every stable distribution has a ch.f. that can be expressed in a closed form, and the corresponding r.v. is absolutely continuous. The subclass of symmetric stable ch.f.’s, after a location-scale transformation, can be written as \(\mathcal{S}=\{\phi _\alpha (t)=e^{-|t|^\alpha }\), \(0<\alpha \le 2\}\). Only the densities that correspond to \(\alpha =1/2\) (Lévy), \(\alpha =1\) (Cauchy) and \(\alpha =2\) (Normal), have known explicit forms.

It is natural to ask whether the \(\mathcal{CC}\) family contains discrete stable distributions, in the sense of [15]. However, the definitions in [15] are designed for non-negative integer-valued r.v.s, and are based on probability generating functions; it is not obvious how to extend these results to the \(\mathcal{CC}\) case. The following definition provides a different approach that seems to be natural for our case.

Definition 2

Let \(\Lambda \) be a set of indices, consider a parametric family \(\mathcal{F}=\{\phi _\lambda , \ \lambda \in \Lambda \}\) of discrete, integer-valued, ch.f.’s, and let \(\mathcal{F}'\) be the corresponding family of random variables. Then, \(\mathcal{F}\) is called discrete stable (DSF) if for each \(\phi _\lambda \in \mathcal{F}\), we can find a sequence of indices \(\{\lambda _n\}_{n=1}^{\infty }\subset \Lambda \) such that \(\phi _{\lambda _n}^n\rightarrow \phi _{\lambda }\). Equivalently, if every random variable in \(\mathcal{F}'\) is the weak limit of sums of i.i.d. r.v.’s from \(\mathcal{F}'\).

The usual Poisson family is DSF, as well as the Negative Binomial. In order for such a model to be useful in practice, the family \(\mathcal{F}\) should not contain “too many” ch.f.’s. Also, it is plausible to consider those DSF’s that satisfy some kind of discrete attraction, in the sense that (non-normalized) sums of several i.i.d. discrete r.v.’s converge weakly to one of the members of the DSF. It is clear that the Compound Poisson that is produced by a fixed discrete ch.f. \(\psi \), namely, \(\mathcal{F}=\{\phi _{\lambda }(t)=e^{\lambda (\psi (t)-1)}, \ \lambda \ge 0\}\), is such a useful DSF model. On the other hand, the complete Compound Poisson model (allowing any \(\psi \) in the exponent) seems to be too wide. Regarding the \(\mathcal{CC}\) family, we have the following result.

Theorem 6

The \(\mathcal{CC}\) family is not DSF. To be more specific, suppose \(\{\phi _{\lambda _n}\}_{n=1}^{\infty }\subset \mathcal{CC}\) where \(\lambda _n\ge 0\) is an arbitrary sequence, and \(\phi _{\lambda _n}\) is as in Theorem 2. Then, (i) and (ii) below are equivalent.

(i) There is a point \(t_0\in (0,2\pi )\) such that \(\lim _n \phi _{\lambda _n}(t_0)^n=\delta >0\).

(ii) It holds \(\lambda _n=\theta /\sqrt{n}+o(1/\sqrt{n})\), where \(\theta =(-2\log \delta )^{1/2}(t_0(2\pi -t_0))^{-1/2}\ge 0\).

If (i) or (ii) is satisfied, then \(\phi _{\lambda _n}(t)^n\rightarrow \psi (t):=\exp (-\theta ^2 t(2\pi -t)/2)\) uniformly in t, \(0\le t\le 2\pi \), and the limiting ch.f. \(\psi \) (extended to be \(2\pi \)-periodic) is an infinitely divisible ch.f.

Before proving Theorem 6, we provide some remarks. The limiting ch.f. \(\psi \) is a Compound Poisson one. Indeed, the exponent can be written as \(\lambda (\psi _1(t)-1)\), where \(\psi _1(t)=1-\theta ^2 t(\pi -t/2)/\lambda \) and, e.g., \(\lambda \ge \pi ^2\theta ^2/2\) (we shall see below that the minimum value of \(\lambda \) for which \(\psi _1\) is a ch.f. is \(\lambda _0=\pi ^2\theta ^2/3\)). Then, it follows that the even, \(2\pi \)-periodic function \(\psi _1\) is nonnegative, decreasing and convex in \([0,\pi ]\), and so, by Polya’s sufficiency criterion (see [8]) it is a ch.f. of an integer-valued r.v. Clearly, the parametric family produced by all possible limits from \(\mathcal{CC}\), namely, \(\mathcal{F}=\{\psi _{\lambda }(t)=e^{-\lambda t(2\pi -t)}, \ \lambda \ge 0, \ t\in [0,2\pi ]\}\), forms a DSF according to Definition 2. By applying the inversion formula for ch.f.’s of integer-valued r.v.’s, namely

$$\begin{aligned} \Pr (X=k)=\frac{1}{2\pi }\int _{-\pi }^{\pi } e^{-ikt} \phi _X(t)\mathrm{d}t, \ \ k\in {Z}, \end{aligned}$$

it is recognized that the p.m.f.’s in \(\mathcal{F}\) do not admit closed forms. Indeed, if \(Y_\lambda \sim \psi _{\lambda }\), then the preceding formula reduces to

$$\begin{aligned} \Pr (Y_\lambda =k)=\frac{1}{\pi }\int _{0}^{\pi } \cos (kt) e^{-\lambda t(2\pi -t)}\mathrm{d}t, \ \ k\in {Z}, \end{aligned}$$

and this integral cannot be computed in terms of elementary functions (unless \(\lambda =0\)). Moreover, if we make use of the preceding formula with \(\psi _1\) instead of \(\psi \), we can easily obtain the p.m.f. of the r.v. W with ch.f. \(\psi _1\). Setting for convenience \(c=\theta ^2/\lambda \) one finds \(\Pr (W=0)=1-c\pi ^2/3\) (so that \(c\le 3/\pi ^2\) and, hence, \(\lambda \ge \theta ^2\pi ^2/3\)) and \(\Pr (W=k)=c/k^2\), \(k\in {Z}^*\). According to Theorem 6, these remarks provide a detailed description of the class of the limiting distributions of sums of i.i.d. r.v.’s from \(\mathcal{CC}\).

The following lemma will be used in the proof of Theorem 6.

Lemma 3

(i) Let \(\{\beta _n\}_{n=1}^\infty \subset (0,1]\), assume that \(\beta _n^n\rightarrow \beta \in (0,1]\) and set \(B=-\log \beta \). Then, \(\beta _n=1-B/n+o(1/n)\).

(ii) Fix \(x_0\in [0,1)\), and define the function \(f(y):=\cosh (x_0 y )/\cosh (y)\), \(y\ge 0\). Suppose that \(\{\alpha _n\}_{n=1}^\infty \subset [0,\infty )\) and that \(f(\alpha _n)^n\rightarrow \delta \in (0,1]\). Then, \(\alpha _n=\alpha /\sqrt{n}+o(1/\sqrt{n})\), where \(\alpha =\sqrt{(-2\log \delta )/(1-x_0^2)}\).

Proof

(i) Despite the fact that (i) is known, we provide a very quick proof here. The inequality \(y\le -\log (1-y)\le y/(1-y)\) (\(0\le y<1\)), applied \(y=1-\beta _n\), yields \( \beta _n(-n \log \beta _n)\le n(1-\beta _n)\le -n\log \beta _n, \) and since the upper bound implies that \(\beta _n\rightarrow 1\), both bounds converge to B.

(ii) The sequence \(n\alpha _n^2\) is bounded. Indeed, assuming the contrary, it follows that for any \(M>0\) (arbitrarily large) we can find a subsequence \(n_k\) such that \(\alpha _{n_{k}}>M/\sqrt{n_{k}}\) for all k. Since it is easily checked that \(f'(y)<0\) for \(y>0\), the positive continuous function f is strictly decreasing, with \(f(0)=1\), \(f(\infty )=0\) (recall that \(0\le x_0<1\)). Therefore, \(f\big (\alpha _{n_k}\big )^{n_k}\le f\big (M/\sqrt{n_{k}}\big )^{n_k}\rightarrow \exp \big (-M^2(1-x_0)^2/2\big )\), as \(k\rightarrow \infty \). Thus, \(\liminf f(\alpha _n)^n\le \exp \big (-M^2(1-x_0)^2/2\big )\), and since \(M>0\) is arbitrary, \(\liminf f(\alpha _n)^n\rightarrow 0\). This contradicts the hypothesis \(f(\alpha _n)^n\rightarrow \delta >0\), and verifies that the sequence \(n\alpha _n^2\) is, indeed, bounded. Hence, \(\alpha _n\rightarrow 0\). By applying a Taylor development to the function f it can be checked that for \(y\ge 0\), sufficiently close to zero,

$$\begin{aligned} 1-\frac{1}{2}(1-x_0^2)y^2\le f(y) \le 1-\frac{1}{2}(1-x_0^2) y^2+\frac{1}{24}(1-x_0^2)(5-x_0^2)y^4, \ \ 0\le y<\epsilon . \end{aligned}$$

Substituting \(y=\alpha _n\) (which tends to zero), we obtain the inequality

$$\begin{aligned} A n(1-f(\alpha _n))\le n\alpha _n^2\le A n(1-f(\alpha _n))+B \alpha _n^2(n\alpha _n^2), \ \ n\ge n_0, \end{aligned}$$

with \(A=2/(1-x_0^2)\), \(B=(5-x_0^2)/12\). Since \(f(\alpha _n)^n\rightarrow \delta \in (0,1]\) (and \(0<f(\alpha _n)\le 1\)), it follows from part (i) that \(n(1-f(\alpha _n))\rightarrow -\log \delta \), and the preceding inequality shows that \(n\alpha _n^2\rightarrow (-\log \delta )A\), completing the proof. \(\square \)

Proof of Theorem 6

Assume first that (ii) holds, that is, \(\lambda _n=\theta /\sqrt{n}+o(1/\sqrt{n})\) for some \(\theta \ge 0\). It is straightforward to verify that \(\phi _{\lambda _n}(t)^n\) converges pointwise to \(\psi (t)\) as given, and from the fact that \(\psi \) is continuous at the origin, the convergence is uniform at compacts, and in particular, in \([0,2\pi ]\). Obviously, (i) is satisfied for (any choice of) \(t_0\in (0,2\pi )\) with \(\delta =\psi (t_0)=\exp (-\theta ^2 t_0(2\pi -t_0)/2)>0\).

Assume now that (i) holds, i.e., suppose that for a fixed \(t_0\in (0,2\pi )\), \(\phi _{\lambda _n}(t_0)^n\rightarrow \delta >0\). Due to symmetry (\(\phi _{\lambda _n}(t)=\phi _{\lambda _n}(2\pi -t)\)), we can further assume that \(0<t_0\le \pi \). Set \(\alpha _n=\pi \lambda _n\), \(x_0=1-t_0/\pi \in [0,1)\), and consider the function \(f(y)=\cosh (x_0y)/\cosh (y)\), \(y\ge 0\), as in Lemma 3. Then, \(\phi _{\lambda _n}(t_0)=f(\alpha _n)\), and by assumption, \(f(\alpha _n)^n\rightarrow \delta >0\) (certainly, \(\delta \le 1\)). Hence, from Lemma 3(ii) we conclude that \(n\alpha _n^2\rightarrow (-2\log \delta )/(1-x_0^2)\), that is, \(n\lambda _n^2\rightarrow (-2\log \delta )/(t_0(2\pi -t_0))\), which verifies (ii). \(\square \)

It is of some interest to observe that, according to Theorem 6, the limiting ch.f. exists if we can merely show the convergence \(\phi _{\lambda _n}(t_0)^n\rightarrow \delta >0\) for a single nontrivial point \(t_0\) (i.e., \(t_0\ne 2k\pi \)). Then, \(\psi (t)\) is uniquely determined from the pair \((t_0,\delta )\), Also, the limiting distribution is degenerate at zero if and only if \(\delta =1\) (which is corresponds to \(\theta =0\) in Theorem 6(ii)).

Another related problem concerns the extended \(\mathcal{CC}\) class, defined as the family of ch.f.’s \(\mathcal{CC}^+:=\{\phi ^\alpha :\ \phi \in \mathcal{CC},\ \alpha >0 \}\). Since every \(\phi \in \mathcal{CC}\) is \(2\pi \)-periodic, decreases in \([0,\pi ]\) and is log-convex in \([0,2\pi ]\), the same is true for all ch.f.’s in \(\mathcal{CC}^+\). Hence, \(\mathcal{CC}^+\) is a family of i.d. ch.f.’s. This family is similar to the (continuous) Cauchy scale family. Cramér [3] showed that all stable centered distributions with exponent \(\alpha <2\) are not factor closed. This means that, e.g., the ch.f. of the standard Cauchy, \(e^{-|t|}\), can be written as \(\phi _1\phi _2\), with \(\phi _i\) (\(i=1,2\)) lying outside the class of Cauchy ch.f.’s. So, it is fairly expected that the same is true for \(\mathcal{CC}^{+}\). Indeed, it can be proved that this is the case, and, as a concrete example, we provide the following \(2\pi \)-periodic \(\phi _i\)’s:

$$\begin{aligned} \phi _1(t)= & {} \left( \frac{\cosh (t-\pi )}{\cosh (\pi )}\right) ^{1/2} \left( \frac{1+\pi ^4}{1+(t-\pi )^4}\right) ^{1/50}, \ \ 0\le t\le 2\pi , \\ \phi _2(t)= & {} \left( \frac{\cosh (t-\pi )}{\cosh (\pi )}\right) ^{1/2} \left( \frac{1+(t-\pi )^4}{1+\pi ^4}\right) ^{1/50}, \ \ 0\le t\le 2\pi . \end{aligned}$$

It can be checked that both functions are positive, decreasing in \([0,\pi ]\), and convex (\(\phi _1\) is log-convex) in \([0,2\pi ]\) and hence, their \(2\pi \)-periodic extensions (which are even functions) are ch.f.’s, see [13]. Obviously, these ch.f.’s lie outside \(\mathcal{CC}^+\), and, trivially, their product equals to the standard discrete Cauchy ch.f. of Theorem 1.