Abstract
Let \(X_1,X_2,...\) be the digits in the base-q expansion of a random variable X defined on [0, 1) where \(q\ge 2\) is an integer. For \(n=1,2,...\), we study the probability distribution \(P_n\) of the (scaled) remainder \(T^n(X)=\sum _{k=n+1}^\infty X_k q^{n-k}\): If X has an absolutely continuous CDF then \(P_n\) converges in the total variation metric to the Lebesgue measure \(\mu \) on the unit interval. Under weak smoothness conditions we establish first a coupling between X and a non-negative integer valued random variable N so that \(T^N(X)\) follows \(\mu \) and is independent of \((X_1,...,X_N)\), and second exponentially fast convergence of \(P_n\) and its PDF \(f_n\). We discuss how many digits are needed and show examples of our results.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Let X be a random variable so that \(0\le X<1\), and for \(x\in \mathbb R\), let \(F(x)=\mathrm P(X\le x)\) be the cumulative distribution function (CDF) of X. For a given integer \(q\ge 2\), we consider the base-q transformation \(T:[0,1)\mapsto [0,1)\) given by
where \(\lfloor \cdot \rfloor \) is the floor function (so \(\lfloor xq\rfloor \) is the integer part of xq). For \(n=1,2,...\), let \(T^n=T\circ \cdots \circ T\) denote the composition of T with itself n times and define
where \(T^{0}(X)=X\). Then
is the base-q expansion of X with digits \(X_1,X_2,...\). Note that X is in a one-to-one correspondence to the first n digits \((X_1,...,X_n)\) together with \(T^{n}(X)=\sum _{k=n+1}^\infty X_kq^{n-k}\), which is the remainder multiplied by \(q^n\). Let \(\mu \) denote Lebesgue measure on [0, 1), \(P_n\) the probability distribution of \(T^n(X)\) and \(F_n\) its CDF, so X follows \(P_0\) and has CDF \(F_0=F\). The following facts are well-known (see Cornean et al. 2022 and the references therein):
-
(a)
\(P_0=P_1\) (i.e., invariance in distribution under T) is equivalent to stationarity of the process \(X_1,X_2,...\).
-
(b)
\(P_0=P_1\) and F is absolutely continuous if and only if \(P_0=\mu \).
-
(c)
\(P_0=\mu \) if and only if \(X_1,X_2,...\) are independent and uniformly distributed on \(\{0,1,...,q-1\}\).
Items (a)–(c) together with the fact that T is ergodic with respect to \(\mu \) are used in metric number theory (see Dajani and Kalle 2021; Schweiger 1995 and the references therein) to establish properties such as ‘for Lebesgue almost all numbers between 0 and 1, the relative frequency of any finite combination of digits of a given length n and which occurs among the first \(m > n\) digits converges to \(q^{-n}\) as \(m\rightarrow \infty \)’ (which is basically the definition of a normal number in base-q, cf. Borel (1909)). To the best of our knowledge, less (or perhaps no) attention has been paid to the asymptotic behaviour of the (scaled) remainder \(T^n(X)\) as \(n\rightarrow \infty \). This paper fills this gap.
Assuming F is absolutely continuous with a probability density function (PDF) f we establish the following. We start in Section 2 to consider a special case of f where \(T^n(X)\) follows exactly \(\mu \) when n is sufficiently large. Then in Section 3, under a weak assumption on f, we specify an interesting coupling construction involving a non-negative integer-valued random variable N so that \(T^N(X)\) follows exactly \(\mu \) and is independent of \((X_1,...,X_N)\). Moreover, in Section 4, we show that \(\lim _{n\rightarrow \infty }d_{\textrm{TV}}(P_n,\mu )=0\) where \(d_{\textrm{TV}}\) is the total variation metric (as given later in (12)). Because of these results, if in an experiment a realization of X is observed and the first n digits are kept, and if (so far) the only model assumption is absolute continuity of F, then the remainder rescaled by \(q^n\) is at least approximately uniformly distributed when n is large. Since we interpret the uniform distribution as the case of complete randomness, no essential information about the distribution is lost. On the other hand, if the distribution of the remainder is far from uniform, this may indicate that the distribution one is trying to find has finer structure that one is missing by looking only at the first n digits. We return to this issue in Section 5 when discussing sufficiency and ancillarity. Furthermore, in Section 4 we study the convergence rate of \(d_{\textrm{TV}}(P_n,\mu )\) and other related properties. In Section 5, we illustrate our results from Sections 3 and 4 when F follows the extended Newcomb-Benford law (Example 1).
The present paper is a shorter version of Herbst et al. (2023) to which we refer for further results, comments, and examples. In particular, Herbst et al. (2023) gives an extension of our convergence results in Section 4 to the case where X is a multivariate random variable with values in the k-dimensional unit cube \([0,1)^k\) and each of the k coordinates of X is transformed by T. We plan in a future paper to study the asymptotic behaviour of the remainder in other expansions, including a certain base-\(\beta \) expansion of a random variable, namely when q is replaced by \(\beta =(1+\sqrt{5})/2\) (the golden ratio) in all places above.
2 Preliminaries
Let again the situation be as in (1)–(3). The following lemma is true in general (i.e., without assuming F is absolutely continuous). As in Cornean et al. (2022), we define a base-q fraction in [0, 1) to be a number of the form \(\sum _{k=1}^n j_kq^{-k}\) with \((j_1,...,j_n)\in \mathbb \{0,1,...,q-1\}^n\) and \(n\in \mathbb N\).
Lemma 2.1
If F has no jump at any base-q fraction in [0, 1) then for every \(x\in [0,1]\),
Proof
Clearly, (4) holds for \(x=1\), so let \(0\le x<1\). For \(j_1,...,j_n\in \{0,1,...,q-1\}\) and \(j=\sum _{i=1}^n j_iq^{n-i}\), the event that \(X_1=j_1,..., X_n=j_n\), and \(T^n(X)\le x\) is the same as the event that \(q^{-n}j\le X<q^{-n}(j+1)\) and \(X\le q^{-n}(j+x)\). Hence, since \(0\le x<1\),
whereby (4) follows since F(x) has no jumps at the base-q fractions. \(\square \)
The property that F has no jump at any base-q fraction is of course satisfied when F is continuous.
For the remainder of this section and the following Sections 3, 4 and 5 we assume that X has a probability density function (PDF) f concentrated on (0, 1), meaning that F is absolutely continuous with \(F(x)=\int _{-\infty }^x f(t)\,\mathrm dt\) for all \(x\in \mathbb R\). Then, by (4), \(F_n\) is absolutely continuous with PDF
for \(0< x< 1\).
In the following special case of f, convergence of \(P_n\) is obtained within a finite number of steps.
Proposition 2.2
Let \(m\ge 1\) be an integer. Then \(P_m=\mu \) (and hence \(P_n=\mu \) for \(n=m,m+1,...\)) if and only if for all \(k\in \{0,1,...,q^m-1\}\) and Lebesgue almost every \(u\in [0,1)\),
In particular, if f is constant Lebesgue almost everywhere on each of the intervals \([jq^{-m},(j+1)q^{-m})\), \(j=0,1,...,q^{m}-1\), then for \(n=m,m+1,...\), \(P_n=\mu \) and \((X_1,...,X_n)\) is independent of \(T^n(X)\).
Proof
If \(P_m=\mu \) then by invariance of \(\mu \) under T, \(P_n=\mu \) for \(n=m,m+1,...\). Let \(K=\sum _{i=1}^mX_iq^{m-i}\) and \(U=T^m(X)\), so \(X=(K+U)q^{-m}\). For Lebesgue almost every \(t\in [0,1)\),
since
Thereby the first assertion follows.
Suppose that \(c_j\) is a constant and \(f=c_j\) Lebesgue almost everywhere on \([jq^{-m},(j+1)q^{-m})\) for \(j=0,1,...,q^{m}-1\). Then
and so for Lebesgue almost all \(x\in [0,1)\), (5) gives that \(f_m(x)=1\). Therefore, \(P_m=\mu \), and hence \(P_n=\mu \) for \(n=m,m+1,...\). Consequently, the last assertion follows from (6), using that \(\sum _{i=1}^m X_iq^{m-i}\) and \((X_1,...,X_m)\) are in a one-to-one correspondence. \(\square \)
3 Couplings
Let f be a PDF on [0, 1). We introduce the following notation. Let \(I_\emptyset =I_{1;0}=[0,1)\) and \(c_\emptyset =c_{1;0}=\inf _{I_\emptyset }f\). For \(n=1,2,...\) and \(x_1,x_2,...\in \{0,1,...,q-1\}\), let \(k=1+\sum _{i=1}^n x_iq^{n-i}\) and
and
Recall that f is lower semi-continuous at a point x if for any sequence \(y_n\rightarrow x\), it holds that \(\liminf _n {f(y_n)} \ge f(x)\). Note that if \(x=\sum _{n=1}^\infty x_nq^{-n}\in [0,1)\) is not a base-q fraction, then lower semi-continuity at x is equivalent to
Commonly used PDFs are lower semi-continuous almost everywhere. For an example where this condition does not hold, see Remark 1 in Herbst et al. (2023).
Write \(U\sim \mu \) if U is a uniformly distributed random variable on [0, 1).
Theorem 3.1
Suppose f is lower semi-continuous at Lebesgue almost all points in [0, 1). Then there is a coupling between \(X \sim f\) and a non-negative integer-valued random variable N such that \(T^N(X)\sim \mu \) is independent of \((X_1,...,X_N)\).
Proof
For Lebesgue almost all \(x=\sum _{n=1}^\infty x_nq^{-n}\in [0,1)\) with \(x_n=\lfloor T^{n-1}(x)q\rfloor \), assuming x is not a base-q fraction (recalling that the set of base-q fractions is a Lebesgue nullset), (7) gives
Let N be a random variable such that for \(f(x)>0\), conditionally on \(X=x\),
By (8) and since \(c_{x_1,...,x_n}\ge 0\), this is a well-defined conditional distribution.
By Bayes theorem, conditioned on \(N=n\) with \(\mathrm P(N=n)>0\), X follows an absolutely continuous distribution with PDF
Therefore, since f(x|n) is constant on each of the intervals \(I_{k;n}\), conditioned on \(N=n\) we immediately see that \((X_1,...,X_n)\) (interpreted as nothing if \(n=0\)) and \(T^n(X)\) are independent and that \(T^n(X)\sim \mu \). The latter implies that \(T^N(X)\sim \mu \) is independent of N. Consequently, if we do not condition on N, we have that \((X_1,...,X_N)\) and \(T^N(X)\sim \mu \) are independent. \(\square \)
Corollary 3.2
For the coupling construction in the proof of Theorem 3.1, conditioned on \(X=x\) with \(f(x)>0\), we have
where \(x_k=\lfloor T^{k-1}(x)q\rfloor \) for \(1\le k\le n\). Moreover,
Remark 1
Corollary 3.2 is used in Section 5 to quantify how many digits are needed to make the remainder uniformly distributed with sufficiently high probability. By (10), in order to make N as small as possible, we prefer a version of f which is as large as possible, cf. Herbst et al. (2023).
Proof
The proof of Theorem 3.1 gives immediately (9). Thus, for \(n=0,1,...\),
So \(\mathrm P(N=0)=c_\emptyset \) in agreement with (10). For \(n=1,2,...\), we have
Thereby (10) follows. \(\square \)
Corollary 3.3
Let the situation be as in Theorem 3.1. The output of the following simulation algorithm is distributed as \(X\sim f\):
-
(a)
Draw N from (10).
-
(b)
Conditionally on N, generate a discrete random variable K with
$$\begin{aligned} \mathrm P(K={k-1}\,|\,N=n)\propto c_{k;n},\quad k=1,...,q^n,\ n=0,1,... \end{aligned}$$(11) -
(c)
Independently of (N, K) pick a random variable \(U\sim \mu \).
-
(d)
Output \((K+U)q^{-N}\).
Proof
Let \(a_n=\sum _{k=1}^{q^n}c_{k;n}\) be the normalizing constant in (11). Conditioned on \(N=n\) with \(\mathrm P(N=n)>0\), steps (b) and (c) give that \(U\sim \mu \) and K are independent, so the conditional distribution of \((K+U)q^{-N}\) is absolutely continuous with a conditional PDF given by
Moreover, we get from (10) that \(\mathrm P(N=0)=c_\emptyset \) and
Therefore, the (unconditional) distribution of \((K+U)q^{-N}\) is absolutely continuous with a PDF which at each point \(x=\sum _{n=1}^\infty x_nq^{n}\in [0,1)\) with \(x_n=\lfloor T^{n-1}(x)q\rfloor \) is given by
This PDF agrees with (8), so \((K+U)q^{-N}\sim f\). \(\square \)
Denote by \(\mathcal {B}\) the class of Borel subsets of [0, 1). The total variation distance between two probability measures \(\nu _1\) and \(\nu _2\) defined on \(\mathcal {B}\) and with PDFs \(g_1\) and \(g_2\), respectively, is given by
(see Tsybakov 2009, Lemma 2.1). Then Theorem 3.1 shows the following.
Corollary 3.4
Let the situation be as in Theorem 3.1. Then
Remark 2
In general the coupling inequality (13) is sharp, see Herbst et al. (2023). It follows from Corollary 3.2 and 3.4 that (7) implies \(\lim _{n\rightarrow \infty } d_{\textrm{TV}}(P_n,\mu )=0\). In Theorem 4.1 below we show that (7) is not needed for this convergence result.
Proof
Using Corollary 3.3, let \(X=(K+U)q^{-N}\). For \(n=0,1,...\), if \(Q_n\) denotes the probability distribution of \(T^n(U)\), then \(Q_n=\mu \), and so
where the first inequality is the standard coupling inequality for the coupled random variables \(T^n(X)\) and \(T^n(U)\), and the last inequality follows since \(N\le n\) implies \(T^n(X)=T^n(U)\). Thereby (13) is verified. \(\square \)
Remark 3
By the Kantorovich-Rubinstein theorem, the Wasserstein distance between two probability measures \(\nu _1\) and \(\nu _2\) on [0, 1] is given by
where \(\Gamma (\nu _1,\nu _2)\) consists of all couplings of \(\nu _1\) and \(\nu _2\). By (Gibbs and Su 2002, Thm 4),
so by Remark 2, Corollary 3.4 implies
The latter bound can be improved by using the coupling between \(T^n(X)\) and \(T^N(X)\sim \mu \) to obtain
See also Gibbs and Su (2002) for an overview of the relation between the total variation distance and other measures of distance between probability measures.
4 Asymptotic Results
We need some notation for the following theorem. For a real, measurable function g defined on (0, 1), denote its \(L_1\)- and supremum-norm by \(\Vert g\Vert _1=\int _0^1|g(t)|\,\mathrm dt\) and \(\Vert g\Vert _\infty =\sup _{x\in (0,1)}|g(x)|\), respectively, and denote the corresponding \(L_1\)-space by \(L_1(0,1)=\{g\,|\,\Vert g\Vert _1<\infty \}\) (here, \(\Vert g\Vert _\infty \) may be infinite when there are no further assumptions on g). Let \(\bar{L}_1(0,1)=\{g\,| \int _0^1g(t) dt =1,\Vert g\Vert _1 < \infty \}\) be the subset of functions with finite \(L_1\)-norm and integral over [0, 1] equal one, and \(\bar{L}_1'(0,1)\subset \bar{L}_1(0,1)\) its subset of differentiable functions g such that \(\Vert g'\Vert _\infty <\infty \). For \(g\in \bar{L}_1'(0,1)\), \(n\in \mathbb N\), \(j=0,1,...,q^n-1\), and \(0<x<1\), define \(g_{n,j}'(x)=g'(x)\) if \(q^{-n}j<x<q^{-n}(j+1)\) and \(g_{n,j}'(x)=0\) otherwise, and define
Henceforth, we also think of f as an element of \(\bar{L}_1(0,1)\).
Theorem 4.1
If \(f \in \bar{L}_1(0,1)\) then for every \(g \in \bar{L}_1'(0,1)\),
In particular,
and we have the following sharper convergence results. If \(f \in \bar{L}_1'(0,1)\) then \(P_n\) converges exponentially fast:
If \(\Vert f\Vert _\infty < \infty \) and f is continuous except for finitely many points, then
If f is twice differentiable with \(\Vert f''\Vert _\infty <\infty \) then we have the following improvement of (17):
where \(\xi _{n,j} \in (q^{-n}j,q^{-n}(j+1))\) is arbitrary.
Before proving this theorem we need the following lemma.
Lemma 4.2
Let \(f \in \bar{L}_1(0,1)\), \(g \in \bar{L}_1'(0,1)\). For every \(x\in (0,1)\),
and
If g is twice differentiable on (0, 1) with \(\Vert g''\Vert _\infty <\infty \) then for every \(x\in (0,1)\),
where each \(\xi _{n,j} \in (q^{-n}j,q^{-n}(j+1))\) is arbitrary.
Proof
Let \(x\in (0,1)\). From (14) we have
If g is differentiable on (0, 1) with \(\Vert g'\Vert _\infty <\infty \), we get by the mean value theorem,
which yields the bound
Thereby (20) follows. Moreover,
whereby (21) follows. If g is twice differentiable on (0, 1) with \(\Vert g''\Vert _\infty <\infty \), the mean value theorem gives
where \(\xi _{x,y} \in (q^{-n}j,q^{-n}(j+1))\) depends on x and y and \(\xi _{n,j} \in (q^{-n}j,q^{-n}(j+1))\) is arbitrary. The second equality was obtained by applying the mean value theorem to \(g'(\xi _{x,y})-g'(\xi _{n,j})\). Inserting this into (23) yields
which reduces to (22). \(\square \)
We are now ready for the proof of Theorem 4.1.
Proof
We have
where we get the equality from (12) and the second inequality from (20), (21), and since \(\int _0^1\left( x^2-x +1/2\right) \,\mathrm dx = 1/3\). Thereby (15) is verified. Taking \(n\rightarrow \infty \) in (15) and using that \(\bar{L}'_1(0,1)\) is dense in \(\bar{L}_1(0,1)\), we get (16). Equation (17) follows from (15) by setting \(g = f\).
For the proof of (18) we suppose f is continuous except at \(x_1,\ldots ,x_m \in (0,1)\) and set \(x_0=0\) and \(x_{m+1}=1\). Let \(\delta >0\) and
By (23),
Given \(\varepsilon >0\), we choose \(\delta \) so that \(\delta < \varepsilon / (6(m+2)\Vert f\Vert _\infty )\). Then, since the cardinality of \(I_n\) is at most \((m+2)(2q^n \delta + 1)\), the first sum in (27) is bounded by
for n sufficiently large. Moreover, for n large enough, the second sum in (27) is bounded by \(\varepsilon /2\) since f is uniformly continuous on \((0,1)\backslash \bigcup _{i=0}^{m+1} (x_i -\delta /2, x_i+\delta /2)\), which is a closed set. Thus, for large enough n, \(|f_n(x) - 1|< \varepsilon \) which gives (18) since \(\varepsilon >0\) is arbitrary.
To prove (19) we use (22) with g replaced by f. Then, for every \(A\in \mathcal B\),
We have
where the second identity follows from (12). This gives (19). \(\square \)
Remark 4
In continuation of Remark 2, by Theorem 4.1, \(d_{\textrm{TV}(P_n,\mu )} \rightarrow 0\) and under weak conditions the convergence is exponentially fast. Lemma 4.2 provides estimates of \(g_n(x)-1\).
5 So How Many Digits are Needed?
This section starts with some theoretical statistical considerations and continues then with Example 1 (Example 2 in Herbst et al. (2023) illustrates how the convergence rate in Theorem 4.1 depends on the smoothness of f). For another paper it would be interesting to study the practical applications of our results, including how to apply a missing data approach which treats N as a hidden variable, cf. Herbst et al. (2023).
Consider a parametric model for the probability distribution of X given by a parametric class of lower semi-continuous densities \(f_\theta \) where \(\theta \) is an unknown parameter. By Theorem 3.1 this specifies a parametric model for \((X_1,...,X_N)\) which is independent of \(T^N(X)\sim \mu \). In practice we cannot expect N to be observable, but let us imagine it is. Then, according to general statistical principles (see e.g. Barndorff-Nielsen 1978), statistical inference for \(\theta \) should be based on the sufficient statistic \((X_1,...,X_N)\), whilst \(T^N(X)\) is an ancillary statistic and hence contains no information about \(\theta \). Moreover, Theorem 4.1 ensures (without assuming that the densities are lower semi-continuous) that \(T^n(X)\) is approximately uniformly distributed. Hence, if n is ‘large enough’, nearly all information about \(\theta \) is contained in \((X_1,...,X_n)\).
The following Example 1 demonstrates how the results in Sections 3 and 4 can be used to quantify the number n of digits needed in order that \(N> n\) (conditioned or not on \(X=x\)) with a small probability or that \(d_{\textrm{TV}}(P_n,\mu )\) is small. See also Example 2 in Herbst et al. (2023).
Example 1
Any number \(y\not =0\) can uniquely be written as \(y=sq^k(y_0+y_f)\) where \(s=s(y)\in \{\pm 1\}\) is the sign of y, \(k=k(y)\in \mathbb Z\) determines the decimal point of y in base-q, \(y_0=y_0(y)\in \{1,...,q-1\}\) is the leading digit of y in base-q, and \(y_0+y_f\) is the so-called significand of y in base-q, where \(y_f=y_f(y)\in [0,1)\) is the fractional part of \(y_0+y_f\) in base-q. Correspondingly, consider any real-valued random variable \(Y\not =0\) (or just \(\mathrm P(Y=0)=0\)), so (almost surely) \(Y=Sq^K(X_0+X)\) where \(S=s(Y)\), \(K=k(Y)\), \(X_0=y_0(Y)\), and \(X=y_f(Y)\) are random variables. Let \(X_1,X_2,...\) be the digits of X in the base-q expansion, cf. (3). We call \(X_0,X_1,X_2,...\) the significant digits of Y in base-q. Now, suppose Y satisfies the extended Newcomb-Benford law, that is, the log-significand of Y in base-q, \(\log _q(X_0+X)\), is uniformly distributed on [0, 1) (Berger and Hill 2015, Theorem 4.2):
for \(0\le x\le 1\). This law applies to a wide variety of real datasets and a number of appealing properties are then satisfied by Y, see Hill (1998) and Berger and Hill (2015) and the references therein.
Combining (10) and (28) gives for \(n=0,1,...\) that
The tail probabilities \(\mathrm P(N>n)\) decrease quickly as n and q increase, see the left panel in Fig. 1 for plots of \(\mathrm P(N>n)\) against n for \(q=2,3,5,10\). The middle panel of Fig. 1 shows \(\mathrm P(N>1\,|\,X=x)\) as a function of x for \(q=10\). We see large fluctuations, with probabilities dropping to zero when approaching the right limit of the intervals \(I_{k;1}\), where \(\inf _{I_{k;1}} f \) is attained. To avoid these fluctuations, the right panel of Fig. 1 shows an upper bound on \(\mathrm P(N>n\,|\,X=x)\) as a function of x for \(q=10\) and \(n=0,1,2,3\). The upper bound is found by noting that on each \(I_{k;n}\), \(\mathrm P(N>n\,|\,X=x)\) is convex decreasing towards zero. Hence an upper bound is given by evaluating at the left end points and interpolating linearly. The plot shows that \(\mathrm P(N>n\,|\,X=x)\) is very close to zero for all x already for \(n=2\).
This is also in accordance with Theorem 4.1 stating that \(T^n(X)\) converges to a uniform distribution on [0, 1) and hence the first digit \(X_n\) of \(T^n(X)\) is approximately uniformly distributed on \(\{0,1,...,q-1\}\) when n is large. For \(n=1,2,...\) and \(x_n\in \{0,1,...,q-1\}\), we have
where \(\mathrm P(X_n=x_n)\) is a decreasing function of \(x_n\). The left part of Fig. 2 shows plots of \(\mathrm P(X_n=0)-\mathrm P(X_n=q-1)\) versus n for \(q=2,3,5,10\) indicating fast convergence to uniformity and that the convergence speed increases with q. The right part of Fig. 2 illustrates the stronger statement in (18) that the PDF \(f_n\) of \(T^n(X)\) converges uniformly to the uniform PDF.
Remark 5
In conclusion, Example 1 demonstrates that the answer to the title of our paper (‘How many digits are needed?’) of course depends much on q (the higher q is, the fewer digits are needed). Example 2 in Herbst et al. (2023) shows how the answer depends on how much f deviates from the uniform PDF on [0, 1) (the more skew f is, the more digits are needed).
Availability of Data and Material
Not applicable.
References
Barndorff-Nielsen OE (1978) Information and exponential families in statistical theory. Wiley, Chichester
Berger A, Hill TP (2015) An introduction to Benford’s Law. Princeton University Press, Princeton, NJ
Borel E (1909) Les probabilités dénombrables et leurs applications arithmétiques. Rendiconti del Circolo Matematico di Palermo 27:247–271
Cornean H, Herbst I, Møller J, Støttrup BB, Studsgaard KS (2022) Characterization of random variables with stationary digits. J Appl Prob 59:931–947
Dajani K, Kalle C (2021) A first course in Ergodic theory. Chapman and Hall/CRC
Gibbs AL, Su FE (2002) On choosing and bounding probability metrics. Int Stat Rev 70:419–435
Herbst I, Møller J, Svane AM (2023) How many digits are needed? Preprint at http://arxiv.org/abs/2307.06685
Hill TP (1998) The first digit phenomenon. American Scientist 86:358–363
Schweiger F (1995) Ergodic theory of fibered systems and metric number theory. Clarendon Press, Oxford
Tsybakov AB (2009) Introduction to nonparametric estimation. Springer Series in Statistics, Springer - Verlag, New York
Acknowledgements
The authors thank the anonymous reviewers for their thorough reading and their useful comments and suggestions.
Funding
Open access funding provided by Aalborg University Supported by The Danish Council for Independent Research – Natural Sciences, grant DFF – 10.46540/2032-00005B.
Author information
Authors and Affiliations
Contributions
All authors contributed to all parts of this paper and reviewed the final version.
Corresponding author
Ethics declarations
Competing Interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Herbst, I.W., Møller, J. & Svane, A.M. How Many Digits are Needed?. Methodol Comput Appl Probab 26, 5 (2024). https://doi.org/10.1007/s11009-024-10073-2
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11009-024-10073-2
Keywords
- Asymptotic distribution
- Coupling
- Exponential convergence rate
- Extended Newcomb-Benford law
- Remainder of a digit expansion
- Total variation distance