Abstract
The complexity function of an infinite word counts the number of its factors. For any positive function f, its exponential rate of growth \(E_0(f)\) is \(\lim \limits _{n\rightarrow \infty } \inf \frac{1}{n}\log f(n)\). We define a new quantity, the word entropy \(E_W(f)\), as the maximal exponential growth rate of a complexity function smaller than f. This is in general smaller than \(E_0(f)\), and more difficult to compute; we give an algorithm to estimate it. The quantity \(E_W(f)\) is used to compute the Hausdorff dimension of the set of real numbers whose expansions in a given base have complexity bounded by f.
Access provided by CONRICYT-eBooks. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Definitions
Let A be the finite alphabet \(\{0,1,\dots ,q-1\}\), If \(w\in A^{{\mathbb N}}\), and L(w) the set of finite factors of w; for any non-negative integer n, we write \(L_n(w)=L(w)\cap A^n\). The classical complexity function is described for example in [2].
Definition 1
The complexity function of \(w\in A^{{\mathbb N}}\) is defined for any non-negative integer n by \(p_w(n)=|L_n(w)|\).
Our work concerns the study of infinite words w the complexity function of which is bounded by a given function f from \({\mathbb N}\) to \({\mathbb R}^{+}\). More precisely, if f is such a function, we put
Definition 2
If f is a function from \({\mathbb N}\) to \({\mathbb R}^{+}\), we call exponential rate of growth of f the quantity
and word entropy of f the quantity
Of course, if \(E_0=0\) then \(E_W\) is zero also. Thus the study of \(E_W\) is interesting only when f has exponential growth: we are in the little-explored field of word combinatorics in positive entropy, or exponential complexity. For an equivalent theory in zero entropy, see [3, 4].
2 First Properties of \(E_0\) and \(E_W\)
The basic study of these quantities is carried out in [5], where the following results are proved.
If f is itself a complexity function (i.e. \(f=p_w\) for some \(w \in A^{\mathbb N}\)), then \(E_W (f)=E_0 (f)\). But in general \(E_W\) may be much smaller than \(E_0\).
We define mild regularity conditions for f: f is said to satisfy \((\mathcal {C})\) if the sequence \(\left( f(n)\right) _{n\ge 1}\) is strictly increasing, there exists \(n_0\in {\mathbb N}\) such that , \(f(n+1) \le f(1) f(n) \), and the sequence \(\left( \frac{1}{n} \log f(n)\right) _{n\ge 1}\) converges.
But for each \(1<\theta \le q\), and \(n_0 \in {\mathbb N}\) such that \(\theta ^{n_0+1} > n_0+q-1\), we define the function f by \(f(1)=q\), \(f(n) = n+q-1\) for \(1 \le n \le n_0\) and \(f(n)=\theta ^n\) for \(n > n_0\). We have \(E_0(f) = \log \theta \) and it is proved that
which can be made arbitrarily small, independently of \(\theta \), while f satisfies \((\mathcal {C})\).
We define stronger regularity conditions for f.
Definition 3
We say that a function f from \({\mathbb N}\) to \({\mathbb R}^{+}\) satisfies the conditions \((\mathcal C^*)\) if (i) for any \(n \in {\mathbb N}\) we have \(f(n+1) > f(n) \ge n+1\); (ii) for any \((n, n' ) \in {\mathbb N}^2\) we have \(f(n+n') \le f(n) f(n') \).
But even with \((\mathcal C^*)\) we may have \(E_W(f) < E_0(f)\). Indeed, let f be the function defined by \(f(n)=\lceil 3^{n/2} \rceil \) for any \(n \in {\mathbb N}\). Then it is easy to check that f satisfies conditions \((\mathcal C^*)\) and that \(E_0(f)=\lim \limits _{n\rightarrow \infty }\frac{1}{n} \log f(n) =\log (\sqrt{3})\). On the other hand, we have \(f(1)=2\), \(f(2)=3\); thus the language has no 00 or no 11, and this implies that \(E_W(f) \le \log (\frac{1+\sqrt{5}}{2})<E_0(f)\).
At least, under these conditions, we have the important
Theorem 4
If f is a function from \({\mathbb N}\) to \({\mathbb R}^{+}\) satisfying the conditions \((\mathcal C*)\), then \(E_W(f)> \frac{1}{2} E_0(f)\).
It is also shown in [5] that the constant \(\frac{1}{2}\) is optimal.
Finally, it will be useful to know that
Theorem 5
For any function f from \({\mathbb N}\) to \({\mathbb R}^{+}\), there exists \(w \in W(f)\) such that for any \( n \in {\mathbb N}\) we have \(p_w(n) \ge \exp (E_W(f) n)\).
3 Algorithm
In general \(E_W(f)\) is much more difficult to compute than \(E_0(f)\); now we will give an algorithm which allows us to estimate with arbitrary precision \(E_W(f)\) from finitely many values of f, if we know already \(E_0(f)\) and have some information on the speed with which this limit is approximated.
We assume that f satisfies conditions \(\mathcal C^*\). We don’t loose too much generality with this assumption, since if the function f which satisfies the weaker conditions \(\mathcal C\), we can replace it by the function \(\tilde{f}\) given recursively by
which satisfies conditions \(\mathcal C^*\), such that \(\tilde{f}(n) \le f(n), \forall n \in {\mathbb N}\) and \(W(\tilde{f})=W(f)\).
Theorem 6
There is an algorithm which gives, starting from f and \({\varepsilon }\), a quantity h such that \((1-{\varepsilon })h\le E_W(f) \le h\). h depends explicitely on \({\varepsilon }\), \(E_0(f)\), N, f(1), ..., f(N), for an integer N which depends explicitely on \({\varepsilon }\), \(E_0(f)\), and an integer \(n_0\), larger than an explicit function of \({\varepsilon }\) and \(E_0(f)\), and such that
We shall now give the algorithm. f is given and henceforth we omit to mention it in \(E_0(f)\) and \(E_W(f)\). Also given is \({\varepsilon }\in (0,1)\).
Description of the algorithm
-
Let
$$\begin{aligned} \delta :=\frac{E_0 {\varepsilon }}{105(4+2E_0)}<\frac{{\varepsilon }}{210}. \end{aligned}$$ -
Let
$$\begin{aligned} K:=\lceil \delta ^{-1} \rceil +1. \end{aligned}$$ -
Choose a positive integer
$$\begin{aligned} n_0\ge K\vee \frac{4K^2}{420^3E_0} \end{aligned}$$such that
$$\begin{aligned} \frac{\log f(n)}{n} < (1+\frac{\delta }{2})E_0, \forall n \ge n_0; \end{aligned}$$in view of conditions \(\mathcal C^*\), this last condition is equivalent to \(\frac{\log f(n)}{n}< (1+\frac{\delta }{2})E_0, n_0 \le n <2 n_0\).
-
Choose intervals so large that all the lengths of words we manipulate stay in one of them. Namely, for each \(t \ge 0\), let
$$\begin{aligned} n_{t+1}:= \exp (K((1+\delta )^2E_0n_t+E_0)). \end{aligned}$$We take
$$\begin{aligned} N:=n_K. \end{aligned}$$ -
Choose a set \(Y \subset A^N\): for each possible Y, we define \(L_n(Y)=\cup _{{\gamma }\in Y}L_({\gamma })\), \(q_n(Y):=|L_n(Y)|\), for \(1 \le n \le N\). We look at those Y for which \(q_n(Y) \le f(n), \forall n \le N\), and choose one among them such that
$$\begin{aligned} \min _{1 \le n \le N}\frac{\log q_n(Y)}{n} \end{aligned}$$is maximum.
-
By Lemma 7 below, on one of the large intervals we have defined, namely \([n_r,n_{r+1}]\), \(\frac{\log q_n(Y)}{n}\) will be almost constant. Let
$$\begin{aligned} h:=\frac{\log q_{n_{r}}(Y)}{n_{r}}. \end{aligned}$$
Here is the lemma we needed; henceforth, Y is fixed and we omit to mention it in the \(q_n(Y)\):
Lemma 7
There exists \(r<K\), such that
Proof
Otherwise \(\frac{\log q_{n_0}}{n_0}\ge (1+\delta )^K \frac{\log q_{n_K}}{{n_K}}\): as \(K>\frac{1}{\delta }\), \((1+\delta )^K\) would be close to e for \(\delta \) small enough, and is larger than \(\frac{9}{4}\) as \(\delta <\frac{1}{2}\); thus, as \(\frac{\log q_{n_K}}{n_K} \ge E_W\) by the proof of Proposition 8 below, we have \(\frac{\log q_{n_0}}{n_0}\ge \frac{9}{4}E_W\), but \(q_{n_0}\le f(n_0)\) hence \(\frac{\log q_{n_0}}{n_0} < (1+\frac{\delta }{2})E_0\), and this contradicts \(E_0\le 2E_W\), which is true by Theorem 4.
We prove now that indeed h is a good approximation of the word entropy.
Proposition 8
Proof
We prove that
We know by Theorem 5 that there is \(\hat{w} \in W(f)\) with \(p_n(\hat{w}) \ge \exp (E_W n)\), for all \(n \ge 1\). For such a word \(\hat{w}\), let \(X:=L_N(\hat{w})\subset A^N\). We have, for each n with \(1\le n \le N\), \(L_n(X)=L_n(\hat{w})\) and \(f(n)\ge \# L_n(\hat{w})=p_n(\hat{w})\ge \exp (E_W n)\). Thus X is one of the possible Y, and the result follows from the maximality of \(\min _{1 \le n \le N}\frac{\log q_n}{n}\).
What remains to prove is the following proposition (which, understandably, does not use the maximality of \(\min _{1 \le n \le N}\frac{\log q_n}{n}\)).
Proposition 9
Proof
Our strategy is to build a word w such that, for all \(n\ge 1\),
which gives the conclusion by definition of \(E_W\). To build the word w, we shall define an integer m, and build successive subsets of \(L_m(Y)\); for such a subset Z, we order it (lexicographically for example) and define w(Z) to be the Champernowne word on Z: namely, if \(Z=\{{\beta }_1,{\beta }_2,...,{\beta }_t\}\), we build the infinite word
made by concatenation of all words in Z followed by the concatenations of all pairs of words of Z followed by the concatenations of all triples of words of Z, etc.
The word w(Z) will satisfy \(\exp ((1-{\varepsilon })hn)\le p_n(w(Z))\) for all n as soon as
since, for every positive integer k, we will have at least \(|Z|^k\) factors of length km in w(Z).
The successive (decreasing) subsets Z of \(L_m(Y)\) we build will all have cardinality at least \(\exp ((1- {\varepsilon })h m)\), and the words w(Z) will satisfy \(p_n(w(Z))\le f(n)\) for n in an interval which will increase at each new set Z we build, and ultimately contains all the integers.
We give only the main ideas of the remaining proof. In the first stage we define two lengths of words, \(\hat{n}\) and \(m>\frac{\hat{n}}{2{\varepsilon }}\), which will be both in the interval \([n_r,n_{r+1}]\), and a set \(Z_1\) of words of length m of the form \(\gamma \theta \), for words \({\gamma }\) of length \(\hat{n}\), such that the word \(\gamma \theta \gamma \) is in \(L_{m+\hat{n}}(Y)\). This is done by looking precisely at twin occurrences of words.
Let \(\tilde{\varepsilon }= \frac{{\varepsilon }}{15}=\frac{7(4+2E_0)\delta }{E_0}>14 \delta \); then we can get such a set \(Z_1\) with \(|Z_1|\ge \exp ((1-\tilde{\varepsilon })h (m+\hat{n}))\).
In the second stage, we define a new set \(Z_2\subset Z_1\) in which all the words have the same prefix \({\gamma }_1\) of length \(6\tilde{\varepsilon }hm\), and all the words have the same suffix \({\gamma }_2\) of length \(6\tilde{\varepsilon }hm\), with \(|Z_2|\ge |Z_1|\exp (-12\tilde{\varepsilon }h m-2\delta h\hat{n})\), and \(2\delta h\hat{n} \le (1-\tilde{\varepsilon })\hat{n}\), thus
As a consequence of the definition of \(Z_2\), all words of \(Z_2\) have the same prefix of length \(\hat{n}\), which is a prefix \({\gamma }_0\) of \({\gamma }_1\); as \(Z_2\) is included in \(Z_1\), any word of \(Z_2\) is of the form \(\gamma _0\theta \), amd the word \(\gamma _0\theta \gamma _0\) is in \(L_{m+\hat{n}}(Y)\).
At this stage we can prove
Claim
\(p_{w(Z_2)}(n)\le f(n)\) for all \(1\le n\le \hat{n}+1\).
Let us shrink again our set of words.
Lemma 10
For a given subset Z of \(Z_2\), there exists \(Z'\subset Z\), \(|Z'|\ge (1-\exp (-(j-1)\frac{E_0}{2}))^j|Z|\), such that the total number of factors of length \(\hat{n}+j\) of all words \({\gamma }_0\theta {\gamma }_0\) such that \({\gamma }_0\theta \) is in \(Z'\) is at most \(f(\hat{n}+j)-j\).
We start from \(Z_2\) and apply successively Lemma 10 from \(j=2\) to \(j=6\tilde{\varepsilon }m\), getting \(6\tilde{\varepsilon }m-1\) successive sets \(Z'\); at the end, we get a set \(Z_3\) such that the total number of factors of length \(\hat{n}+j\) of words \({\gamma }_0\theta {\gamma }_0\) for \({\gamma }_0\theta \) in \(Z_3\) is at most \(f(\hat{n}+j)-j\) for \(j=2,\ldots , 6\tilde{\varepsilon }m\), and \(\frac{|Z_3|}{|Z_2|}\) is at least
which implies after computations that
We can now bound the number of short factors by using the factors we have just deleted and properties of \({\gamma }_0\), \({\gamma }_1\) and \({\gamma }_2\).
Claim
\(p_{w(Z_3)}(n)\le f(n)\) for all \(1\le n\le 6\tilde{\varepsilon }m\).
We shrink our set again.
Let \(m \ge n > 6 \tilde{\varepsilon }m\); in average a factor of length n of a word in \(Z_3\) occurs in at most \(\frac{m|Z_3|}{f(n)}\) elements of \(Z_3\). We consider the \(\frac{f(n)}{mn^2}\) factors of length n which occur the least often. In total, these factors occur in at most \(\frac{m|Z_3|}{f(n)}\frac{f(n)}{mn^2}=\frac{|Z_3|}{n^2}\) elements of \(Z_3\). We remove these words from \(Z_3\), for all \(m \ge n > 6 \tilde{\varepsilon }m\), obtaining a set \(Z_4\) with \(|Z_4| \ge \exp ((1-15\tilde{\varepsilon })h m)\).
We can now control medium length factors, using again the missing factors we have just created, and \({\gamma }_1\) and \({\gamma }_2\), but not \({\gamma }_0\).
Claim
\(p_{w (Z_4)}(n)\le f(n)\) for all \(1\le n\le m\).
Finally we put \(Z_5=Z_4\) if \(|Z_4|\le \exp ((1-4\tilde{\varepsilon })hm)\), otherwise we take for \(Z_5\) any subset of \(Z_4\) with \(\lceil \exp ((1-4\tilde{\varepsilon })hm)\rceil \) elements. In both cases we have
For the long factors, we use mainly the fact that there are many missing factors of length m, but we need also some help from \({\gamma }_1\) and \({\gamma }_2\).
Claim
\(p_{w(Z_5)}(n)\le f(n)\) for all n.
In view of the considerations at the beginning of the proof of Proposition 9, Claim 3 completes the proof of that proposition, and thus of Theorem 6.
4 Application
We define
We are interested in the Hausdorff dimensions of this set, see [1] for definitions; indeed, the main motivation for studying the word entropy is Theorem 4.8 of [5]:
Theorem 11
The Hausdorff dimension of C(f) is equal to \(E_W(f)/\log q\).
References
Falconer, K.: Fractal Geometry: Mathematical Foundations and Applications. Wiley, Chichester (1990)
Ferenczi, S.: Complexity of sequences and dynamical systems. Discrete Math. 206(1–3), 145–154 (1999). http://dx.doi.org/10.1016/S0012-365X(98)00400-2, (Tiruchirappalli 1996)
Mauduit, C., Moreira, C.G.: Complexity of infinite sequences with zero entropy. Acta Arith. 142(4), 331–346 (2010). http://dx.doi.org/10.4064/aa142-4-3
Mauduit, C., Moreira, C.G.: Generalized Hausdorff dimensions of sets of real numbers with zero entropy expansion. Ergodic Theor. Dynam. Syst. 32(3), 1073–1089 (2012). http://dx.doi.org/10.1017/S0143385711000137
Mauduit, C., Moreira, C.G.: Complexity and fractal dimensions for infinite sequences with positive entropy (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Ferenczi, S., Mauduit, C., Moreira, C.G. (2017). The Word Entropy and How to Compute It. In: Brlek, S., Dolce, F., Reutenauer, C., Vandomme, É. (eds) Combinatorics on Words. WORDS 2017. Lecture Notes in Computer Science(), vol 10432. Springer, Cham. https://doi.org/10.1007/978-3-319-66396-8_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-66396-8_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66395-1
Online ISBN: 978-3-319-66396-8
eBook Packages: Computer ScienceComputer Science (R0)