Keywords

1 Definitions

Let A be the finite alphabet \(\{0,1,\dots ,q-1\}\), If \(w\in A^{{\mathbb N}}\), and L(w) the set of finite factors of w; for any non-negative integer n, we write \(L_n(w)=L(w)\cap A^n\). The classical complexity function is described for example in [2].

Definition 1

The complexity function of \(w\in A^{{\mathbb N}}\) is defined for any non-negative integer n by \(p_w(n)=|L_n(w)|\).

Our work concerns the study of infinite words w the complexity function of which is bounded by a given function f from \({\mathbb N}\) to \({\mathbb R}^{+}\). More precisely, if f is such a function, we put

$$\begin{aligned} W(f)=\{w\in A^{{\mathbb N}}, p_w(n)\le f(n), \forall n \in {\mathbb N}\}. \end{aligned}$$

Definition 2

If f is a function from \({\mathbb N}\) to \({\mathbb R}^{+}\), we call exponential rate of growth of f the quantity

$$\begin{aligned} E_0(f)=\lim \limits _{n\rightarrow \infty } \inf \frac{1}{n}\log f(n) \end{aligned}$$

and word entropy of f the quantity

$$\begin{aligned} E_W (f) =\sup _{\begin{array}{c} w \in W(f) \end{array}} E_0(p_w). \end{aligned}$$

Of course, if \(E_0=0\) then \(E_W\) is zero also. Thus the study of \(E_W\) is interesting only when f has exponential growth: we are in the little-explored field of word combinatorics in positive entropy, or exponential complexity. For an equivalent theory in zero entropy, see [3, 4].

2 First Properties of \(E_0\) and \(E_W\)

The basic study of these quantities is carried out in [5], where the following results are proved.

If f is itself a complexity function (i.e. \(f=p_w\) for some \(w \in A^{\mathbb N}\)), then \(E_W (f)=E_0 (f)\). But in general \(E_W\) may be much smaller than \(E_0\).

We define mild regularity conditions for f: f is said to satisfy \((\mathcal {C})\) if the sequence \(\left( f(n)\right) _{n\ge 1}\) is strictly increasing, there exists \(n_0\in {\mathbb N}\) such that , \(f(n+1) \le f(1) f(n) \), and the sequence \(\left( \frac{1}{n} \log f(n)\right) _{n\ge 1}\) converges.

But for each \(1<\theta \le q\), and \(n_0 \in {\mathbb N}\) such that \(\theta ^{n_0+1} > n_0+q-1\), we define the function f by \(f(1)=q\), \(f(n) = n+q-1\) for \(1 \le n \le n_0\) and \(f(n)=\theta ^n\) for \(n > n_0\). We have \(E_0(f) = \log \theta \) and it is proved that

$$\begin{aligned} E_W(f) \le \frac{1}{n_0} \log (n_0+q-1), \end{aligned}$$

which can be made arbitrarily small, independently of \(\theta \), while f satisfies \((\mathcal {C})\).

We define stronger regularity conditions for f.

Definition 3

We say that a function f from \({\mathbb N}\) to \({\mathbb R}^{+}\) satisfies the conditions \((\mathcal C^*)\) if (i) for any \(n \in {\mathbb N}\) we have \(f(n+1) > f(n) \ge n+1\); (ii) for any \((n, n' ) \in {\mathbb N}^2\) we have \(f(n+n') \le f(n) f(n') \).

But even with \((\mathcal C^*)\) we may have \(E_W(f) < E_0(f)\). Indeed, let f be the function defined by \(f(n)=\lceil 3^{n/2} \rceil \) for any \(n \in {\mathbb N}\). Then it is easy to check that f satisfies conditions \((\mathcal C^*)\) and that \(E_0(f)=\lim \limits _{n\rightarrow \infty }\frac{1}{n} \log f(n) =\log (\sqrt{3})\). On the other hand, we have \(f(1)=2\), \(f(2)=3\); thus the language has no 00 or no 11, and this implies that \(E_W(f) \le \log (\frac{1+\sqrt{5}}{2})<E_0(f)\).

At least, under these conditions, we have the important

Theorem 4

If f is a function from \({\mathbb N}\) to \({\mathbb R}^{+}\) satisfying the conditions \((\mathcal C*)\), then \(E_W(f)> \frac{1}{2} E_0(f)\).

It is also shown in [5] that the constant \(\frac{1}{2}\) is optimal.

Finally, it will be useful to know that

Theorem 5

For any function f from \({\mathbb N}\) to \({\mathbb R}^{+}\), there exists \(w \in W(f)\) such that for any \( n \in {\mathbb N}\) we have \(p_w(n) \ge \exp (E_W(f) n)\).

3 Algorithm

In general \(E_W(f)\) is much more difficult to compute than \(E_0(f)\); now we will give an algorithm which allows us to estimate with arbitrary precision \(E_W(f)\) from finitely many values of f, if we know already \(E_0(f)\) and have some information on the speed with which this limit is approximated.

We assume that f satisfies conditions \(\mathcal C^*\). We don’t loose too much generality with this assumption, since if the function f which satisfies the weaker conditions \(\mathcal C\), we can replace it by the function \(\tilde{f}\) given recursively by

$$\begin{aligned} \tilde{f}(n):=\min \{f(n), \min _{1 \le k <n}\tilde{f}(k) \tilde{f}(n-k) \}, \end{aligned}$$

which satisfies conditions \(\mathcal C^*\), such that \(\tilde{f}(n) \le f(n), \forall n \in {\mathbb N}\) and \(W(\tilde{f})=W(f)\).

Theorem 6

There is an algorithm which gives, starting from f and \({\varepsilon }\), a quantity h such that \((1-{\varepsilon })h\le E_W(f) \le h\). h depends explicitely on \({\varepsilon }\), \(E_0(f)\), N, f(1), ..., f(N), for an integer N which depends explicitely on \({\varepsilon }\), \(E_0(f)\), and an integer \(n_0\), larger than an explicit function of \({\varepsilon }\) and \(E_0(f)\), and such that

$$\begin{aligned} \frac{\log f(n)}{n}< (1+\frac{E_0(f) {\varepsilon }}{210(4+2E_0(f))})E_0(f), \quad \text{ for }\quad n_0 \le n <2 n_0. \end{aligned}$$

We shall now give the algorithm. f is given and henceforth we omit to mention it in \(E_0(f)\) and \(E_W(f)\). Also given is \({\varepsilon }\in (0,1)\).

Description of the algorithm

  • Let

    $$\begin{aligned} \delta :=\frac{E_0 {\varepsilon }}{105(4+2E_0)}<\frac{{\varepsilon }}{210}. \end{aligned}$$
  • Let

    $$\begin{aligned} K:=\lceil \delta ^{-1} \rceil +1. \end{aligned}$$
  • Choose a positive integer

    $$\begin{aligned} n_0\ge K\vee \frac{4K^2}{420^3E_0} \end{aligned}$$

    such that

    $$\begin{aligned} \frac{\log f(n)}{n} < (1+\frac{\delta }{2})E_0, \forall n \ge n_0; \end{aligned}$$

    in view of conditions \(\mathcal C^*\), this last condition is equivalent to \(\frac{\log f(n)}{n}< (1+\frac{\delta }{2})E_0, n_0 \le n <2 n_0\).

  • Choose intervals so large that all the lengths of words we manipulate stay in one of them. Namely, for each \(t \ge 0\), let

    $$\begin{aligned} n_{t+1}:= \exp (K((1+\delta )^2E_0n_t+E_0)). \end{aligned}$$

    We take

    $$\begin{aligned} N:=n_K. \end{aligned}$$
  • Choose a set \(Y \subset A^N\): for each possible Y, we define \(L_n(Y)=\cup _{{\gamma }\in Y}L_({\gamma })\), \(q_n(Y):=|L_n(Y)|\), for \(1 \le n \le N\). We look at those Y for which \(q_n(Y) \le f(n), \forall n \le N\), and choose one among them such that

    $$\begin{aligned} \min _{1 \le n \le N}\frac{\log q_n(Y)}{n} \end{aligned}$$

    is maximum.

  • By Lemma 7 below, on one of the large intervals we have defined, namely \([n_r,n_{r+1}]\), \(\frac{\log q_n(Y)}{n}\) will be almost constant. Let

    $$\begin{aligned} h:=\frac{\log q_{n_{r}}(Y)}{n_{r}}. \end{aligned}$$

Here is the lemma we needed; henceforth, Y is fixed and we omit to mention it in the \(q_n(Y)\):

Lemma 7

There exists \(r<K\), such that

$$\begin{aligned} \frac{\log q_{n_r}}{n_{r}} < (1+\delta ) \frac{\log q_{n_{r+1}}}{{n_{r+1}}}. \end{aligned}$$

Proof

Otherwise \(\frac{\log q_{n_0}}{n_0}\ge (1+\delta )^K \frac{\log q_{n_K}}{{n_K}}\): as \(K>\frac{1}{\delta }\), \((1+\delta )^K\) would be close to e for \(\delta \) small enough, and is larger than \(\frac{9}{4}\) as \(\delta <\frac{1}{2}\); thus, as \(\frac{\log q_{n_K}}{n_K} \ge E_W\) by the proof of Proposition 8 below, we have \(\frac{\log q_{n_0}}{n_0}\ge \frac{9}{4}E_W\), but \(q_{n_0}\le f(n_0)\) hence \(\frac{\log q_{n_0}}{n_0} < (1+\frac{\delta }{2})E_0\), and this contradicts \(E_0\le 2E_W\), which is true by Theorem 4.

We prove now that indeed h is a good approximation of the word entropy.

Proposition 8

$$\begin{aligned} h \ge E_W. \end{aligned}$$

Proof

We prove that

$$\begin{aligned} \min _{1 \le n \le N}\frac{\log q_n}{n} \ge E_W. \end{aligned}$$

We know by Theorem 5 that there is \(\hat{w} \in W(f)\) with \(p_n(\hat{w}) \ge \exp (E_W n)\), for all \(n \ge 1\). For such a word \(\hat{w}\), let \(X:=L_N(\hat{w})\subset A^N\). We have, for each n with \(1\le n \le N\), \(L_n(X)=L_n(\hat{w})\) and \(f(n)\ge \# L_n(\hat{w})=p_n(\hat{w})\ge \exp (E_W n)\). Thus X is one of the possible Y, and the result follows from the maximality of \(\min _{1 \le n \le N}\frac{\log q_n}{n}\).

What remains to prove is the following proposition (which, understandably, does not use the maximality of \(\min _{1 \le n \le N}\frac{\log q_n}{n}\)).

Proposition 9

$$\begin{aligned} (1-{\varepsilon })h\le E_W. \end{aligned}$$

Proof

Our strategy is to build a word w such that, for all \(n\ge 1\),

$$\begin{aligned} \exp ((1-{\varepsilon })hn)\le p_n(w)\le f(n), \end{aligned}$$

which gives the conclusion by definition of \(E_W\). To build the word w, we shall define an integer m, and build successive subsets of \(L_m(Y)\); for such a subset Z, we order it (lexicographically for example) and define w(Z) to be the Champernowne word on Z: namely, if \(Z=\{{\beta }_1,{\beta }_2,...,{\beta }_t\}\), we build the infinite word

$$\begin{aligned} w(Z):={\beta }_1{\beta }_2\dots {\beta }_t{\beta }_1{\beta }_1{\beta }_1{\beta }_2{\beta }_1{\beta }_3\dots {\beta }_{t-1}{\beta }_t{\beta }_1 {\beta }_1{\beta }_1\dots {\beta }_t{\beta }_t{\beta }_t\dots \end{aligned}$$

made by concatenation of all words in Z followed by the concatenations of all pairs of words of Z followed by the concatenations of all triples of words of Z, etc.

The word w(Z) will satisfy \(\exp ((1-{\varepsilon })hn)\le p_n(w(Z))\) for all n as soon as

$$\begin{aligned} |Z| \ge \exp ((1- {\varepsilon })h m), \end{aligned}$$

since, for every positive integer k, we will have at least \(|Z|^k\) factors of length km in w(Z).

The successive (decreasing) subsets Z of \(L_m(Y)\) we build will all have cardinality at least \(\exp ((1- {\varepsilon })h m)\), and the words w(Z) will satisfy \(p_n(w(Z))\le f(n)\) for n in an interval which will increase at each new set Z we build, and ultimately contains all the integers.

We give only the main ideas of the remaining proof. In the first stage we define two lengths of words, \(\hat{n}\) and \(m>\frac{\hat{n}}{2{\varepsilon }}\), which will be both in the interval \([n_r,n_{r+1}]\), and a set \(Z_1\) of words of length m of the form \(\gamma \theta \), for words \({\gamma }\) of length \(\hat{n}\), such that the word \(\gamma \theta \gamma \) is in \(L_{m+\hat{n}}(Y)\). This is done by looking precisely at twin occurrences of words.

Let \(\tilde{\varepsilon }= \frac{{\varepsilon }}{15}=\frac{7(4+2E_0)\delta }{E_0}>14 \delta \); then we can get such a set \(Z_1\) with \(|Z_1|\ge \exp ((1-\tilde{\varepsilon })h (m+\hat{n}))\).

In the second stage, we define a new set \(Z_2\subset Z_1\) in which all the words have the same prefix \({\gamma }_1\) of length \(6\tilde{\varepsilon }hm\), and all the words have the same suffix \({\gamma }_2\) of length \(6\tilde{\varepsilon }hm\), with \(|Z_2|\ge |Z_1|\exp (-12\tilde{\varepsilon }h m-2\delta h\hat{n})\), and \(2\delta h\hat{n} \le (1-\tilde{\varepsilon })\hat{n}\), thus

$$\begin{aligned} |Z_2|\ge \exp ((1-13\tilde{\varepsilon })h m). \end{aligned}$$

As a consequence of the definition of \(Z_2\), all words of \(Z_2\) have the same prefix of length \(\hat{n}\), which is a prefix \({\gamma }_0\) of \({\gamma }_1\); as \(Z_2\) is included in \(Z_1\), any word of \(Z_2\) is of the form \(\gamma _0\theta \), amd the word \(\gamma _0\theta \gamma _0\) is in \(L_{m+\hat{n}}(Y)\).

At this stage we can prove

Claim

\(p_{w(Z_2)}(n)\le f(n)\) for all \(1\le n\le \hat{n}+1\).

Let us shrink again our set of words.

Lemma 10

For a given subset Z of \(Z_2\), there exists \(Z'\subset Z\), \(|Z'|\ge (1-\exp (-(j-1)\frac{E_0}{2}))^j|Z|\), such that the total number of factors of length \(\hat{n}+j\) of all words \({\gamma }_0\theta {\gamma }_0\) such that \({\gamma }_0\theta \) is in \(Z'\) is at most \(f(\hat{n}+j)-j\).

We start from \(Z_2\) and apply successively Lemma 10 from \(j=2\) to \(j=6\tilde{\varepsilon }m\), getting \(6\tilde{\varepsilon }m-1\) successive sets \(Z'\); at the end, we get a set \(Z_3\) such that the total number of factors of length \(\hat{n}+j\) of words \({\gamma }_0\theta {\gamma }_0\) for \({\gamma }_0\theta \) in \(Z_3\) is at most \(f(\hat{n}+j)-j\) for \(j=2,\ldots , 6\tilde{\varepsilon }m\), and \(\frac{|Z_3|}{|Z_2|}\) is at least

$$\begin{aligned} \varPi _{2 \le j \le 6 \tilde{\varepsilon }m-\hat{n}}(1-\exp (-(j-1)\frac{E_0}{2}))^j \ge \varPi _{j \ge 2}(1-\exp (-(j-1)\frac{E_0}{2}))^j, \end{aligned}$$

which implies after computations that

$$\begin{aligned} |Z_3| \ge \exp ((1-14\tilde{\varepsilon })h m). \end{aligned}$$

We can now bound the number of short factors by using the factors we have just deleted and properties of \({\gamma }_0\), \({\gamma }_1\) and \({\gamma }_2\).

Claim

\(p_{w(Z_3)}(n)\le f(n)\) for all \(1\le n\le 6\tilde{\varepsilon }m\).

We shrink our set again.

Let \(m \ge n > 6 \tilde{\varepsilon }m\); in average a factor of length n of a word in \(Z_3\) occurs in at most \(\frac{m|Z_3|}{f(n)}\) elements of \(Z_3\). We consider the \(\frac{f(n)}{mn^2}\) factors of length n which occur the least often. In total, these factors occur in at most \(\frac{m|Z_3|}{f(n)}\frac{f(n)}{mn^2}=\frac{|Z_3|}{n^2}\) elements of \(Z_3\). We remove these words from \(Z_3\), for all \(m \ge n > 6 \tilde{\varepsilon }m\), obtaining a set \(Z_4\) with \(|Z_4| \ge \exp ((1-15\tilde{\varepsilon })h m)\).

We can now control medium length factors, using again the missing factors we have just created, and \({\gamma }_1\) and \({\gamma }_2\), but not \({\gamma }_0\).

Claim

\(p_{w (Z_4)}(n)\le f(n)\) for all \(1\le n\le m\).

Finally we put \(Z_5=Z_4\) if \(|Z_4|\le \exp ((1-4\tilde{\varepsilon })hm)\), otherwise we take for \(Z_5\) any subset of \(Z_4\) with \(\lceil \exp ((1-4\tilde{\varepsilon })hm)\rceil \) elements. In both cases we have

$$\begin{aligned} |Z_5| \ge \exp ((1- {\varepsilon })h m). \end{aligned}$$

For the long factors, we use mainly the fact that there are many missing factors of length m, but we need also some help from \({\gamma }_1\) and \({\gamma }_2\).

Claim

\(p_{w(Z_5)}(n)\le f(n)\) for all n.

In view of the considerations at the beginning of the proof of Proposition 9, Claim 3 completes the proof of that proposition, and thus of Theorem 6.

4 Application

We define

$$\begin{aligned} C(f)=\{x = \sum \limits _{n \ge 0} \frac{w_n}{q^{n+ 1}} \in [0,1], w(x) = {w_0}{w_1}\cdots {w_n}\cdots \in W(f)\}. \end{aligned}$$

We are interested in the Hausdorff dimensions of this set, see [1] for definitions; indeed, the main motivation for studying the word entropy is Theorem 4.8 of [5]:

Theorem 11

  The Hausdorff dimension of C(f) is equal to \(E_W(f)/\log q\).