Some Applications of Algebra to Automatic Sequences

Bell, Jason

doi:10.1007/978-3-319-69152-7_4

Jason Bell³

Part of the book series: Trends in Mathematics ((TM))

1548 Accesses

Abstract

We give an overview of the theory of rings satisfying a polynomial identity and use this to give a proof of a characterization due to Berstel and Reutenauer of automatic and regular sequences in terms of two properties, which we call the shuffle property and the power property. These properties show that if one views an automatic sequence f as a map on a free monoid on k-letters to a finite subset of a ring, then the values of f are closely related to values of f on related words obtained by permuting letters of the word. We use this characterization to give answers to three questions from Allouche and Shallit, two of which have not appeared in the literature. The final part of the chapter deals more closely with the shuffle property, and we view this as giving a generalization of regular sequences. We show that sequences with the shuffle property are closed under the process of taking sums and taking products; in addition we show that there is closure under a noncommutative product, which turns the collection of shuffled sequences into a noncommutative algebra. We show that this algebra is very large, in the sense that it contains a copy of a free associative algebra on countably many generators. We conclude by giving some open questions, which we hope will begin a more careful study of shuffled sequences.

Access provided by CONRICYT-eBooks. Download chapter PDF

On the operations of sequences in rings and binomial type sequences

Article 26 March 2018

A constructive sequence algebra for the calculus of indications

Article 15 July 2020

IntegerSequences: A Package for Computing with k-Regular Sequences

4.1 Introduction

We recall that, given a finite set Δ, a sequence $f:\mathbb {N}\to \varDelta $ is said to be k-automatic if the n^th term of this sequence is generated by a finite-state machine taking the base-k expansion of n as input, starting with the least significant digit. Automatic sequences appear in many different contexts (see, e.g., [14, 189]), and there are many examples of their application to algebra, such as their involvement in the characterization of algebraic power series over finite fields [14, Chapter 12]. The purpose of this chapter, however, is to look at applications of algebra, in particular the theory of polynomial identities, in giving a characterization of automatic sequences due to Berstel and Reutenauer [77, Chapter 3] and to use this characterization to answer some open questions about automatic and regular sequences.

Another way of defining the k-automatic property comes from looking at the k-kernel of a sequence. The k-kernel of a sequence f(n) is defined to be the collection of sequences of the form f(kⁱn + j) where i ≥ 0 and 0 ≤ j < kⁱ. A sequence is k-automatic if and only if its k-kernel is finite. Using this definition of k-automatic sequences, Allouche and Shallit [14, 16] generalized the notion of k-automatic sequences, defining k-regular sequences.

Let K be a field. Given a sequence f(n) taking values in K, we create a vector subspace of $K^{\mathbb {N}}$, V (f(n);k), which is defined to be the subspace spanned by all sequences f(kⁱn + j), where i ≥ 0 and 0 ≤ j < kⁱ.

Definition 4.1.1

A sequence is k-regular if V (f(n);k) is a finite-dimensional K-vector space.

Since the k-kernel of a sequence f(n) spans V (f(n);k) as a K-vector space, we see that a k-automatic sequence is necessarily k-regular. We remark that one can adopt a slightly more general viewpoint by replacing the field K by a commutative ring and asking that submodules generated by elements of the kernel be finitely generated. Everything we do in this chapter holds in this more general setting, but for ease of exposition we restrict to the case where our sequences are K-valued with K a field.

In the first half of this chapter, we take a ring theoretic look at automatic sequences and regular sequences and prove a result due to Berstel and Reutenauer [77, Chapter 3], which is not as well known as it probably should be. Part of the reason for this is that their result is written in the context of noncommutative rational series, so we give a straightforward translation of these results into our setting. In fact, we shall give a quantitative version of their result, which we hope will be of future use. We then give some applications of this result.

Let $\mathcal {A}$ be a finite alphabet. We let $\mathcal {A}^*$ denote the free monoid on $\mathcal {A}$; that is, $\mathcal {A}$ is the collection of all words on $\mathcal {A}$. We let ε denote the empty word on $\mathcal {A}$.

We are interested in maps $f:\mathcal {A}^*\rightarrow K$, where $\mathcal {A}$ is some finite alphabet and K is a field. In the case that $\mathcal {A}=\{1,2,\ldots ,k\}$ for some positive integer k, we can regard a word in $\mathcal {A}$ as an integer as follows. We associate 0 to the empty word ε, and for nontrivial words $a_m\cdots a_0\in \mathcal {A}^*$ with 1 ≤ a_i ≤ k, we associate a positive integer using the correspondence

$$\displaystyle \begin{aligned} a_m\cdots a_0 \mapsto a_mk^m+a_{m-1}k^{m-1}+a_{m-2}k^{m-2} + \cdots + a_0.\end{aligned} $$

(4.1)

Observe that this gives a bijection between $\mathcal {A}^*$ and the natural numbers and hence, in this case, we may think of f as being a sequence indexed by the nonnegative integers taking values in K. We note that it is more common to use the alphabet {0, 1, …, k − 1} and instead restrict to the regular sublanguage of the monoid {0, …, k − 1} consisting of words whose first letter is not zero when dealing with k-automatic and k-regular sequences. The point of view we adopt is completely equivalent, but we find the framework used in this chapter easier to deal with.

Using this correspondence and the k-kernel definition of a k-automatic sequence, we see that we can think of a k-automatic sequence taking values in a field K as being a map f : {1, 2, …, k}^∗→ K with the property that the collection of maps f^u(w) := f(wu) obtained by taking a word u ∈{1, 2, …, k}^∗ is finite. Similarly, if the vector space of set maps from {1, …, k}^∗ to K generated by the maps of the form f^u is finite-dimensional, then f is k-regular. We shall give an overview of the work of Berstel and Reutenauer [77, Chapter 3], describing k-automatic and k-regular sequences, in terms of two additional properties, which for the purposes of this chapter we shall call the shuffle property and the power property. These properties are defined in Section 4.2 and Section 4.3, respectively. In Section 4.4 we present some results from the theory of rings satisfying a polynomial identity which will be necessary in giving the aforementioned characterization of k-regular sequences. In Section 4.5 we show that for a sequence, possessing these two properties is equivalent to being regular. If in addition to having these two properties, the sequence only takes on a finite number of values, the sequence is automatic. In Section 4.6 and Section 4.7, we use our characterization to answer some questions of Allouche and Shallit about whether certain sequences are k-regular and k-automatic. In Section 4.8 we develop the basic properties of sequences with the shuffle property and show that they have nice closure properties. In Section 4.9, we conclude with some open problems and some useful remarks.

4.2 The Shuffle Property

In this section we define the shuffle property and show that regular sequences necessarily possess this property. Let $\mathcal {A}$ be a finite alphabet and let K be a field. Given a map $f:\mathcal {A}^{*}\rightarrow K$, we say that f has the d-shuffle property if for any words $w,w_1,\ldots ,w_d, w'\in \mathcal {A}^*$ we have

$$\displaystyle \begin{aligned}\sum_{\sigma\in S_d} \mathrm{sgn}(\sigma) f(ww_{\sigma(1)}w_{\sigma(2)}\cdots w_{\sigma(d)} w') \ = \ 0.\end{aligned} $$

(4.2)

We define the d^th-shuffle function

$$\displaystyle \begin{aligned} \mathrm{Shuf}_d(f;w,w_1,\ldots ,w_d,w') \ := \ \sum_{\sigma\in S_d} \mathrm{sgn}(\sigma) f(ww_{\sigma(1)}w_{\sigma(2)}\cdots w_{\sigma(d)} w'). \end{aligned} $$

(4.3)

Example 4.2.1

Let $\mathcal {A}=\{0,1\}$, let K be the field of rational numbers, and define f(w) to be the number of ones in the word $w\in \mathcal {A}^*$. Then f has the 2-shuffle property.

Example 4.2.2

Let $\mathcal {A}=\{0,1\}$ and define f(w) to be the nonnegative integer whose binary expansion is equal to w. Then f has the 4-shuffle property, but does not have the 3-shuffle property.

We postpone the proof of the fact that f has the 4-shuffle property till after the proof of Proposition 4.2.8. To see that f does not have the 3-shuffle property, take w₁ = 0, w₂ = 1, w₃ = 10. Then f(w₁ w₂ w₃) = [0110]₂ = 6. Similarly, f(w₁ w₃ w₂) = 5, f(w₂ w₃ w₁) = 12, f(w₂ w₁ w₃) = 10, f(w₃ w₁ w₂) = 9, f(w₃ w₂ w₁) = 10. Thus

$$\displaystyle \begin{aligned}\sum_{\sigma\in S_3} \mathrm{sgn}(\sigma)f(w_{\sigma(1)}w_{\sigma(2)} w_{\sigma(3)}) \ = \ 6 - 5 + 12 - 10 + 9 - 10 = 2 \ \neq \ 0.\end{aligned} $$

The motivation for this shuffle property definition comes from the following theorem of Amitsur and Levitzki [20].

Theorem 4.2.3 (Amitsur-Levitzki [20])

Let A₁, …, A_2m be m × m matrices with entries in a commutative ring C. Then

$$\displaystyle \begin{aligned} \mathsf{S}_d(\mathbf{A}_1,\ldots ,\mathbf{A}_{2m}) \ := \ \sum_{\sigma \in S_{2m}} \mathrm{sgn}(\sigma)\mathbf{A}_{\sigma(1)}\cdots \mathbf{A}_{\sigma(2m)} \ = \ 0.\end{aligned} $$

(4.4)

Before we prove this result, we need a basic result about matrices.

Lemma 4.2.4

Let C be a commutative $\mathbb {Q}$ -algebra and let Y ∈ M_n(C). Suppose that Y has the property that the trace of Yⁱ is zero for i = 1, 2, …, n. Then Yⁿ = 0.

Proof

Let y_i,j denote the (i, j)-entry of Y. Then we may assume without loss of generality that C is generated by the y_i,j. Now let $R=\mathbb {Q}[x_{i,j}\colon 1\le i,j\le n]$ and let Z ∈ M_n(R) be the matrix whose (i, j)-entry is x_i,j. We note that if we impose the constraints on the x_i,j that give that the trace of Zⁱ is equal to zero for i = 1, 2, …, n, then we obtain a homomorphic image R′ of R and by hypothesis the map x_i,j↦y_i,j gives a surjective map $\mathbb {Q}$-algebra homomorphism from R′ to C. Thus it is sufficient to show that Zⁿ = 0 in M_n(R′). We note that there is a $\mathbb {Q}$-algebra S that contains R and is a finite R-module and that contains all the eigenvalues of the matrix Z. Then the relations imposed by setting the trace of Zⁱ to zero for i = 1, 2, …, n give a homomorphic image S′ of S that is a finite R′-module. Now Z is triangularizable in M_n(S′) and so we may assume that Z is upper triangular. Then if λ₁, …, λ_n ∈ S′ are the diagonal entries of Z, then we have $\sum _{i=1}^n \lambda _i^j =0$ for j = 1, 2, …, n. We now claim that λ₁, λ₂, ⋯ , λ_n are nilpotent elements of S′. Once we establish this claim, we see that some power of Z is strictly upper triangular and so Z is nilpotent. Then the Cayley-Hamilton theorem gives that Zⁿ = 0 in M_n(S′) and hence in M_n(R′), and so we get Yⁿ = 0.

To establish the claim, we note that it is sufficient to show that if C is a finitely generated commutative $\mathbb {Q}$-algebra and λ₁, …, λ_n ∈ C satisfy $\sum _{i=1}^n \lambda _i^j =0$ for j = 1, 2, …, n then λ₁, …, λ_n are nilpotent. Let N denote the nil radical of C (the set of all nilpotent elements of C). Then we may replace C by C/N, and we see that it is sufficient to show that if C is a finitely generated reduced (i.e., it has no nonzero nilpotent elements) commutative $\mathbb {Q}$-algebra and λ₁, …, λ_n ∈ C satisfy $\sum _{i=1}^n \lambda _i^j =0$ for j = 1, 2, …, n, then λ₁, …, λ_n are all zero. Since a reduced commutative ring embeds in a direct product of integral domains, we then see we can use projection maps to reduce to the case of an integral domain. So we prove the more general fact that if C is a commutative $\mathbb {Q}$-algebra that is an integral domain and λ₁, …, λ_n ∈ C are distinct and nonzero and β₁, …, β_n ∈ C are nonzero then $\sum \beta _i \lambda _i^j \neq 0$ for some j = 1, …, n. We note that the original claim follows easily from this more general statement (but this requires characteristic zero). To see this general claim, we note that since we are in an integral domain, we get the result for n = 1. Now suppose that the result holds for n < d and consider the case where n = d. Then if $\sum _{i=1}^d \beta _i \lambda _i^j = 0$ for j = 1, 2, …, d, we see that

$$\displaystyle \begin{aligned}\sum_{i=1}^d \beta_i \lambda_i^j \lambda_d - \sum_{i=1}^d \beta_i\lambda_i^{j+1}=0\end{aligned}$$

for j = 1, 2, …, d − 1. Simplifying, we see that

$$\displaystyle \begin{aligned}\sum_{i=1}^{d-1} \beta_i(\lambda_i-\lambda_d) \lambda_i^j=0,\end{aligned}$$

for j = 1, …, d − 1. But this contradicts the induction hypothesis, and we get the claim. This finishes the proof. □

Proof (Proof of Theorem 4.2.3)

We only prove the case when C is an integral domain, which is sufficient for the considerations of this chapter. Here we give an argument due to Rosset [512] (we note that Rosset does the more general case of a commutative ring). We note that one can immediately reduce to the case when C is a field of characteristic zero, since an integral domain C of positive characteristic is a homomorphic image of an integral domain C′ of characteristic zero, and one can show that if the result holds for the field of fractions of C′ then it holds for C. Now let A₁, …, A_2n be 2n elements in M_n(C) and let V be a 2n-dimensional C-vector space with basis e₁, …, e_2n. For each i ≥ 0, we recall that there is a d-th exterior product, ∧^dV is formed by taking the vector space $V_d:=\otimes _{i=1}^d V$ and then forming the quotient V_d/W_d where W_d is the subspace of V_d spanned by elements of the form $\mathbf {e}_{i_1}\otimes \cdots \otimes \mathbf {e}_{i_d} - \mathrm {sgn}(\sigma )\mathbf {e}_{\sigma (i_1)}\otimes \cdots \otimes \mathbf {e}_{\sigma (i_d)}$, where σ ∈ S_d and 1 ≤ i₁, …, i_d ≤ 2n. Then we let $\mathbf {e}_{i_1}\wedge \cdots \wedge \mathbf {e}_{i_d}$ denote the image of $\mathbf {e}_{i_1}\otimes \cdots \otimes \mathbf {e}_{i_d}$ in this quotient. Then one can check that ∧^dV is a ${2n\choose d}$-dimensional space with basis consisting of elements of the form $\mathbf {e}_{i_1}\wedge \cdots \wedge \mathbf {e}_{i_d}$ with i₁ < i₂ < ⋯ < i_d.

Now let $E=\bigoplus _{d=0}^{2n} \wedge ^i(V)$; this is called the exterior algebra on V . Notice that E is a ring with multiplication formed by taking the natural bilinear “wedge” map ∧ⁱ(V ) ×∧^j(V ) →∧^i+j(V ) given by

$$\displaystyle \begin{aligned}\left((\mathbf{e}_{p_1}\wedge \cdots \wedge \mathbf{e}_{p_i}),(\mathbf{e}_{q_1}\wedge \cdots \wedge \mathbf{e}_{q_j})\right) \mapsto \mathbf{ e}_{p_1}\wedge \cdots \wedge \mathbf{e}_{p_i} \wedge \mathbf{e}_{q_1}\wedge \cdots \wedge \mathbf{e}_{q_j}.\end{aligned}$$

Then E is generated as a C-algebra by e₁, …, e_2n. Let E_e denote the subalgebra of E consisting of the direct sum of ∧ⁱV with i even. Then E_e is generated by elements of the form e_i ∧e_j as an algebra, and it is straightforward to check, using the relations given above, that these elements commute with one another and so E_e is a commutative ring of characteristic zero. Now consider B := M_n(C) ⊗_C E≅M_n(E). Let X = A₁ ⊗e₁ + ⋯ + A_2n ⊗e_2n ∈ B. Then

$$\displaystyle \begin{aligned}\mathbf{Y}:=\mathbf{X}^2=\sum_{i<j} (\mathbf{A}_i\mathbf{A}_j-\mathbf{A}_j\mathbf{A}_i)\mathbf{e}_i\wedge \mathbf{e}_j \in M_n(E_e).\end{aligned}$$

Notice that Y has trace zero, as it is an E_e-linear combination of commutators. More generally,

$$\displaystyle \begin{aligned}\mathbf{Y}^i = \sum_{1\le j_1,\ldots,j_{2i}\le 2n} \mathbf{A}_{j_1}\cdots \mathbf{A}_{j_{2i}} \mathbf{e}_{j_1}\wedge \cdots \wedge \mathbf{ e}_{j_{2i}},\end{aligned}$$

which is equal to

$$\displaystyle \begin{aligned}\sum_{1\le j_1<j_2<\cdots < j_{2i} \le 2n} \mathsf{S}_{2i}(\mathbf{A}_{j_1},\ldots ,\mathbf{A}_{j_{2i}}) \mathbf{ e}_{j_1}\wedge \cdots \wedge \mathbf{e}_{j_{2i}}.\end{aligned}$$

It is straightforward to check that $\mathsf {S}_{2i}(\mathbf {A}_{j_1},\ldots ,\mathbf {A}_{j_{2i}})$ always has trace zero for i ≥ 1 and so we see that the trace of Yⁱ is zero for i = 1, 2, …, n. Thus Y is an n × n matrix over the commutative $\mathbb {Q}$-algebra E_e, and it has the property that the trace of all of its powers is equal to zero. By Lemma 4.2.4, we have that Yⁿ = 0. Thus Yⁿ = X²ⁿ = 0. As before, we have X²ⁿ = S_2n(A₁, …, A_2n)e₁ ∧⋯ ∧e_2n and so we get the desired result. □

We now show that the d-shuffle property implies all larger shuffle properties hold.

Proposition 4.2.5

If f has the d-shuffle property, then f has the e-shuffle property for all e ≥ d.

Proof

By induction, it is sufficient to prove this when e = d + 1. Notice that

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle ~&\displaystyle \mathrm{Shuf}_{d+1} (f;w,w_1,\ldots ,w_{d+1},w') \\ &\displaystyle =&\displaystyle \sum_{i=1}^{d+1} (-1)^{i-1} \mathrm{Shuf}_d(f,ww_i,w_1,\ldots ,\widehat{w_i}, \ldots ,w_{d+1},w'),\end{array} \end{aligned} $$

where $\widehat {w_i}$ means that w_i is omitted from the list. Hence if f has the d-shuffle property, it must also have the (d + 1)-shuffle property. □

We now introduce some notation. Given a finite alphabet $\mathcal {A}$, a field K, a word $w\in \mathcal {A}^*$ and a function $f:\mathcal {A}\rightarrow K$, we define two functions $f_w,f^{w}:\mathcal {A}\rightarrow \mathcal {A}$ by

$$\displaystyle \begin{aligned} f_w(u) \ := \ f(wu)\qquad \mathrm{and} \qquad f^{w}(u)=f(uw). \end{aligned} $$

(4.5)

We now give some definitions.

Definition 4.2.6

Given a finite alphabet $\mathcal {A}$, a field K, and a function $f:\mathcal {A}^*\rightarrow K$, we say that f is left (resp. right) $\mathcal {A}$-regular if the K-vector space spanned by the functions $\{f_w~|~w\in \mathcal {A}^*\}$ (resp. $\{f^w~|~w\in \mathcal {A}^*\}$) is finite-dimensional. If in addition to being left (resp. right) $\mathcal {A}$-regular, the range of f is a finite subset of K, we say that f is left (resp. right) $\mathcal {A}$-automatic.

Later, we will give Kleene’s theorem, which states that being left $\mathcal {A}$-regular is equivalent to being right $\mathcal {A}$-regular. Thus we will omit the words left and right and just use the term $\mathcal {A}$-regular. In the case that $\mathcal {A}=\{1,2,\ldots , k\}$, we shall say that a right $\mathcal {A}$-regular function is k-regular and shall say that a right $\mathcal {A}$-automatic function is k-automatic.

Notice that this definition of k-regular coincides with the definition of k-regular given by Allouche and Shallit [14] and the definition of k-automatic coincides with the conventional definitions of the k-automatic property.

For convenience, we use the following notation. Given a finite alphabet $\mathcal {A}$, a field K, and a function $f:\mathcal {A}^*\rightarrow K$, we let L(f) denote the K-vector space spanned by the functions $\{f_w~|~w\in \mathcal {A}^*\}$, and we let R(f) denote the vector space spanned by $\{f^w~|~w\in \mathcal {A}^*\}$.

Proposition 4.2.7

Let $\mathcal {A}$ be a finite alphabet and let f be a left (resp. right) $\mathcal {A}$ -regular function taking values in a field K. Let m denote the dimension of L(f) (resp. the dimension of R(f)). Then there exist some m ≥ 1, m × m matrices $\{\mathbf {A}_a~|~a\in {\mathcal A}\}$ with entries in K, and v, w ∈ K^d×1 such that

$$\displaystyle \begin{aligned}f(x_1\cdots x_i) = \mathbf{w}^{\mathrm{T}}\mathbf{A}_{x_1}\cdots \mathbf{A}_{x_i} \mathbf{v}\end{aligned}$$

for all words $x_1\cdots x_i \in {\mathcal A}^*$.

Proof

Choose $\varepsilon =w_1,\ldots ,w_m\in {\mathcal A}^*$ such that $f_{w_1},\ldots ,f_{w_m}$ span the vector space L(f). Given $x\in \mathcal {A}$, for each i pick c_i,j ∈ K, j ≤ m, such that

$$\displaystyle \begin{aligned}f_{xw_i} \ = \ \sum_{j=1}^m c_{i,j} f_{w_j}.\end{aligned}$$

Define the m × m matrix A_x whose (i, j)-entry is given by c_i,j. Take v to be the m × 1 column vector whose i^th coordinate is f(w_i). Notice that if $x_1,\ldots ,x_i\in \mathcal {A}$, then

$$\displaystyle \begin{aligned} \mathbf{e}_1^{\mathrm{T}}\mathbf{A}_{x_1}\cdots \mathbf{A}_{x_i}\mathbf{v} \ = \ f(x_1\cdots x_i).\end{aligned}$$

In the case that f is right $\mathcal {A}$-regular, an analogous construction gives the same result. □

Proposition 4.2.8

Let $\mathcal {A}$ be a finite alphabet, let K be a field, and let $f:\mathcal {A}^*\rightarrow K$ be a left (resp. right) $\mathcal {A}$ -regular sequence. Then f has the d-shuffle property for d = 2dim(L(f)) (resp. d = 2dim(R(f))).

Proof

Let m denote the dimension of L(f). Since f is left $\mathcal {A}$-regular, there exist m × m matrices $\{\mathbf {A}_x~|~x\in \mathcal {A}\}$ with entries in K and some vector v ∈ K^d×1 such that

$$\displaystyle \begin{aligned}f(x_1\cdots x_m)= \mathbf{e}_1^{\mathrm{T}}\mathbf{A}_{x_1}\cdots \mathbf{A}_{x_m}\mathbf{v}\end{aligned}$$

for all words $x_1\cdots x_m\in \mathcal {A}^*$. Let d = 2m. We claim that f has the d-shuffle property. To see this, let w₁, …, w_d, w, w′ be words in $\mathcal {A}^*$. Let $\mathcal {S}$ denote the monoid on $\{\mathbf {A}_x~|~x\in \mathcal {A}^*\}$. Then there exist matrices U₁, …, U_d, U, U ′ in $\mathcal {S}$ which correspond to w₁, …, w_d, w, w′, respectively, given by the correspondence ε↦I_d and

$$\displaystyle \begin{aligned}x_1\cdots x_m\in \mathcal{A}^* \mapsto \mathbf{A}_{x_1}\cdots \mathbf{A}_{x_m}.\end{aligned}$$

Then

$$\displaystyle \begin{aligned}f(ww_{\sigma(1)}\cdots w_{\sigma(d)}w') = \mathbf{e}_1^{\mathrm{T}} \mathbf{U}\mathbf{U}_{\sigma(1)} \cdots \mathbf{U}_{\sigma(d)}\mathbf{U}'\mathbf{v}.\end{aligned}$$

Hence

$$\displaystyle \begin{aligned} \begin{array}{rcl} \sum_{\sigma\in S_d}\mathrm{sgn}(\sigma)f(ww_{\sigma(1)}\cdots w_{\sigma(d)}w') &\displaystyle =&\displaystyle \sum_{\sigma\in S_d} \mathbf{e}_1^{\mathrm{T}}\mathbf{U} \Bigg(\sum_{\sigma \in S_d} \mathbf{U}_{\sigma(1)} \cdots \mathbf{U}_{\sigma(d)}\Bigg)\mathbf{U}'\mathbf{v} \\ &\displaystyle =&\displaystyle 0, \end{array} \end{aligned} $$

where the last step follows from the Amitsur-Levitzki theorem. □

We now prove that the function given in Example 4.2.2 has the 4-shuffle property. We note that if $f:\{0,1\}^*\rightarrow {\mathbb Q}$ is the function which maps a word w to [w]₂, where [w]₂ is the natural number whose binary expansion is equal to w, then the ${\mathbb Q}$-vector space R(f) is two-dimensional, spanned by f and the constant function g which sends every word to 1. To see this, notice that f^w(u) = [uw]₂ = 2^length(w)[u]₂ + [w]₂ and so

$$\displaystyle \begin{aligned}f^w \ = \ 2^{\mathrm{length}(w)}f+ [w]_2g.\end{aligned}$$

Clearly f and g are both in R(f) and are linearly independent. Thus dim(R(f)) = 2 and so f has the 4-shuffle property.

4.3 The Power Property

In this section we define the power property , which is the second ingredient of the characterization of automatic and regular sequences of Berstel and Reutenauer. Given a finite alphabet $\mathcal {A}$ and a field K, we say that $f:\mathcal {A} \rightarrow K$ has the d-power property if for any word w₀, there exists a polynomial Φ(t) ∈ K[t] of degree at most d with constant coefficient 1 such that for any words w, w′ we have

$$\displaystyle \begin{aligned}\varPhi(t)\left(\sum_{i=0}^{\infty} f(ww_0^{i}w')\right) t^i,\end{aligned}$$

is a polynomial in K[t] of degree at most d. We shall say that f has the power property if f has the d-power property for some d ≥ 1.

Lemma 4.3.1

Let K be a field and let X be a d × d matrix with entries in K. Then

$$\displaystyle \begin{aligned}\sum_{i=0}^{\infty} \mathbf{X}^i t^i = \mathbf{Y}(t)\det(1-t\mathbf{X})^{-1},\end{aligned}$$

for some matrix Y with entries in K[t] of degree at most d − 1.

Proof

Let Y(t) denote the classical adjoint of 1 − t X. Then Y(t) is a matrix with entries given by polynomials in t of degree at most d − 1 and

$$\displaystyle \begin{aligned}\mathbf{Y}(t)(1-t\mathbf{X})=\det(1-t\mathbf{X})\mathbf{I}_d.\end{aligned}$$

Notice that

$$\displaystyle \begin{aligned}\mathbf{Y}(t)(1-t\mathbf{X})\sum_{i=0}^{\infty} \mathbf{X}^i t^i \ = \ \mathbf{Y}(t)\end{aligned}$$

and so

$$\displaystyle \begin{aligned}\sum_{i=0}^{\infty} \mathbf{X}^i t^i \ = \ \mathbf{Y}(t)\det(1-t\mathbf{X})^{-1}.\end{aligned}$$

□

Proposition 4.3.2

Let f be a $\mathcal {A}$ -regular function taking values in a field K. Then f has the m-power property, where m is the dimension of L(f).

Proof

It suffices to show that for any word w₀ and w, w′ that

$$\displaystyle \begin{aligned}\sum_{i=1}^{\infty}f(ww_0^{i}w') t^i\end{aligned}$$

is a rational function in t whose numerator and denominator have degrees that are at most m (with denominator independent of w, w′ and depending only upon w₀ and having constant coefficient 1). By Proposition 4.2.7, there exists some m and m × m matrices U, U₀, U ′ such that $f(ww_0^{i}w')=\mathbf {e}_1^{\mathrm {T}}\mathbf {U}\mathbf {U}_0^{i}\mathbf {U}'\mathbf {v}$. We then have

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle ~&\displaystyle \sum_{i=0}^{\infty} f(ww_0^{i}w') t^{i} \\ &\displaystyle =&\displaystyle \sum_{i=0}^{\infty} \mathbf{e}_1^{\mathrm{T}}\mathbf{U} \Bigg( \mathbf{U}_0^{i}\Bigg)\mathbf{U}'\mathbf{v} \\ &\displaystyle =&\displaystyle \mathbf{e}_1^{\mathrm{T}}\mathbf{U}\Bigg( \sum_{i=0}^{\infty} \mathbf{U}_0^{i}t^{i}\Bigg) \mathbf{U}'\mathbf{v}\\ &\displaystyle =&\displaystyle \mathbf{e}_1^{\mathrm{T}} \mathbf{U}\mathbf{Y}(t) \mathbf{U}'\mathbf{v} \cdot \det(1-\mathbf{U}_0t)^{-1}, \end{array} \end{aligned} $$

where Y(t) is the classical adjoint of 1 −U₀ t. This expression is easily seen to be a rational function of the form

$$\displaystyle \begin{aligned}P(t) \varPhi(t)^{-1},\end{aligned}$$

and the numerator and denominator have degree at most m and Φ(0) = 1 and Φ depends only upon U₀ and hence only on w₀. □

4.4 Shirshov’s Height Theorem

We now introduce an important result in combinatorial ring theory: Shirshov’s theorem. We have seen that $\mathcal {A}$-regular functions have both the shuffle and power properties. In fact, it is the case that the shuffle and power properties characterize regular sequences. To deduce this we need a famous combinatorial result due to Shirshov. We first take a detour and survey the beautiful field of polynomial identity algebras.

Definition 4.4.1

Let K be a field and let B be a K-algebra. We say that B is a polynomial identity ring if there exists a nonzero, noncommutative polynomial p(x₁, …, x_d) with coefficients in K, such that p(b₁, …, b_d) = 0 for all b₁, …, b_d ∈ B. In this case the polynomial p is called a polynomial identity for B. The total degree of the polynomial identity p of B of least positive degree is called the PI degree of B.

Example 4.4.2

Any commutative algebra B is a polynomial identity ring since it satisfies the identity x₁ x₂ − x₂ x₁ = 0.

Example 4.4.3 (Wagner)

The ring of 2 × 2 matrices over a field K satisfies the identity [x₁, [x₂, x₃]²] = 0, where [a, b] = ab − ba.

Proof

Notice that if X is a 2 × 2 matrix then by the Cayley-Hamilton theorem

$$\displaystyle \begin{aligned}\mathbf{X}^2-\mathrm{Tr}(\mathbf{X})\mathbf{X}+\det(\mathbf{X})\mathrm{I}_2 =0.\end{aligned}$$

Hence if X has trace 0, then its square is a scalar matrix. In particular the square of a commutator must commute with every 2 × 2 matrix. □

An important fact is that if a ring B satisfies a nontrivial polynomial identity, then it in fact satisfies a homogeneous multilinear identity (i.e., each monomial occurring in the identity has degree precisely one in each variable).

Proposition 4.4.4

Let K be a field and let B be a K-algebra satisfying a polynomial identity of degree at most d. Then B satisfies a multilinear homogeneous identity of degree at most d.

Proof

Let m(f) be the maximum degree of a variable appearing in f. Among all nonzero polynomial identities, we pick one with the property that m(f) is minimal. Let us call this minimum m. Among all such identities with m(f) = m, we pick one with the property that the number of variables of degree m is minimal. Let f(x₁, …, x_d) be such a minimal polynomial identity for the ring B. By permuting the variables, if necessary, we may assume that m is the degree of x₁ in f. Consider the identity g(x₁, y₁, x₂, …, x_d) := f(x₁+y₁, …, x_d)−f(x₁, …, x_d)−f(y₁, …, x_d) ∈ K{x₁, y₁, x₂, …, x_d}. We note that this is an identity for B. Then it is straightforward to see that this transforms any monomial of degree m in x₁ to a monomial of total degree m in x₁ and y₁ and no terms of degree m in just x₁ or just y₁. That means that either m(g) < m or m(g) = m but the number of variables of degree m in g is strictly less than that of f. By minimality of f we have that g = 0. But this occurs only if m = m(f) = 1. So having m = 1 says that every monomial appears with degree at most 1. Now pick a monomial occurring in f with nonzero coefficient of smallest degree, say r ≤ d, By relabeling indices, we may assume that the monomial is x₁⋯x_r. Then consider f(x₁, …, x_r, 0, …, 0). This is nonzero and must be homogeneous by minimality of r.

Notice this process yields an algorithm to convert a polynomial identity into a homogeneous multilinear identity. One begins with an identity, and if it is of degree strictly greater than one in some variable, say x₁, then we add a new variable y₁, and we look at f(x₁ + y₁, …) − f(x₁, …) − f(y₁, …). This creates an additional variable, but the total degree does not increase. If we keep repeating this process, the argument above shows that it must terminate and so we obtain an identity in some set of variables in which each variable occurs with degree at most 1. Then we pick a monomial of minimal length and set all variables not occurring in this monomial equal to zero to get a homogeneous multilinear identity. Notice that the total degree never increases at any step, so we see the total degree of the homogeneous multilinear identity we ultimately produce is at most that of the original identity. □

Remark 4.4.5

We note that in the case that the polynomial p(x₁, …, x_d) is multilinear and homogeneous, to check that a K-algebra B satisfies this identity, it is sufficient to check that p(b₁, …, b_d) = 0 for (b₁, …, b_d) ∈ Y^d where Y is a K-spanning set for B.

Proof

To see this, we remark that if a₁, …, a_d ∈ B then we can write each a_i as a K-linear combination of elements of Y . Then using multilinearity, we can write p(a₁, …, a_d) as a K-linear combination of elements of the form p(b₁, …, b_d) with (b₁, …, b_d) ∈ Y^d. Thus if the latter all vanish, then p is necessarily an identity for B. □

We will make use of this fact for algebras B of the form B := K{x₁, …, x_k}/I, where I is a two-sided ideal of the free algebra on noncommuting variables x₁, …, x_k. In this case, we can take the images of the words in x₁, …, x_k to be our spanning set, and so to check that a homogeneous multilinear identity holds for an algebra B, it suffices to check that it vanishes when evaluated at elements from the semigroup generated by a finite set of generators. Arguably, the most important types of identities studied in the theory of polynomial identities are the standard identities , which we already encountered, albeit in a different form, when addressing the shuffle property. The n^th standard identity is defined as follows:

$$\displaystyle \begin{aligned} \mathsf{S}_n(x_1,\ldots ,x_n) \ = \ \sum_{\sigma\in S_n} \mathrm{sgn}(\sigma)x_{\sigma(1)}\cdots x_{\sigma(n)}. \end{aligned} $$

(4.6)

The Amitsur-Levitzki theorem shows that the ring of n × n matrices over a commutative ring satisfies the 2n^th standard identity. In fact, every finitely generated algebra satisfying a polynomial identity must satisfy one of these standard identities.

Theorem 4.4.6 (Braun)

Let K be a field and let B be a finitely generated K-algebra satisfying an identity. Then B satisfies the identity S_n for some n ≥ 1.

Proof

See Braun [105] or Amitsur and Small [21, Corollary 1.2.8A]. □

We recall that if we are given a free monoid $\mathcal {A}^*$ with $\mathcal {A}=\{x_1,\ldots ,x_k\}$, we can put a pure lexicographic order ≼ on $\mathcal {A}^*$ by declaring that

$$\displaystyle \begin{aligned}x_1 \ \prec \ x_2 \ \prec \ \cdots \ \prec \ x_k.\end{aligned}$$

In this order we declare that if w is a proper initial factor of w′ then w ≺ w′. So, for example, x₁ ≼ x₁ x₃ and x₂ x₃ x₄ ≺ x₃ x₁. We note that this lexicographic order extends to right infinite words on $\mathcal {A}$. If we alter the order somewhat and use the pure lexicographic order to order words of the same length and for words of different length declare that the longer word is bigger, then this new order is called the degree lexicographic order .

Using intricate combinatorial techniques, Shirshov proved the following beautiful result.

Theorem 4.4.7 (Shirshov)

Let $\mathcal {A}=\{x_1,\ldots ,x_k\}$ be a finite alphabet, and let m be a positive integer. If w is a right-infinite word over the alphabet $\mathcal {A}$ , then either there is some nontrivial word $w_0\in \mathcal {A}^*$ such that $w_0^d$ is a factor of w for every d ≥ 1 or w contains a finite factor of the form w₁ w₂⋯w_m where w₁ ≻ w₂ ≻⋯ ≻ w_m are nontrivial words in $\mathcal {A}^*$ with no w_i equal to a prefix of w_j for i ≠ j and ≻ is the pure lexicographic order induced by x₁ ≻ x₂ ≻⋯ ≻ x_k.

We postpone the proof of Shirshov’s theorem until we have developed a few basic combinatorial tools.

To prove Theorem 4.4.7, we require two basic combinatorial tools. The first is König’s infinity lemma; the second is a theorem of Furstenberg.

Theorem 4.4.8 (König’s infinity lemma)

Let $\mathcal {A}$ be a finite alphabet, and let $\mathcal {S}$ be an infinite subset of $\mathcal {A}^*$ that is closed under the process of taking factors. Then there exists $w\in \mathcal {A}^{\mathbb {N}}$ such that every finite factor of w is in $\mathcal {S}$.

Proof

We define w = a₁ a₂⋯ as follows. Since $\mathcal {S}$ is infinite, there is some $a_1\in \mathcal {A}$ such that a₁ is the first letter of infinitely many elements of $\mathcal {S}$; now suppose that for every i ≥ 1 we have defined a word a₁ a₂⋯a_i with the property that it is a prefix of infinitely many elements of $\mathcal {S}$. Then since $\mathcal {A}$ is finite, there is some $a_{i+1}\in \mathcal {A}$ such that a₁ a₂⋯a_i a_i+1 is a prefix of infinitely many elements of $\mathcal {S}$. Continuing in this way, we obtain an infinite word w with the desired property. □

The next result is Furstenberg’s theorem, which is part of a more general result in Ergodic theory. We give an algebraic proof in the case in which we are interested. We recall that a right-infinite word is uniformly recurrent if each factor u has the property that there is some natural number N = N(u) such that whenever u occurs as a factor, its next occurrence is at most N positions later in our right-infinite word. As far as we know, this rather simple proof (which requires the axiom of choice) does not appear in the literature.

Theorem 4.4.9 (Furstenberg’s theorem)

Let $\mathcal {A}$ be a finite alphabet and let $w\in \mathcal {A}^{\mathbb {N}}$ . Then there is a uniformly recurrent word $u\in \mathcal {A}^{\mathbb {N}}$ such that every finite factor of u is a factor of w.

Proof

Let K be a field. Write $\mathcal {A}=\{x_1,\ldots ,x_k\}$, and consider the algebra B = K{x₁, …, x_k}/I, where I is the ideal generated by all monomials that do not occur as a factor of w. Then B is infinite-dimensional as a K-vector space since w is infinite and the images of the factors of w are linearly independent in B. Now let $\mathcal {S}$ denote the collection of ideals J of K{x₁, …, x_k} that are generated by monomials, contain I, and have the property that B/J is infinite-dimensional. Then $\mathcal {S}$ is nonempty since I is in $\mathcal {S}$. We note that $\mathcal {S}$ is closed under unions of chains, for if L is the union of a chain in $\mathcal {S}$, then L is certainly generated by monomials and contains I; if K{x₁, …, x_k}/L is finite-dimensional, then there is some N such that L contains all monomials of length ≥ N. In particular, L is finitely generated as it generated by all monomials of length exactly N along with the finite set of monomials of length < N that are in L. Since L is the union of a chain in $\mathcal {S}$, and L is finitely generated, there is some element in our chain that must be equal to L. Thus we can pick a maximal element L of $\mathcal {S}$ by Zorn’s lemma. Now since L is generated by monomials, K{x₁, …, x_k}/L is spanned by the images of all monomials over x₁, …, x_k that are not in the ideal L. Since K{x₁, …, x_k}/L is infinite-dimensional and the collection of words that are not in L is closed under taking factors, we see by König’s infinity lemma that there is a right-infinite word u with the property that all factors of u are not in L. In particular, they are not in I, and so all factors of u are factors of w.

We claim that u is uniformly recurrent. If not, there is some finite factor u₀ such that there are arbitrarily long factors of u that avoid u₀. Then let L′ be the ideal generated by u₀ and L. Then since there are arbitrarily long words in u that avoid u₀, we see that these words have nonzero image in the ring K{x₁, …, x_k}/L′, and so K{x₁, …, x_k}/L′ is infinite-dimensional. But this contradicts maximality of L in $\mathcal {S}$. The result follows. □

Proof (Proof of Theorem 4.4.7)

We follow the proof of Pirillo [484]. By Furstenberg’s theorem, there is some uniformly recurrent right-infinite word v such that every factor of v is a factor of w. Now if v is eventually periodic, then $v=v_0 w_0^{\omega }$ for some v₀, w₀ with w₀ nontrivial, and we get the claim. If v is not eventually periodic, then v must have at least m distinct factors of length m − 1 (see [14, Theorem 10.2.6]). Let w₁, …, w_m be these distinct factors and suppose that w₁ ≻ w₂ ≻⋯ ≻ w_m. Since v is uniformly recurrent, there is a factor v₀ of v that can be written in the form v₀ = v₁⋯v_m where w_i is a prefix of v_i. Then we see that v_σ(1)⋯v_σ(m) ≺ v₁⋯v_m, where the inequality is strict whenever σ is not the identity. □

Shirshov’s theorem has as an immediate application the following result about algebras satisfying a polynomial identity.

Corollary 4.4.10 (Shirshov)

Let K be a field and let B be a finitely generated K-algebra with generators x₁, …, x_k that satisfies a polynomial identity. Then either B is finite-dimensional as a K-vector space or there is some word w ∈{x₁, …, x_d}^∗ such that the images of 1, w, w², … in the algebra B are linearly independent over K.

Proof

We put the degree lexicographic order, ≺, on the monomials in {x₁, …, x_k}^∗ induced by the x₁ ≻ x₂ ≻⋯ ≻ x_k. We let ≺_p be the corresponding pure lexicographic order. Let I be the two-sided ideal of K{x₁, …, x_k} generated by the collection of monomials w ∈{x₁, …, x_k}^∗ with the property that the image of w in B can be expressed as a K-linear combination of the image of monomials that are strictly smaller than w in the degree lexicographic order. Since B satisfies a polynomial identity, we know that it satisfies a homogeneous multilinear identity by Proposition 4.4.4. We may write this identity as

$$\displaystyle \begin{aligned}f(x_1,\ldots ,x_m)=x_1\cdots x_m + \sum_{\sigma\in S_m\setminus \mathrm{id}} c_{\sigma}x_{\sigma(1)}\cdots x_{\sigma(m)}.\end{aligned}$$

It then follows that if w₁ ≻_p w₂ ≻_p w₃ ≻_p⋯ ≻_p w_m with no w_i equal to a prefix of w_j for i ≠ j then the identity f shows that w₁ w₂⋯w_m can be written as a K-linear combination of strictly smaller words in the degree lexicographic order.

Now if K{x₁, …, x_k}/I is finite-dimensional, then B is finite-dimensional since by construction the images in B of words over {x₁, …, x_k}^∗ that are not in I span B as a K-vector space. So if B is infinite-dimensional over K, then K{x₁, …, x_k}/I must be infinite-dimensional. Then there are arbitrarily long words on {x₁, …, x_k}^∗ that are not in the ideal I, and so by König’s infinite lemma, we see that there is a right-infinite word w whose factors all lie outside of the ideal I. But by the remarks above, we see that w cannot contain any factor of the form w₁ w₂⋯w_m with w₁ ≻_p w₂ ≻_p w₃ ≻_p⋯ ≻_p w_m with no w_i equal to a prefix of w_j for i ≠ j. Thus, by Shirshov’s theorem, there is a nontrivial word w₀ such that for every d ≥ 1, $w_0^d$ is a factor of w. This means that $w_0, w_0^2,\ldots $ are not in I. By definition of I this means that their images in B are linearly independent in B. □

In fact, there is an even stronger version of Shirshov’s theorem for rings satisfying a polynomial identity, which we give now.

Theorem 4.4.11 (Strong version of Shirshov’s theorem)

Let C be a commutative ring, and let B be a finitely generated C-algebra with generators x₁, …, x_k and which satisfies a polynomial identity of degree d. Then every element in B is a C-linear combination of elements of the form

$$\displaystyle \begin{aligned}w_1^{i_1}\cdots w_m^{i_m},\end{aligned}$$

where m ≤ d⁵k^d and w₁, …, w_m are words of length less than d.

Proof

See Theorem 1.2.2 of Amitsur and Small [21]. □

We have the following unexpected application of Shirshov’s theorem to functions with a shuffle property.

Corollary 4.4.12

Let K be a field and let $f:\mathcal {A}\rightarrow K$ have the d-shuffle property. Then the vector space L(f) (resp. R(f)) is spanned by elements of the form $f_{w_1^{j_1}\cdots w_m^{j_m}}$ (resp. $f^{w_1^{j_1}\cdots w_m^{j_m}}$ ) with $m\le d^5|\mathcal {A}|{ }^d$ and w₁, …, w_m words of length at most d.

Proof

Write $\mathcal {A}=\{x_1,x_2,\ldots ,x_k\}$. Let B = K{x₁, …, x_k} be the free K-algebra on k-variables, i.e., the ring of “noncommutative polynomials” in k variables with coefficients in K.

We define an ideal I ⊆ K{x₁, …, x_k} as follows. Given words w₁, …, w_m in x₁, …, x_k, we declare that

$$\displaystyle \begin{aligned}c_1w_1+\cdots + c_mw_m \ \in \ I\end{aligned}$$

if

$$\displaystyle \begin{aligned}c_1f_{uw_1}+\cdots + c_mf_{uw_m}\qquad \mbox{is identically } 0 ~\mbox{for all } u\in {\mathcal A}^*.\end{aligned}$$

We note that all such relations form an ideal. Notice that since f has the d-shuffle property, for any d words w₁, …, w_d, we have

$$\displaystyle \begin{aligned}\mathsf{S}_d(w_1,\ldots ,w_d) \ = \ \sum_{\sigma\in S_d} \mathrm{sgn}(\sigma)w_{\sigma(1)}\cdots w_{\sigma(d)} \ \in I.\end{aligned}$$

Since the function

$$\displaystyle \begin{aligned}\mathsf{S}_d(x_1,\ldots ,x_d)\end{aligned}$$

is multilinear, we see that S_d(b₁, …, b_d) ∈ I for all b₁, …, b_d in B. Consequently, B/I satisfies a polynomial identity. It follows from Theorem 4.4.11 that every element of B/I is a K-linear combination of the images of elements of the form $w_1^{j_1}\cdots w_m^{j_m}$ with $m\le d^5|\mathcal {A}|{ }^d$, j₁, …, j_m ≥ 0 and w₁, …, w_m words of length at most d. It follows that any word w ∈{x₁, …, x_k}^∗ there is some b ∈ I such that w − b can be written as a linear combination of words of the form $w_1^{j_1}\cdots w_m^{j_m}$ with $m\le d^5|\mathcal {A}|{ }^d$ and w₁, …, w_m of length at most d. By definition of I we then have

$$\displaystyle \begin{aligned}f_{w} \ \in \ \mathrm{Span}_{K} \{f_{w_1^{j_1}\cdots w_m^{j_m}}~|~ m\le d^5|\mathcal{A}|{}^d, \mathrm{length}(w_i)\le d ~\mathrm{for}~i\le m\}.\end{aligned}$$

We thus obtain the desired result. A similar argument works for the vector space R(f). □

4.5 Characterization of $\mathcal {A}$-Regular Sequences

In this section, we are finally able to give the general version of Kleene’s theorem along with the structure result of Berstel and Reutenauer.

Lemma 4.5.1

Let K be a field, let d and e be positive integers, and let $f:\mathcal {A}^*\rightarrow K$ be a function with the e-power property and let m be at most $d^5|\mathcal {A}|{ }^d$ and let w₁, …w_m be words of length at most d. Then there exists e ≥ 0 such that the K-vector space spanned by $\{f_{w_1^{i_1}\cdots w_m^{i_m}}~|~i_1,\ldots ,i_m\ge 0\}$ (resp. $f^{w_1^{i_1}\cdots w_m^{i_m}}~|~i_1,\ldots ,i_m\ge 0\}$ ) is spanned by

$$\displaystyle \begin{aligned}\{f_{w_1^{i_1}\cdots w_m^{i_m}}~|~0\le i_1,\ldots ,i_m\le e\}\end{aligned}$$

(resp. $\{f_{w_1^{i_1}\cdots w_m^{i_m}}~|~0\le i_1,\ldots ,i_m\le e\}$ ).

Proof

By the e-power property, there exist polynomials Φ₁, …, Φ_m of degree at most e with constant coefficient 1 such that for every word u we have

$$\displaystyle \begin{aligned}\varPhi_i(t) \sum_{s=0}^{\infty} f(w_1^{i_1}\cdots w_j^{s} \cdots w_m^{i_m}u)t^{i_1}\cdots t^{i_m}\end{aligned}$$

is a polynomial in K[t] of degree at most e. Let e be the maximum of the degrees of P, Φ₁, …, Φ_m. Now suppose that it is not the case that $\{f_{w_1^{i_1}\cdots w_m^{i_m}}~|~0\le i_1,\ldots ,i_m\le e\}$ spans $\{f_{w_1^{i_1}\cdots w_m^{i_m}}~|~i_1,\ldots ,i_m\ge 0\}$. Then there exist j₁, …, j_m with j_i > e for some i such that $f_{w_1^{j_1}\cdots w_m^{j_m}} $ is not in the span of $\mathcal {S}$. It is no loss of generality to assume that (j₁, …, j_m) has the property that if $(j_1^{\prime },\ldots ,j_m^{\prime })$ satisfies:

$j_k^{\prime }\le j_k$ for 1 ≤ k ≤ m; and
$(j_1^{\prime },\ldots ,j_m^{\prime })\neq (j_1,\ldots ,j_m)$,

then

$$\displaystyle \begin{aligned}f_{w_1^{j_1^{\prime}}\cdots w_m^{j_m^{\prime}}} \ \in \ \{f_{w_1^{i_1}\cdots w_m^{i_m}}~|~0\le i_1,\ldots ,i_m\le e\}.\end{aligned}$$

By assumption j_i > e and by the e-power property, we have

$$\displaystyle \begin{aligned}\sum_{k=0}^{e} c_k f_{w_1^{j_1}\cdots w_i^{j_i-k}\cdots w_m^{j_m}} \ = \ 0,\end{aligned}$$

where c_k is the coefficient of x^k in Φ_i. Hence

$$\displaystyle \begin{aligned}f_{w_1^{j_1}\cdots w_m^{j_m}}\ \in \ \mathrm{Span}_K\{ f_{w_1^{j_1}\cdots w_i^{j_i-k} \cdots w_m^{i_m} } ~| ~1\le k\le j_i\}\},\end{aligned}$$

since Φ_i has constant coefficient 1. By minimality, we deduce that $f_{w_1^{j_1}\cdots w_m^{j_m}}$ is in the span of $\{f_{w_1^{i_1}\cdots w_m^{i_m}}~|~0\le i_1,\ldots ,i_m\le e\}$, a contradiction. The result follows. □

Theorem 4.5.2 (Berstel-Reutenauer)

Let K be a field and let $f:\mathcal {A}^*\rightarrow K$ . Then the following are equivalent:

1.
f is right k-regular;
2.
f is left k-regular;
3.
f has the d-shuffle and the d-power property for some d;
4.
f has the d-shuffle and the d-power property for all sufficiently large d.

Proof

From Propositions 4.2.5, 4.2.8, and 4.3.2, we have that if f is either right or left k-regular, then f has the d-shuffle and the d-power property for all sufficiently large d. Hence (1) and (2) both imply (4). Clearly (4) implies (3). Thus, it is sufficient to show that if f has the d-shuffle and the d-power property for some d, then f is both left and right regular. Suppose that f has the d-shuffle and the d-power property. We claim that the K-vector space L(f) is finite-dimensional. To see this, notice that by Corollary 4.4.12, the vector space L(f) is spanned by elements of the form $f_{w_1^{i_1}\cdots w_m^{i_m}}$ with $m\le d^5|\mathcal {A}|{ }^d$ and w₁, …, w_m having length at most d. Since f has the d-power property, by Lemma 4.5.1 for any words w₁, …, w_m of length at most d with $m\le d^5|\mathcal {A}|{ }^d$, the K-vector space spanned by $\{f_{w_1^{i_1}\cdots w_m^{i_m}}~|~i_1,\ldots ,i_m\ge 0\}$ is spanned by

$$\displaystyle \begin{aligned}\{f_{w_1^{i_1}\cdots w_m^{i_m}}~|~0\le i_1,\ldots ,i_m\le d\}.\end{aligned}$$

It now follows that L(f) is in the K-span of the set

$$\displaystyle \begin{aligned}\bigcup_{m\le d^5|\mathcal{A}|{}^d} \bigcup_{\stackrel{\mathrm{length}(w_j)\le d,}{1\le j\le m} }\{ f_{w_1^{i_1}\cdots w_m^{i_m}} ~|~ i_1,\ldots, i_m\le d \}.\end{aligned}$$

Thus L(f) is finite-dimensional and so f is left $\mathcal {A}$-regular. A similar argument shows that f is right $\mathcal {A}$-regular and hence (3) implies both (1) and (2). □

As a result of the equivalence between left and right regularity, we drop the words left and right and talk only of $\mathcal {A}$-regular functions from now on.

Corollary 4.5.3

Let $f:\mathcal {A}^*\rightarrow A$ . Then the following are equivalent:

1.
f is k-automatic;
2.
f has the d-shuffle and the d-power property for some d and has a finite range;
3.
f has the d-shuffle and the d-power property for all sufficiently large d and has a finite range.

We make the following remark that follows from the proof of Theorem 4.5.2. This gives a bound on the dimension of the vector space in terms of the quantities d and e for which one has the d-shuffle and e-power property, which is undoubtedly far from optimal, but has the advantage of being explicit in terms of d and e.

Remark 4.5.4

If $|\mathcal {A}|\ge 2$ and $f:\mathcal {A}^*\to K$ has the d-shuffle property and the e-power property, then L(f) and R(f) are both at most N-dimensional where

$$\displaystyle \begin{aligned}N= (e+1)^{d^5 |\mathcal{A}|{}^d} \mathcal{A}|{}^{2d^6|\mathcal{A}|{}^d}.\end{aligned}$$

Proof

The proof of Theorem 4.5.2 shows that L(f) is in the K-span of the set

$$\displaystyle \begin{aligned}\bigcup_{m\le d^5|\mathcal{A}|{}^d} \bigcup_{\stackrel{\mathrm{length}(w_j)\le d,}{1\le j\le m} }\{ f_{w_1^{i_1}\cdots w_m^{i_m}} ~|~ i_1,\ldots, i_m\le e \}.\end{aligned}$$

Since there are at most $|\mathcal {A}|{ }^{d+1}$ words of length at most d, we see that the number of m-tuples of words of length at most d with $m\le d^5|\mathcal {A}|{ }^d$ is at most

$$\displaystyle \begin{aligned}\sum_{m=1}^{d^5|\mathcal{A}|{}^d|} |\mathcal{A}|{}^{(d+1)m}|\le |\mathcal{A}|{}^{2d^6|\mathcal{A}|{}^d}.\end{aligned}$$

Now for each such m-tuple, we pick up a space of dimension at most (e + 1)^m in our spanning set, and so the dimension of L(f) is at most $(e+1)^{d^5 |\mathcal {A}|{ }^d} |\mathcal {A}|{ }^{2d^6|\mathcal {A}|{ }^d}$. A similar argument works for R(f). □

4.6 Sandwich Functions

In this section we introduce a special type of function that is produced from a K-valued function on a free monoid, where K is a field; these functions will be called, for reasons that will soon become apparent, sandwich functions. We will then characterize all automatic sandwich functions.

Let $\mathcal {A} = \{ x_1,\ldots ,x_k\}$ be a finite alphabet. We put a pure lexicographic order ≼ on $\mathcal {A}^*$ by declaring that

$$\displaystyle \begin{aligned}x_1 \ \prec \ x_2 \ \prec \ \cdots \ \prec \ x_k.\end{aligned}$$

We note that this lexicographic order extends to right infinite words over the alphabet $\mathcal {A}$.

Let w₁ and w₂ be two (possibly right-infinite) words on $\mathcal {A}$ with w₁ ≼ w₂. In dealing with right-infinite words, it will be convenient to use w^ω to denote the right-infinite word wwww⋯. We define $f:\mathcal {A} \rightarrow \{0,1\}$ by f(w) = 1 if w₁ ≼ w ≼ w₂ and f(w) = 0 otherwise. We call f a sandwich function since the words which get mapped to 1 are sandwiched between w₁ and w₂.

Example 4.6.1

Let $\mathcal {A}=\{x_1,\ldots ,x_k\}$. Then the constant function 1 and the constant function 0 are both sandwich functions.

Proof

To get the constant function 1, take w₁ = ε and take $w_2=x_k^{\omega }$. To get the constant function 0, take $w_1=w_2=x_k^{\omega }$. □

Example 4.6.2

Let $\mathcal {A}=\{x_1,\ldots ,x_k\}$. Then the function f which sends words beginning with x₁ to 1 and all other words to 0 is a sandwich function.

Proof

Take w₁ = x₁ and take $w_2=x_1x_k^{\omega }$. Then the sandwich function given by these words is f. □

We now characterize $\mathcal {A}$-automatic sandwich functions. To do this we first prove some basic results. As notation for the following lemma, we introduce the function χ which inputs a statement and outputs 1 if the statement is true and 0 if the statement is false.

Lemma 4.6.3

Let w be a (possibly right-infinite) word on a finite alphabet $\mathcal {A}$ which is either finite or eventually periodic. Then the functions f(u) := χ(u ≼ w) and g(u) = χ(u ≽ w) are both $\mathcal {A}$ -automatic.

Proof

If w is finite, notice that if v is a word whose length is greater than the length of w, then

$$\displaystyle \begin{aligned}f_v(u)=f(vu)=\chi(vu\prec w)=\chi(v\preceq w).\end{aligned}$$

Hence the vector space L(f) is contained in the space spanned by the constant function 1 and the functions {f_v | length(v) ≤length(w)}. Thus L(f) is finite-dimensional. A similar argument shows that L(g) is finite-dimensional. If w is eventually periodic, then we can write $w=w_1w_2^{\omega }$. Notice that if v is not an initial factor of w, then

$$\displaystyle \begin{aligned}f_v(u)=f(vu)=\chi(vu\preceq w)=\chi(v\preceq w),\end{aligned}$$

which is a constant function. If v is an initial factor of w of length at least length(w₁) + 2length(w₂), then $f_v=f_{v'}$, where v′ is an initial factor of v obtained by removing the last length(w₂) letters from v. Hence L(f) is again contained in a finite-dimensional vector space and hence must be finite-dimensional. A similar argument works for L(g). □

We need a lemma about monotonic subsequences.

Lemma 4.6.4

Let w be a right-infinite word on a finite alphabet $\mathcal {A}$ which is not eventually periodic. Then for any d ≥ 1, there exist finite words u₁, …, u_d such that:

length(u₁) < length(u₂) < ⋯ < length(u_d);
the sequence u₁, …, u_d is monotonic with respect to ≺;
for 2 ≤ i ≤ d, there exists a word $u_i^{\prime }\neq u_i$ of the same length as u_i which does not have u_i−1 as an initial factor and has the property that $u_{i-1},u_i^{\prime },u_i$ is monotonic;
u₁ u₂⋯u_d is a factor of w.

Proof

For i ≥ 0 define w_i to be the right-infinite word obtained by removing the first i letters from w. Since w is not eventually periodic, the words w₀, w₁, … are all distinct. It follows that they are totally ordered by ≺. It follows that there is a monotonic (with respect to ≺) subsequence $w_{i_1},w_{i_2},\ldots $ of w₀, w₁, …. Without loss of generality $w_{i_1} \prec w_{i_2} \prec \cdots .$ Let $v_j=w_{i_j}$ for j ≥ 1. Pick k₁ = 1. Since v₁ ≺ v₂ and v₂ is not eventually periodic, there exists some number m₁ such that the first m₁ letters of v₁ differ from the first m₁ letters of v₂ and the first m₁ letters of v₂ differ with the first m₁ letters of v₃. Choose j₁ ≥ m₁ such that if we remove the first j₁ letters of v₁ be obtain a word $v_{k_2}$ with k₂ ≥ 3. Define u₁ to be the first j₁ letters of v₁. Now $v_{k_2}\prec v_{k_2+1}$ and hence there is some m₂ such that the first m₂ letters of $v_{k_2}$ differ from the first m₂ letters of $v_{k_2+1}$ and the first m₂ letters of $v_{k_2+1}$ differ from the first m₂ letters of $v_{k_2+2}$. As before, we choose $j_2>\max (j_1, m_2)$ such that if we remove the first j₂ letters of $v_{k_2}$ we obtain some word $v_{k_3}$ with k₃ ≥ k₂ + 2. We define u₂ to be the first j₂ letters of $v_{k_2}$ and $u_2^{\prime }$ to be the first j₂ letters of v₂. Notice that $u_1\prec u_2^{\prime }\prec u_2$ and $u_2^{\prime }\neq u_2$, and it cannot have u₁ as an initial factor of u₁. Continuing in this manner, we see that we can write

$$\displaystyle \begin{aligned}w \ = \ uu_1u_2u_3\cdots \end{aligned}$$

with u some initial factor and words u₁, u₂, …, u_d, $u_1^{\prime },\ldots ,u_d^{\prime }$ satisfying the conditions in the statement of the lemma. □

Theorem 4.6.5

Let w ₁ and w ₂ be (possibly right-infinite) words on a finite alphabet. Then the sandwich function f corresponding to w ₁ and w ₂ is $\mathcal {A}$ -automatic if and only if w ₁ and w ₂ are both either finite or ultimately periodic.

Proof

Suppose that w₁ is neither finite nor ultimately periodic. Write $w_1=w_1^{\prime }w_1^{\prime \prime }$ where $w_1^{\prime }$ is a finite word which is not an initial factor of w₂ and $w_1^{\prime \prime }$ is a right-infinite word. Then $w_1^{\prime \prime }$ is not eventually periodic and hence by Lemma 4.6.4, for any d we can find words u, u₁, u₂, …, u_d such that $w_1=w_1^{\prime }uu_1\cdots u_du'$ for some right-infinite word u′; u₁, …, u_d is monotonic with respect to ≺;

$$\displaystyle \begin{aligned}\mathrm{length} (u_1) \ < \ \cdots \ < \ \mathrm{length} (u_d);\end{aligned}$$

and for i ≥ 2 there exist words $u_i^{\prime }$ with $u_{i-1},u_i^{\prime },u_i$ monotonic with $u_i^{\prime }\neq u_i$, $\mathrm {length}(u_i^{\prime })=\mathrm {length}(u_i)$ and u_i−1 not an initial factor of $u_i^{\prime }$. We have two cases: Case I u₁ ≺ u₂ ≺⋯ ≺ u_d. In this case take $v_d = u_d^{\prime }$ and for 1 ≤ i ≤ d − 1 take v_i = u_i. Consider

$$\displaystyle \begin{aligned}\mathrm{Shuf}_d(f;w_1^{\prime}u,v_1,\ldots ,v_{d-1},v_d,\epsilon).\end{aligned}$$

Notice that since $w_1^{\prime }u$ is lexicographically less than the initial factor of w₂ of the same length and since v₁ ≺ v₂ ≺⋯ ≺ v_d, we have that

$$\displaystyle \begin{aligned}w_1 \ \prec \ w_1^{\prime}uv_{\sigma(1)}\cdots v_{\sigma(d)} \ \prec \ w_2\end{aligned}$$

unless σ is the identity. When σ is the identity, we have $w_1 \ \not \preceq \ w_1^{\prime }uv_1\cdots v_d$ and hence

$$\displaystyle \begin{aligned}\sum_{\sigma\in S_d}\mathrm{sgn}(\sigma) f(w_1^{\prime}uv_{\sigma(1)}\cdots v_{\sigma(d)} )\ \equiv \ 1 (\bmod ~2).\end{aligned}$$

Thus f cannot have the d-shuffle property. Since d is arbitrary we conclude that f is not $\mathcal {A}$-automatic. Case II: u₁ ≻ u₂ ≻⋯ ≻ u_d. In this case take $v_d = u_d^{\prime }$ and for 1 ≤ i ≤ d − 1 take v_i = u_i. In this case we have

$$\displaystyle \begin{aligned}f(w_1^{\prime}uv_{\sigma(1)}\cdots v_{\sigma(d)}) \ = \ 1 \qquad \mbox{if and only if} \sigma \neq \mathrm{id}.\end{aligned}$$

Thus the result again holds in this case. In either case, the d-shuffle property fails to hold for any d, and so our function cannot be $\mathcal {A}$-regular. A similar argument shows that f is not $\mathcal {A}$-regular if w₂ is right-infinite and not eventually periodic.

Next consider what happens if w₁ and w₂ are finite or eventually periodic. Then by the lemma

$$\displaystyle \begin{aligned}f(w) = \chi(w\succ w_1) \chi(w\prec w_2)\end{aligned}$$

and hence f is the coordinate-wise product of two $\mathcal {A}$-automatic sequences. It follows that f is automatic. □

4.7 Applications

4.7.1 The Logarithm and Automaticity

We now give an application of our description of automatic sandwich functions to answer a question of Allouche and Shallit. We note that this question had been answered via a different method by Yossi [425] in 2008.

Proposition 4.7.1

Let $\alpha \in \mathbb {R}$ . Then the sequence given by f(0) = 1 and

$$\displaystyle \begin{aligned}f(n) \ = \ \lfloor \log_k n + \alpha \rfloor \qquad \mathrm{for}~n\ge 1\end{aligned}$$

is k-regular if and only if k ^α is rational.

Proof

We may assume that α ∈ [0, 1). It is easy to verify that the function g(0) = 0 and g(n) = ⌊log_k n⌋ for n ≥ 1 is k-regular. Let us consider the function f(n) − g(n). This function takes values in {0, 1} and is 1 at an integer n ≥ 1 if and only if there is some integer m such that

$$\displaystyle \begin{aligned}\log_k n +\alpha \ge m > \log_k n.\end{aligned}$$

Equivalently, we must have

$$\displaystyle \begin{aligned}k^{\alpha} n \ \ge \ k^m \ \ge \ n.\end{aligned}$$

Write

$$\displaystyle \begin{aligned}k^{-\alpha} \ = \ \sum_{i=1}^{\infty} a_i/k^i\end{aligned}$$

with a_i ∈{1, 2, 3, …, k}. Let w₁ = ε and let w₂ be the right-infinite word a₁ a₂ a₃⋯. Let w be the word in {1, 2, …, k} which corresponds to n using the correspondence described in equation (4.1). Then if we regard h := f − g as a function on {1, 2, …, k}^∗, then h(w) = 1 if and only if w ≼ w₂, where ≺ is the lexicographic order induced by taking 1 ≺ 2 ≺⋯ ≺ k; equivalently, h(w) = 1 if w₁ ≼ w ≼ w₂ and is 0 otherwise. Hence h is a sandwich function. Notice that w₂ is a right-infinite word which is eventually periodic if and only if k^α is rational. Thus h is k-regular if and only if k^α is rational. Since f = g + h and g is k-regular, we see that f is k-regular if and only if h is k-regular. The result now follows. □

Allouche and Shallit [14] ask whether the sequence $\lfloor \log _2 n + \frac {1}{2}\rfloor $ is 2-automatic; since $\sqrt {2}$ is irrational, we deduce that it is not.

4.7.2 The 2-Adic Behavior of the Logarithm

Allouche and Shallit [14, Section 16.7, Q. 4] ask whether the sequence

$$\displaystyle \begin{aligned}f(n) \ = \ \min_{i\ge n+1} i - v_2(i)\end{aligned}$$

is 2-regular, where v₂(i) is the 2-adic valuation; that is, v₂(i) satisfies $2^{v_2(i)}|i$ but $2^{v_2(i)+1}\not |i$. We show that it is not. Here we use the fact that right regularity and left regularity are equivalent properties to answer a question which is difficult to handle using the traditional definition of regularity in terms of right regularity, but which is easily handled if one uses the notion of left regularity.

Proposition 4.7.2

Let

$$\displaystyle \begin{aligned}f(n) \ = \ \min_{i\ge n+1} i - v_2(i).\end{aligned}$$

Then f(n) is not 2-regular.

Proof

Let $\mathcal {A}=\{0,1\}$. Let $w_i=1^i0\in \mathcal {A}^*$. Consider the subspace of L(f) generated by $\{f_{w_i} ~|~i\ge 0\}$. Suppose that f is $\mathcal {A}$-regular. Then since L(f) is finite-dimensional, there exists some m ≥ 3 such that this subspace is spanned by $\{f_{w_i}~|~i\le m\}.$ Let $d=2^{2^m}$. By assumption, there exist integers c₀, …, c_m such that

$$\displaystyle \begin{aligned}f_{w_d} \ = \ \sum_{i=0}^m c_i f_{w_i}.\end{aligned}$$

In particular, we have

$$\displaystyle \begin{aligned} \begin{array}{rcl} f(w_dw_j) &\displaystyle = &\displaystyle f_{w_d}(w_j) \\ &\displaystyle =&\displaystyle \sum_{i=0}^m c_i f_{w_i}(w_j) \\ &\displaystyle =&\displaystyle \sum_{i=0}^m c_if(w_iw_j){} \end{array} \end{aligned} $$

(4.7)

for all j ≥ 0. Observe that for [w_i w_j]₂ = 2^i+j+2 − 2 − 2^j. Hence

$$\displaystyle \begin{aligned} f(w_iw_j) \ = \ \left\{ \begin{array}{ll} 2^{i+j+2}-2^j-j & \mbox{if }2^j\ge i+2; \\ 2^{i+j+2}-i-j-2 & \mbox{if }2^j\le i+2. \end{array} \right. \end{aligned}$$

In particular, if m ≤ j ≤ 2^m, we have f(w_d w_j) = 2^d+j+2 − d − j − 2 and f(w_i w_j) = 2^i+j+2 − 2^j − j for 1 ≤ i ≤ m. Using equation (4.7), we see that for m ≤ j ≤ 2^m we have

$$\displaystyle \begin{aligned} \begin{array}{rcl} 2^{d+j+2} - d-j-2 &\displaystyle =&\displaystyle f(w_dw_j) \\ &\displaystyle =&\displaystyle \sum_{i=0}^m c_if(w_iw_j) \\ &\displaystyle =&\displaystyle \sum_{i=0}^m c_i(2^{i+j+2}-2^j-j). \end{array} \end{aligned} $$

Simplifying, we deduce

$$\displaystyle \begin{aligned}2^j\Bigg( 2^{d+2} - \sum_{i=0}^m c_i (2^{i+2}-1)\Bigg) + j\Bigg(-1+\sum_{i=0}^m c_i \Bigg) \ = \ d+2,\end{aligned}$$

for m ≤ j ≤ 2^m. Let

$$\displaystyle \begin{aligned}A \ = \ 2^{d+2} - \sum_{i=0}^m c_i (2^{i+2}-1)\end{aligned}$$

and let

$$\displaystyle \begin{aligned}B \ = \ -1+\sum_{i=0}^m c_i.\end{aligned}$$

Then we have

$$\displaystyle \begin{aligned}2^j A + j B \ = \ d+2,\end{aligned}$$

for m ≤ j ≤ 2^m. Consider the function G(x) = 2^xA + xB − (d + 2). We have G(m) = G(m + 1) = ⋯ = G(2^m) = 0 and hence by Rolle’s theorem $G'(x)=2^x A \log 2 + B$ must have at least 2^m − m ≥ 2 real zeros. This implies that A = B = 0 since the function G′(x) is monotonic. Then the fact that G(m) = 0 along with A = B = 0 now implies that d + 2 = 0, which is a contradiction. It follows that f(n) is not a 2-regular function. □

4.7.3 Nim Sums and Nim Products

In this subsection we consider the 2-regularity of sequences constructed using nim sums and nim products. Given nonnegative integers n and m, we define the nim sum , n ⊕ m, of n and m as follows. We write n = [a_d⋯a₀]₂ with a_i ∈{0, 1} and m = [b_d⋯b₀]₂ with b_i ∈{0, 1}, where we may pad the binary expansion of either a or b with zeros at the beginning to ensure that they have the same length. We then define

$$\displaystyle \begin{aligned}n\oplus m:=[(a_d+b_d \bmod 2)\cdots (a_0+b_0\bmod 2)]_2.\end{aligned}$$

For example, if m = 12 and n = 21, then

$$\displaystyle \begin{aligned}m\oplus n = [01100]_2\oplus [10101]_2 \ = \ [11001]_2 \ = \ 25.\end{aligned}$$

The nim product , ⊗, is defined as follows:

$$\displaystyle \begin{aligned} 2^{2^a}\otimes 2^{2^b} = \left\{ \begin{array}{ll} 2^{2^a}\cdot 2^{2^b} & \mathrm{if}~ a\neq b;\\ 3\cdot 2^{2^a-1} & \mathrm{if} ~a=b. \end{array} \right. \end{aligned} $$

(4.8)

The product is then defined for all pairs of natural numbers using associativity and distributivity. For example,

$$\displaystyle \begin{aligned} \begin{array}{rcl} 8\otimes 3 &\displaystyle =&\displaystyle (2^{2^1}\otimes 2^{2^0})\otimes (2^{2^0}\oplus 1) \\ &\displaystyle =&\displaystyle 2^{2^1}\otimes (2^{2^0}\otimes 2^{2^0})\oplus (2^{2^1}\otimes 2^{2^0}) \\ &\displaystyle =&\displaystyle 2^{2^1}\otimes 3 \oplus 8 \\ &\displaystyle =&\displaystyle 2^{2^1}\otimes (2^{2^0}\oplus 1) \oplus 8 \\ &\displaystyle =&\displaystyle 8\oplus 4\oplus 8 \\ &\displaystyle =&\displaystyle 4. \end{array} \end{aligned} $$

The nim sum and nim products have the following properties, which can be found in Conway [162, Chap. 6]:

$2^{2^a}\otimes x = 2^{2^a}x$ for $0\le x<2^{2^a}$;
the set of nonnegative numbers less than $2^{2^a}$ is a field under ⊕ and ⊗.

Allouche and Shallit [14, Section 16.7, Q. 5,6] ask the following questions:

Is the nim sum of two 2-regular sequences a 2-regular sequence?
Is the sequence {n ⊗ n} a 2-regular sequence?

We show the answer to these questions is “no.” In fact, Allouche and Shallit ask questions involving 2-dimensional arrays of numbers, but negative answers to the above questions imply negative answers to their questions about arrays. This time, we use the power property and show that it fails to hold.

Lemma 4.7.3

Let i be a nonnegative integer. Then 2ⁱ ⊗ 2ⁱ = 3 ⋅ 2ⁱ⁻¹ if and only if i is a power of 2.

Proof

If i is a power of 2, then 2ⁱ ⊗ 2ⁱ = 3 ⋅ 2ⁱ⁻¹ by the definition of the nim product. Next suppose that i is not a power of 2. Then there is some a such that i = 2^a + j with 0 < j < 2^a. Then

$$\displaystyle \begin{aligned} \begin{array}{rcl} 2^i\otimes 2^i &\displaystyle =&\displaystyle (2^{2^a}\otimes 2^j)\otimes (2^{2^a}\otimes 2^j) \\ &\displaystyle =&\displaystyle (2^{2^a}\otimes 2^{2^a})\otimes (2^j\otimes 2^j) \\ &\displaystyle =&\displaystyle (2^{2^a}\oplus 2^{2^a-1})\otimes (2^j\otimes 2^j) \\ &\displaystyle =&\displaystyle \Big(2^{2^a}\otimes (2^j\otimes 2^j) \Big) \oplus \Big( 2^{2^a-1}\otimes (2^j\otimes 2^j)\Big) \\ &\displaystyle =&\displaystyle \Big(2^{2^a}\cdot (2^j\otimes 2^j)\Big)\oplus \Big( 2^{2^a-1}\otimes (2^j\otimes 2^j)\Big), \end{array} \end{aligned} $$

where we are using the two facts from Conway mentioned above to obtain these equalities.

Observe that if 2ⁱ ⊗ 2ⁱ = 3 ⋅ 2ⁱ⁻¹, then the binary expansion of 2ⁱ ⊗ 2ⁱ cannot have any 1’s appearing in the 2^a + j − 1 least significant digits. In particular it must have 0’s appearing in the 2^a least significant digits. Since $2^{2^a}\otimes (2^j\otimes 2^j)$ has this property, $ 2^{2^a-1}\otimes (2^j\otimes 2^j)$ must also have this property if 2ⁱ ⊗ 2ⁱ = 3 ⋅ 2ⁱ⁻¹, as the nim sum of these two numbers is 2ⁱ ⊗ 2ⁱ. But since $\{0,1,\ldots ,2^{2^a}-1\}$ is a field under ⊗ and ⊕, we see that $2^{2^a-1}\otimes (2^j\otimes 2^j)$ is a nonzero number less than $2^{2^a}$. We conclude that 2ⁱ ⊗ 2ⁱ ≠ 3 ⋅ 2ⁱ⁻¹. □

Proposition 4.7.4

The sequence {m ⊗ m} is not 2-regular.

Proof

To do this, we show that the sequence does not have the power property. It is sufficient to show that the power series

$$\displaystyle \begin{aligned}F(x) \ = \ \sum_{i=0}^{\infty} (2^i\otimes 2^i)x^i\end{aligned}$$

is not rational. Suppose that F(x) is rational. Then

$$\displaystyle \begin{aligned}G(x) \ = \ F(x) - \frac{3}{2} (1-2x)^{-1} \ = \ \sum_{i=0}^{\infty} (2^i\otimes 2^i-3\cdot 2^{i-1})x^i\end{aligned}$$

must also be rational. Notice that by Lemma 4.7.3 the coefficient of xⁿ is zero in G(x) if and only if n is a power of 2. If follows from the Skolem-Mahler-Lech theorem (see [282]) that G(x) is not rational. We conclude that F is not rational and so {n ⊗ n} is not 2-regular. □

Proposition 4.7.5

There exist 2-regular sequences f(n) and g(n) such that f(n) ⊕ g(n) is not 2-regular.

Proof

Let f(n) be defined by

$$\displaystyle \begin{aligned} f(n) \ = \ \left\{ \begin{array}{ll} \frac{4^m-1}{3} & \mathrm{if} ~n=2^m \\ 0 & \mathrm{otherwise.} \end{array} \right. \end{aligned}$$

Then it is easy to check that f(n) is 2-shuffled and has the power property. Hence f is 2-regular. Let

$$\displaystyle \begin{aligned} g(n) \ = \ \left\{ \begin{array}{ll} m & \mathrm{if}~ n=2^m \\ 0 & \mathrm{otherwise.} \end{array} \right. \end{aligned}$$

Then g(n) is 2-regular. Observe that f(2^m) ⊕ g(2^m) = f(2^m) − g(2^m) if and only if the binary expansion of m has 0’s in all the even positions (beginning the count from the least significant digit). Suppose that f(n) ⊕ g(n) is 2-regular. Then the power property gives that the sequence

$$\displaystyle \begin{aligned}f(2^m)\oplus g(2^m)\end{aligned}$$

must satisfy a linear recurrence. If this is the case, then by the Skolem-Mahler-Lech theorem (see [282]), the set of m such that $f(2^m)\oplus g(2^m)=\frac {4^m-1}{3}-m$ must contain an infinite arithmetic progression. But

$$\displaystyle \begin{aligned}\#\{ m\le 4^N~|~f(2^m)\oplus g(2^m)=f(2^m)-g(2^m)\} \ \le \ 2^N.\end{aligned}$$

In other words, the density of the set of m such that $f(2^m)\oplus g(2^m)=\frac {4^m-1}{3}-m$ is 0. But this is a contradiction, since by the Skolem-Mahler-Lech theorem, the density must be positive. It follows that f ⊕ g is not 2-regular. □

We note that if f(n) and g(n) are k-automatic, then both f(n) ⊗ g(n) and f(n) ⊕ g(n) are k-automatic; this follows easily from the fact that k-automatic sequences assume only finitely many different values.

4.8 Shuffled Sequences

In this section we develop the basic properties of sequences possessing the shuffle property. We will find it more useful to work with abelian groups rather than fields in obtaining some of our closure properties, so we shall adopt this point of view in this section.

Definition 4.8.1

Let $\mathcal {A}$ be a finite alphabet and let A be an abelian group. We say that $f:\mathcal {A}^*\rightarrow A$ is an $\mathcal {A}$-shuffled sequence if f has the d-shuffle property for some positive integer d. In the case that $\mathcal {A}=\{1,2,\ldots ,k\}$, we say that f is k-shuffled. We point out that being k-shuffled is not the same as having the k-shuffle property; when we use the word shuffled, the k is making reference to the underlying alphabet and when we use the word shuffle, the k is making reference to the shuffle identity satisfied by a function.

In the case that no confusion will arise, we will drop the alphabet and refer to a sequence as being a shuffled sequence. We note that the shuffle property, although originally defined in the case where our abelian group is a field (with group law given by addition), makes sense in this more general setting.

We have the following containments:

$$\displaystyle \begin{aligned} \begin{array}{lll} \mathcal{A}\mathrm{-automatic sequences} & \subseteq & \mathcal{A}\mathrm{-regular sequences} \\ ~ & \subseteq & \mathcal{A}\mathrm{-shuffled sequences}. \end{array} \end{aligned}$$

It is well known [14, Theorem 16.3.1] that a k-regular sequence $f:\mathbb {N}\to \mathbb {Z}$ has polynomially bounded growth in the sense that there is some d > 0 such that |f(n)|≤ n^d for all sufficiently large n. The following example shows that no such growth restriction on shuffled sequences holds.

Example 4.8.2

A k-shuffled sequence can have arbitrarily rapid growth.

Proof

Let a₁, a₂, … be a sequence of integers and let $f:\{1,2\}^*\rightarrow \mathbb {C}$ be defined by

$$\displaystyle \begin{aligned}f(w) \ = \ \left\{ \begin{array}{ll} 0 & \mbox{if }2~\mbox{appears in}~w \\ a_d & \mathrm{if}~w=1^d. \end{array} \right. \end{aligned}$$

Then f(w) has the 2-shuffle property and hence is a shuffled sequence. Since the a_d are arbitrary, we see that shuffled sequences can have arbitrarily rapid growth. □

We now give some closure properties for shuffled sequences.

Proposition 4.8.3

Let A and B be two abelian groups, and let $\mathcal {A}$ be a finite alphabet. If $f:\mathcal {A}^*\rightarrow A$ and $g:\mathcal {A}^*\rightarrow B$ are shuffled sequences, then so are $(f\oplus g):\mathcal {A}^*\rightarrow A\oplus B$ and $f\otimes g:\mathcal {A}^*\rightarrow A\otimes _{\mathbb Z} B$ . (Here f ⊕ g)(w) = f(w) ⊕ g(w) and f ⊗ g(w) = f(w) ⊗ g(w).)

Proof

Create the ideals I₁ and I₂ in $\mathbb {Z}\{x_1,\ldots ,x_k\}$ as follows. Let I₁ be the set of elements of the form c₁ w₁ + ⋯c_d w_d such that

$$\displaystyle \begin{aligned} c_1f(ww_1w')+\cdots +c_df(ww_dw') \ = \ 0\end{aligned}$$

for all $w,w'\in \mathcal {A}^*$. Similarly, define I₂ to be the set of elements of the form c₁ w₁ + ⋯c_d w_d with

$$\displaystyle \begin{aligned} c_1g(ww_1w')+\cdots +c_dg(ww_dw')\ = \ 0\end{aligned}$$

for all $w,w'\in \mathcal {A}^*$. Then I₁ and I₂ are ideals and $R_1:=\mathbb {Z}\{x_1,\ldots ,x_k\}/I_1$ and $R_2:=\mathbb {Z}\{x_1,\ldots ,x_k\}/I_2$ both satisfy the identity S_d and hence satisfy polynomial identities. It follows from Regev’s theorem [498], that $R_1\otimes _{\mathbb Z} R_2$ also satisfies a polynomial identity. Since it is finitely generated as a $\mathbb {Z}$-algebra, it satisfies the standard identity S_m for some m ≥ 0. Now suppose that the image of $\sum c_{i,j} w_i\otimes w_j$ is zero in $R_1\otimes _{\mathbb Z} R_2$. Then by construction, we have

$$\displaystyle \begin{aligned}\sum c_{i,j} f(ww_iw')\otimes g(ww_jw') \ = \ 0\end{aligned}$$

for all words w, w′. Let $R\subseteq R_1\otimes _{\mathbb Z} R_2$ be the subalgebra generated by the images of x₁ ⊗ x₁, …, x_k ⊗ x_k. Then R must also satisfy the identity S_m, since it is a subring of R₁ ⊗ R₂. Consequently, the sequence f ⊗ g must have the m-shuffle property. □

Remark 4.8.4

Let $\mathcal {A}$ be a finite alphabet and let A and B be abelian groups. If $f:\mathcal {A}^*\rightarrow A$ is a shuffled sequence and ϕ : A → B is a homomorphism of abelian groups, then $\phi \circ f:\mathcal {A}^*\rightarrow B$ is a shuffled sequence.

Corollary 4.8.5

Let $\mathcal {A}$ be a finite group and let A be an abelian group. If $f,g:\mathcal {A}\rightarrow A$ are shuffled, then (f + g) is shuffled. If, in addition, A is a ring then f ⋅ g is shuffled.

Proof

For the first part, use Proposition 4.8.3 and Remark 4.8.4, taking the homomorphism ϕ : A ⊕ A → A given by ϕ(a, a′) = a + a′. For the product, again use Proposition 4.8.3 and Remark 4.8.4, this time taking the homomorphism ϕ : A ⊗ A → A given by ϕ(a ⊗ a′) = aa′. □

Sometimes it is easier to verify that a sequence is shuffled by showing that an identity holds other than the standard identity of the shuffle property holds. We make this more precise in the next theorem.

Theorem 4.8.6

Let n be a nonnegative integer. Suppose that there exist integers {c_σ | σ ∈ S_n}, at least one of which is equal to 1, such that

$$\displaystyle \begin{aligned}\sum_{\sigma\in S_n} c_{\sigma} f(ww_{\sigma(1)}\cdots w_{\sigma(n)}w') \ = \ 0.\end{aligned}$$

for all words $w,w_1,\ldots ,w_n,w'\in \mathcal {A}^*$ . Then f is shuffled.

Proof

Write $\mathcal {A}=\{x_1,\ldots ,x_k\}$. We define an ideal I in the algebra $\mathbb {Z}\{x_1,\ldots ,x_k\}$ as follows. We declare that $\sum a_w w \in I$ if

$$\displaystyle \begin{aligned}\sum a_w f(w'ww'') \ = \ 0\end{aligned}$$

for all $w',w''\in \mathcal {A}^*$. Notice that $R=\mathbb {Z}\{x_1,\ldots ,x_k\}/I$ satisfies a polynomial identity, as it satisfies the multilinear, homogeneous polynomial identity $\sum _{\sigma \in S_n} c_{\sigma } t_{\sigma (1)}\cdots t_{\sigma (n)}$. Since R is finitely generated, it satisfies a standard identity S_m for some m (cf. Theorem 4.4.6 and see, also, Braun [105]). Consequently,

$$\displaystyle \begin{aligned}\sum_{\sigma\in S_m} \mathrm{sgn}(\sigma)w_{\sigma(1)}\cdots w_{\sigma(m)} \ \in \ I\end{aligned}$$

for all $(w_1,\ldots ,w_m) \in (\mathcal {A}^*)^m$. Hence

$$\displaystyle \begin{aligned}\sum_{\sigma\in S_m} \mathrm{sgn}(\sigma)f(ww_{\sigma(1)}\cdots w_{\sigma(m)}w') \ = \ 0\end{aligned}$$

for all $w,w_1,\ldots ,w_m,w'\in \mathcal {A}^*$. □

Let $\mathcal {A}$ be a finite alphabet and let R be a ring. The set of maps $f:\mathcal {A}^*\rightarrow R$ has a multiplication defined as follows: Given $x_1,\ldots ,x_d\in \mathcal {A}$, we define

$$\displaystyle \begin{aligned} f\star g(x_1\cdots x_d) \ := \ \sum_{i=0}^d f(x_1\cdots x_i)g(x_{i+1}\cdots x_d), \end{aligned} $$

(4.9)

where f(x₁⋯x_i) is taken to mean f(ε) when i = 0 and we take g(x_i+1⋯x_d) to be g(ε) when i = d. This product, along with ordinary sum, turns the set of maps $f:\mathcal {A}^*\rightarrow R$ into an associative ring [77]. We show that the shuffled sequences form a subalgebra.

Proposition 4.8.7

If $f:\mathcal {A}^*\rightarrow A$ and $f:\mathcal {A}^*\rightarrow A$ are shuffled, then f ⋆ g is shuffled.

Proof

We may pick d such that f and g both have the d-shuffled property. We claim that if $w_1,\ldots ,w_d,u_1,\ldots ,u_d,w,w'\in \mathcal {A}^*$, then

$$\displaystyle \begin{aligned}\sum_{\sigma\in S_d}\sum_{\tau\in S_d} \mathrm{sgn}(\sigma)\mathrm{sgn}(\tau) (f\star g)(ww_{\sigma(1)}\cdots w_{\sigma(d)}u_{\tau(1)}\cdots u_{\tau(d)}w') \ = \ 0.\end{aligned}$$

To see this, let

$$\displaystyle \begin{aligned}v(\sigma,\tau)=ww_{\sigma(1)}\cdots w_{\sigma(d )}u_{\tau(1)}\cdots u_{\tau(d)}w'.\end{aligned}$$

Then

$$\displaystyle \begin{aligned} \sum_{\sigma,\tau\in S_d} \mathrm{sgn}(\sigma)\mathrm{sgn}(\tau) (f\star g)(v(\sigma,\tau)) = \sum_{\sigma,\tau\in S_d}\sum_{\stackrel{v_1v_2=}{v(\sigma,\tau)}} \mathrm{sgn}(\sigma)\mathrm{sgn}(\tau) f(v_1)g(v_2).\end{aligned}$$

If v₁ v₂ = v(σ, τ) then either v₁ = v₁(σ, τ) contains ww_σ(1)⋯w_σ(d) as an initial factor, or v₂ = v₂(σ, τ) contains u_τ(1)⋯u_τ(d) w′ as a terminal factor (or both). Notice that if $v_1=ww_{\sigma (1)}\cdots w_{\sigma (d)}v_1^{\prime }$, then $v_1^{\prime }$ and v₂ depend only on τ. Hence

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle ~&\displaystyle \sum_{\sigma\in S_d}\sum_{\tau\in S_d} \mathrm{sgn}(\sigma)\mathrm{sgn}(\tau)f(v_1)g(v_2) \\ &\displaystyle =&\displaystyle \sum_{\tau\in S_d}\sum_{\sigma\in S_d} \mathrm{sgn}(\sigma)\mathrm{sgn}(\tau) f(ww_{\sigma(1)}\cdots w_{\sigma(d)}v_1^{\prime})g(v_2) \\ &\displaystyle =&\displaystyle \sum_{\tau\in S_d} \mathrm{sgn}(\tau)g(v_2)\Bigg(\sum_{\sigma\in S_d} \mathrm{sgn}(\sigma)f(ww_{\sigma(1)}\cdots w_{\sigma(d)}v_1^{\prime}) \Bigg) \\ &\displaystyle =&\displaystyle 0, \end{array} \end{aligned} $$

since f has the d-shuffled property. Similarly, if v₂ contains the terminal factor u_τ(1)⋯u_τ(d) w′, then the fact that g has the d-shuffled property guarantees that

$$\displaystyle \begin{aligned}\sum_{\sigma\in S_d}\sum_{\tau\in S_d} f(v_1)g(v_2)=0.\end{aligned}$$

Observe that S_d × S_d embeds in S_2d by taking an ordered pair (σ, τ) and letting σ act on {1, 2, …, d} and letting τ act on {d + 1, …, 2d}. Given μ ∈ S_2d, we define c_μ = sgn(σ)sgn(τ) if μ corresponds to (σ, τ) under this inclusion and we take c_μ = 0 otherwise. Then from Theorem 4.8.6, we see that f ⋆ g is shuffled. □

We let $\mathcal {S}_k$ denote the collection of k-shuffled sequences taking values in $\mathbb {C}$. Then we have just shown that $(\mathcal {S}_k,+,\star )$ is a $\mathbb {C}$-algebra. The following proposition shows that in some sense this algebra is very large.

Proposition 4.8.8

The algebra $(\mathcal {S}_k,+,\star )$ contains a copy of the free $\mathbb {C}$ -algebra on infinitely many generators.

Proof

Since the free algebra on two generators contains a copy of the free algebra on infinitely many generators, it is sufficient to do show we contain a copy of the free algebra on two generators. Let $f_0:\{1,2,\ldots ,k\}^*\rightarrow \mathbb {C}$ be defined to be 1 on words of the form 12ⁱ and be defined to be 0 on all other words. Let $f_1:\{1,2,\ldots ,k\}^*\rightarrow \mathbb {C}$ be defined to be i on the word 12ⁱ and to be 0 on words not of the form 12ⁱ. Then it is easy to check that f₀ and f₁ are k-shuffled sequences. We claim that f₀ and f₁ generate a free algebra. Suppose that

$$\displaystyle \begin{aligned}G:=\sum_{k=1}^d \sum_{(i_1,\ldots ,i_k)\in \{0,1\}^k} \alpha_{i_1,\ldots ,i_k} f_{i_1}\star f_{i_2}\star \cdots \star f_{i_k} \ = \ 0\end{aligned}$$

for some d with $\alpha _{i_1,\ldots ,i_d}\neq 0$ for some (i₁, …, i_d) ∈{0, 1}^d. Notice that if w is a word of the form $12^{a_1}12^{a_2} \cdots 12^{a_d}$ then $f_{i_1}\star \cdots \star f_{i_k}(w)=0$ for k < d. Hence

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle ~&\displaystyle G(12^{a_1}12^{a_2}\cdots 12^{a_d}) \\ &\displaystyle = &\displaystyle \sum_{(i_1,\ldots ,i_d)\in \{0,1\}^d} \alpha_{i_1,\ldots ,i_d} f_{i_1}\star f_{i_2}\star \cdots \star f_{i_d} (12^{a_1}12^{a_2}\cdots 12^{a_d})\\ &\displaystyle =&\displaystyle \sum_{(i_1,\ldots ,i_d)\in \{0,1\}^d} \alpha_{i_1,\ldots ,i_d} a_1^{i_1}\cdots a_d^{i_d}. \end{array} \end{aligned} $$

by assumption G is identically 0 and hence the polynomial

$$\displaystyle \begin{aligned}H(x_1,\ldots ,x_d) \ := \ \sum_{(i_1,\ldots ,i_d)\in \{0,1\}^d} \alpha_{i_1,\ldots ,i_d}x_1^{i_1} \cdots x_d^{i_d}\end{aligned}$$

vanishes on all points $(x_1,\ldots ,x_d)\in \mathbb {N}^d$. But this implies that H is the zero polynomial, which contradicts the fact that $\alpha _{i_1,\ldots ,i_d}\neq 0$ for some (i₁, …, i_d) ∈{0, 1}^d. We conclude that the algebra generated by f₀ and f₁ is free. This completes the proof. □

4.9 Open Problems and Concluding Remarks

One of the fundamental theorems from the theory of automatic sequences is Cobham’s theorem. Two integers p and q are multiplicatively independent if p^a ≠ q^b for (a, b) ≠ (0, 0). Cobham’s theorem [14, Chapter 11] states that if a sequence is p-automatic and q-automatic and p and q are multiplicatively independent, then the sequence is eventually periodic . Given a sequence f(n) taking values in an abelian group, by the correspondence described in item 4.1, it makes sense to talk about the sequence being a k-shuffled sequence.

Question 4.9.1

Suppose a $\mathbb {Z}$-valued sequence f(n) is both p-shuffled and q-shuffled for two multiplicatively independent integers p and q. What can be said about f(n)? For instance, does f(n) satisfy a linear recurrence?

Another question comes from looking at closure properties. In Section 4.8 we showed that shuffled sequences are closed under ordinary products and under the ⋆ product. We now ask if shuffled sequences are closed under Cauchy products.

Question 4.9.2

Given integer-valued sequences f(n) and g(n) that are k-shuffled, is the Cauchy product of f(n) and g(n) also k-shuffled?

In Section 4.8 we showed that the set $\mathcal {S}_k$ of k-shuffled sequences taking values in $\mathbb {C}$ forms a $\mathbb {C}$-algebra under the ⋆ product.

Question 4.9.3

Can one find nice generating sets for the $\mathbb {C}$-algebra $(\mathcal {S}_k,+,\star )$?

Question 4.9.4

Can one extend the notion of shuffled sequences to nonconstant length substitutions ? See, for example, Shallit [542] and Allouche, Scheicher, and Tichy [13].

Two final questions we pose come from the nim sum and nim product. In Section 4.7 we studied sequences defined using the nim sum and the nim product. In each example, we showed that the power property failed to hold. We did not show, however, that the shuffle property fails to hold.

Question 4.9.5

Let ⊕ and ⊗ denote, respectively, the nim sum and the nim product. Is the sequence {m ⊗ m} a 2-shuffled sequence? If f(n) and g(n) are 2-shuffled sequences, is f(n) ⊕ g(n) also a 2-shuffled sequence?

Question 4.9.6

Does {2ⁿ ⊕ 3ⁿ} satisfy a linear recurrence? We note that this is related to Mahler’s study of Z-numbers [406].

We note that throughout we have been looking at sequences taking values in a field or occasionally in an abelian group. In fact, we can work more generally by using a commutative ring C and look at sequences taking values in a C-module. Allouche and Shallit [16] define the more general notion of a (C, k)-regular sequence. We note that since Shirshov’s height theorem is over a more general base ring C, all the results in this paper which relate to k-regular sequences have analogues in this more general context of (C, k)-regular sequences.

References

Allouche, J.-P., Scheicher, K., Tichy, R.F.: Regular maps in generalized number systems. Math. Slovaca 50, 41–58 (2000)
MathSciNet MATH Google Scholar
Allouche, J.-P., Shallit, J.: Automatic Sequences: Theory, Applications, Generalizations. Cambridge University Press, Cambridge (2003)
Book Google Scholar
Allouche, J.-P., Shallit, J.O.: The ring of k-regular sequences. In: Choffrut, C., Lengauer, T. (eds.) STACS 90, Proceedings of the 7th Symposium on Theoretical Aspects of Computer Science. Lecture Notes in Computer Science, vol. 415, pp. 12–23. Springer, Berlin (1990)
Google Scholar
Amitsur, A.S., Levitzki, J.: Minimal identities for algebras. Proc. Am. Math. Soc. 1, 449–463 (1950)
Article MathSciNet Google Scholar
Amitsur, S.A., Small, L.W.: Affine algebras with polynomial identities. Rend. Circ. Mat. Palermo (2) Suppl. 31, 9–43 (1993). Recent developments in the theory of algebras with polynomial identities (Palermo, 1992)
Google Scholar
Berstel, J., Reutenauer, C.: Noncommutative rational series with applications. Encyclopedia of Mathematics and Its Applications, vol. 137. Cambridge University Press, Cambridge (2011)
Google Scholar
Braun, A.: The nilpotency of the radical in a finitely generated PI ring. J. Algebra 89(2), 375–396 (1984)
Article MathSciNet Google Scholar
Conway, J.H.: On Numbers and Games. Academic Press, New York (1976)
MATH Google Scholar
Dekking, F.M., Mendès France, M., Poorten, A.J.v.d.: Folds! Math. Intelligencer 4, 130–138, 173–181, 190–195 (1982). Erratum, 5 (1983), 5
Google Scholar
Hansel, G.: A simple proof of the Skolem-Mahler-Lech theorem. In: Brauer, W. (ed.) Proceedings of the 12th International Conference on Automata, Languages, and Programming (ICALP). Lecture Notes in Computer Science, vol. 194, pp. 244–249. Springer, Berlin (1985)
Google Scholar
Mahler, K.: An unsolved problem on the powers of 3/2. J. Aust. Math. Soc. 8, 313–321 (1968)
Article MathSciNet Google Scholar
Moshe, Y.: On some questions regarding k-regular and k-context-free sequences. Theor. Comput. Sci. 400(1-3), 62–69 (2008)
Article MathSciNet Google Scholar
Pirillo, G.: A proof of Shirshov’s theorem. Adv. Math. 124(1), 94–99 (1996)
Article MathSciNet Google Scholar
Regev, A.: Existence of identities in A ⊗ B. Isr. J. Math. 11, 131–152 (1972)
Google Scholar
Rosset, S.: A new proof of the Amitsur-Levitski identity. Isr. J. Math. 23(2), 187–188 (1976)
Article MathSciNet Google Scholar
Shallit, J.O.: A generalization of automatic sequences. Theor. Comput. Sci. 61, 1–16 (1988)
Article MathSciNet Google Scholar

Download references

Acknowledgements

I thank Jean-Paul Allouche and Jeffrey Shallit for many helpful comments. I also thank Jean-Paul Allouche for raising Question 4.9.4.

Author information

Authors and Affiliations

Department of Pure Mathematics, University of Waterloo, 200 University Ave. W., Waterloo, ON, Canada, N2L 3G1
Jason Bell

Authors

Jason Bell
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jason Bell .

Editor information

Editors and Affiliations

IRIF, Université Paris Diderot, Paris, France
Valérie Berthé
Department of Mathematics, University of Liège, Liège, Belgium
Michel Rigo

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bell, J. (2018). Some Applications of Algebra to Automatic Sequences. In: Berthé, V., Rigo, M. (eds) Sequences, Groups, and Number Theory. Trends in Mathematics. Birkhäuser, Cham. https://doi.org/10.1007/978-3-319-69152-7_4

Download citation

DOI: https://doi.org/10.1007/978-3-319-69152-7_4
Published: 10 April 2018
Publisher Name: Birkhäuser, Cham
Print ISBN: 978-3-319-69151-0
Online ISBN: 978-3-319-69152-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Some Applications of Algebra to Automatic Sequences

Abstract

Similar content being viewed by others

On the operations of sequences in rings and binomial type sequences

A constructive sequence algebra for the calculus of indications

IntegerSequences: A Package for Computing with k-Regular Sequences

4.1 Introduction

Definition 4.1.1

4.2 The Shuffle Property

Example 4.2.1

Example 4.2.2

Theorem 4.2.3 (Amitsur-Levitzki [20])

Lemma 4.2.4

Proof

Proof (Proof of Theorem 4.2.3)

Proposition 4.2.5

Proof

Definition 4.2.6

Proposition 4.2.7

Proof

Proposition 4.2.8

Proof

4.3 The Power Property

Lemma 4.3.1

Proof

Proposition 4.3.2

Proof

4.4 Shirshov’s Height Theorem

Definition 4.4.1

Example 4.4.2

Example 4.4.3 (Wagner)

Proof

Proposition 4.4.4

Proof

Remark 4.4.5

Proof

Theorem 4.4.6 (Braun)

Proof

Theorem 4.4.7 (Shirshov)

Theorem 4.4.8 (König’s infinity lemma)

Proof

Theorem 4.4.9 (Furstenberg’s theorem)

Proof

Proof (Proof of Theorem 4.4.7)

Corollary 4.4.10 (Shirshov)

Proof

Theorem 4.4.11 (Strong version of Shirshov’s theorem)

Proof

Corollary 4.4.12

Proof

4.5 Characterization of \(\mathcal {A}\)-Regular Sequences

Lemma 4.5.1

Proof

Theorem 4.5.2 (Berstel-Reutenauer)

Proof

Corollary 4.5.3

Remark 4.5.4

Proof

4.6 Sandwich Functions

Example 4.6.1

Proof

Example 4.6.2

Proof

Lemma 4.6.3

Proof

Lemma 4.6.4

Proof

Theorem 4.6.5

Proof

4.7 Applications

4.7.1 The Logarithm and Automaticity

Proposition 4.7.1

Proof

4.7.2 The 2-Adic Behavior of the Logarithm

Proposition 4.7.2

Proof

4.7.3 Nim Sums and Nim Products

Lemma 4.7.3

Proof

Proposition 4.7.4