1 Introduction

Given a semigroup \({\textbf{S}}\), a word \({\textbf{t}}\) in n variables induces an n-ary term function \({{\textbf {t}}}^{{\textbf {S}}} :S^n \rightarrow S\) by evaluation. The set of all finitary term functions on a semigroup  \({\textbf{S}}\) is called the clone of \({\textbf{S}}\), and is denoted \(\text {Clo}({\textbf{S}})\). A k-ary relation \(R \subseteq S\) is compatible with \({\textbf{S}}\) if it forms a subsemigroup of \({\textbf{S}}^k\). By a classical result of universal algebra, any finitary operation that preserves all finitary compatible relations on a finite semigroup \({\textbf{S}}\) is a term function on \({\textbf{S}}\). In other words, the set of all finitary compatible relations on a finite semigroup determines its clone. If a finite subset of the finitary relations suffices to determine the clone, then \({\textbf{S}}\) is said to be finitely related.

The study of finitely related semigroups began with Davey, Jackson, Pitkethly and Szabó [4], who showed, among other things, that all finite commutative semigroups/monoids and all finite nilpotent semigroups are finitely related. Mayr [14] built on their work, providing the first example of a non-finitely related finite semigroup. Mayr also showed that Clifford semigroups are finitely related and that finite relatedness is preserved by the addition of a zero element. Dolinka [5] showed that all finite bands are finitely related, answering a question posed by Mayr [14]. More recently, Steindl [19] answered a further question by Mayr, giving an example of a non-finitely related nilpotent monoid, thus establishing that finite relatedness is not preserved by the addition of an identity element.

Interestingly, the study of finitely related semigroups has developed in much the same way as the study of the finite basis property for semigroups. We continue this trend by beginning the exploration of the finite relatedness of nilpotent monoids built from words. These monoids have previously provided a plethora of interesting examples of finitely and non-finitely based semigroups [6, 8, 9, 11, 18].

We will build off the work of Steindl [19] by using nilpotent monoids constructed from words to produce large classes of non-finitely related semigroups. Examples drawn from these classes are used to show some interesting behaviour of the finite relatedness of semigroups. In particular, we show that:

  • There are no inherently non-finitely related nilpotent monoids (Corollary 3.3);

  • There exist two non-finitely related semigroups whose direct product is finitely related (Corollary 4.20);

  • There exists a non-finitely related semigroup \({\textbf{S}}\) such that \({\textbf{S}}^1\) is finitely related (Corollary 5.4);

  • There exists an infinite ascending chain of finitely generated semigroup varieties alternating between finitely and non-finitely related (Corollary 4.8);

  • There exists a non-finitely based semigroup which is finitely related (Theorem 6.7).

2 Preliminaries

2.1 Finitely related algebras

An algebra \({\textbf{A}} = \langle A; F \rangle \) is a set A with a set of finitary operations F on A. We refer the reader to Burris and Sankappanavar [3] for a good introduction to the study of algebras. The set of all finitary term functions on an algebra \({\textbf{A}}\) is called the clone of term functions of \({\textbf{A}}\). We refer to the clone of term functions of an algebra \({\textbf{A}}\) as simply the clone of \({\textbf{A}}\) and denote it  \(\text {Clo}({\textbf{A}})\).

Given a set A, an operation \(f :A^n \rightarrow A\) preserves a relation \(R \subseteq A^k\) if \(f(r_1,\dots ,r_n) \in R\) for all \(r_1,\dots ,r_n \in R\), where f is applied coordinatewise. An algebra \({\textbf{A}}\) is said to be finitely related if there exists a finite set of finitary relations such that the operations preserving those relations are precisely the term functions of \({\textbf{A}}\). We will say that \({\textbf{A}}\) is non-finitely related if no such set of finitary relations exist. Finitely related algebras have also been called algebras possessing finite degree by some authors [4, 20].

Given a function \(f:A^n \rightarrow A\), we may identify two coordinates, say i and j (with  \(i < j\)), to produce \(f_{ij}:A^n \rightarrow A\) which maps

$$\begin{aligned} (x_1,\dots ,x_n) \mapsto f(x_1,\dots ,x_{i-1},x_j,x_{i+1},\dots ,x_n). \end{aligned}$$

We call the function \(f_{ij}\) an identification minor.

Given a term \({\textbf{t}}(x_1,\dots ,x_n)\) and \(i,j \in {\underline{n}} = \{1,\dots ,n\}\) with \(i < j\), we denote the term obtained from \({\textbf{t}}\) by replacing each instance of \(x_i\) with \(x_j\) by \({\textbf{t}}^{(ij)}\).

Let V be a variety and \({\mathscr {F}} = \{{\textbf{t}}_{ij}(x_1,\dots ,x_n) \mid 1 \le i < j \le n\}\) be an indexed family of n-ary terms. We say that \({\mathscr {F}}\) is a scheme for V if \({\mathscr {F}}\) satisfies the following conditions:

Dependency: for all \({\textbf{A}} \in V\), \({\textbf{t}}_{ij}^{\textbf{A}}\) does not depend on its i-th coordinate;

Consistency: for all \(i,j,k,l \in {\mathbb {N}}\) with \(i < j\) and \(k < l\), V satisfies

$$\begin{aligned} V \models {\textbf{t}}_{ij}^{(kl)} \approx {\textbf{t}}_{kl}^{(ij)}. \end{aligned}$$

A scheme \({\mathscr {F}} = \{{\textbf{t}}_{ij}(x_1,\dots ,x_n) \mid 1 \le i < j \le n\}\) comes from a term for V if there exists a term \({\textbf{s}}\) such that \(V \models {\textbf{s}}^{(ij)} \approx {\textbf{t}}_{ij}\) for all \(1 \le i < j \le n\).

Theorem 2.1

( [4, Theorem 2.9]) Let \({\textbf{A}}\) be a finite algebra. Then \({\textbf{A}}\) is finitely related if and only if there exists \(k \ge |A |\) such that the following equivalent conditions hold:

  1. (1)

    for all \(n > k\), every n-ary scheme for \(V({\textbf{A}})\) comes from a term;

  2. (2)

    for all \(n > k\), if \(f_{ij}\) is a term function for every \(1 \le i < j \le n\) then f is a term function for \({\textbf{A}}\).

Davey et al. [4] and Dolinka [5] used schemes to great effect, while Mayr [14] and Steindl [19] both opted to use operations with minor identified terms. For the sake of convenience, we have chosen to use schemes in this paper as we will repeatedly use the equational theory of monoids built from words.

We briefly note that any n-ary scheme for an algebra \({\textbf{A}}\) with \(n > |A |\) induces a unique operation \(f :A^n \rightarrow A\) such that \(f_{ij} = {\textbf{t}}_{ij}^{\textbf{A}}\) (see [4, Lemma 2.6] for details) by application of the pigeonhole principle.

The term degree of an algebra is the least \(k \ge |A |\) such that Theorem 2.1 holds. If we remove the requirement that \(k \ge |A |\), then the degree of an algebra is the least k such that condition (2) of Theorem 2.1 holds.

Theorem 2.2

( [4, Theorem 2.11]) If two algebras generate the same variety then they are either both finitely or both non-finitely related.

Given a set A and an operation \(f:A^n \rightarrow A\), we say f depends on its j-th coordinate if there exist \(a,b \in A^n\) such that \(a_i = b_i\) for all \(i \in {\underline{n}}{\setminus }\{j\}\) and \(a_j \ne b_j\) with \(f(a) = f(b)\). A scheme depends on all of its variables if the unique operation it induces depends on all of its coordinates. Throughout this paper we will assume that any scheme depends on all of its variables since otherwise a term can simply be found by identification (provided the scheme has sufficient arity) [4, Remark 2.21].

2.2 Nilpotent monoids and the structure of their schemes

A semigroup \({\textbf{S}}\) is said to be d-nilpotent if it satisfies

$$\begin{aligned} {\textbf{S}} \models x_1\cdots x_d \approx y_1 \cdots y_d. \end{aligned}$$

A nilpotent monoid is a nilpotent semigroup with an adjoined identity.

Davey et al. [4, Theorem 3.4] proved that all finite nilpotent semigroups are finitely related. Mayr [14, Theorem 5.1] showed that every finite 3-nilpotent monoid is finitely related, which, by semigroup lore, implies that almost all finite monoids are finitely related [10]. Steindl [19, Theorem 2.5] gave the first example of a non-finitely related nilpotent monoid, hence showing that finite relatedness is not preserved by adjoining an identity.

For a scheme over a nilpotent monoid (and any semigroup containing a 2-element semilattice), we may easily recover the fact that all variables, except the identified variable, appear in the terms in the scheme.

Given a word \({\textbf{w}}\), the content of \({\textbf{w}}\), denoted \(c({\textbf{w}})\), is the set of all variables appearing in \({\textbf{w}}\). Throughout this paper, we will set \(X_n = \{x_1,\dots ,x_n\}\).

Lemma 2.3

( [19, Lemma 3.2], [5, Lemma 3.2]) Let \({\textbf{S}}\) be a semigroup such that  \(V({\textbf{S}})\) contains the variety of semilattices. Then for \(n > |S |+ 1\), and any n-ary scheme \({\mathscr {F}} = \{{\textbf{t}}_{ij} |1 \le i < j \le n\}\), we have \(c({\textbf{t}}_{ij}) = X_n {\setminus } \{x_i\}\).

Let \({\textbf{S}}\) be a monoid and \({\textbf{t}}(x_1,\dots ,x_n)\) be a term of \({\textbf{S}}\). Let \(Y \subseteq X_n\). We denote the term obtained from \({\textbf{t}}\) by deleting \(X_n\setminus Y\) by \({\textbf{t}}[Y]\). If \(f:S^n \rightarrow S\) is an operation, then f[Y] is the operation obtained by restricting f to

$$\begin{aligned} \{(a_1,\dots ,a_n) \in S^n \mid a_i = 1 \text { if } x_i \in \{x_1,\dots ,x_n\}\setminus Y\}. \end{aligned}$$

Given a scheme \({\mathscr {F}} = \{{\textbf{t}}_{ij} |1 \le i < j \le n\}\) for a monoid \({\textbf{S}}\) that depends on all of its variables (with \(n > |S |+ 1\)), the consistency condition and Lemma 2.3 ensures that

$$\begin{aligned} {\textbf{S}} \models {\textbf{t}}_{jk}[x_i] \approx {\textbf{t}}_{jk}^{(lm)}[x_i] \approx x_i^{e_i} \approx x_i^{f_i} \approx {\textbf{t}}_{lm}^{(jk)}[x_i] \approx {\textbf{t}}_{lm}[x_i] \end{aligned}$$

for any \(j,k,l,m \in {\underline{n}}\setminus \{i\}\) and some \(e_i,f_i \in {\mathbb {N}}\).

Let \({\textbf{S}}\) be a finite semigroup. The smallest positive integers m and r such that \({\textbf{S}} \models x^m \approx x^{m+r}\) are known as the index and period of \({\textbf{S}}\) respectively. It follows that for a monoid \({\textbf{S}}\), the occurrences of variables in the equations of \({\mathscr {F}}\) must obey the index and period of \({\textbf{S}}\) across different terms in that scheme.

Let \({\underline{n}} = \{1,\dots ,n\}\) and \(\text {occ}(x,{\textbf{w}})\) be the number of occurrences of a variable x in a word (or term) \({\textbf{w}}\).

To each variable \(x_i\) in a scheme over \({\textbf{S}}\), we may assign

$$\begin{aligned} e_i = \text {min}\{\text {occ}(x_i,{\textbf{t}}_{jk}) \mid j,k \in {\underline{n}}\setminus \{i\}\}, \end{aligned}$$

where \(e_i\) is known as the variable exponent of \(x_i\) in \({\mathscr {F}}\).

A word \({\textbf{w}}\) is an isoterm for a semigroup \({\textbf{S}}\) if \({\textbf{S}} \models {\textbf{u}} \approx {\textbf{w}} \implies {\textbf{u}} = {\textbf{w}}\).

The proof of the following lemma closely follows [19, Lemma 3.4].

Lemma 2.4

Let \({\textbf{S}}\) be a monoid such that \(V({\textbf{S}})\) contains the variety of semilattices and \({\textbf{S}}\) satisfies \(x^{p} \approx x^{p+q}\) for minimal \(p,q \in {\mathbb {N}}\). Let \({\mathscr {F}} = \{{\textbf{t}}_{ij} |1 \le i < j \le n\}\) be a scheme for \({\textbf{S}}\) with \(n > |S |+ 1\) and variable exponents \(e_1,\dots ,e_n\). Then

  1. (i)

    for all \(i \in {\underline{n}}\) and all \(j,k \in {\underline{n}}{\setminus }\{i\}\) if \(e_i < p\) then \(\text {occ}(x_i,{\textbf{t}}_{jk}) = e_i\) and \(\text {occ}(x_i,{\textbf{t}}_{jk}) \equiv e_i \pmod {q}\) otherwise;

  2. (ii)

    for all \(i,j \in {\underline{n}}\) if \(e_i + e_j < p\) then \(\text {occ}(x_j,{\textbf{t}}_{ij}) = e_i + e_j\) and \(\text {occ}(x_j,{\textbf{t}}_{ij}) \equiv e_i + e_j \pmod {q}\) otherwise.

Proof

Suppose for the variable \(x_i\) in \({\mathscr {F}}\) that \(e_i < p\). Then there exist \(l,m \in {\underline{n}}\setminus \{i\}\) such that \({\textbf{t}}_{lm}[x_i] = x_i^{e_i}\). By the consistency condition, for any \(j,k \in {\underline{n}}\setminus \{i\}\), if \({\textbf{t}}_{jk}[x_i] = x_i^f\) then \({\textbf{S}} \models x_i^{e_i} \approx x_i^f\). But \(e_i\) is less than the index of \({\textbf{S}}\), so \(x_i^{e_i}\) is an isoterm for \({\textbf{S}}\) and hence \(f = e_i\).

Now suppose that \(e_i \ge p\). Then by the same argument, if \({\textbf{t}}_{jk}[x_i] = x_i^f\) then \({\textbf{S}} \models x_i^{e_i} \approx x_i^f\). It follows that \(f = \text {occ}(x_i,t_{kl}) \equiv e_i \pmod {q}\). This concludes the proof of (i).

To prove (ii) suppose \(e_i + e_j < p\). Then by (i), and the fact that \(n > |S |+ 1 \ge 3\), there exist \(k,l \in {\underline{n}}{\setminus }\{i,j\}\) such that \(\text {occ}(x_i,{\textbf{t}}_{kl}) = e_i\) and \(\text {occ}(x_j,{\textbf{t}}_{kl}) = e_j\). It follows that \(\text {occ}(x_j,{\textbf{t}}_{kl}^{(ij)}) = e_i + e_j\). Let \(f = \text {occ}(x_j,{\textbf{t}}_{ij})\). Then, by the consistency condition, we have

$$\begin{aligned} x_j^f = {\textbf{t}}_{ij}[x_j] \approx {\textbf{t}}_{ij}^{(kl)}[x_j] \approx {\textbf{t}}_{kl}^{(ij)}[x_j] = x_j^{e_i+e_j}. \end{aligned}$$

But \(e_i + e_j < p\) so \(x_j^{e_i+e_j}\) is an isoterm for \({\textbf{S}}\), hence \(f = e_i + e_j\) as required.

If \(e_i + e_j > p\), then \(f = \text {occ}(x_j,{\textbf{t}}_{ij}) \equiv e_i + e_j \pmod q\). This concludes the proof of part (ii) and the proof of the lemma. \(\square \)

For a nilpotent monoid, the value of q in Lemma 2.4 is 1, and p is the smallest integer such that \({\textbf{S}}\setminus \{1\} \models x^p \approx 0\).

Given a word \({\textbf{w}}\), a variable \(x \in c({\textbf{w}}) \subseteq A\) is said to be primitive in \({\textbf{w}}\) with respect to a nilpotent monoid \({\textbf{S}}\) if under any assignment \(\theta :A \rightarrow {\textbf{S}}\) either \(\theta (x) = 1\) or \(\theta ({\textbf{w}}) = 0\). Given a nilpotent monoid \({\textbf{S}}\), we will denote the primitive letters of a word \({\textbf{w}}\) with respect to \({\textbf{S}}\) by \(\text {Prim}({\textbf{w}})\).

Definition 2.5

Let \({\textbf{S}}\) be a monoid satisfying

$$\begin{aligned} {\mathscr {A}}_{\alpha ,\beta }:= \{x^\alpha \approx x^{\alpha +\beta }, t_1xt_2x\dots t_\alpha x \approx x^\alpha t_1t_2\dots t_\alpha \} \end{aligned}$$

with \(\alpha ,\beta \in {\mathbb {N}}\) minimal with respect to \({\textbf{S}} \models {\mathscr {A}}_{\alpha ,\beta }\). Then, for any term \({\textbf{t}}\) of \({\textbf{S}}\), a variable that appears at least \(\alpha \) times in \({\textbf{t}}\) is said to be strongly primitive in \({\textbf{t}}\).

We denote \({\mathscr {A}}_{\alpha ,1}\) simply by \({\mathscr {A}}_{\alpha }\).

Given a monoid \({\textbf{S}}\) satisfying \({\mathscr {A}}_{\alpha ,\beta }\), we denote the strongly primitive letters of a word \({\textbf{u}}\) by \(\text {Prim}_{st}({\textbf{u}})\). If \({\mathscr {F}} = \{{\textbf{t}}_{ij} \mid 1 \le i < j \le n\}\) is a scheme for \({\textbf{S}}\) then define

$$\begin{aligned} \text {Prim}_{st}({\mathscr {F}}) = \{x_k \mid \exists i,j \in {\underline{n}}\setminus \{k\} \text { such that } x_k \in \text {Prim}_{st}({\textbf{t}}_{ij})\}. \end{aligned}$$

Lemma 2.4 ensures that if \(x_k \in \text {Prim}_{st}({\mathscr {F}})\) then \(x_k\) is strongly primitive in all terms in \({\mathscr {F}}\) except the terms which do not depend on \(x_k\).

Lemma 2.6

Let \({\textbf{S}}\) be a monoid satisfying \({\mathscr {A}}_{\alpha ,\beta }\). Let \({\mathscr {F}} = \{{\textbf{t}}_{ij} |1 \le i < j \le n\}\) be a scheme for \({\textbf{S}}\) with \(n > |S |+ 1\) and variable exponents \(e_1,\dots ,e_n\). Let \(f:S^n \rightarrow S\) be the operation determined by \({\mathscr {F}}\). Then

$$\begin{aligned} f = f[X_n\setminus Prim _{st}({\mathscr {F}})] \cdot \prod _{x_k \in Prim _{st}({\mathscr {F}})}x_k^{e_k}. \end{aligned}$$

Proof

First suppose \(V({\textbf{S}})\) does not contain the variety of semilattices. Then \({\textbf{S}}\) has no idempotents other than the identity so is a group. It follows from the second law in \({\mathscr {A}}_{\alpha ,\beta }\) that \({\textbf{S}}\) is an abelian group whence the lemma’s statement holds trivially by the fact \({\mathscr {F}}\) comes from a term by [4, Theorem 3.6]. We may then assume that \(V({\textbf{S}})\) contains the variety of semilattices.

If \(\text {Prim}_{st}({\mathscr {F}}) = \varnothing \) then there is nothing to prove. Now assume there is a single strongly primitive variable in the scheme \({\mathscr {F}}\). Without loss of generality suppose \(\text {Prim}_{st}({\mathscr {F}}) = \{x_n\}\). Then for any \((a_1,\dots ,a_n) \in S^n\) there exist distinct \(i,j \in \underline{n-1}\) such that \(a_i = a_j\). Then

$$\begin{aligned} f(a_1,\dots ,a_{n-1},1)\cdot a_n^{e_n}&= {\textbf{t}}_{ij}(a_1,\dots ,a_{n-1},1)\cdot a_n^{e_n} \\&= {\textbf{t}}_{ij}(a_1,\dots ,a_n) \\&=f(a_1,\dots ,a_n), \end{aligned}$$

by Lemma 2.4 and the fact that \({\textbf{S}} \models {\mathscr {A}}_{\alpha ,\beta }\) with \(e_n \ge \alpha \).

Now we consider the remaining case where there exist distinct variables \(x_i,x_j \in \text {Prim}_{st}({\mathscr {F}})\). We claim that \({\mathscr {F}}\) comes from the term

$$\begin{aligned} {\textbf{w}} = {\textbf{t}}_{ij}[X_n\setminus \text {Prim}_{st}({\mathscr {F}})]\cdot {\textbf{s}}, \end{aligned}$$

where \({\textbf{s}} = \prod _{x_m \in \text {Prim}_{st}({\mathscr {F}})}x_m^{e_m}\). Take any distinct \(k,l \in {\underline{n}}\). First suppose that \(e_k + e_l < \alpha \) implying that neither \(x_k\) nor \(x_l\) is strongly primitive. Then

$$\begin{aligned} ({\textbf{t}}_{ij}[X_n\setminus \text {Prim}_{st}({\mathscr {F}})])^{(kl)}&\approx {\textbf{t}}_{ij}^{(kl)}[X_n\setminus \text {Prim}_{st}({\mathscr {F}})] \\&\approx {\textbf{t}}_{kl}^{(ij)}[X_n\setminus \text {Prim}_{st}({\mathscr {F}})] \\&\approx {\textbf{t}}_{kl}[X_n\setminus \text {Prim}_{st}({\mathscr {F}})], \end{aligned}$$

since \(x_i,x_j \in \text {Prim}_{st}({\mathscr {F}})\). It follows that

$$\begin{aligned} {\textbf{w}}^{(kl)} \approx ({\textbf{t}}_{ij}[X_n\setminus \text {Prim}_{st}({\mathscr {F}})])^{(kl)}\cdot {\textbf{s}}^{(kl)} \approx {\textbf{t}}_{kl}[X_n\setminus \text {Prim}_{st}({\mathscr {F}})]\cdot {\textbf{s}}^{(kl)}, \end{aligned}$$

and thus \({\textbf{w}}^{(kl)} \approx {\textbf{t}}_{kl}\).

Now suppose \(e_k + e_l \ge \alpha \) and let \(Y = \text {Prim}_{st}({\mathscr {F}})\cup \{x_k,x_l\}\). Then, by the consistency of \({\mathscr {F}}\) and the fact that \(x_i,x_j,x_k,x_l \in Y\), we have

$$\begin{aligned} ({\textbf{t}}_{kl}[X_n\setminus Y])^{(ij)} \approx {\textbf{t}}_{kl}[X_n\setminus Y] \approx {\textbf{t}}_{ij}[X_n\setminus Y] \approx ({\textbf{t}}_{ij}[X_n\setminus Y])^{(kl)}. \end{aligned}$$
(1)

Let \(p = \text {occ}(x_l,({\textbf{t}}_{ij}[X_n{\setminus }\text {Prim}_{st}({\mathscr {F}})])^{(kl)})\) and notice

$$\begin{aligned} p = {\left\{ \begin{array}{ll} e_k + e_l &{}\text {if } x_k\not \in \text {Prim}_{st}({\mathscr {F}}), x_l \not \in \text {Prim}_{st}({\mathscr {F}}), \\ e_k &{}\text {if } x_k\not \in \text {Prim}_{st}({\mathscr {F}}), x_l \in \text {Prim}_{st}({\mathscr {F}}), \\ e_l &{}\text {if } x_k\in \text {Prim}_{st}({\mathscr {F}}), x_l \not \in \text {Prim}_{st}({\mathscr {F}}), \\ 0 &{}\text {if } x_k\in \text {Prim}_{st}({\mathscr {F}}), x_l \in \text {Prim}_{st}({\mathscr {F}}). \end{array}\right. } \end{aligned}$$

Then

$$\begin{aligned} {\textbf{w}}^{(kl)} \approx ({\textbf{t}}_{ij}[X_n\setminus \text {Prim}_{st}({\mathscr {F}})])^{(kl)}\cdot {\textbf{s}}^{(kl)} \approx ({\textbf{t}}_{ij}[X_n\setminus Y])^{(kl)}\cdot x_l^p \cdot {\textbf{s}}^{(kl)}, \end{aligned}$$

where \(x_l^0\) is the empty word.

By (1) we get

$$\begin{aligned} ({\textbf{t}}_{ij}[X_n\setminus Y])^{(kl)}\cdot x_l^p \cdot {\textbf{s}}^{(kl)} \approx {\textbf{t}}_{kl}[X_n\setminus Y]\cdot x_l^p \cdot {\textbf{s}}^{(kl)}. \end{aligned}$$

But \(e_k + e_l \ge \alpha \) and thus \({\textbf{t}}_{kl} \approx {\textbf{t}}_{kl}[X_n{\setminus } Y]\cdot x_l^p \cdot {\textbf{s}}^{(kl)}\). Therefore \({\textbf{w}}^{(kl)} \approx {\textbf{t}}_{kl}\) as required.

Finally, since f is a term function and \({\textbf{S}} \models {\mathscr {A}}_{\alpha ,\beta }\) we obtain

$$\begin{aligned} f = f[X_n\setminus \text {Prim}_{st}({\mathscr {F}})]\cdot \prod _{x_k \in \text {Prim}_{st}({\mathscr {F}})}x_k^{e_k}. \end{aligned}$$

\(\square \)

3 Nilpotent monoids built from words

Let A be a set and let W be a set of words in the free monoid \(A^*\). Let M(W) be the Rees quotient \(A^*/I(W)\) where I(W) is the ideal consisting of all words in \(A^*\) which are not subwords of words in W. It is easy to see that M(W) is a nilpotent monoid whose universe consists of all subwords of words in W (along with the empty word and 0) and whose binary operation \(\cdot \) is defined by

$$\begin{aligned} u \cdot v = {\left\{ \begin{array}{ll} uv &{} \text { if}\, uv\, \text { is a subword of a word in} W, \\ 0 &{} \text { otherwise.} \end{array}\right. } \end{aligned}$$

This construction, usually attributed to Dilworth [15], was employed by Perkins to show the existence of non-finitely based semigroups [16]. Since then, constructing nilpotent monoids from words has proven to be instrumental in the exploration of the finite basis problem for semigroups. In addition, this construction has been used to show that the finite basis property is not preserved by direct products [9] nor by adjoining an identity element [11, 16].

Let \(\kappa \in {\mathbb {N}}\). A word \({\textbf{w}}\) is \(\kappa \)-limited if \(\text {occ}(x,{\textbf{w}}) \le \kappa \) for every letter \(x \in c({\textbf{w}})\). Let A be a non-empty set. We denote the set of \(\kappa \)-limited words in \(A^*\) by \(A_\kappa \).

The next result concerning \(M(A_\kappa )\) is reminiscent of Steindl’s [19, Theorem 2.4] for the free nilpotent monoid, though was obtained by the author independently and prior to the posting of [19].

Proposition 3.1

For any words \({\textbf{u}}\) and \({\textbf{v}}\), \(M(A_\kappa ) \models {\textbf{u}} \approx {\textbf{v}}\) if and only if \(Prim _{st}({\textbf{u}}) = Prim _{st}({\textbf{v}})\) and \({\textbf{u}}[c({\textbf{u}}){\setminus } Prim _{st}({\textbf{u}})] = {\textbf{v}}[c({\textbf{v}}){\setminus } Prim _{st}({\textbf{v}})]\).

Proof

Follows immediately from the property that any term for \(M(A_\kappa )\) without strongly primitive variables is an isoterm. \(\square \)

Theorem 3.2

Let A be a finite set. Then \(M(A_\kappa )\) is finitely related with term degree at most \(\text {max}(|M(A_\kappa ) |+ 1, 4)\).

Proof

We may assume that \(|A |\ge 2\) since if \(|A |= 1\) then \(M(A_\kappa )\) is commutative and finitely related by [4, Theorem 3.6] with term degree at most \(\text {max}(|M(A_\kappa ) |,4)\).

Assume \(n > |M(A_\kappa ) |+ 1\) and let \({\mathscr {F}} = \{{\textbf{t}}_{ij} \mid 1 \le i < j \le n\}\) be a scheme for \(M(A_\kappa )\) with variable exponents \(e_1,\dots ,e_n\) that determines an operation \(f:M(A_\kappa )^n \rightarrow M(A_\kappa )\).

Let \(Y = X_n \setminus \text {Prim}_{st}({\mathscr {F}})\). Define a set of new variables

$$\begin{aligned} {\overline{Y}} = \{{}_px_i \mid x_i \in Y, 1 \le p \le e_i\}. \end{aligned}$$

Since Y contains only non-strongly primitive variables, if \(x_i,x_j \in Y\), and \(f[x_i,x_j]\) is induced by a term \({\textbf{s}}\) then \({\textbf{s}}\) is an isoterm for \(M(A_\kappa )\) and hence is the unique term inducing \(f[x_i,x_j]\). With this in mind, define \(\prec \) on \({\overline{Y}}\) by \({}_px_i \prec {}_qx_j\) if the p-th occurrence of \(x_i\) occurs before the q-th occurrence of \(x_j\) in the term inducing \(f[x_i,x_j]\). Since each term \(f[x_i,x_j]\) is an isoterm, \(\prec \) is anti-symmetric and connected. Transitivity is established by the fact that \(n > |M(A_\kappa ) |+ 1 \ge 7\) and so \(f[x_i,x_j,x_l]\) is a term function (and, in fact, an isoterm) for any \(x_i,x_j,x_l \in Y\). Therefore \(\prec \) is a strict linear order on \({\overline{Y}}\).

By Lemma 2.6, it is sufficient to show that f[Y] comes from a term. Let

$$\begin{aligned} x_{i_1} \prec x_{i_2} \prec \cdots \prec x_{i_\alpha } \end{aligned}$$

be the linear order \(\prec \) on \({\overline{Y}}\) with the prefixes removed so that \(i_1,i_2,\dots ,i_\alpha \) may have repeats. We claim that f[Y] is induced by the term

$$\begin{aligned} {\textbf{w}} = x_{i_1}x_{i_2}\cdots x_{i_\alpha }. \end{aligned}$$

To do this, we will show that \(M(A_\kappa )\) satisfies \({\textbf{w}}^{(ij)} \approx {\textbf{t}}_{ij}[Y]\).

Note, only \(x_j\) can be strongly primitive in \({\textbf{w}}^{(ij)}\) or \({\textbf{t}}_{ij}[Y]\). If \(x_j\) is strongly primitive in the former (resp. latter), then \(x_j\) is strongly primitive in the latter (resp. former) by Lemma 2.4 and the construction of \({\textbf{w}}\). Similarly, if \(x_j\) is not strongly primitive in \({\textbf{w}}^{(ij)}\) then it is not strongly primitive in \({\textbf{t}}_{ij}\) and vice-versa. Therefore  \(\text {Prim}_{st}({\textbf{w}}^{(ij)}) = \text {Prim}_{st}({\textbf{t}}_{ij})\).

Note, for all  \(x_k,x_l \in Y\setminus \{x_i,x_j\}\) we get

$$\begin{aligned} {\textbf{t}}_{ij}[x_k,x_l] = f[x_k,x_l] = {\textbf{w}}_{ij}[x_k,x_l]. \end{aligned}$$
(2)

Hence if \(x_j\) is strongly primitive in \({\textbf{w}}^{(ij)}\), and thus \({\textbf{t}}_{ij}\), then \({\textbf{w}}^{(ij)} \approx {\textbf{t}}_{ij}\) by Proposition 3.1.

Suppose \(x_j\) is not strongly primitive in \({\textbf{w}}^{(ij)}\). Equation (2) implies that \({}_px_k\) occurs before \({}_qx_l\) in \({\textbf{w}}^{(ij)}\) if and only if \({}_px_k\) occurs before \({}_qx_l\) in \({\textbf{t}}_{ij}\) for any \(x_k,x_l \in Y{\setminus }\{x_i,x_j\}\). It follows that to show \({\textbf{w}}^{(ij)} \approx {\textbf{t}}_{ij}\), we need to show that \({}_px_j\) occurs before (resp. after) \({}_qx_k\) in \({\textbf{w}}^{(ij)}\) if and only if \({}_px_j\) occurs before (resp. after) \({}_qx_k\) in  \({\textbf{t}}_{ij}\).

Assume \({}_px_j\) occurs before \({}_qx_k\) in \({\textbf{w}}^{(ij)}\). Then there exist integers r and s such that \(r + s = p\) and \({}_rx_i\) and \({}_sx_j\) occur before \({}_qx_k\) in \({\textbf{w}}\). By the construction of \({\textbf{w}}\), the operation \(f[x_i,x_j,x_k]\) is induced by a unique term in which \({}_rx_i\) and \({}_sx_j\) occur before \({}_qx_k\). Moreover, since \(x_j\) is not primitive in \({\textbf{w}}^{(ij)}\) (and \({\textbf{t}}_{ij}\)), \(f_{ij}[x_i,x_j,x_k]\) is induced by a unique term where \({}_{r+s}x_j = {}_px_j\) occurs before \({}_qx_k\). But \(f_{ij}[x_i,x_j,x_k]\) is induced by \({\textbf{t}}_{ij}[x_j,x_k]\) and thus \({}_px_j\) occurs before \({}_qx_k\) in \({\textbf{t}}_{ij}\) as required. The converse is similar as is the case when \({}_px_j\) occurs after \({}_qx_k\) in \({\textbf{w}}^{(ij)}\) or \({\textbf{t}}_{ij}\).

Therefore \(M(A_\kappa ) \models {\textbf{w}}^{(ij)} \approx {\textbf{t}}_{ij}[Y]\) by Proposition 3.1 and the theorem is proved by Lemma 2.6. \(\square \)

Mayr [14] proposed a more general way of establishing large classes of non-finitely related algebras: prove there exists an inherently non-finitely related semigroup. An algebra \({\textbf{A}}\) is said to be inherently non-finitely related if for any algebra \({\textbf{B}}\) of the same type, \({\textbf{A}} \in V({\textbf{B}})\) implies that \({\textbf{B}}\) is non-finitely related. It is unknown whether there exist inherently non-finitely related algebras, however Theorem 3.2 allows us to conclude that the class of nilpotent monoids will not produce an example of one.

Corollary 3.3

There are no inherently non-finitely related nilpotent monoids.

Proof

Let \({\textbf{S}}\) be a non-finitely related d-nilpotent monoid. We claim that \({\textbf{S}}\) embeds into a finitely related nilpotent monoid.

Since \({\textbf{S}}\) is d-nilpotent, \({\textbf{S}} \models {\mathscr {A}}_d\). Let A be a set with \(|A |= 2\) and consider the semigroup \({\textbf{T}}:= {\textbf{S}} \times M(A_{d-1})\). Then \(V({\textbf{T}}) = V(M(A_{d-1}))\) [9, Corollary 3.7] so is finitely related by Theorem 3.2 and Theorem 2.2. This concludes the proof of the claim as \({\textbf{S}}\) embeds into \({\textbf{T}}\). Therefore there are no inherently non-finitely related nilpotent monoids. \(\square \)

4 Chain, crown & maelstrom words

This section is primarily dedicated to the construction of schemes using chain, crown, and maelstrom words. Chain words, to be defined precisely below, were initially introduced by Lee for constructing non-finitely based monoid varieties [12, 13], and were further studied by Jackson and Lee to analyse monoids with uncountably many subvarieties [8]. Ren, Jackson, Zhao, and Lei also utilized chain words in the identification of semiring limit varieties [17]. Although not explicitly stated, chain words were used by Steindl to give an example of a non-finitely related nilpotent monoid [19]. For certain varieties, chain words are essentially fixed in an overlapping pattern until a variable is deleted, in which case the pattern is irrevocably damaged. In the case of schemes, we will show that, under certain circumstances, identifying variables to create a strongly primitive variable has the same effect. This will allow us to construct patterns that don’t correspond to any term for some variety, but where identifications can be derived from a term.

We will extend the concept of variable patterns that break down upon removal to two novel types of words known as Crown and Maelstrom words. This extension enables us to demonstrate the presence of an abundance of non-finitely related nilpotent monoids, which we can utilize to prove, among other things, the existence of two non-finitely related algebras whose direct product is finitely related.

4.1 Schemes built from chain words

The base case of the following definition, where \(p = q = 1\), is the most commonly invoked form, and are sometimes referred to as Lee words [17].

Definition 4.1

Given \(n,p,q \in {\mathbb {N}}\), the chain word in n variables with variable exponent \(p+q\) is given by

$$\begin{aligned} {\mathcal {C}}_{n,p,q} = x_1^px_2^px_1^qx_3^px_2^qx_4^px_3^q\cdots x_n^px_{n-1}^qx_n^q. \end{aligned}$$

Let \(i,j,n \in {\mathbb {N}}\). To simplify the exposition in this section, the following notation is used:

$$\begin{aligned} {\mathcal {C}}_{p,q}(i;j)&:= {\mathcal {C}}_{j-i+1,p,q}(x_i,x_{i+1},\dots ,x_j){} & {} \text { if } i< j \le n, \\ {\mathcal {C}}_{p,q}(i;n;j)&:= {\mathcal {C}}_{n-i+j+1,p,q}(x_i,\dots ,x_{n-1},x_n,x_1,\dots ,x_j){} & {} \text { if } j < i \le n.\\ \end{aligned}$$

We adopt the convention that chain words are over the usual variable set \(X_n\), and denote the chain word over \(Y_n = \{y_1,\dots ,y_n\}\) by \(\overline{{\mathcal {C}}_{n,p,q}}\).

Given any \(i,j \in {\underline{n}}\) with \(i < j\), let

$$\begin{aligned} \theta _{ij} =(x_j,x_{i+1},x_{i+2},\dots ,x_{n-1},x_n,x_1,x_2,\dots ,x_{i-1}). \end{aligned}$$

Lemma 4.2

Let \({\textbf{S}}\) be a monoid. Fix \(p,q \in {\mathbb {N}}\) and suppose

$$\begin{aligned} {\textbf{S}} \models {\mathcal {C}}_{n,p,q}\overline{{\mathcal {C}}_{m,n,q}} \approx \overline{{\mathcal {C}}_{m,n,q}}{\mathcal {C}}_{n,p,q} \end{aligned}$$

for all \(m,n \in {\mathbb {N}}\). Then for any \(n \in {\mathbb {N}}\) with \({\textbf{t}} = {\mathcal {C}}_{n,p,q}\), \({\textbf{S}}\) satisfies

$$\begin{aligned} {\textbf{t}}(\theta _{ij})[X_n\setminus \{x_j,x_k,x_l\}] \approx {\textbf{t}}(\theta _{kl})[X_n\setminus \{x_i,x_j,x_l\}], \end{aligned}$$
(3)

for all \(i,j,k,l \in {\underline{n}}\) with \(i < j\) and \(k < l\).

Proof

Let \({\textbf{u}},{\textbf{v}}\) be the words on the left and right hand side of (3) respectively. It is easy to see that \(c({\textbf{u}}) = c({\textbf{v}})\). Notice that assigning 1 to any variable in a chain breaks the word into smaller chain words. It follows from the fact that \({\textbf{S}} \models {\mathcal {C}}_{n,p,q}\overline{{\mathcal {C}}_{m,n,q}} \approx \overline{{\mathcal {C}}_{m,n,q}}{\mathcal {C}}_{n,p,q}\) that it is enough to show that \({\textbf{u}}\) and \({\textbf{v}}\) are products of the same chain words.

We give the case for when \(i< k< j < l\), but it is easy to verify, using the same calculation, that all cases produce the same chain words in \({\textbf{u}}\) and \({\textbf{v}}\).

Given that \(i< k< j < l\), direct calculation yields

$$\begin{aligned} {\textbf{u}}&= {\mathcal {C}}_{p,q}(i+1;k-1){\mathcal {C}}_{p,q}(k+1;j-1){\mathcal {C}}_{p,q}(j+1;l-1){\mathcal {C}}_{p,q}(l+1;n;i-1)\\&\approx {\mathcal {C}}_{p,q}(k+1;j-1){\mathcal {C}}_{p,q}(j+1;l-1){\mathcal {C}}_{p,q}(l+1;n;i-1){\mathcal {C}}_{p,q}(i+1;k-1) = {\textbf{v}}. \end{aligned}$$

\(\square \)

Our main goal is to show that monoids satisfying certain conditions are necessarily non-finitely related. Lemma 2.6 and Lemma 4.2 hint towards the scheme we are trying to construct. The following lemma will essentially guarantee that these schemes do not come from terms.

Fig. 1
figure 1

A directed n-cycle is an impossible pattern for chain words

The scheme we produce will be derived from the directed n-cycle in Fig. 1. Each vertex represents a variable, and each edge a pattern of occurrence of variables in a desired term. In particular, we require that there is an edge \(x \rightarrow y\) if and only if \(x^py^px^qy^q\) occurs in the term we are trying to construct. The following lemma shows that this is impossible for the graph in Fig. 1 if \({\mathcal {C}}_{2,p,q} = x_1^px_2^px_1^qx_2^q\) is an isoterm.

Lemma 4.3

Let \(n \ge 4\) and \(G = (V,E)\) be the directed graph in Fig. 1. Fix \(p,q \in {\mathbb {N}}\). Then there is no n-ary word \({\textbf{t}}\) such that if \((x,y) \in E(G)\) then \({\textbf{t}}[x,y] = x^py^px^qy^q\).

Proof

For any \(x_i\) we have \((x_{i\ominus 1},x_i) \in E\) where \(\ominus \) is subtraction modulo n. Suppose \({\textbf{t}}\) starts with some variable \(x_j\). Then \({\textbf{t}}[x_{j\ominus 1},x_j] = x_{j\ominus 1}^px_j^px_{j\ominus 1}^qx_j^q\) contradicting the fact that \({\textbf{t}}\) begins with \(x_j\). \(\square \)

Theorem 4.4

Let \({\textbf{S}}\) be a finite monoid. Suppose there exist \(p,q \in {\mathbb {N}}\) such that \({\textbf{S}}\) satisfies the following:

  1. (i)

    for any \(m,n\in {\mathbb {N}}\), \({\textbf{S}} \models {\mathcal {C}}_{n,p,q}\overline{{\mathcal {C}}_{m,p,q}} \approx \overline{{\mathcal {C}}_{m,p,q}}{\mathcal {C}}_{n,p,q}\);

  2. (ii)

    \({\textbf{S}} \models {\mathscr {A}}_{\alpha ,\beta }\) where \(\alpha ,\beta \in {\mathbb {N}}\) and \(\alpha \le 2p+2q\);

  3. (iii)

    \({\mathcal {C}}_{2,p,q}\) is an isoterm for \({\textbf{S}}\).

Then \({\textbf{S}}\) is non-finitely related.

Proof

Let \(n > 4\). For any \(1 \le i < j \le n\), let

$$\begin{aligned} {\textbf{t}}_{ij} = {\mathcal {C}}_{n,p,q}(\theta _{ij}). \end{aligned}$$

Let \({\mathscr {F}} = \{{\textbf{t}}_{ij} \mid 1 \le i < j \le n\}\). We first show that \({\mathscr {F}}\) is a scheme for \(V({\textbf{S}})\). Clearly  \({\mathscr {F}}\) satisfies the dependency condition since \(c({\textbf{t}}_{ij}) = X_n\setminus \{x_i\}\). To show consistency, take any \(i,j,k,l \in {\underline{n}}\) with \(i < j\) and \(k < l\). Then

$$\begin{aligned} {\textbf{t}}_{ij}^{(kl)}[X_n \setminus \{x_j,x_l\}] \approx {\textbf{t}}_{kl}^{(ij)}[X_n \setminus \{x_j,x_l\}] \end{aligned}$$
(4)

by Lemma 4.2 and (i). Now, (ii) ensures that \(x_j,x_l\) are the strongly primitive letters in \({\textbf{t}}_{ij}^{(kl)}\) and \({\textbf{t}}_{kl}^{(ij)}\) with \(\text {occ}(x_j,{\textbf{t}}_{ij}^{(kl)}) = 2p+2q = \text {occ}(x_j,{\textbf{t}}_{kl}^{(ij)})\) and \(\text {occ}(x_l,{\textbf{t}}_{ij}^{(kl)}) = 2p+2q = \text {occ}(x_l,{\textbf{t}}_{kl}^{(ij)})\) whence \({\textbf{S}} \models {\textbf{t}}_{ij}^{(kl)} \approx {\textbf{t}}_{kl}^{(ij)}\) by (4). Therefore  \({\mathscr {F}}\) is a scheme for \(V({\textbf{S}})\).

Let G be the graph in Fig. 1. By the construction of the terms in \({\mathscr {F}}\) and (iii), if \({\mathscr {F}}\) came from a term \({\textbf{t}}\) then we would have \({\textbf{t}}[x,y] = x^py^px^qy^q\) whenever \((x,y) \in E(G)\). But by Lemma 4.3 no such term can exist. Therefore \({\mathscr {F}}\) does not come from a term and hence \({\textbf{S}}\) is non-finitely related. \(\square \)

An image of a word \(\textbf{w}\in A^*\) is a word of the form \(\theta (\textbf{w})\), where \(\theta \) is an endomorphism of \(A^*\).

It is well known that two words \({\textbf{u}}\) and \({\textbf{v}}\) in the free monoid \(A^*\) commute if and only if there exists a word \({\textbf{w}}\) such that \({\textbf{u}}\) and \({\textbf{v}}\) are powers of \({\textbf{w}}\) (see [18, Lemma  2.2] for instance).

Corollary 4.5

Let \({\textbf{w}}\) be a word in \(\{a,b\}^+\). Suppose the following conditions hold:

  1. (1)

    there exist subwords \({\textbf{u}},{\textbf{v}}\) of \({\textbf{w}}\) such that \({\textbf{u}}\) and \({\textbf{v}}\) do not commute in \(\{a,b\}^*\) and \({\textbf{u}}^p{\textbf{v}}^p{\textbf{u}}^q{\textbf{v}}^q\) is a subword of \({\textbf{w}}\);

  2. (2)

    the image of two chain words are not adjacent in \(\textbf{w}\) unless the images are powers of the same word;

  3. (3)

    \(M({\textbf{w}})\) satisfies \({\mathscr {A}}_\alpha \) for \(\alpha \le 2p+2q\).

Then \(M({\textbf{w}})\) is non-finitely related.

Proof

The monoid \(M({\textbf{w}})\) satisfies conditions (ii) and (iii) in Theorem 4.4 by (1) and (3). To show (i) of Theorem 4.4 let \(m,n \in {\textbf{N}}\) and consider \({\mathcal {C}}_{n,p,q}\) and \(\overline{{\mathcal {C}}_{m,p,q}}\) in variables \(x_1,\dots ,x_n\) and \(y_1,\dots ,y_m\) respectively. We may interpret \({\mathcal {C}}_{n,p,q}\) and  \(\overline{{\mathcal {C}}_{m,p,q}}\) as \((m+n)\)-ary words in the obvious way. Then for any \(x \in M({\textbf{w}})^{m+n}\), condition  (2) implies \(({\mathcal {C}}_{n,q,p}\overline{{\mathcal {C}}_{m,p,q}})(x) = 0\) unless \({\mathcal {C}}_{n,q,p}(x) = 1\) or \(\overline{{\mathcal {C}}_{m,p,q}}(x) = 1\). Thus \(M({\textbf{w}})\) satisfies (i) of Theorem 4.4. \(\square \)

Although Corollary 4.5 offers an abundance of examples of non-finitely related nilpotent monoids, it is not comprehensive in terms of words which meet the requirements of Theorem 4.4. In our example of two non-finitely related semigroups whose direct product is finitely related, we will use the fact that \(M(a^pb^pa^qb^q)\) is non-finitely related (as a consequence of Corollary 4.5). The smallest monoid of this form, when \(p = q = 1\), is in fact the smallest non-finitely related nilpotent monoid built from a set of words.

Example 4.6

The monoid M(abab) is the smallest example of a non-finitely related nilpotent monoid built from a set of words.

Proof

The 9 element monoid M(abab) is non-finitely related by Corollary 4.5. The only words \({\textbf{w}}\) such that \(|M({\textbf{w}}) |< 10\) are the words which have length 3 or less or the words aaaa and abab (up to letter change) [9, Theorem 4.3]. If \({\textbf{w}}\) has length 3 or less then \(M({\textbf{w}})\) is d-nilpotent for \(d \le 4\). If \(d = 3\) or \(d = 4\) then \(M({\textbf{w}})\) is finitely related by [14, Theorem 5.1] and [19, Theorem 2.3] respectively. If \(d = 2\) or if \({\textbf{w}} = aaaa\) then \(M({\textbf{w}})\) is a commutative monoid so is finitely related by [4, Theorem 3.6]. Therefore M(abab) is the smallest non-finitely related nilpotent monoid built from words. \(\square \)

While our main focus in this paper is nilpotent monoids built from words, Theorem 4.4 also applies to semigroups that are not nilpotent monoids. An ideal extension of a semigroup \({\textbf{S}}\) by a semigroup with zero element \({\textbf{T}}\) is a semigroup \({\textbf{U}}\) such that \({\textbf{S}}\) is an ideal of \({\textbf{U}}\) and the Rees quotient \({\textbf{U}}/{\textbf{S}}\) is isomorphic to \({\textbf{T}}\). If \({\textbf{T}}\) is a nilpotent semigroup then \({\textbf{U}}\) is called an nilpotent-extension of \({\textbf{S}}\) by \({\textbf{T}}\).

Given a set of words W, we obtain S(W) from M(W) by removing the empty word, that is the identity, from M(W).

Example 4.7

Let \(p,q \in {\mathbb {N}}\). Let \({\textbf{T}} = S(a^pb^pa^qb^q)\) and \({\textbf{S}}\) be a commutative monoid with index \(m \in {\mathbb {N}}\) such that \(m \le 2p+2q - (\text {max}(p,q)+1)\). If \({\textbf{N}}\) is a nilpotent-extension of \({\textbf{S}}\) by \({\textbf{T}}\) then \({\textbf{N}}^1\) is non-finitely related.

Proof

Since \({\textbf{w}} = a^pb^pa^qb^q\) satisfies condition (2) of Corollary 4.5, it follows that chain words commute in \({\textbf{T}}^1\) as in the proof of Corollary 4.5. Together with the fact that \({\textbf{S}}\) is commutative and \(ns = sn\) for any \(n \in N^1\) and \(s \in S\), it follows that \({\textbf{N}}^1\) satisfies condition (i) of Theorem 4.4. Condition (ii) of Theorem 4.4 holds by the fact that \({\textbf{T}}^1 \models {\mathscr {A}}_{\text {max}(p,q)+1,r}\), where r is the period of \({\textbf{S}}\), and \({\textbf{S}}\) has an index of no more than \(2p+2q - (\text {max}(p,q)-1)\). Finally, condition (iii) holds by the fact that \(x^py^px^qy^q\) is an isoterm for \({\textbf{T}}^1\) and so is an isoterm for \({\textbf{N}}^1\). \(\square \)

Note, in the above example we may remove the condition that \({\textbf{S}}\) has index \(m \le 2p+2q\) and still use the terms \({\textbf{t}}_{ij}\) in Theorem 4.4. In this case \({\textbf{N}}^1\) does not satisfy the requirements of Theorem 4.4, but the reader can verify that the family of terms \({\mathscr {F}} = \{{\textbf{t}}_{ij} \mid 1 \le i < j \le n\}\) is indeed a scheme for \(V({\textbf{N}}^1)\) and that it does not come from a term.

The following corollary mimics a result in the finite basis world [9].

Corollary 4.8

Let \(A = \{a,b\}\) and \(A_\kappa \) be the words over \(A^*\) with at most \(\kappa \)-occurrences of any letter. Set \(B_k = A_\kappa \cup \{a^\kappa b^\kappa ab\}\). Then

$$\begin{aligned} V(M(A_2)) \subset V(M(B_2)) \subset V(M(A_3)) \subset V(M(B_3)) \subset \cdots \end{aligned}$$

is an ascending chain of varieties alternating between finitely related and non-finitely related.

Proof

Let \(i \ge 2\). The inclusions \(V(M(A_i)) \subset V(M(B_i)) \subset V(M(A_{i+1}))\) are obvious. Each \(V(M(A_i))\) is finitely related by Theorem 3.2. Note, the schemes built from the chain word \({\mathcal {C}}_{n,i,1}\) in Theorem 4.4 are schemes for \(V(M(A_i))\) since each \({\mathcal {C}}_{n,i,1}\) is \((i+1)\)-limited. It follows that to show \(V(M(B_i)) = V(M(A_i \cup \{a^ib^iab\}))\) is non-finitely related, we only need to show that the terms built from \({\mathcal {C}}_{n,i,1}\) form a scheme for \(V(M(a^ib^iab))\) and do not come from a term. This is easily established, however, as \({\textbf{w}} = a^ib^iab\) satisfies the requirements of Corollary 4.5. Therefore each \(V(M(B_i))\) is non-finitely related. \(\square \)

Let W be a set of words. For each \({\textbf{w}}_i \in W\), let \(X_{{\textbf{w}}_i}\) be an alphabet such that \(|X_{{\textbf{w}}_i} |= |c({\textbf{w}}_i) |\) and \(X_{{\textbf{w}}_i} \cap X_{{\textbf{w}}_j} = \varnothing \) for all \({\textbf{w}}_j \in W\setminus \{{\textbf{w}}_i\}\). Let \({\textbf{u}}_i\) be the word obtained from \({\textbf{w}}_i\) by replacing (one-to-one) each \(x \in c({\textbf{w}}_i)\) with a \(y \in X_{{\textbf{w}}_i}\). Let  \({\textbf{u}} = \prod _{\mathbf {w_i} \in W} {\textbf{u}}_i\) so that \(V(M({\textbf{u}})) = V(M(W))\) (see the proof of [9, Theorem 4.4]). We may apply this trick to each \(A_i\) in Corollary 4.8, producing the word \({\textbf{A}}_i\), and give the following sequence of subwords alternating between finitely and non-finitely related:

$$\begin{aligned} {\textbf{A}}_2< a^2b^2ab{\textbf{A}}_2< a^2b^2ab{\textbf{A}}_2{\textbf{A}}_3< a^3b^3aba^2b^2ab{\textbf{A}}_2{\textbf{A}}_3 < \cdots \end{aligned}$$

4.2 Schemes built from maelstrom words

Like chain words, maelstrom words utilise a sort of commutativity to produce schemes and a pattern of isoterms to prevent those schemes coming from terms. Importantly, while these schemes will require that \(x^py^px^qy^q\) is an isoterm (like the schemes built with chain words), they will not require chain words to commute. This provides the flexibility to study monoids where \(x^{p+q}y^{p+q}\) is an isoterm. To our knowledge, maelstrom words have not gained any attention in the finite basis world, unlike their chain word cousins.

For ease of use of these maelstrom words, we will only define them on an even number of variables.

Definition 4.9

Given \(n,p,q \in {\mathbb {N}}\), with n even, the maelstrom word on n variables with variable exponents \(p+q\) is given by

$$\begin{aligned} {\mathcal {M}}_{n,p,q} = x_1^p\cdot \Bigg (\prod _{k=1}^{\frac{n}{2}-1} x_{2k+1}^px_{2k}^p\Bigg )\cdot x_n^p \cdot \Bigg (\prod _{k=0}^{\frac{n}{2}-1}x_{n-(2k+1)}^qx_{n-2k}^q\Bigg ). \end{aligned}$$

One may also construct a maelstrom word in the following way: Let \(X_{\text {even}} = \{x_2,x_4,\cdots \,x_n\}\) and \(X_{\text {odd}} = \{x_1,x_3,\dots \,x_{n-1}\}\). Let \([X_{\text {even}}n] = x_2x_4\cdots x_n\) and \([nX_{\text {even}}] = x_nx_{n-2}\cdots x_2\). Define \([X_{\text {odd}}(n-1)]\) and \([(n-1)X_{\text {odd}}]\) analogously. Then \({\mathcal {M}}_{n,p,q}\), with powers pq removed, can be considered as an interlacing of \([X_{\text {even}}n][nX_{\text {even}}]\) and \([X_{\text {odd}}(n-1)][(n-1)X_{\text {odd}}]\):

figure a

Definition 4.10

Let \({\mathcal {M}}_{n,p,q}\) and \({\textbf{w}}\) be maelstrom words. Then \(\odot \) is defined as follows,

$$\begin{aligned} {\mathcal {M}}_{n,p,q} \odot {\textbf{w}}:= x_1^px_3^p\cdots x_{n-2}^px_n^p \cdot {\textbf{w}} \cdot x_{n-1}^qx_n^q \cdots x_2^q. \end{aligned}$$

Here, \(\odot \) inserts \({\textbf{w}}\) into the word \({\mathcal {M}}_{n,p,q}\) after the p-th occurrence of any variable \(x_i\) but before its \((p+1)\)-th occurrence. This ensures that for any variable \(y \in c({\textbf{w}})\) we get

$$\begin{aligned} ({\mathcal {M}}_{n,p,q} \odot {\textbf{w}})[x_i,y] = x_i^py^{\text {occ}(y,{\textbf{w}})}x_i^q, \end{aligned}$$

avoiding the isoterms \(x^py^px^qy^q\) and \(x^{p+q}x^{p+q}\).

To build the desired schemes, we first need to be able to “slice" maelstrom words into several parts. Let \(i,j,n \in {\mathbb {N}}\) with \(n \ge \text {max}(i,j)\). If \(i \le j\) define \({\mathcal {M}}_{p,q}(i;j)\) by

$$\begin{aligned} {\mathcal {M}}_{p,q}(i;j):= {\mathcal {M}}_{n,p,q}[x_i,\dots ,x_j]. \end{aligned}$$

If \(j < i\) let \(X' = \{x_1,\dots ,x_j,x_i,\dots ,x_n\}\) and define \({\mathcal {M}}_{p,q}(i;n;j)\) by

$$\begin{aligned} {\mathcal {M}}_{p,q}(i;n;j):= {\left\{ \begin{array}{ll} {\mathcal {M}}_{n,p,q}(x_i,\dots ,x_n,x_1,\dots ,x_{i-1})[X'] &{}\text { if } i\text { is odd,}\\ {\mathcal {M}}_{n+2,p,q}(y,x_i,\dots ,x_n,x_1,\dots ,x_{i-1},z)[X'] &{}\text { if } i \text { is even}. \end{array}\right. } \end{aligned}$$

Our aim is to produce maelstrom words for subgraphs (with certain edge conditions) obtained by vertex deletion of the graph in Fig. 2. To fully capture this idea, we require some care for boundary cases. To this end, let 1 denote the empty word in the free monoid \(A^*\). For \(i \in {\underline{n}}\) we define the following:

$$\begin{aligned} {\mathcal {M}}_{p,q}(i+1;i)&:= 1, \\ {\mathcal {M}}_{p,q}(i;n;0)&:= {\mathcal {M}}_{p,q}(i;n), \\ {\mathcal {M}}_{p,q}(n+1;n;i)&:= {\mathcal {M}}_{p,q}(1;i),\\ {\mathcal {M}}_{p,q}(n+1;n;0)&:= 1. \end{aligned}$$

The reader will notice the similarity between the following lemma and Lemma  4.2.

Lemma 4.11

Let \({\textbf{S}}\) be a monoid. Fix \(p,q \in {\mathbb {N}}\) and suppose for any even \(m,n \in {\mathbb {N}}\),

$$\begin{aligned} {\textbf{S}} \models {\mathcal {M}}_{n,p,q}\odot \overline{{\mathcal {M}}_{m,p,q}} \approx \overline{{\mathcal {M}}_{m,p,q}}\odot {\mathcal {M}}_{n,p,q}, \end{aligned}$$

where \({\mathcal {M}}_{n,p,q}\) and \(\overline{{\mathcal {M}}_{m,p,q}}\) are maelstrom words over disjoint variable sets. Fix an even \(n \in {\mathbb {N}}\) and let

$$\begin{aligned} {\textbf{t}}_{ij} = {\mathcal {M}}_{p,q}(j+1;n;i-1)\odot {\mathcal {M}}_{p,q}(i+1,j-1), \end{aligned}$$

for \(i < j \le n\). Then \({\textbf{S}}\) satisfies

$$\begin{aligned} {\textbf{t}}_{ij}[X_n\setminus \{x_j,x_k,x_l\}] \approx {\textbf{t}}_{kl}[X_n\setminus \{x_i,x_j,x_l\}], \end{aligned}$$
(5)

for all \(i,j,k,l \in {\underline{n}}\) with \(i < j\) and \(k < l\).

Proof

Let \({\textbf{u}}, {\textbf{v}}\) be the words on the left and right hand side of (5) respectively. It is easily seen that the content of both \({\textbf{u}}\) and \({\textbf{v}}\) coincide and equals \(X_n \setminus \{x_i,x_j,x_k,x_l\}\). We will prove (5) in the case for when \(i< j< k < l\), but all cases are similar. A simple calculation shows

$$\begin{aligned} \begin{aligned} {\textbf{u}} = {\mathcal {M}}_{p,q}(j+1;k-1)&\odot {\mathcal {M}}_{p,q}(k+1;l-1) \\&\odot {\mathcal {M}}_{p,q}(l+1;n;i-1) \odot {\mathcal {M}}_{p,q}(i+1,j-1). \end{aligned} \end{aligned}$$

Given that any two words of the form \({\mathcal {M}}_{p,q}(r;s)\) or \({\mathcal {M}}_{p,q}(r;n;s)\) are constructed by first rearranging and then deleting letters from \({\mathcal {M}}_{n,p,q}\) (or \({\mathcal {M}}_{n+2,p,q}\)), the commutativity of any two such words under \(\odot \) follows from \({\textbf{S}} \models {\mathcal {M}}_{n,p,q}\odot \overline{{\mathcal {M}}}_{m,p,q} \approx \overline{{\mathcal {M}}}_{m,p,q}\odot {\mathcal {M}}_{n,p,q}\) provided they do not share any variables. Since \({\textbf{u}}\) is made up of such words, and the variable sets for those words are pairwise disjoint, we obtain

Fig. 2
figure 3

An impossible pattern for maelstrom words

$$\begin{aligned} \begin{aligned} {\textbf{u}} \approx {\mathcal {M}}_{p,q}(l+1;n;i-1)&\odot {\mathcal {M}}_{p,q}(i+1,j-1) \\ {}&\odot {\mathcal {M}}_{p,q}(j+1;k-1) \odot {\mathcal {M}}_{p,q}(k+1;l-1) = {\textbf{v}}. \\ \end{aligned} \end{aligned}$$

\(\square \)

The commutativity of maelstrom words with respect to \(\odot \) will have the same effect for the schemes we build as the commutativity of chain words had on the schemes in the previous section.

Lemma 4.12

Let \(G = (V,E)\) be the directed graph in Fig. 2 and let \(n \ge 4\) be even. Fix \(p,q \in {\mathbb {N}}\). Then there is no n-ary word \({\textbf{t}}\) with the following properties:

  1. (i)

    if \((x,y) \in E(G)\) then \({\textbf{t}}[x,y] = x^py^px^qy^q\);

  2. (ii)

    if \((x,y) \not \in E(G)\) then \({\textbf{t}}[x,y] \in \{x^py^{p+q}x^q, y^px^{p+q}y^q\}\).

Proof

For the sake of contradiction, suppose there exists an n-ary word \({\textbf{t}}\) satisfying the conditions (i) and (ii). Suppose \({\textbf{t}}\) begins with \(x_i\). Then i must be odd since otherwise \({\textbf{t}}[x_i,x_{i+1}] = x_{i+1}^px_{i}^px_{i+1}^qx_i^q\) contradicting the fact that \({\textbf{t}}\) begins with \(x_i\). Now, consider \({\textbf{t}}[x_{i\ominus 2},x_{i\ominus 1},x_i,x_{i\oplus 1},x_{i\oplus 2}]\) where \(\oplus \) and \(\ominus \) are addition and subtraction modulo n respectively. Since there are, pairwise, no edges between \(x_{i\ominus 2}, x_{i}\) and \(x_{i\oplus 2}\) in G and \({\textbf{t}}\) begins with \(x_i\) either

$$\begin{aligned} {\textbf{t}}[x_{i\ominus 2},x_i,x_{i\oplus 2}] = x_i^px_{i\ominus 2}^px_{i\oplus 2}^{p+q}x_{i\ominus 2}^qx_i^q, \end{aligned}$$
(6)

or

$$\begin{aligned} {\textbf{t}}[x_{i\ominus 2},x_i,x_{i\oplus 2}] = x_i^px_{i\oplus 2}^px_{i\ominus 2}^{p+q}x_{i\oplus 2}^qx_i^q. \end{aligned}$$
(7)

If equation (6) is true then since \({\textbf{t}}[x_{i\oplus 1}, x_{i\oplus 2}] = x_{i\oplus 2}^px_{i\oplus 1}^px_{i\oplus 2}^qx_{i\oplus 1}^q\) and \({\textbf{t}}[x_i,x_{i \oplus 1}] = x_i^px_{i\oplus 1}^px_i^qx_{i\oplus 1}^q\) we must have

$$\begin{aligned} {\textbf{t}}[x_{i\ominus 2},x_i,x_{i\oplus 1},x_{i\oplus 2}] = x_i^px_{i\ominus 2}^px_{i\oplus 2}^{p}x_{i\oplus 1}^px_{i\oplus 2}^{q}x_{i\ominus 2}^qx_i^qx_{i\oplus 1}^q, \end{aligned}$$

which contradicts the fact that \((x_{i\ominus 2},x_{i\oplus 1}) \not \in E(G)\). Similarly, if equation (7) is true then \({\textbf{t}}[x_{i\ominus 2},x_{i\ominus 1},x_i,x_{i\oplus 2}] = x_i^px_{i\oplus 2}^px_{i\ominus 2}^px_{i\ominus 1}^px_{i\ominus 2}^qx_{i\oplus 2}^qx_i^qx_{i\ominus 1}^q\) which contradicts the fact that \((x_{i\oplus 2},x_{i\ominus 1}) \not \in E(G)\). As every case has led to a contradiction, it follows that no such \({\textbf{t}}\) can exist. \(\square \)

Given a semigroup \({\textbf{S}}\), we will say that a set of words \(\{{\textbf{u}}_1,\dots ,{\textbf{u}}_n\}\) forms an island for \({\textbf{S}}\) if for all \(i,j \in {\underline{n}}\) we have \({\textbf{S}} \models {\textbf{u}}_i \approx {\textbf{u}}_j\) and for any word \({\textbf{v}}\), \({\textbf{S}}\) satisfies \({\textbf{u}}_i \approx {\textbf{v}}\) if and only if \({\textbf{v}} = {\textbf{u}}_k\) for some \(k \in {\underline{n}}\).

Theorem 4.13

Let \({\textbf{S}}\) be a finite monoid. Suppose there exist \(p,q \in {\mathbb {N}}\) such that \({\textbf{S}}\) satisfies the following:

  1. (i)

    for any even \(m,n \in {\mathbb {N}}\), \({\textbf{S}} \models {\mathcal {M}}_{n,p,q}\odot \overline{{\mathcal {M}}_{m,p,q}} \approx \overline{{\mathcal {M}}_{m,p,q}}\odot {\mathcal {M}}_{n,p,q}\);

  2. (ii)

    \({\textbf{S}} \models {\mathscr {A}}_{\alpha ,\beta }\) where \(\alpha ,\beta \in {\mathbb {N}}\) and \(\alpha \le 2p+2q\);

  3. (iii)

    \(x^py^px^qy^q\) is an isoterm for \({\textbf{S}}\);

  4. (iv)

    \(x^py^{p+q}x^q \approx y^px^{p+q}y^q\) is an island identity for \({\textbf{S}}\).

Then \({\textbf{S}}\) is non-finitely related.

Proof

Let n be even and \(n > 4\). For any \(1 \le i < j \le n\), let

$$\begin{aligned} {\textbf{t}}_{ij} = x_j^{2p+2q}\cdot {\mathcal {M}}_{p,q}(j+1;n;i-1)\odot {\mathcal {M}}_{p,q}(i+1,j-1). \end{aligned}$$

Let \({\mathscr {F}} = \{{\textbf{t}}_{ij} \mid 1 \le i < j \le n\}\). We will show that \({\mathscr {F}}\) is an scheme for \(V({\textbf{S}})\). Dependency follows immediately from the fact that \(x_i \not \in c({\textbf{t}}_{ij})\). To show consistency, take any \(i,j,k,l \in {\underline{n}}\) with \(i < j\) and \(k < l\). Condition (ii) ensures that \(x_j\) and \(x_l\) are strongly primitive in \({\textbf{t}}_{ij}^{(kl)}\) and \({\textbf{t}}_{kl}^{(ij)}\) with \(\text {occ}(x_j,{\textbf{t}}_{ij}^{(kl)}) = 2p+2q = \text {occ}(x_j,{\textbf{t}}_{kl}^{(ij)})\) and \(\text {occ}(x_l,{\textbf{t}}_{ij}^{(kl)}) = 2p+2q = \text {occ}(x_l,{\textbf{t}}_{kl}^{(ij)})\). Combining this fact and (i) allows us to obtain \({\textbf{S}} \models {\textbf{t}}_{ij}^{(kl)}\approx {\textbf{t}}_{kl}^{(ij)}\) from Lemma 4.11. Therefore \({\mathscr {F}}\) obeys the consistency condition hence \({\mathscr {F}}\) is an scheme for \(V({\textbf{S}})\).

Let G denote the graph in Fig. 2. By the construction of the terms \({\textbf{t}}_{ij}\) and  (iii), if \({\mathscr {F}}\) did come from a term \({\textbf{t}}\), then \({\textbf{t}}[x,y] = x^py^qx^qy^q\) whenever \((x,y) \in E(G)\). Furthermore, if  \((x,y) \not \in E(G)\), then the terms \({\textbf{t}}_{ij}\) would also require that \({\textbf{t}}[x,y] \in \{x^py^{p+q}x^q, y^px^{p+q}y^q\}\) by (iv). But by Lemma 4.12 no such term can exist. It follows that \({\mathscr {F}}\) does not come from a term and therefore \({\textbf{S}}\) is non-finitely related. \(\square \)

Example 4.14

M(ababaabb) is non-finitely related.

Proof

Conditions (ii)–(iv) of Theorem 4.13 with \(p=q=1\) hold trivially for M(ababaabb). Condition (i) can be obtained from the fact that any evaluation of the term \(xy^2x\) is 0 unless x or y is assigned 1. \(\square \)

4.3 Schemes built from crown words

Comparing the schemes built from chains and maelstrom words, the reader will notice that the basic recipe is the same.

  • We first show that, given some sort of commutativity between crown/maelstrom words, an appropriate substitution and then deletions of variables forms an identity (see Lemma 4.2 and Lemma 4.11).

  • Under certain conditions, a collection of such identities forms a scheme (see Theorem 4.4 and Theorem 4.13).

  • The operations determined by the schemes must obey the cyclic edge relations seen in Figs. 1 and 2. This necessarily requires that these operations are not term functions (see Lemma 4.3 and Lemma 4.12).

We implement the same procedure in this section to show that, for certain monoids, schemes built from crown words do not come from terms.

Definition 4.15

Given \(n,p,q \in {\mathbb {N}}\), the crown word on n variables is given by

$$\begin{aligned} \mathbf {{\mathcal {R}}}_{n,p,q}:= {\left\{ \begin{array}{ll} x_1^p \cdot \Bigg (\displaystyle \prod _{k=1}^{\frac{n}{2}-1}x_{2k+1}^px_{2k}^{p+q}x_{2k-1}^q\Bigg ) \cdot x_n^{p+q}x_{n-1}^q &{}\text {if } n \text { is even}, \\ x_1^p \cdot \Bigg (\displaystyle \prod _{k=1}^{\frac{n-1}{2}}x_{2k+1}^px_{2k}^{p+q}x_{2k-1}^q\Bigg ) \cdot x_n^q &{}\text {if } n \text { is odd}. \end{array}\right. } \end{aligned}$$

On the odd-indexed variables, \(X_{\text {odd}}\), it is easy to see that \({\mathcal {R}}_{n,p,q}[X_{\text {odd}}] = {\mathcal {C}}_{n,p,q}\). To create \({\mathcal {R}}_{n,p,q}\) from \({\mathcal {C}}_{n,p,q}\), we insert the even-indexed variables within \({\mathcal {C}}_{n,p,q}\) as follows:

  • if \(x_j \in X_{\text {even}}\setminus \{x_n\}\) then insert \(x_j^{p+q}\) immediately after the p-th occurrence of \(x_{j+1}\) and immediately before the \((p+1)\)-th occurrence of \(x_{j-1}\);

  • if n is even, insert \(x_n^{p+q}\) immediately after the \((p+q)\)-th occurrence of \(x_{n-2}\) and immediately before the \((p+1)\)-th occurrence of \(x_{n-1}\).

Let \(i,j,n \in {\mathbb {N}}\) with \(n \ge \text {max}(i,j)\). If \(i \le j\) then define \({\mathcal {R}}_{p}(i;j)\) by

$$\begin{aligned} {\mathcal {R}}_{p,q}(i;j):= {\mathcal {R}}_{n,p,q}[x_i,\dots ,x_j]. \end{aligned}$$

If \(j < i\) let \(X' = \{x_1,\dots ,x_j,x_i,\dots ,x_n\}\) and define \({\mathcal {R}}_{p,q}(i;n;j)\) by

$$\begin{aligned} {\mathcal {R}}_{p,q}(i;n;j):= {\left\{ \begin{array}{ll} {\mathcal {R}}_{n,p,q}(x_i,\dots ,x_n,x_1,\dots ,x_{i-1})[X'] &{}\text {if } i\text { is odd,} \\ {\mathcal {R}}_{n+1,p,q}(y,x_i,\dots ,x_n,x_1,\dots ,x_{i-1})[X'] &{}\text {if } i\text { is even.} \end{array}\right. } \end{aligned}$$

As with maelstrom words, we are required to define some boundary cases. For \(i \in {\underline{n}}\) we define the following:

$$\begin{aligned} {\mathcal {R}}_{p,q}(i+1;i)&:= 1, \\ {\mathcal {R}}_{p,q}(i;n;0)&:= {\mathcal {R}}_{p,q}(i;n), \\ {\mathcal {R}}_{p,q}(n+1;n;i)&:= {\mathcal {R}}_{p,q}(1;i),\\ {\mathcal {R}}_{p,q}(n+1;n;0)&:= 1. \end{aligned}$$

Lemma 4.16

Let \({\textbf{S}}\) be a monoid. Fix \(p,q \in {\mathbb {N}}\) and suppose for any \(m,n \in {\mathbb {N}}\)

$$\begin{aligned} {\textbf{S}} \models {\mathcal {R}}_{n,p,q}\overline{{\mathcal {R}}_{m,p,q}} \approx \overline{{\mathcal {R}}_{m,p,q}}{\mathcal {R}}_{n,p,q}, \end{aligned}$$

where \({\mathcal {R}}_{n,p,q}\) and \(\overline{{\mathcal {R}}_{m,p,q}}\) are crown words over disjoint variable sets. Fix \(n \in {\mathbb {N}}\) and let

$$\begin{aligned} {\textbf{t}}_{ij} = {\mathcal {R}}_{p,q}(j+1;n;i-1){\mathcal {R}}_{p,q}(i+1,j-1) \end{aligned}$$

for any \(i < j \le n\). Then \({\textbf{S}}\) satisfies

$$\begin{aligned} {\textbf{t}}_{ij}[X_n\setminus \{x_k,x_l\}] \approx {\textbf{t}}_{kl}[X_n\setminus \{x_i,x_j\}], \end{aligned}$$
(8)

for all \(i,j,k,l \in {\underline{n}}\) with \(i < j\) and \(k < l\).

Proof

Let \({\textbf{u}},{\textbf{v}}\) be the words on the left and right hand side of (8) respectively. Clearly \(c({\textbf{u}}) = c({\textbf{v}})\). We will only prove the lemma for the case when \(i< j< k < l\), but all cases can be obtained in a similar way. A simple calculation shows

$$\begin{aligned} {\textbf{u}} = {\mathcal {R}}_{p,q}(j+1;k-1){\mathcal {R}}_{p,q}(k+1;l-1){\mathcal {R}}_{p,q}(l+1;n;i-1){\mathcal {R}}_{p,q}(i+1,j-1). \end{aligned}$$

Each word of the form \({\mathcal {R}}_{p,q}\) that makes up \({\textbf{u}}\) is obtained by first rearranging and then deleting letters from a crown word. Since crown words commute on \({\textbf{S}}\), it follows that words of the form \({\mathcal {R}}_{p,q}\) also commute. We then get

$$\begin{aligned} {\textbf{u}} \approx {\mathcal {R}}_{p,q}(l+1;n;i-1){\mathcal {R}}_{p,q}(i+1,j-1){\mathcal {R}}_{p,q}(j+1;k-1){\mathcal {R}}_{p,q}(k+1;l-1) = {\textbf{v}}, \end{aligned}$$

as required. \(\square \)

In the previous sections, we saw that it is convenient to use a graph and an edge relation to show that certain operations cannot be term functions. For crown words, we re-use Fig. 2, now requiring that if \((x,y) \in E(G)\) then \(t[x,y] = x^py^{p+q}x^q\).

Lemma 4.17

Let \(G = (V,E)\) be the directed graph in Fig. 2 and let \(n > 4\) be even. Fix \(p,q \in {\mathbb {N}}\). Then there is no n-ary word \({\textbf{t}}\) with the following properties:

  1. (i)

    if \((x,y) \in E(G)\) then \({\textbf{t}}[x,y] = x^py^{p+q}x^q\);

  2. (ii)

    if \((x,y) \not \in E(G)\) then \({\textbf{t}}[x,y] \in \{x^py^px^qy^q,y^px^py^qx^q,x^{p+q}y^{p+q},y^{p+q}x^{p+q}\}\).

Proof

For the sake of contradiction suppose there exists an n-ary word \({\textbf{t}}\) such that  \({\textbf{t}}\) satisfies (i) and (ii). Suppose \({\textbf{t}}\) begins with \(x_i\). Then using Fig. 2, i must be odd since if i were even then \({\textbf{t}}[x_i,x_{i\oplus 1}] = x_{i\oplus 1}^px_{i}^{p+q}x_{i\oplus 1}^q\) contradicting the fact that \({\textbf{t}}\) begins with \(x_i\). Now, by (ii) one of the following is true:

  1. (1)

    \({\textbf{t}}[x_{i \ominus 1},x_{i \oplus 1}] = x_{i \ominus 1}^p x_{i \oplus 1}^px_{i \ominus 1}^q x_{i \oplus 1}^q\);

  2. (2)

    \({\textbf{t}}[x_{i \ominus 1},x_{i \oplus 1}] = x_{i \oplus 1}^p x_{i \ominus 1}^px_{i \oplus 1}^q x_{i \ominus 1}^q\);

  3. (3)

    \({\textbf{t}}[x_{i \ominus 1},x_{i \oplus 1}] = x_{i \ominus 1}^{p+q}x_{i \oplus 1}^{p+q}\);

  4. (4)

    \({\textbf{t}}[x_{i \ominus 1},x_{i \oplus 1}] =x_{i \oplus 1}^{p+q}x_{i \ominus 1}^{p+q}\).

Suppose (1) is true. Then since \((x_{i \ominus 2},x_{i \ominus 1}) \in E(G)\) and \(x_{i \ominus 2}\) and \(x_{i \oplus 1}\) are not adjacent in G we have

$$\begin{aligned} {\textbf{t}}[x_{i \ominus 2},x_{i \ominus 1},x_{i \oplus 1}] = x_{i \ominus 2}^p x_{i \ominus 1}^p x_{i \oplus 1}^px_{i \ominus 1}^q x_{i \ominus 2}^q x_{i \oplus 1}^q. \end{aligned}$$

But as \(x_i\) appears first and \({\textbf{t}}[x_i,x_{i \oplus 1}] = x_i^px_{i \oplus 1}^{p+q}x_i^q\) we have \({\textbf{t}}[x_i,x_{i \ominus 2}] = x_i^px_{i \ominus 2}^{p+q}x_i^q\) contradicting the fact that \((x_i,x_{i \ominus 2}) \not \in E(G)\). A similar contradiction can be found if we assume (2) is true.

Now suppose that (3) is true. Note, since \({\textbf{t}}\) begins with \(x_i\) we have \({\textbf{t}}[x_{i \ominus 2},x_i] \in \{x_i^px_{i \ominus 2}^px_i^qx_{i \ominus 2}^q, x_i^{p+q}x_{i \ominus 2}^{p+q}\}\) by (ii). If \({\textbf{t}}[x_{i \ominus 2},x_i] = x_i^px_{i \ominus 2}^px_i^qx_{i \ominus 2}^q\) then

$$\begin{aligned} {\textbf{t}}[x_{i \ominus 2},x_{i \ominus 1},x_i,x_{i \oplus 1}] = x_i^p x_{i \ominus 2}^p x_{i \ominus 1}^{p+q}x_{i \oplus 1}^{p+q} x_i^q x_{i \ominus 2}^q. \end{aligned}$$

But then \({\textbf{t}}[x_{i \ominus 2},x_{i \oplus 1}] = x_{i \ominus 2}^px_{i \oplus 1}^{p+q} x_{i \ominus 2}^q\) contradicts the fact that \((x_{i \ominus 2},x_{i \oplus 1}) \not \in E(G)\). If \({\textbf{t}}[x_{i \ominus 2},x_i] = x_i^{p+q}x_{i \ominus 2}^{p+q}\) then since \((x_{i},x_{i \ominus 1}),(x_{i},x_{i \oplus 1}) \in E(G)\) we have

$$\begin{aligned} {\textbf{t}}[x_{i \ominus 2},x_{i \ominus 1},x_i,x_{i \oplus 1}] = x_i^px_{i \ominus 1}^{p+q}x_{i \oplus 1}^{p+q}x_i^qx_{i \ominus 2}^{p+q}. \end{aligned}$$

But then \({\textbf{t}}[x_{i \ominus 1},x_{i \ominus 2}] = x_{i \ominus 1}^{p+q}x_{i \ominus 2}^{p+q}\) contradicting the fact that \((x_{i \ominus 2},x_{i \ominus 1}) \in E(G)\). A similar contradiction can be found for (4).

Since cases (1) through (4) are exhaustive by (ii), and each results in a contradiction, no word \({\textbf{t}}\) satisfying the requirements of the lemma’s statement can exist. \(\square \)

The proof of the following theorem is similar to Theorem 4.4 and 4.13.

Theorem 4.18

Let \({\textbf{S}}\) be a finite monoid. Suppose there exist \(p,q \in {\mathbb {N}}\) such that \({\textbf{S}}\) satisfies the following:

  1. (i)

    for any \(m,n \in {\mathbb {N}}\), \({\textbf{S}} \models {\mathcal {R}}_{n,p,q}\overline{{\mathcal {R}}_{m,p,q}} \approx \overline{{\mathcal {R}}_{m,p,q}}{\mathcal {R}}_{n,p,q}\);

  2. (ii)

    \({\textbf{S}} \models {\mathscr {A}}_{\alpha ,\beta }\), where \(\alpha ,\beta \in {\mathbb {N}}\) and \(\alpha \le 2p+2q\);

  3. (iii)

    \(x^py^{p+q}x^q\) is an isoterm for \({\textbf{S}}\);

  4. (iv)

    \(\{x^py^px^qy^q, y^px^py^qx^q, x^{p+q}y^{p+q}, y^{p+q}x^{p+q}\}\) is an island for \({\textbf{S}}\).

Then \({\textbf{S}}\) is non-finitely related.

Proof

Let n be even and \(n > 4\). For any \(1 \le i < j \le n\), let

$$\begin{aligned} {\textbf{t}}_{ij} = x_j^{2p+2q}\cdot {\mathcal {R}}_{p,q}(j+1;n;i-1){\mathcal {R}}_{p,q}(i+1,j-1). \end{aligned}$$

Let \({\mathscr {F}} = \{{\textbf{t}}_{ij} \mid 1 \le i < j \le n\}\). We will show that \({\mathscr {F}}\) is a scheme for \(V({\textbf{S}})\). Dependency follows immediately from the fact that \(x_i \not \in c({\textbf{t}}_{ij})\). To show consistency, take any \(i,j,k,l \in {\underline{n}}\) with \(i < j\) and \(k < l\). Condition (ii) ensures that \(x_j\) and \(x_l\) are strongly primitive in \({\textbf{t}}_{ij}^{(kl)}\) and \({\textbf{t}}_{kl}^{(ij)}\) with \(\text {occ}(x_j,{\textbf{t}}_{ij}^{(kl)}) = 2p+2q = \text {occ}(x_j,{\textbf{t}}_{kl}^{(ij)})\) and \(\text {occ}(x_l,{\textbf{t}}_{ij}^{(kl)}) = 2p+2q = \text {occ}(x_l,{\textbf{t}}_{kl}^{(ij)})\). Combining this fact and (i) allows us to obtain \({\textbf{S}} \models {\textbf{t}}_{ij}^{(kl)}\approx {\textbf{t}}_{kl}^{(ij)}\) from Lemma 4.16. Therefore \({\mathscr {F}}\) obeys the consistency condition hence \({\mathscr {F}}\) is a scheme for \(V({\textbf{S}})\).

Let G denote the graph in Fig. 2. By the construction of the terms \({\textbf{t}}_{ij}\) and  (iii), if \({\mathscr {F}}\) did come from a term \({\textbf{t}}\), then \({\textbf{t}}[x,y] = x^py^{p+q}x^q\) whenever \((x,y) \in E(G)\). Furthermore, if  \((x,y) \not \in E(G)\), then by (iv) the terms \({\textbf{t}}_{ij}\) would also require that \({\textbf{t}}[x,y] \in \{x^py^px^qy^q,y^px^py^qx^q,x^{p+q}y^{p+q},y^{p+q}x^{p+q}\}\). But by Lemma 4.17 no such term can exist. It follows that \({\mathscr {F}}\) does not come from a term and therefore \({\textbf{S}}\) is non-finitely related. \(\square \)

Example 4.19

M(abba) is non-finitely related.

Proof

Conditions (iii), and (iv) of Theorem 4.18 clearly hold for M(abba) with \(p = q = 1\). As \(M(abba) \models {\mathscr {A}}_{3,1}\) and \(3 < 2p+2q = 4\), the monoid M(abba) satisfies condition (ii) of Theorem 4.4. Condition (i) can be obtained by noticing that any evaluation of the term \(x^2y^2\) is 0 unless x or y is assigned 1. \(\square \)

We can now establish the existence of two non-finitely related semigroups whose direct product is finitely related.

Corollary 4.20

There exist two non-finitely related semigroups whose direct product is finitely related.

Proof

Let \({\textbf{S}} = M(abba)\) and \({\textbf{T}} = M(abab,aabb)\). Then \({\textbf{S}}\) and \({\textbf{T}}\) are non-finitely related by Example 4.19 and Example 4.14 respectively. But

$$\begin{aligned} V({\textbf{S}} \times {\textbf{T}}) = V(M(\{a,b\}_2), \end{aligned}$$

which is finitely related by Theorem 3.2. Therefore \({\textbf{S}} \times {\textbf{T}}\) is finitely related by Theorem 2.2. \(\square \)

To the author’s knowledge, Corollary 4.20 appears to be the first example of two non-finitely related algebras whose direct product is finitely related.

Steindl has shown that finite relatedness is not preserved under homomorphic images, subsemigroups or Rees quotients [19]. While we have given an example of two non-finitely related semigroups whose direct product is finitely related, it still remains unknown whether finite relatedness is preserved by direct products in the semigroup context.

Open Problem 4.21

Do there exist finitely related semigroups \({\textbf{S}}\) and \({\textbf{T}}\) such that \({\textbf{S}} \times {\textbf{T}}\) is not finitely related?

5 A short note on the addition of an identity to non-finitely related semigroups

The existence of a non-finitely related nilpotent monoid satisfies the question of whether finite relatedness is preserved by the addition of an identity element (see Steindl [19]). We may also ask whether we can obtain a finitely related monoid from a non-finitely related semigroup. In the world of the finite basis problem, a non-finitely based semigroup \({\textbf{S}}\) is conformable if \({\textbf{S}}^1\) is finitely based. Lee was the first to show that finite conformable semigroups exist [11]. Lee did this by showing that, under certain circumstances, the direct product of a suitable monoid with a suitable nilpotent semigroup would suffice. Lee’s paper on conformable semigroups translates, almost trivially, from the finite basis world to the finite relatedness world.

The only if-part of the next lemma follows from [4, Lemma 3.11].

Lemma 5.1

Let \({\textbf{S}}\) and \({\textbf{N}}\) be a semigroup and nilpotent semigroup respectively. Then \({\textbf{S}}\) is finitely related if and only if \({\textbf{S}} \times {\textbf{N}}\) is finitely related.

Proof

The only if-part follows from [4, Lemma 3.11]. We prove the if-statement.

Suppose \({\textbf{S}} \times {\textbf{N}}\) is finitely related with term degree k. To show that \({\textbf{S}}\) is finitely related, it suffices to show that \({\textbf{S}}^0\) is finitely related by [14, Theorem 4.1]. As \({\textbf{S}} \times {\textbf{N}}\) is finitely related, \(({\textbf{S}} \times {\textbf{N}})^0\) is finitely related by [14, Theorem 4.1]. It follows from Theorem 2.2 that \({\textbf{S}}^0 \times {\textbf{N}}\) is finitely related since \(V(({\textbf{S}} \times {\textbf{N}})^0) = V({\textbf{S}}^0 \times {\textbf{N}})\).

Let \(d \in {\mathbb {N}}\) be the term degree of \({\textbf{S}}^0 \times {\textbf{N}}\) and \({\mathscr {F}} = \{{\textbf{t}}_{ij}\mid 1 \le i < j \le n\}\) be an n-ary scheme for \({\textbf{S}}^0\) that depends on all of its variables with \(n > \text {max}(d,|S^0 |+ 1)\). Then by Lemma 2.3 we have \(|c({\textbf{t}}_{ij}) |= n - 1 \ge d \ge |S^0 \times N |> |N |\). But \({\textbf{N}}\) is  \(|N |\)-nilpotent by [4, Lemma 3.3], hence \({\mathscr {F}}\) is a scheme for \({\textbf{N}}\) and thus a scheme for \({\textbf{S}}^0 \times {\textbf{N}}\). It follows that \({\mathscr {F}}\) comes from a term  \({\textbf{t}}\) for \({\textbf{S}}^0 \times {\textbf{N}}\) since \({\textbf{S}}^0 \times {\textbf{N}}\) has term degree \(d < n\). Therefore \({\mathscr {F}}\) comes from the term \({\textbf{t}}\) for \({\textbf{S}}^0\) and so \({\textbf{S}}^0\) has term degree at most n. \(\square \)

We will now prove two theorems corresponding to Theorems 4 and 5 in Lee’s paper on conformable semigroups [11]. Following Lemma 5.1, this essentially amounts to replacing the words “finitely based" and “non-finitely based" in Lee’s paper with “finitely related" and “non-finitely related" respectively.

Theorem 5.2

Suppose that \({\textbf{S}}\) and \({\textbf{N}}\) are semigroups such that

  1. (i)

    \({\textbf{S}}^1\) is non-finitely related;

  2. (ii)

    \({\textbf{N}}\) is nilpotent;

  3. (iii)

    \({\textbf{S}}^1 \times {\textbf{N}}^1\) is finitely related.

Then \({\textbf{S}}^1 \times {\textbf{N}}\) is non-finitely related but \(({\textbf{S}}^1 \times {\textbf{N}})^1\) is finitely related.

Proof

The fact that \({\textbf{S}}^1 \times {\textbf{N}}\) is non-finitely related follows from (i), (ii) and Lemma 5.1. Following Lee’s argument, \(V(({\textbf{S}}^1 \times {\textbf{N}})^1) = V({\textbf{S}}^1 \times {\textbf{N}}^1)\) and hence \(({\textbf{S}}^1 \times {\textbf{N}})^1\) is finitely related by Theorem 2.2. \(\square \)

Theorem 5.3

Suppose that \({\textbf{S}}\) and \({\textbf{N}}\) are semigroups such that

  1. (i)

    \({\textbf{S}}^1\) is non-finitely related;

  2. (ii)

    \({\textbf{N}}\) is nilpotent;

  3. (iii)

    \({\textbf{N}}^1\) is finitely related;

  4. (iv)

    \(V({\textbf{S}}^1) \subseteq V({\textbf{N}}^1)\).

Then \({\textbf{S}}^1 \times {\textbf{N}}\) is non-finitely related but \(({\textbf{S}}^1 \times {\textbf{N}})^1\) is finitely related.

Proof

From (i), (ii), and Lemma 5.1 we have \({\textbf{S}}^1 \times {\textbf{N}}\) is non-finitely related. From (iv) and Theorem 5.2 we get \(V(({\textbf{S}}^1 \times {\textbf{N}})^1) = V({\textbf{S}}^1 \times {\textbf{N}}^1) = V({\textbf{N}}^1)\) which is finitely related by (iii). Therefore \(({\textbf{S}}^1 \times {\textbf{N}})^1\) is finitely related by Theorem 2.2. \(\square \)

Corollary 5.4

There exists a non-finitely related semigroup \({\textbf{T}}\) such that \({\textbf{T}}^1\) is finitely related.

Proof

Let \({\textbf{S}} = S(abab)\) and \({\textbf{N}} = S(abab,abba,aabb)\). Then \({\textbf{S}}\) and \({\textbf{N}}\) satisfy the conditions of Theorem 5.3 by Corollary 4.5 and Theorem 3.2. Therefore \({\textbf{T}} = {\textbf{S}}^1 \times {\textbf{N}}\) is non-finitely related, but \({\textbf{T}}^1\) is finitely related. \(\square \)

6 A finitely related non-finitely based semigroup

This paper thus far has presented results that are suggestive of an unexpected transfer of a positive and negative properties from the finite basis world to the finite relatedness of nilpotent monoids built from words. Interestingly, this (perceived) connection extends beyond the specific semigroups we have studied in this paper:

  • All groups are finitely based and finitely related [1].

  • Commutative semigroups, nilpotent semigroups and bands are both finitely based and finitely related [4, 5].

  • Clifford semigroups are finitely based and finitely related [14].

  • The 6-element Brandt monoid \({\textbf{B}}_2^1\) is non-finitely based [16] and non-finitely related [14].

  • The specific nilpotent monoid Steindl showed was non-finitely related is non-finitely based by applying a result of Jackson and Sapir [6, Lemma 4.3].

To the author’s knowledge, each semigroup that has been shown to be finitely related has also been finitely based. On the other hand the few examples of non-finitely related semigroups have been non-finitely based. There is no obvious link between the two concepts, and we do not mean to imply that such a link exists, but the coincidence up to now is interesting nonetheless. However, this coincidence does not extend to algebras beyond semigroups. For instance, one may consider a pointed group: an algebra \(\langle G, \cdot , g \rangle \) where \(\langle G, \cdot \rangle \) is a group and g is a fixed element of G considered as a nullary operation. Certainly every finite pointed group has few subpowers, so is finitely related [1], but there exists a finite non-finitely based pointed group [2].

We finish this paper by providing a semigroup example: the non-finitely based monoid M(asabtb) [7, Lemma 5.5] is finitely related.

Given a word \({\textbf{w}}\), the linear variables of \({\textbf{w}}\) are the variables \(x \in c({\textbf{w}})\) such that \(\text {occ}(x,{\textbf{w}}) = 1\). The proof of the following lemma can be found in the proof of [19, Theorem 2.3].

Lemma 6.1

( [19]) Let \({\textbf{S}}\) be a non-commutative nilpotent monoid. Let  \({\mathscr {F}} = \{{\textbf{t}}_{ij} \mid 1 \le i < j \le n\}\) be a scheme for \({\textbf{S}}\) with  \(n >|S |+ 1\) inducing an operation \(f:S^n \rightarrow S\). If \(Y \subset X_n\) are the linear variables of \({\mathscr {F}}\) then f[Y] is induced by the isoterm

$$\begin{aligned} x_{i_1}x_{i_2}\cdots x_{i_m}, \end{aligned}$$

where \(Y = \{x_{i_1},\dots ,x_{i_m}\}\) and \(x_{i_p} \ne x_{i_q}\) for any \(p \ne q\).

Given a scheme \({\mathscr {F}}\), we will denote the set of linear letters of \({\mathscr {F}}\) by \(\text {Lin}({\mathscr {F}})\).

Recall that given a word \({\textbf{u}}\) and a letter \(x \in c({\textbf{u}})\), the p-th occurrence of x in \({\textbf{u}}\) is denoted by \({}_px\). Let \(\text {OccSet}({\textbf{u}}) = \{{}_px \mid x \in c({\textbf{u}}), 1 \le p \le \text {occ}(x,{\textbf{u}})\}\). Given a semigroup \({\textbf{S}}\) and an identity \({\textbf{u}} \approx {\textbf{v}}\), a set \(Y \subset \text {OccSet}({\textbf{u}})\) is unstable in \({\textbf{u}}\approx {\textbf{v}}\) if for any two elements \({}_px,{}_qy \in Y\), the order in which \({}_px\) and \({}_qy\) occur in \({\textbf{v}}\) differs from the order they occur in \({\textbf{u}}\). We will say Y is unstable in \({\textbf{u}}\) if there exists a word \({\textbf{v}}\) such that \({\textbf{S}} \models {\textbf{u}} \approx {\textbf{v}}\) and Y is unstable in \({\textbf{u}} \approx {\textbf{v}}\). A set \(Y \subset \text {OccSet}({\textbf{u}})\) is stable if it is not unstable.

Proposition 6.2

Let \({\textbf{S}} = M(asabtb)\) and \({\textbf{u}}\) be a word over an alphabet A with \({\textbf{u}}[Lin({\textbf{u}})] = t_1t_2\cdots t_m\). Suppose that \({\textbf{u}}[x,t_{i-1},t_{i}] = t_{i-1}xt_ix\) and \({\textbf{u}}[y,t_{i-1},t_{i}] = t_{i-1}yt_iy\). Then \(\{{}_1x,{}_1y\}\) is stable in \({\textbf{u}}\) if and only if there exists a variable \(z \in c({\textbf{u}})\) such that \({\textbf{u}}[z,t_{i-1},t_{i}] = zt_{i-1}zt_i\) and xzy or yzx is a subword of \({\textbf{u}}[x,y,z,t_i,t_{i-1}]\).

Similarly, if \({\textbf{u}}[x,t_{i-1},{\textbf{t}}_{i}] = xt_{i-1}xt_i\) and \({\textbf{u}}[y,t_{i-1},t_{i}] = yt_{i-1}yt_i\) then \(\{{}_2x,{}_2y\}\) is stable in \({\textbf{u}}\) if and only if there exists \(z \in c({\textbf{u}})\) such that \({\textbf{u}}[z,t_{i-1},t_{i}] = t_{i-1}zt_iz\) and xzy or yzx is a subword of \({\textbf{u}}[x,y,z,t_i,t_{i-1}]\).

Proof

The backwards implication follows immediately from the fact that \(zt_{i-1}xzt_ix\) and \(zt_{i-1}zxt_ix\) are isoterms for \({\textbf{S}}\).

To establish the forwards implication, we show the contrapositive is true. Without loss of generality, suppose \({}_1x\) occurs before \({}_1y\) in \({\textbf{u}}\). By the assumption of the contrapositive, if an instance of \(z \in C({\textbf{u}})\) occurs between \({}_1x\) and \({}_1y\) then z is primitive in \({\textbf{u}}\) or \({}_1z\) occurs between \({}_1x\) and \({}_1y\) and \({}_2z\) occurs after \(t_i\).

It is enough to show that if \({}_1z\) occurs immediately after \({}_1x\) then \(\{{}_1x,{}_1z\}\) is unstable in \({\textbf{u}}\). Clearly \(\{{}_1x,{}_1z\}\) is unstable if z is primitive. Suppose z is not primitive in \({\textbf{u}}\). Let \({\textbf{v}}\) be the word obtained from \({\textbf{u}}\) by swapping \({}_1x{}_1z\) with \({}_1z{}_1x\). Consider any substitution \(\Theta : A \rightarrow S\). Since neither x nor z are primitive or linear, x and z are 2-occurring. It follows that if neither \(\Theta (x) = 1\) nor \(\Theta (z) = 1\) then \(\Theta ({\textbf{u}}) = 0 = \Theta ({\textbf{v}})\). Clearly if \(\Theta (x) = 1\) or \(\Theta (z) = 1\) then \(\Theta ({\textbf{u}}) = \Theta ({\textbf{v}})\). Therefore \({\textbf{S}} \models {\textbf{u}} \approx {\textbf{v}}\) and hence \(\{{}_1x,{}_1z\}\) is unstable in \({\textbf{u}}\).

The case for the stability of \(\{{}_2x,{}_2y\}\) holds by a symmetric argument. \(\square \)

Following Proposition 6.2, we will construct a normal form for any word \({\textbf{v}}\) such that \(M(asabtb) \models {\textbf{u}} \approx {\textbf{v}}\). To this end, assume \({\textbf{u}}[\text {Lin}({\textbf{u}})] = t_1\cdots t_m\) and fix i with \(2 \le i \le m\). Let

$$\begin{aligned} {}_1X = \{x \in A \mid {\textbf{u}}[t_{i-1},t_i,x] = t_{i-1}xt_ix\}, \end{aligned}$$

and

$$\begin{aligned} {}_2X = \{x \in A \mid {\textbf{u}}[t_{i-1},t_i,x] = xt_{i-1}xt_i\}. \end{aligned}$$

For each \(x \in {}_1X\) let

$$\begin{aligned} {\mathcal {A}}_x = \{y \in {}_1X \mid \{{}_1x,{}_1y\} \text { is unstable in } {\textbf{u}}\} \cup \{x\}, \end{aligned}$$

and for each \(x \in {}_2X\)

$$\begin{aligned} {\mathcal {B}}_x = \{y \in {}_2X \mid \{{}_2x,{}_2y\} \text { is unstable in } {\textbf{u}}\} \cup \{x\}. \end{aligned}$$

Using these sets, we obtain the following proposition.

Proposition 6.3

The sets \(\{{\mathcal {A}}_x \mid x \in {}_1X\}\) and \(\{{\mathcal {B}}_x \mid x \in {}_2X\}\) partition \({}_1X\) and \({}_2X\) respectively.

Proof

We prove the case for \({}_1X\) only as the case for \({}_2X\) follows from a symmetric argument.

Note, it suffices to show that for all \(y \in {\mathcal {A}}_x\) we have \({\mathcal {A}}_y = {\mathcal {A}}_x\). Suppose \(y,z \in {\mathcal {A}}_x\). By Proposition 6.2, for all \(x' \in {}_2X\) neither \(xx'y\) nor \(yx'x\) are subwords of \({\textbf{u}}[t_{i-1},t_i,x',x,y]\). Similarly, for all \(x' \in {}_2X\) neither \(xx'z\) nor \(zx'x\) are subwords of \({\textbf{u}}[t_{i-1},t_i,x',x,z]\). It follows that for all \(x' \in {}_2X\) neither \(yx'z\) nor \(zx'y\) are subwords of \({\textbf{u}}[t_{i-1},t_i,x',y,z]\) and hence \(\{{}_1y,{}_1z\}\) is unstable in \({\textbf{u}}\). Therefore \({\mathcal {A}}_x \subseteq {\mathcal {A}}_y\). The reverse inclusion is similar. The proposition is proved. \(\square \)

Construction 6.4

By Proposition 6.2 either the first occurrence of every variable in \({\mathcal {A}}_x\) occurs before the second occurrence of every variable in \({\mathcal {B}}_x\) or vice-versa. This, with Proposition 6.3, gives a natural linear order on the blocks \(\{{\mathcal {A}}_x \mid x \in {}_1X\}\) and \(\{{\mathcal {B}}_x \mid x \in {}_2X\}\) of \({}_1X \cup {}_2X\). Without loss of generality, suppose this linear order is given by

$$\begin{aligned} {\mathcal {A}}_{x_1}< {\mathcal {B}}_{x_2}< {\mathcal {A}}_{x_3}< {\mathcal {B}}_{x_4}< \cdots< {\mathcal {A}}_{x_{n-1}} < {\mathcal {B}}_{x_n}. \end{aligned}$$

For each \({\mathcal {A}}_x\) (resp. \({\mathcal {B}}_x\)), choose any word \({\textbf{v}}_{x}\) such that \(c({\textbf{v}}_{x}) = {\mathcal {A}}_x\) (resp. \(c({\textbf{v}}_{x}) = {B}_x\)) and \(\text {occ}(y,{\textbf{v}}_{x}) = 1\) for all \(y \in {\mathcal {A}}_x\) (resp. \(y \in {\mathcal {B}}_x\)). Set \({\textbf{w}}_{i} = {\textbf{v}}_{x_1}{\textbf{v}}_{x_2}\cdots {\textbf{v}}_{x_n}\).

Finally, let \({}_1X_0 = \{x \in A \mid {\textbf{u}}[t_1,x] = xt_1x\}\) and \({}_2X_m = \{x \in A \mid {\textbf{u}}[t_m,x] = xt_mx\}\). Produce the word \({\textbf{w}}_0\) by choosing any word with \(c({\textbf{w}}_0) = {}_1X_0\) and \(\text {occ}(x,{\textbf{w}}_{0}) = 1\). Produce \({\textbf{w}}_{m+1}\) analogously.

From Proposition 6.2 and Construction 6.4 we gain the following corollary.

Corollary 6.5

Let \({\textbf{S}} = M(asabtb)\) and \({\textbf{u}}\) be a word with \({\textbf{u}}[Lin({\textbf{u}})] = t_1t_2\cdots t_m\). Produce the words \({\textbf{w}}_0,{\textbf{w}}_1,\cdots ,{\textbf{w}}_{m+1}\) as in Construction 6.4. Let

$$\begin{aligned} {\textbf{v}} = {\textbf{w}}_0t_1{\textbf{w}}_1t_2\cdots {\textbf{w}}_m t_m {\textbf{w}}_{m+1} \cdot \prod _{x \in \text {Prim}({\textbf{u}})} x^2. \end{aligned}$$

Then \({\textbf{S}} \models {\textbf{u}} \approx {\textbf{v}}\).

Lemma 6.6

Let \({\textbf{S}} = M(asabtb)\) and \({\textbf{u}}, {\textbf{v}}\) be words over A with \(c({\textbf{u}}) = c({\textbf{v}})\). Then \({\textbf{S}} \models {\textbf{u}} \approx {\textbf{v}}\) if and only if \({\textbf{S}} \models {\textbf{u}}[X] \approx {\textbf{v}}[X]\) for every \(X \subseteq A\) with \(|X |\le 5\).

Proof

The only if-direction is trivial. To show the if-statement suppose \({\textbf{S}} \models {\textbf{u}}[X] \approx {\textbf{v}}[X]\) for every \(X \subseteq A\) with \(|X |\le 5\). Then any letter \(t_i\) that is linear in \({\textbf{u}}\) must be linear in \({\textbf{v}}\). Moreover, \({\textbf{u}}[\text {Lin}({\textbf{u}})] = {\textbf{v}}[\text {Lin}({\textbf{v}})]\) can be ascertained from the fact that \({\textbf{u}}[t_i,t_j] \approx {\textbf{v}}[t_i,t_j]\) for any linear letters \(t_i\) and \(t_j\).

Notice that a letter x is primitive in a word for \({\textbf{S}}\) if and only if it occurs more than twice or there are no linear letters between two occurrences of x. We can then establish \(\text {Prim}({\textbf{u}}) = \text {Prim}({\textbf{v}})\) by considering \({\textbf{u}}[t_i,x]\) and \({\textbf{v}}[t_i,x]\) for every letter x and every linear letter \(t_i\).

Now, since \({\textbf{S}} \models {\textbf{u}}[X] \approx {\textbf{v}}[X]\) for any set of letters X with \(|X |= 5\), \(\{{}_px,{}_py\}\) is stable in \({\textbf{u}}\) if and only if \(\{{}_px,{}_py\}\) is stable in \({\textbf{v}}\) by Proposition 6.2 for \(p \in \{1,2\}\).

Consider Construction 6.4 and the normal form in Corollary 6.5. Not only do the sets \(_{1}X\) and \(_{2}X\) have to coincide for \({\textbf{u}}\) and \({\textbf{v}}\), by the above facts, but since the stability of \(\{{}_px,{}_py\}\) is consistent across \({\textbf{u}}\) and \({\textbf{v}}\), the sets \({\mathcal {A}}_x\) and \({\mathcal {B}}_x\) have to coincide for \({\textbf{u}}\) and \({\textbf{v}}\). Moreover, the natural order on \({}_1X \cup {}_2X\) in \({\textbf{u}}\) must be the same order on \({}_1X \cup {}_2X\) in \({\textbf{v}}\). Therefore we have some word \({\textbf{w}}_0t_1{\textbf{w}}_1t_2\cdots {\textbf{w}}_m t_m {\textbf{w}}_{m+1}\) such that

$$\begin{aligned} {\textbf{u}} \approx {\textbf{w}}_0t_1{\textbf{w}}_1t_2\cdots {\textbf{w}}_m t_m {\textbf{w}}_{m+1}\cdot \prod _{x \in \text {Prim}({\textbf{u}})} x^2 \approx {\textbf{v}} \end{aligned}$$

by Corollary 6.5. Thus the lemma is proved. \(\square \)

Given an n-ary scheme \({\mathscr {F}}\) for a semigroup \({\textbf{S}}\), let \(\text {Prim}({\mathscr {F}}) = \{x_i \in X_n \mid \forall k,l \in {\underline{n}}{\setminus }\{i\} \ x_i \in \text {Prim}({\textbf{t}}_{kl})\}\).

Theorem 6.7

The nilpotent monoid M(asabtb) is finitely related.

Proof

Let \({\textbf{S}} = M(asabtb)\) and \({\mathscr {F}} = \{{\textbf{t}}_{ij} \mid 1 \le i < j \le n\}\) be a scheme for \({\textbf{S}}\) with  \(n >|S |+ 1\) inducing the operation \(f:S^n \rightarrow S\). Without loss of generality, assume \(\{1,\dots ,m\}\) indexes \(\text {Lin}({\mathscr {F}})\) and \(f[x_1,\dots ,x_m]\) is induced by  \(x_1x_2\cdots x_m\) (see Lemma 6.1).

For any subset \(Y \subset X_n\) with \(|Y |\le 21\) the operation f[Y] is induced by a term since \(n > |S |+ 1 \ge 22\). We may then construct \({\textbf{v}}\) as in Corollary 6.5 using the terms f[Y] for \(|Y |\le 5\). Perform this construction and let

$$\begin{aligned} {\textbf{v}} = {\textbf{w}}_0x_1{\textbf{w}}_1x_2\cdots {\textbf{w}}_m{\textbf{x}}_m{\textbf{w}}_{m+1} \cdot \prod _{x \in \text {Prim}({\mathscr {F}})}x^2. \end{aligned}$$

We claim that \({\mathscr {F}}\) comes from the term \({\textbf{v}}\). Take any subset \(Y \subset X_n\) with \(|Y |= 5\) and let \({\textbf{s}}\) be the term inducing f[Y]. It is easy to verify \({\textbf{S}} \models {\textbf{s}} \approx {\textbf{v}}[Y]\) by considering the following facts:

  • \({\textbf{s}}[x_{m_1},x_{m_2}] = {\textbf{v}}[x_{m_1},x_{m_2}]\) for all linear variables \(x_{m_1},x_{m_2}\);

  • by the construction of \({\textbf{v}}\), \({\textbf{s}}[x_{m_1},x_{m_2},x_i,x_j,x_k] \approx {\textbf{v}}[x_{m_1},x_{m_2},x_i,x_j,x_k]\) where \(x_{m_1},x_{m_2}\) are linear and \(x_i,x_j,x_k\) are 2-occurring;

  • \({\textbf{S}} \models x^2y^2 \approx y^2x^2 \approx (xy)^2 \approx (yx)^2\).

By deletion, we must also have that if \(Y \subset X_n\) with \(|Y |< 5\), then \({\textbf{S}} \models {\textbf{s}} \approx {\textbf{v}}[Y]\). It follows that for any \(i,j \in {\underline{n}}\) and any subset \(Y \subset X_n\) with \(|Y |\le 5\), we obtain \({\textbf{S}} \models {\textbf{t}}_{ij}[Y] \approx {\textbf{v}}^{(ij)}[Y]\). Therefore \({\textbf{S}} \models {\textbf{t}}_{ij} \approx {\textbf{v}}^{(ij)}\) by Lemma 6.6 and the claim is proved. \(\square \)