1 Introduction

Word equations (equations over free monoids), or string equations, are of fundamental importance in mathematics and theoretical computer science, for example in program verification. The area is actively studied both in theoretical and practical areas (see the recent papers [2, 3, 6, 12] and the references therein).

The notion of compactness is a foundational in numerous areas of mathematics. In semigroup theory, the compactness property takes the following form: a semigroup is said to possess the compactness property if every system of equations with finitely many variables has an equivalent finite subsystem of equations. That is to say, any solution to the finite subsystem satisfies all the equations of the original system.

Famously, the free monoid \(\varSigma ^*\) possesses the so-called compactness property, independently proved in [1] and [4]. The latter work also shows that free groups possess the compactness property. In [5] it was shown that all commutative semigroups possess the compactness property. Nevertheless, not all semigroups possess the compactness property. Some such examples are the monoid of finite languages [9], the so-called bicyclic monoid, and the Baumslag–Solitar group [5]. An interesting non-example is given in [20]: Shevlyakov gives a semigroup over which each consistent system of equations (i.e., has a solution) has an equivalent finite subsystem, yet the semigroup does not possess the compactness property. Namely, there is an inconsistent system of equations such that each of its finite subsystems is consistent.

A system of equations is called independent if it is not equivalent to any of its subsystems. Now if a semigroup possesses the compactness property, then any independent system of equations is finite. The aspect of considering sizes of independent systems of equations in semigroups has been previously treated, e.g., in the paper [7]. See also [16], and references therein, concerning the free semigroup.

In this note we consider equations over the k-binomial monoids. Two words u, \(v \in \varSigma ^*\) are called k-binomially equivalent, in symbols \(u \equiv _{k}v\), if each word e of length at most k occurs as a subword, or scattered factor, equally many times in both u and v. The notion was introduced in [18], and has attracted a lot interest in contemporary research areas in combinatorics on words [10, 11, 17]. The k-binomial equivalence actually defines a congruence on \(\varSigma ^*\), the free semigroup [18]. Hence \(\varSigma ^*/{\equiv _{k}}\) defines a monoid. Our aim is to study basic equations over this monoid. Our main motivation is to discover algebraic properties of these k-binomial monoids. In particular, we consider when two words commute and when they are conjugate in the k-binomial monoid. These are well understood equations over \(\varSigma ^*\) [14]: for two word \(u,v\in \varSigma ^*\) we have \(uv=vu\) if and only if there exists \(r\in \varSigma ^*\) such that \(u,v\in r^*\). Thus the set \({{\,\mathrm{Sol}\,}}(xy=yx)\) of solutions to the equation \(xy = yx\) in \(\varSigma ^*\) equals \(\{\alpha : x\mapsto r^i,y\mapsto r^j:r\in \varSigma ^*, i,j,\geqslant 0\}\). Similarly, for words \(x,y,z\in \varSigma ^*\) we have \(xz=zy\) if and only if there exist \(p,q\in \varSigma ^*\) such that \(x=pq\), \(y=qp\), and \(z\in (pq)^*p\) (or \(x=y=\varepsilon \) and z is arbitrary). In the free monoid, we thus have \({{\,\mathrm{Sol}\,}}(xz=zy) = \{(x,y,z)\mapsto (pq,qp,(pq)^rp,):p,q\in \varSigma ^*,r\in \mathbb {N}\}\). The results we obtain in this paper differ quite a bit to these characterisations.

The paper is organised as follows. In Sect. 2 we introduce some basic properties of the k-binomial equivalence. In Sect. 3 we study the equation \(xy \equiv _{k}yx\), i.e., when do two words commute in the k-binomial monoid. We characterise the solutions in case \(k=2\), and give partial results for \(k\geqslant 3\). In Sect. 4 we study the equation \(xz \equiv _{k}zy\), i.e., when are two words conjugate in the k-binomial monoid. In Sect. 5 we show that the k-binomial monoids possess the compactness property. In fact, we observe that this already follows from results in the literature. We give another proof of this fact and, furthermore, give a bound on the number of equations in an independent system of equations when k and the number of variables is fixed. We conclude the paper with some open problems in Sect. 6.

This paper is based on results appearing in the author’s PhD thesis [21].

2 Preliminaries and Notation

We recall some notation and basic terminology from the literature of combinatorics on words. We refer the reader to [13, 14] for more on the subject.

For a finite alphabet \(\varSigma \) we let \(\varSigma ^*\) denote the set of finite words of \(\varSigma \). We use \(\varepsilon \) to denote the empty word. We let \(\varSigma ^+\) denote the set of non-empty words. For a word \(w \in \varSigma ^*\), |w| denotes the length of w. By convention we set \(|\varepsilon | = 0\). For a letter \(a \in \varSigma \), we use \(|w|_a\) to denote the number of occurrences of a in w. The Parikh vector of w is defined by \(\varPsi (w):= (|s|_a)_{a\in \varSigma }\). A word \(u = u_0 \cdots u_t\), \(u_i \in \varSigma \) is called a subword of \(w = a_0\cdots a_n\), if \(u = a_{i_1}\cdots a_{i_t}\) for some indices \(0 \leqslant i_1< \ldots < i_t \leqslant n\). We let \({\textstyle {\left( {\begin{array}{c}w\\ u\end{array}}\right) }}\) denote the number of occurrences of u in w as a subword. Basic properties of binomial coefficients \({\textstyle {\left( {\begin{array}{c}u\\ v\end{array}}\right) }}\) are presented in [13, Chapter 6]. We repeat the main properties here. Define, for \(a,b\in \varSigma \), \(\delta _{a,b} = 1\) if \(a=b\), otherwise \(\delta _{a,b} = 0\). For all \(p,q\in \mathbb {N}\), \(u,v\in \varSigma ^*\), and \(a,b\in \varSigma \) we have

$$\begin{aligned} {\textstyle {\left( {\begin{array}{c}a^p\\ a^q\end{array}}\right) }} = \textstyle \left( {\begin{array}{c}p\\ q\end{array}}\right) ;\quad {\textstyle {\left( {\begin{array}{c}u\\ \varepsilon \end{array}}\right) }} = 1;\quad |u|<|v| \text { implies }{\textstyle {\left( {\begin{array}{c}u\\ v\end{array}}\right) }} = 0;\quad {\textstyle {\left( {\begin{array}{c}ua\\ vb\end{array}}\right) }} = {\textstyle {\left( {\begin{array}{c}u\\ vb\end{array}}\right) }} + \delta _{a,b}{\textstyle {\left( {\begin{array}{c}u\\ v\end{array}}\right) }}. \end{aligned}$$

The last three relations completely determine the binomial coefficient \({\textstyle {\left( {\begin{array}{c}u\\ v\end{array}}\right) }}\) for all \(u,v\in \varSigma ^*\).

A mapping \(\varphi :\varDelta ^*\rightarrow \varSigma ^*\) from the language \(\varDelta ^*\) to the language \(\varSigma ^*\) is called a morphism if \(\varphi (uv) = \varphi (u)\varphi (v)\) for all \(u,v\in \varDelta ^*\). Let \(\varXi \) be a finite non-empty set of variables and S a semigroup. An element \((u,v) \in (\varXi \cup \varSigma )^+ \times (\varXi \cup \varSigma )^+\) is called an equation over S with variables \(\varXi \). A solution to an equation (uv) over S with variables \(\varXi \) is a morphism \(\alpha : \varXi \rightarrow S\) such that \(\alpha (u) = \alpha (v)\). Here we extend \(\alpha \) to \(\varXi \cup S\) so that \(\alpha \) acts as the identity morphism on S. An equation \(e=(u,v)\) is often denoted by \(e: u=v\). The set of solutions to the equation e is denoted by \({{\,\mathrm{Sol}\,}}(e)\).

A set \(E\subseteq \varXi ^+\times \varXi ^+\) is called a system of equations. The solutions to E are defined as

$$\begin{aligned} {{\,\mathrm{Sol}\,}}(E) = \bigcap _{e\in E}{{\,\mathrm{Sol}\,}}(e). \end{aligned}$$

We say that two systems \(E_1\) and \(E_2\) of equations are equivalent if \({{\,\mathrm{Sol}\,}}(E_1) = {{\,\mathrm{Sol}\,}}(E_2)\). Further, we say that a system of equations E is independent if E is not equivalent to any of its finite proper subsystems \(E'\subseteq E\).

Let us turn to the main notion of the paper. Two words \(u,v \in \varSigma ^*\) are k-binomially equivalent if \({\textstyle {\left( {\begin{array}{c}u\\ e\end{array}}\right) }} = {\textstyle {\left( {\begin{array}{c}v\\ e\end{array}}\right) }}\) for all \(e \in \varSigma ^*\) with \(|e| \leqslant k\). As noted in the introduction, the k-binomial monoid is defined as the quotient monoid \(\varSigma ^*/{\equiv _{k}}\). We recall a basic result on k-binomial equivalence from [18].

Proposition 2.1

Let \(u,v,e\in \varSigma ^*\) and \(a\in \varSigma \).

  • We have \({\textstyle {\left( {\begin{array}{c}uv\\ e\end{array}}\right) }} = \sum _{e_1e_2 = e}{\textstyle {\left( {\begin{array}{c}u\\ e_1\end{array}}\right) }}{\textstyle {\left( {\begin{array}{c}v\\ e_2\end{array}}\right) }}.\)

  • Let \(\ell \geqslant 0\). We have \({\textstyle {\left( {\begin{array}{c}u\\ a^{\ell }\end{array}}\right) }} = \left( {\begin{array}{c}|u|_a\\ \ell \end{array}}\right) \) and \(\sum _{|v|=\ell }{\textstyle {\left( {\begin{array}{c}u\\ v\end{array}}\right) }} = {\textstyle {\left( {\begin{array}{c}|u|\\ \ell \end{array}}\right) }}\).

The second point can be refined further:

Lemma 2.2

Let \(u,v\in \varSigma ^*\). Then \(\sum _{v'\equiv _{1} v}{\textstyle {\left( {\begin{array}{c}u\\ v'\end{array}}\right) }} = \prod _{a\in \varSigma }\left( {\begin{array}{c}|u|_a\\ |v|_a\end{array}}\right) \), where the summation runs through all words \(v'\) for which \(\varPsi (v') = \varPsi (v)\).

Proof

We count the number of choices of subwords \(v'\) of u having \(|v'|_a = |v|_a\) for each \(a\in \varSigma \). For each \(a\in \varSigma \), we may choose the occurrences of a in \(\left( {\begin{array}{c}|u|_a\\ |v|_a\end{array}}\right) \) ways. Since the choices of distinct letters are independent, the total number of choices equals \(\prod _{a\in \varSigma }\left( {\begin{array}{c}|u|_a\\ |v|_a\end{array}}\right) \). Each of these choices corresponds to an occurrence of a subword \(v'\equiv _1 v\) of u.    \(\square \)

Let \(\lhd \) be a lexicographic order on \(\varSigma \).

Corollary 2.3

Given two words \(x,y\in \varSigma ^*\), we have that \(x\equiv _{2} y\) if and only if \(x\equiv _{1} y\) and \({\textstyle {\left( {\begin{array}{c}x\\ ab\end{array}}\right) }} = {\textstyle {\left( {\begin{array}{c}y\\ ab\end{array}}\right) }}\) for all pairs of letters \(a,b\in \varSigma \) with \(a\lhd b\).

Proof

Clearly \(x \equiv _{2} y\) implies the weaker condition. Now \(x\equiv _{1} y\) implies that \({\textstyle {\left( {\begin{array}{c}x\\ aa\end{array}}\right) }} = \left( {\begin{array}{c}|x|_a\\ 2\end{array}}\right) = {\textstyle {\left( {\begin{array}{c}|y|_a\\ 2\end{array}}\right) }} = \left( {\begin{array}{c}y\\ aa\end{array}}\right) \), and Lemma 2.2 implies that, for \(a\lhd b\),

$$\begin{aligned} {\textstyle {\left( {\begin{array}{c}x\\ ba\end{array}}\right) }} = |x|_a|x|_b - {\textstyle {\left( {\begin{array}{c}x\\ ab\end{array}}\right) }} = |y|_a|y|_b - {\textstyle {\left( {\begin{array}{c}y\\ ab\end{array}}\right) }} = {\textstyle {\left( {\begin{array}{c}y\\ ba\end{array}}\right) }}. \end{aligned}$$

   \(\square \)

3 On Commutativity in the k-Binomial Monoids

We first study when two words commute in the k-binomial monoid. Let us begin with a straightforward characterisation of commutativity in the 2-binomial monoids.

Proposition 3.1

For all \(x,y\in \varSigma ^*\), \(xy\equiv _{2} yx\) if and only if \(\varPsi (x)\) and \(\varPsi (y)\) are collinear.

Proof

Notice first that \(xy\equiv _{2} yx\) is equivalent to

$$\begin{aligned} {\textstyle {\left( {\begin{array}{c}x\\ ab\end{array}}\right) }} + {\textstyle {\left( {\begin{array}{c}x\\ a\end{array}}\right) }}{\textstyle {\left( {\begin{array}{c}y\\ b\end{array}}\right) }} + {\textstyle {\left( {\begin{array}{c}y\\ ab\end{array}}\right) }}&= {\textstyle {\left( {\begin{array}{c}xy\\ ab\end{array}}\right) }} = {\textstyle {\left( {\begin{array}{c}yx\\ ab\end{array}}\right) }} = {\textstyle {\left( {\begin{array}{c}x\\ ab\end{array}}\right) }} + {\textstyle {\left( {\begin{array}{c}y\\ a\end{array}}\right) }}{\textstyle {\left( {\begin{array}{c}x\\ b\end{array}}\right) }} + {\textstyle {\left( {\begin{array}{c}y\\ ab\end{array}}\right) }} \end{aligned}$$

for all \(a,b \in \varSigma \). This, in turn is equivalent to

$$\begin{aligned} |x|_a|y|_b = |y|_a|x|_b,\quad a,b\in \varSigma . \end{aligned}$$
(1)

Assume now that \(xy\equiv _{2} yx\), i.e., (1) holds. Summing both sides over \(b\in \varSigma \) yields \(|x|_a |y| = |y|_a |x|\) for all \(a\in \varSigma \), which is equivalent to \(|y|\varPsi (x) = |x|\varPsi (y)\), and so the vectors are collinear.

For the converse assume that \(\varPsi (x) = \alpha \varPsi (y)\) for some \(\alpha \in \mathbb {Q}\). If \(\alpha = 0\), then \(x = \varepsilon \) and there is nothing to prove. Otherwise, we observe that the property \(|x|_a = \alpha |y|_a\) for all \(a\in \varSigma \) implies \(|x|_a|y|_b = \alpha |y|_a \frac{1}{\alpha }|x|_b = |y|_a|x|_b\) for all \(a,b\in \varSigma \), so (1) holds. Therefore \(xy\equiv _2 yx\).    \(\square \)

It is immediate that if \(x \equiv _{k}r^{m}\) and \(y \equiv _{k}r^{n}\) for some \(r \in \varSigma ^*\), \(m,n \in \mathbb {N}\), then x and y commute in the k-binomial monoid. It is straightforward to see that the above proposition can be stated as follows: in case \(k=2\), the elements x and y commute in \(\varSigma ^*/{\equiv _{k}}\) if and only if there exist a word \(r\in \varSigma ^*\) and non-negative integers m and n such that \(x\equiv _{k-1} r^m\) and \(y\equiv _{k-1} r^n\) (we call such r a common \((k-1)\)-binomial root). It is natural to consider whether such characterisation holds for larger k. Unfortunately, only one direction generalises.

Let us first give a counterexample of 3-binomially commuting words with no common 2-binomial root.

Example 3.2

Let \(x = aaabbb\) and \(y=aaabbabb\). As their Parikh vectors are collinear, we have \(xy \equiv _{2} yx\) by Proposition 3.1. One can further check that \({\textstyle {\left( {\begin{array}{c}xy\\ e\end{array}}\right) }} = {\textstyle {\left( {\begin{array}{c}yx\\ e\end{array}}\right) }}\) for all \(e \in \varSigma ^3\):

$$\begin{aligned} aaa; 35,\ aab; 81,\ aba; 48,\ abb; 82,\ baa; 18,\ bab; 46,\ bba; 19,\ bbb; 35. \end{aligned}$$

Now \(\gcd (|x|,|y|) = 2\), which implies that a possible common 2-binomial root r must have length at most 2. Clearly it cannot be a single letter, so has length 2 and contains both a and b. Hence \(r = ab\) or \(r = ba\). Now \({\textstyle {\left( {\begin{array}{c}x\\ ba\end{array}}\right) }} = 0\), while \({\textstyle {\left( {\begin{array}{c}r^3\\ ba\end{array}}\right) }} > 0\). Therefore, x and y do not have a common 2-binomial root.

The other implication of Proposition 3.1 does generalise to arbitrary \(k \geqslant 2\):

Proposition 3.3

Let \(k\geqslant 2\) be an integer, \(r\in \varSigma ^*\), and \(m,n\geqslant 0\). For any \(x\equiv _{k-1} r^m\) and \(y\equiv _{k-1} r^n\) we have \(xy \equiv _{k}yx\).

Proof

For all \(a\in \varSigma \), we clearly have \(|xy|_a = |yx|_a\). Further, for each word \(e\in \varSigma ^{\leqslant k}\) of length at least two,

figure a

where the second and fifth equalities above follow from \(x\equiv _{k-1} r^m\) and \(y\equiv _{k-1} r^n\) and the observation that \(e_1,e_2\in \varSigma ^{\leqslant k-1}\) in the summations.   \(\Box \)

Example 3.4

Let \(x = aba\) and \(y = baaaab\). Now \(y\equiv _{2} x^2\), by simply counting the occurrences of subwords of length at most two:

$$\begin{aligned} a;4,\ b; 2,\ aa;6,\ ab;4,\ ba;4,\ bb;1 \end{aligned}$$

By the above proposition we have \(xy\equiv _{3}yx\).

Let us end this section on a positive note: we characterise k-binomial commutation among words of equal length.

Theorem 3.5

Let \(x,y\in \varSigma ^*\) with \(|x| = |y|\). Then \(xy\equiv _{k}yx\) if and only if \(x\equiv _{k-1} y\).

Proof

Note that \(x \equiv _{k-1} y\) implies \(xy \equiv _{k}yx\) by Proposition 3.3. We shall prove the converse by induction on k. Note that the case of \(k=2\) follows from applying Proposition 3.1 with \(|x|=|y|\). Assume that the claim holds for some \(k\geqslant 2\) and suppose \(xy\equiv _{k+1}yx\). It follows that \(xy \equiv _{k}yx\) so that \(x\equiv _{k-1} y\) by induction. Let then \(a,b\in \varSigma \) and \(e\in \varSigma ^{k-1}\). We have

$$\begin{aligned} {\textstyle {\left( {\begin{array}{c}xy\\ aeb\end{array}}\right) }}&= {\textstyle {\left( {\begin{array}{c}x\\ aeb\end{array}}\right) }} + {\textstyle {\left( {\begin{array}{c}y\\ aeb\end{array}}\right) }} + {\textstyle {\left( {\begin{array}{c}x\\ a\end{array}}\right) }}{\textstyle {\left( {\begin{array}{c}y\\ eb\end{array}}\right) }} + {\textstyle {\left( {\begin{array}{c}x\\ ae\end{array}}\right) }}{\textstyle {\left( {\begin{array}{c}y\\ b\end{array}}\right) }} + \displaystyle {\sum _{\begin{array}{c} \scriptstyle e_1e_2=e\\ e_1,e_2\in \varSigma ^+ \end{array}}}{\textstyle {\left( {\begin{array}{c}x\\ ae_1\end{array}}\right) }}{\textstyle {\left( {\begin{array}{c}y\\ e_2b\end{array}}\right) }} \text { and} \\ {\textstyle {\left( {\begin{array}{c}yx\\ aeb\end{array}}\right) }}&= {\textstyle {\left( {\begin{array}{c}y\\ aeb\end{array}}\right) }} + {\textstyle {\left( {\begin{array}{c}x\\ aeb\end{array}}\right) }} + {\textstyle {\left( {\begin{array}{c}y\\ a\end{array}}\right) }}{\textstyle {\left( {\begin{array}{c}x\\ eb\end{array}}\right) }} + {\textstyle {\left( {\begin{array}{c}y\\ ae\end{array}}\right) }}{\textstyle {\left( {\begin{array}{c}x\\ b\end{array}}\right) }} + \displaystyle {\sum _{\begin{array}{c} \scriptstyle e_1e_2=e\\ e_1,e_2\in \varSigma ^+ \end{array}}}{\textstyle {\left( {\begin{array}{c}y\\ ae_1\end{array}}\right) }}{\textstyle {\left( {\begin{array}{c}x\\ e_2b\end{array}}\right) }}. \end{aligned}$$

Putting \({\textstyle {\left( {\begin{array}{c}xy\\ aeb\end{array}}\right) }}={\textstyle {\left( {\begin{array}{c}yx\\ aeb\end{array}}\right) }}\) and noting that \({\textstyle {\left( {\begin{array}{c}y\\ ae_1\end{array}}\right) }}{\textstyle {\left( {\begin{array}{c}x\\ e_2b\end{array}}\right) }} = {\textstyle {\left( {\begin{array}{c}x\\ ae_1\end{array}}\right) }}{\textstyle {\left( {\begin{array}{c}y\\ e_2b\end{array}}\right) }}\) for all terms in the summation (as \(x\equiv _{k-1} y\)), we obtain, after rearranging,

$$\begin{aligned} |x|_a\left( {\textstyle {\left( {\begin{array}{c}y\\ eb\end{array}}\right) }} - {\textstyle {\left( {\begin{array}{c}x\\ eb\end{array}}\right) }}\right) = |x|_b\left( {\textstyle {\left( {\begin{array}{c}y\\ ae\end{array}}\right) }}-{\textstyle {\left( {\begin{array}{c}x\\ ae\end{array}}\right) }}\right) . \end{aligned}$$

Note that the above equation holds for all \(a,b\in \varSigma \) and \(e\in \varSigma ^{k-1}\). Assume without loss of generality that \(|x|_a \ne 0\). Letting \(e=e_1\cdots e_{k-1}\) and repeatedly applying the above (to possibly different letters ab and words \(e\in \varSigma ^{k-1}\)), we obtain

$$\begin{aligned} {\textstyle {\left( {\begin{array}{c}y\\ eb\end{array}}\right) }} - {\textstyle {\left( {\begin{array}{c}x\\ eb\end{array}}\right) }}&= \left( {\textstyle {\left( {\begin{array}{c}y\\ ae_1\cdots e_{k-1}\end{array}}\right) }}-{\textstyle {\left( {\begin{array}{c}x\\ ae_1\cdots e_{k-1}\end{array}}\right) }}\right) \frac{|x|_b}{|x|_a}\\&= \left( {\textstyle {\left( {\begin{array}{c}y\\ aae_1\cdots e_{k-2}\end{array}}\right) }}-{\textstyle {\left( {\begin{array}{c}x\\ aae_1\cdots e_{k-2}\end{array}}\right) }}\right) \frac{|x|_{e_{k-1}}}{|x|_a}\frac{|x|_b}{|x|_a}\\&=\cdots \\&=\left( {\textstyle {\left( {\begin{array}{c}y\\ a^k\end{array}}\right) }}-{\textstyle {\left( {\begin{array}{c}x\\ a^k\end{array}}\right) }}\right) \frac{|x|_{e_1}\cdots |x|_{e_{k-1}}|x|_b}{|x|_a^{k}}=0, \end{aligned}$$

since \({\textstyle {\left( {\begin{array}{c}y\\ a^k\end{array}}\right) }} = {\textstyle {\left( {\begin{array}{c}x\\ a^k\end{array}}\right) }}\) follows from \(x\equiv _{1} y\). It thus follows that \({\textstyle {\left( {\begin{array}{c}y\\ eb\end{array}}\right) }} = {\textstyle {\left( {\begin{array}{c}x\\ eb\end{array}}\right) }}\) for all \(b\in \varSigma \) and \(e\in \varSigma ^{k-1}\), and consequently \(x \equiv _{k}y\). This concludes the proof.   \(\Box \)

Corollary 3.6

Let \(k\geqslant 2\) and \(x,y\in \varSigma ^*\). If \(xy\equiv _{k}yx\), then there exist \(m,n\in \mathbb {N}\) such that \(x^m\equiv _{k-1} y^n\).

Proof

Since \(xy\equiv _{k}yx\) it follows that \(x^m y^n \equiv _{k}y^nx^m\) for all \(m,n\in \mathbb {N}\). We may choose \(m = \mathop {\mathrm {lcm}}(|x|,|y|)/|x|\) and \(n=\mathop {\mathrm {lcm}}(|x|,|y|)/|y|\), whence \(|x^{m}| = |y^{n}|\). By the above proposition we have that \(x^m \equiv _{k-1} y^n\) as was claimed.   \(\Box \)

4 Conjugacy in the 2-Binomial Monoids

Here we consider conjugacy in the 2-binomial monoids. Two words \(x,y \in \varSigma ^*\) are k-binomially conjugate if there exists \(z \in \varSigma ^*\) such that \(xz \equiv _{k}zy\). Notice that for such a z to exist, we must have \(x \equiv _{1} y\). Furthermore, for \(k \geqslant 2\), z cannot contain any letters not occurring in x and y. Indeed, if \(|x|_b = |y|_b = 0\), \(|z|_b\geqslant 1\), and \(|x|_a \geqslant 1\), then \({\textstyle {\left( {\begin{array}{c}xz\\ ab\end{array}}\right) }} = {\textstyle {\left( {\begin{array}{c}x\\ a\end{array}}\right) }}{\textstyle {\left( {\begin{array}{c}z\\ b\end{array}}\right) }} + {\textstyle {\left( {\begin{array}{c}z\\ ab\end{array}}\right) }} > {\textstyle {\left( {\begin{array}{c}z\\ ab\end{array}}\right) }}={\textstyle {\left( {\begin{array}{c}zy\\ ab\end{array}}\right) }}\).

Let us consider first the case when \(\varSigma = \{a,b\}\).

Proposition 4.1

Let \(x,y\in \{a,b\}^*\). Then there exists \(z\in \{a,b\}^*\) such that \(xz \equiv _{2} zy\) if and only if \(x\equiv _{1} y\) and \(\gcd (|x|_a,|x|_b)\) divides \({\textstyle {\left( {\begin{array}{c}x\\ ab\end{array}}\right) }}-{\textstyle {\left( {\begin{array}{c}y\\ ab\end{array}}\right) }}\).

Proof

Assume first there exists z such that \(xz \equiv _{2} zy\). It immediately follows that \(x\equiv _{1} y\). We also have

$$\begin{aligned} {\textstyle {\left( {\begin{array}{c}x\\ ab\end{array}}\right) }} + {\textstyle {\left( {\begin{array}{c}x\\ a\end{array}}\right) }}{\textstyle {\left( {\begin{array}{c}z\\ b\end{array}}\right) }} + {\textstyle {\left( {\begin{array}{c}z\\ ab\end{array}}\right) }} = {\textstyle {\left( {\begin{array}{c}xz\\ ab\end{array}}\right) }} = {\textstyle {\left( {\begin{array}{c}zy\\ ab\end{array}}\right) }} = {\textstyle {\left( {\begin{array}{c}y\\ ab\end{array}}\right) }} + {\textstyle {\left( {\begin{array}{c}z\\ a\end{array}}\right) }}{\textstyle {\left( {\begin{array}{c}y\\ b\end{array}}\right) }} + {\textstyle {\left( {\begin{array}{c}z\\ ab\end{array}}\right) }}, \end{aligned}$$
(2)

which implies that \({\textstyle {\left( {\begin{array}{c}x\\ ab\end{array}}\right) }}-{\textstyle {\left( {\begin{array}{c}y\\ ab\end{array}}\right) }} = |z|_a|y|_b - |z|_b|x|_a = |z|_a|x|_b - |z|_b|x|_a\). It now follows that \(\gcd (|x|_a,|x|_b)\) divides \({\textstyle {\left( {\begin{array}{c}x\\ ab\end{array}}\right) }}-{\textstyle {\left( {\begin{array}{c}y\\ ab\end{array}}\right) }}\).

Let \(d=\gcd (|x|_a,|x|_b)\) and assume that \(x\equiv _{1}y\) and \({\textstyle {\left( {\begin{array}{c}x\\ ab\end{array}}\right) }} - {\textstyle {\left( {\begin{array}{c}y\\ ab\end{array}}\right) }} = kd\) for some \(k\in \mathbb {Z}\). By Bezout’s identity there exist \(i,j\in \mathbb {Z}\), such that \(kd = i|x|_b - j|x|_a\). Here we may assume that \(i,j\geqslant 0\) since otherwise we may replace (ij) with \((h|x|_a + i, h|x|_b+j)\) for some suitably large h. We claim that \(z = a^{i}b^{j}\) satisfies \({\textstyle {\left( {\begin{array}{c}xz\\ ab\end{array}}\right) }} = {\textstyle {\left( {\begin{array}{c}zy\\ ab\end{array}}\right) }}\). Indeed, \({\textstyle {\left( {\begin{array}{c}x\\ ab\end{array}}\right) }} - {\textstyle {\left( {\begin{array}{c}y\\ ab\end{array}}\right) }} = i|x|_b - j|x|_a\) which is equivalent to

$$\begin{aligned} {\textstyle {\left( {\begin{array}{c}x\\ ab\end{array}}\right) }} + |z|_b|x|_a + {\textstyle {\left( {\begin{array}{c}z\\ ab\end{array}}\right) }} = {\textstyle {\left( {\begin{array}{c}y\\ ab\end{array}}\right) }} + |z|_a|x|_b + {\textstyle {\left( {\begin{array}{c}z\\ ab\end{array}}\right) }}. \end{aligned}$$

The latter is equivalent to \({\textstyle {\left( {\begin{array}{c}xz\\ ab\end{array}}\right) }} = {\textstyle {\left( {\begin{array}{c}zy\\ ab\end{array}}\right) }}\) as seen above. By Lemma 2.2, we further have \({\textstyle {\left( {\begin{array}{c}xz\\ ba\end{array}}\right) }}={\textstyle {\left( {\begin{array}{c}zy\\ ba\end{array}}\right) }}\) and, since \(y\equiv _{1} x\), we have \(xz \equiv _{2} zy\) as claimed.   \(\Box \)

Example 4.2

Let \(x = aabaaaabbbab\) and \(y = bbaababaaaba\). As \(y\equiv _{1} x\) and \(\gcd (|x|_a,|x|_b) = 1\), there exists \(z\in \varSigma ^*\) such that \(xz\equiv _{2} zy\). Now \({\textstyle {\left( {\begin{array}{c}x\\ ab\end{array}}\right) }}-{\textstyle {\left( {\begin{array}{c}y\\ ab\end{array}}\right) }} = 16\) and \(3|x|_b -2|x|_a = 1\); therefore, the proof above gives us, for example, \(z = a^{48}b^{32}\). Note that \(z' = a^6 b^2\) satisfies \(xz'\equiv _{2} z'y\) as well.

On the other hand, for \(x=aabb\) and \(y=abab\), we have \(x \equiv _{1} y\) and \(\gcd (|x|_a,|x|_b) = 2\), which does not divide \({\textstyle {\left( {\begin{array}{c}x\\ ab\end{array}}\right) }}-{\textstyle {\left( {\begin{array}{c}y\\ ab\end{array}}\right) }} = 1\). Thus x and y are not 2-binomial conjugate, in other words, \(xz\not \equiv _{2} zy\) for all \(z\in \varSigma ^*\).

We now discuss the generalisation of the above characterisation for larger alphabets. Notice that if \(xz \equiv _{2} zy\), then (2) holds for all \(a,b \in \varSigma \). Taking into account Corollary 2.3, we have

Lemma 4.3

For \(x,y,z \in \varSigma ^*\), we have \(xz\equiv _{2} zy\) if and only if \(x\equiv _{1} y\) and \({\textstyle {\left( {\begin{array}{c}x\\ ab\end{array}}\right) }}-{\textstyle {\left( {\begin{array}{c}y\\ ab\end{array}}\right) }} = |z|_a|x|_b - |z|_b|x|_a\) for all pairs of letters \(a,b\in \varSigma \) with \(a\lhd b\).

Hence, deciding whether x and y are 2-binomially conjugate reduces to solving a system of linear equations with integer coefficients. Let us formalise this observation. Let \(x,y\in \varSigma ^*\) and assume that \(x\equiv _{1} y\). Assume further that each letter of \(\varSigma \) occurs in x, otherwise we consider a sub-alphabet instead. Fix an ordering on \(\varSigma \) and define the vector \(\mathbf {D}_{x,y}\) indexed by pairs of letters \(a,b\in \varSigma \), \(a\lhd b\), defined as follows: \(\mathbf {D}_{x,y}[(a,b)] = {\textstyle {\left( {\begin{array}{c}x\\ ab\end{array}}\right) }}-{\textstyle {\left( {\begin{array}{c}y\\ ab\end{array}}\right) }}\). Let then \(M_x\) be the \(\left( {\begin{array}{c}|\varSigma |\\ 2\end{array}}\right) \times |\varSigma |\)-matrix (rows indexed by pairs \(a,b\in \varSigma \) with \(a \lhd b\), columns by letters \(a \in \varSigma \)) defined as \(M_x[(a,b),a] = |x|_b\), \(M_x[(a,b),b] = -|x|_a\), and \(M_x[(a,b),c] = 0\) for \(c\ne a,b\). Let \(\mathbf {X}\) be a vector of \(|\varSigma |\) unknowns indexed by the letters \(a\in \varSigma \). We consider solutions to the equation

$$\begin{aligned} M_x\mathbf {X} = \mathbf {D}_{x,y}. \end{aligned}$$
(3)

Let us give a brief example of the entities defined above.

Example 4.4

Let \(\varSigma = \{0,1,2\}\) and let \(x,y\in \varSigma ^*\) such that \(x\equiv _{1} y\) and \(|x|_{a}\geqslant 1\) for each \(a\in \varSigma \). Then Eq. (3) is defined as

In general, observe that for any word \(z\in \varSigma ^*\) we have

$$\begin{aligned} M_x\varPsi (z)^{\top } = \sum _{c\in \varSigma }M_x[(a,b),c]\cdot |z|_c = (|x|_b|z|_a-|x|_a|z|_b)_{(a,b),a\lhd b}. \end{aligned}$$
(4)

Now, for x and y as defined above, if there exists \(z\in \varSigma ^*\) such that \(xz\equiv _{k}zy\), then \(\mathbf {X} = \varPsi (z)^{\top }\) is a solution to Eq. (3). Indeed, recall that

$$\begin{aligned} |x|_b|z|_a-|x|_a|z|_b = {\textstyle {\left( {\begin{array}{c}x\\ ab\end{array}}\right) }}-{\textstyle {\left( {\begin{array}{c}y\\ ab\end{array}}\right) }} = D_{x,y}[(a,b)]. \end{aligned}$$

On the other hand, if \(\mathbf {X}\) is a solution to Eq. (3) having non-negative entries, then the word \(z = \prod _{a\in \varSigma }a^{\mathbf {X}[a]}\) is a solution to \(xz\equiv _{2} zy\).

We are in the position to characterise 2-binomial conjugacy over arbitrary alphabets.

Theorem 4.5

Let \(x,y\in \varSigma ^*\) and assume that each letter of \(\varSigma \) occurs in x. Then there exists \(z\in \varSigma ^*\) such that \(xz\equiv _{2} zy\) if and only if \(x\equiv _{1} y\) and Eq. (3) has solution \(\mathbf {X}\) having integer entries.

Proof

If there exists z such that \(xz\equiv _{2} zy\), then \(\varPsi (z)^{\top }\) is an integer solution to the equation, as was asserted previously.

Conversely, assume that \(\mathbf {X}\) is an integer solution to Eq. (3). Notice that some entries of \(\mathbf {X}\) could be negative. However, plugging \(z=x\) in Eq. (4), we have \(M_{x}\varPsi (x)^{\top }=\mathbf {0}\).Footnote 1 Thus, for each \(n\geqslant 0\), \(\mathbf {X} + n\varPsi (x)^{\top }\) is also an integer solution to the equation. Moreover, taking n large enough, each entry is a non-negative integer, since all entries of \(\varPsi (x)\) are assumed to be positive. Now the word \(z = \prod _{a\in \varSigma }a^{\mathbf {X}[a]+n|x|_a}\) satisfies \(xz\equiv _{2} zy\) (and is well-defined).   \(\Box \)

Remark 4.6

One can compute an (translated) integer basis for the set of solutions to Eq. (4) in polynomial time (see, e.g., [19, Cor. 5.3c]).

5 Bounds on Sizes of Independent Systems of Equations

In this section we show that the k-binomial monoids possess the compactness property. We further give an upper bound on the size of an independent system of equations. The main results of this section are the following.

Theorem 5.1

The k-binomial monoids possess the compactness property.

Theorem 5.2

The number of equations in an independent system of equations (without constants) over the semigroup \(\varSigma ^+/{\equiv _{k}}\) with variables \(\varXi \) is at most \(|\varXi ^{\leqslant k}|\).

As a consequence of the latter theorem, for k fixed, the size of an independent system of equations has a polynomial upper bound with respect to the number of unknowns. On the other hand, the upper bound is exponential when the number \(|\varXi |\) of unknowns \(\varXi \) is fixed and k is allowed to vary. We remark that these bounds do not depend on the size of the alphabet \(\varSigma \), when the equations have no constants, that is, the system of equations is a subset of \(\varXi ^+\times \varXi ^+\).

Let us quickly explain why Theorem 5.2 implies Theorem 5.1.

When considering the compactness property, we remark that there is no loss of generality assuming that h in the above is non-erasing (which implies that we are considering solutions to equations in which each variable is assigned a non-empty word). Indeed, it is not hard to see that a semigroup (possibly without a unit element) possesses the compactness property if and only if the monoid obtained from S by adding a unit element possesses it (see, e.g., [14, Problem 13.5.2]).

Note also that there is no loss in generality assuming that the equations have no constants, as we are dealing with finitely generated monoids: any system E of equations (with or without constants) over S may be modified into a system without constants by identifying each generator \(g\in G\) with a new variable \(X_g\). The set of solutions of the original system are obtained from the solutions to the modified system by choosing the solutions where \(X_g \mapsto g\) for each generator. Further, if the number of equations in an independent system of equations without constants using n variables is at most f(n) for each n, then the number of equations in an independent system of equations is at most \(f(n+\#G)\).

Let us still begin with a short proof of the first main result.

Proof of Theorem 5.1. It is known that if a semigroup S can be embedded in the ring of integer matrices, then S possesses the compactness property [14, Chapter 13]. In [18] such an embedding is explicitly constructed.    \(\Box \)

The rest of this section is devoted to proving Theorem 5.2.

Our approach for upper bounding the size of an independent system of equations over \(\varSigma ^*/{\equiv _{k}}\) is identical to the approach taken for showing a similar result for the so-called k-abelian monoids [8]. We interpret the solutions to a system as a subset of a finite dimensional subspace. Basic results from linear algebra are then utilised to show that actually only finitely many equations are required to define all solutions.

Let us fix some notation. Let \(k\geqslant 1\) be fixed. Consider a word \(u\in \varXi ^+\) and define the \(\frac{|\varXi |^{k+1}-|\varXi |}{|\varXi |-1}\)-dimensional vector \(\mathbf {u}\) as

$$\begin{aligned} \mathbf {u} = \left( {\textstyle {\left( {\begin{array}{c}u\\ Y\end{array}}\right) }}\right) _{\scriptstyle Y\in \varXi ^{\leqslant k}\setminus \{\varepsilon \}}. \end{aligned}$$

For any non-erasing morphism \(h:\varXi \rightarrow \varSigma ^*/\equiv _{k}\) we define, for each word \(w\in \varSigma ^{\leqslant k}\), the \(\frac{|\varXi |^{k+1}-|\varXi |}{|\varXi |-1}\)-dimensional vector \(\mathbf {h}_w\) (components indexed by non-empty words in \(\varXi ^{\leqslant k}\)) as

$$\begin{aligned} \mathbf {h}_w[Y] = \sum _{\begin{array}{c} \scriptstyle w=w_1\cdots w_{\ell }\\ w_j\in \varSigma ^+ \end{array}} {\textstyle {\left( {\begin{array}{c}h(Y_1)\\ w_1\end{array}}\right) }}\cdots {\textstyle {\left( {\begin{array}{c}h(Y_{\ell })\\ w_{\ell }\end{array}}\right) }}, \end{aligned}$$

for each \(Y=Y_1\cdots Y_{\ell } \in \varXi ^{\leqslant k}\) with \(Y_i\in \varXi \) for all \(i= 1,\ldots ,\ell \). Note that \(\mathbf {h}_e[Y]=0\) for all Y for which \(|Y|>|e|\), as e does not have a factorisation into |Y| non-empty words.

The following lemma is crucial in the endeavours that follow. Here \((\mathbf {x},\mathbf {y})\) denotes the inner product of vectors \(\mathbf {x}\), \(\mathbf {y}\).

Lemma 5.3

Let \(h:\varXi \rightarrow \varSigma ^*/\equiv _{k}\) be a non-erasing morphism, \(u\in \varXi ^+\), and \(w\in \varSigma ^{\leqslant k}\). Then \({\textstyle {\left( {\begin{array}{c}h(u)\\ w\end{array}}\right) }} = (\mathbf {h}_w, \mathbf {u})\).

Proof

To avoid cluttering the text, we set \(\widehat{X}:= h(X)\) for each \(X\in \varXi \). Let \(u = X_1\cdots X_n\), where \(X_i\in \varXi \) for each \(i = 1,\ldots , n\). For any subset S of \(\{1,\ldots ,n\}\), by the sequence \(S_1,\ldots ,S_{|S|}\) we mean the sequence of elements of S arranged in increasing order. Now, for each \(w\in \varSigma ^{\leqslant k}\), we observe that

$$\begin{aligned} {\textstyle {\left( {\begin{array}{c}h(u)\\ w\end{array}}\right) }} = \sum _{\begin{array}{c} \scriptstyle S\subseteq [1,n]\\ |S|\leqslant |w| \end{array}} \sum _{\begin{array}{c} \scriptstyle w=w_1\cdots w_{|S|}\\ w_j\in \varSigma ^+ \end{array}} {\textstyle {\left( {\begin{array}{c}\widehat{X}_{S_1}\\ w_1\end{array}}\right) }}\cdots {\textstyle {\left( {\begin{array}{c}\widehat{X}_{S_{|S|}}\\ w_{|S|}\end{array}}\right) }}. \end{aligned}$$

Indeed, for each occurrence of w as a subword, there exists a subset \(S\subseteq [1,n]\) of length at most k such that \(w = w_1\cdots w_{|S|}\), where \(w_i\in \varSigma ^+\) and the indices of \(w_i\) in u are a subset of the indices of \(\widehat{X}_{S_i}\) in h(u). For each subset S of [1, n] having \(|S|\geqslant |e|\), there exists no such factorisation, and thus the corresponding sum contributes nothing to the total sum. Now for two subsets \(S,S'\subseteq [1,n]\) having \(Y_{S_1}\cdots Y_{S_{|S|}} = Y_{S'_1}\cdots Y_{S'_{|S'|}} = Y\), the corresponding sums contribute the same value. The number of distinct such sets equals \({\textstyle {\left( {\begin{array}{c}u\\ Y\end{array}}\right) }}\). We may thus rewrite the above equation as

$$\begin{aligned} \sum _{Y\in \varXi ^{\leqslant k}}{\textstyle {\left( {\begin{array}{c}u\\ Y\end{array}}\right) }} \sum _{\begin{array}{c} \scriptstyle w=w_1\cdots w_{|Y|}\\ w_j\in \varSigma ^+ \end{array}} {\textstyle {\left( {\begin{array}{c}\widehat{Y}_1\\ w_1\end{array}}\right) }}\dots {\textstyle {\left( {\begin{array}{c}\widehat{Y}_{|Y|}\\ w_{|Y|}\end{array}}\right) }} = \sum _{Y\in \varXi ^{\leqslant k}}\mathbf {h}_w[Y]\cdot \mathbf {u}[Y] = (\mathbf {h}_w, \mathbf {u}), \end{aligned}$$

as claimed.    \(\Box \)

For a vector \(\mathbf {x}\), we let \(\mathbf {x}^{\bot }\) denote the orthogonal complement of \(\mathbf {x}\).

Lemma 5.4

Let \(e:u=v\) be an equation and let \(h:\varXi \rightarrow \varSigma ^*/\equiv _{k}\) be a non-erasing morphism. Then h is a solution to e over \(\varSigma ^*/{\equiv _{k}}\) if and only if \(\mathbf {h}_w \in \mathbf {e}^\bot \) for all \(w\in \varSigma ^{\leqslant k}\), where \(\mathbf {e} = \mathbf {u}-\mathbf {v}\).

Proof

We have \(h(u)\equiv _{k}h(v)\) if and only if \({\textstyle {\left( {\begin{array}{c}h(u)\\ w\end{array}}\right) }}-{\textstyle {\left( {\begin{array}{c}h(v)\\ w\end{array}}\right) }} = 0\) for all non-empty \(w\in \varSigma ^{\leqslant k}\) if and only if \((\mathbf {h}_w , \mathbf {u}-\mathbf {v}) = (\mathbf {h}_w,\mathbf {e}) = 0\) for each \(w\in \varSigma ^{\leqslant k}\), by the lemma above.   \(\Box \)

We may now bound the number of equations in an independent system.

Proof of Theorem 5.2. Let \(E = \{e_i:u_i=v_i\}_{i\in I}\) be an independent system of equations over \(\varXi \). Assume again that \({{\,\mathrm{Sol}\,}}(E)\) is not empty. The case of \({{\,\mathrm{Sol}\,}}(E)\) having no solutions is analogous to the k-abelian case. Now h is a solution to E if and only if \(\mathbf {h}_w \in \bigcap _{e\in E}\mathbf {e}^\bot = U\) for all \(w\in \varSigma ^{\leqslant k}\). Since U is a finite dimensional vector space, there exist equations \(e_1,\ldots ,e_f\in E\) such that \(U = \cap _{i=1}^f\mathbf {e}_i^\bot \), where \(f\leqslant |\varXi ^{\leqslant k}|-1\). We claim that \(E' = \{e_1,\ldots ,e_f\}\) is an equivalent subsystem of E.

Let \(e\in E\). Let then h be a solution to \(E'\). It follows that \(\mathbf {h}_w \in \mathbf {e}_i^\bot \) for all \(i=1,\ldots ,f\), so that \(\mathbf {h}_w\in U\) for all \(w\in \varSigma ^*\). Furthermore \(\mathbf {h}_w \in \mathbf {e}^\bot \) which is equivalent to h being a solution to e by the above lemma. as claimed.    \(\Box \)

6 Conclusions and Future Work

We have considered basic equations over the k-binomial monoids. For commutativity, we obtain a characterisation only in the case \(k = 2\). The problem is open for \(k > 2\), though we obtain some partial results here. We plan to attack the problem in the future:

Problem 6.1

Characterise when \(xy \equiv _{k}yx\) for \(k > 2\).

The mixture of positive and negative results obtained relating to this problem seem to suggest that the problem is quite intricate.

As seen in Theorem 4.5, characterising k-binomial conjugacy of two words is already quite involved even for \(k=2\). It is not immediate how to translate the result into a word combinatorial statement. Furthermore, we suspect that the methods used in the case \(k = 2\) do not extend to cases with \(k > 2\) without substantial new insights. The following problem is thus left open.

Problem 6.2

Characterise when, for words \(x,y,z\in \varSigma ^*\) and \(k > 2\), we have \(xz\equiv _{k}zy\).

Finally for independent systems of equations, it will be interesting to answer the following question.

Question 6.3

What is the maximal number of equations in an independent system of equations in the k-binomial monoid?

The analogous problem over the free semigroups is notoriously open. There is a constant upper bound given in the case of equations with no constants when the alphabet has size 3 (see [15]).