Commutative Regular Languages with Product-Form Minimal Automata

Hoffmann, Stefan

doi:10.1007/978-3-030-93489-7_5

Stefan Hoffmann ORCID: orcid.org/0000-0002-7866-075X¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 13037))

Included in the following conference series:

International Conference on Descriptional Complexity of Formal Systems

150 Accesses
2 Citations

Abstract

We introduce a subclass of the commutative regular languages that is characterized by the property that the state set of the minimal deterministic automaton can be written as a certain Cartesian product. This class behaves much better with respect to the state complexity of the shuffle, for which we find the bound 2nm if the input languages have state complexities n and m, and the upward and downward closure and interior operations, for which we find the bound n. In general, only the bounds $(2nm)^{|\varSigma |}$ and $n^{|\varSigma |}$ are known for these operations in the commutative case. We prove different characterizations of this class and present results to construct languages from this class. Lastly, in a slightly more general setting of partial commutativity, we introduce other, related, language classes and investigate the relations between them.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Commutative Regular Languages – Properties and State Complexity

Automata-Theoretical Regularity Characterizations for the Iterated Shuffle on Commutative Regular Languages

Regularity Conditions for Iterated Shuffle on Commutative Regular Languages

Keywords

1 Introduction

The state complexity, as used here, of a regular language L is the minimal number of states needed in a complete deterministic automaton recognizing L. The state complexity of an operation on regular languages is the greatest state complexity of the result of this operation as a function of the (maximal) state complexities of its arguments.

Investigating the state complexity of the result of a regularity-preserving operation on regular languages, see [7] for a survey, was first initiated by Maslov in [20] and systematically started by Yu, Zhuang and Salomaa in [27].

A language is called commutative, if for each word in the language, every permutation of this word is also in the language. The class of commutative automata, which recognize commutative regular languages, was introduced in [2].

The shuffle and iterated shuffle have been introduced and studied to understand the semantics of parallel programs. This was undertaken, as it appears to be, independently by Campbell and Habermann [3], by Mazurkiewicz [22] and by Shaw [25]. They introduced flow expressions, which allow for sequential operators (catenation and iterated catenation) as well as for parallel operators (shuffle and iterated shuffle) to specify sequential and parallel execution traces.

The shuffle operation as a binary operation, but not the iterated shuffle, is regularity-preserving on all regular languages. The state complexity of the shuffle operation in the general cases was investigated in [1] for complete deterministic automata and in [4] for incomplete deterministic automata. The bound $2^{nm-1} + 2^{(m-1)(n-1)}(2^{m-1}-1)(2^{n-1}-1)$ was obtained in the former case, which is not known to be tight, and the tight bound $2^{nm}-1$ in the latter case.

A word is a (scattered) subsequence of another word, if it can be obtained from the latter word by deleting letters. This gives a partial order, and the upward and downward closure and interior operations refer to this partial order. The upward closures are also known as shuffle ideals. The state complexity of these operations was investigated in [11,12,13, 19, 23]

The state complexity of the projection operation was investigated in [17, 18, 26]. In [26], the tight upper bound $3 \cdot 2^{n-2} - 1$ was shown, and in [18] the refined, and tight, bound $2^{n-1} + 2^{n-m} - 1$ was shown, where m is related to the number of unobservable transitions for the projection operator. Both results were established for incomplete deterministic automata.

In [14,15,16,17] the state complexity of these operations was investigated for commutative regular languages. The results are summarized in Table 1.

Table 1. Overview of results for commutative regular languages. The state complexities of the input languages are n and m. Also, $f(n,m) = 2^{nm-1} + 2^{(m-1)(n-1)}(2^{m-1}-1)(2^{n-1}-1)$ is the general bound for shuffle from [1] in case of complete automata.

Full size table

Table 2. State complexity results on the subclass of commutative languages with product-form minimal automaton for input languages with state complexities n and m.

Full size table

In [8] the minimal commutative automaton was introduced, which can be associated with every commutative regular language. This automaton played a crucial role in [14, 15] to derive the bounds mentioned in Table 1. Here, we will investigate the subclass of those language for which the minimal commutative automaton is in fact the smallest automaton recognizing a given commutative language. For this language class, we will derive the following state complexity bounds summarized in Table 2. Additionally, we will prove other characterizations and properties of the subclass considered and relate it with other subclasses, in a more general setting, in the final chapter.

2 Preliminaries

In this section and Sect. 3, we assume that $k \geqslant 0$ denotes our alphabet size and $\varSigma = \{a_1, \ldots , a_k\}$ is our alphabet. We will also write a, b, c for $a_1,a_2,a_3$ in case of $|\varSigma | \leqslant 3$. The set $\varSigma ^{*}$ denotes the set of all finite sequences over $\varSigma $, i.e., of all words. The finite sequence of length zero, or the empty word, is denoted by $\varepsilon $. For a given word we denote by |w| its length, and for $a \in \varSigma $ by $|w|_a$ the number of occurrences of the symbol a in w. For $a \in \varSigma $, we set $a^* = \{a\}^*$. A language is a subset of $\varSigma ^*$. For $u \in \varSigma ^*$, the left quotient is $u^{-1}L = \{ v \in \varSigma ^* \mid uv \in L\}$ and the right quotient is $Lu^{-1} = \{ v \in \varSigma ^* \mid vu \in L \}$.

The shuffle operation, denoted by , is defined by

for $u,v \in \varSigma ^{*}$ and for $L_1, L_2 \subseteq \varSigma ^{*}$. If $L_1, \ldots , L_n \subseteq \varSigma ^*$, we set .

Let $\varGamma \subseteq \varSigma $. The projection homomorphism $\pi _{\varGamma } : \varSigma ^* \rightarrow \varGamma ^*$ is given by $\pi _{\varGamma }(x) = x$ for $x \in \varGamma $ and $\pi _{\varGamma }(x) = \varepsilon $ for $x \notin \varGamma $ and extended to $\varSigma ^*$ by $\pi _{\varGamma }(\varepsilon ) = \varepsilon $ and $\pi _{\varGamma }(wx) = \pi _{\varGamma }(w)\pi _{\varGamma }(x)$ for $w \in \varSigma ^*$ and $x \in \varSigma $. As a shorthand, we set, with respect to a given naming $\varSigma = \{a_1, \ldots , a_k\}$, $\pi _j = \pi _{\{a_j\}}$. Then $\pi _j(w) = a_j^{|w|_{a_j}}$.

A language $L \subseteq \varSigma ^*$ is commutative, if, for $u,v \in \varSigma ^*$ such that $|v|_x = |u|_x$ for every $x \in \varSigma $, we have $u \in L$ if and only if $v \in L$, i.e., L is closed under permutation of letters in words from L.

A quintuple $\mathcal A = (\varSigma , Q, \delta , q_0, F)$ is a finite deterministic and complete automaton (DFA), where $\varSigma $ is the input alphabet, Q the finite set of states, $q_0 \in Q$ the start state, $F \subseteq Q$ the set of final states and $\delta : Q \times \varSigma \rightarrow Q$ is the totally defined state transition function. Here, we do not consider incomplete automata. The transition function $\delta : Q \times \varSigma \rightarrow Q$ extends to a transition function on words $\delta ^{*} : Q \times \varSigma ^{*} \rightarrow Q$ by setting $\delta ^{*}(q, \varepsilon ) := q$ and $\delta ^{*}(q, wa) := \delta (\delta ^{*}(q, w), a)$ for $q \in Q$, $a \in \varSigma $ and $w \in \varSigma ^{*}$. In the remainder, we drop the distinction between both functions and also denote this extension by $\delta $. The language recognized by an automaton $\mathcal A = (\varSigma , Q, \delta , q_0, F)$ is $ L(\mathcal A) = \{ w \in \varSigma ^{*} \mid \delta (q_0, w) \in F \}. $ A language $L \subseteq \varSigma ^{*}$ is called regular if $L = L(\mathcal A)$ for some finite automaton $\mathcal A$.

The Nerode right-congruence with respect to $L \subseteq \varSigma ^*$ is defined, for $u,v \in \varSigma ^*$, by $u \equiv _L v$ if and only if $ \forall x \in \varSigma ^* : ux \in L \Leftrightarrow vx \in L. $ The equivalence class of $w \in \varSigma ^{*}$ is denoted by $[w]_{\equiv _L} = \{ x \in \varSigma ^{*} \mid x \equiv _L w \}$. A language is regular if and only if the above right-congruence has finite index, and it can be used to define the minimal deterministic automaton $\mathcal A_L = (\varSigma , Q_L, \delta _L, [\varepsilon ]_{\equiv _L}, F_L)$ with $Q_L = \{ [u]_{\equiv _L} \mid u \in \varSigma ^{*} \}$, $\delta _L([w]_{\equiv _L}, a) = [wa]_{\equiv _L}$ and $F_L = \{ [u]_{\equiv _L} \mid u \in L \}$. Let $L \subseteq \varSigma ^*$ be regular with minimal automaton $\mathcal A_L = (\varSigma , Q_L, \delta _L, [\varepsilon ]_{\equiv _L}, F_L)$. The number $|Q_L|$ is called the state complexity of L and denoted by ${\text {sc}}(L)$. The state complexity of a regularity-preserving operation on a class of regular languages is the greatest state complexity of the result of this operation as a function of the (maximal) state complexities for argument languages from the class.

Given two automata $\mathcal A = (\varSigma , S, \delta , s_0, F)$ and $\mathcal B = (\varSigma , T, \mu , t_0, E)$, an automaton homomorphism $h : S \rightarrow T$ is a map between the state sets such that for each $a \in \varSigma $ and state $s \in S$ we have $ h(\delta (s, a)) = \mu (h(s),a), $ $h(s_0) = t_0$ and $h^{-1}(E) = F$. If $h : S \rightarrow T$ is surjective, then $L(\mathcal B) = L(\mathcal A)$. A bijective homomorphism between automata $\mathcal A$ and $\mathcal B$ is called an isomorphism, and the two automata are said to be isomorphic.

The minimal commutative automaton was introduced in [8] to investigate the learnability of commutative languages. In [14, 15] this construction was used to define the index and period vector and in the derivation of the state complexity bounds mentioned in Table 1.

Definition 1

(minimal commutative aut.). Let $L \subseteq \varSigma ^*$ be regular. The minimal commutative automaton for L is $\mathcal C_L = (\varSigma , S_1 \times \ldots \times S_k, \delta , s_0, F)$ with

$$ S_j = \{ [a_j^m]_{\equiv _L} : m \geqslant 0 \}, \quad F = \{ ([\pi _1(w)]_{\equiv _L}, \ldots , [\pi _k(w)]_{\equiv _L}) : w \in L \} $$

and $\delta ((s_1, \ldots , s_j, \ldots , s_k), a_j) = (s_1, \ldots , \delta _{j}(s_j, a_j), \ldots , s_k)$ with one-letter transitions $\delta _{j}([a_j^m]_{\equiv _L}, a_j) = [a_j^{m+1}]_{\equiv _L}$ for $j = 1,\ldots , k$ and $s_0 = ([\varepsilon ]_{\equiv _L}, \ldots , [\varepsilon ]_{\equiv _L})$.

In [8], the next result was shown.

Theorem 2

(Gómez and Alvarez [8]). Let $L \subseteq \varSigma ^*$ be a commutative regular language. Then, $L = L(\mathcal C_L)$.

In general the minimal commutative automaton is not equal to the minimal deterministic and complete automaton for a regular commutative language L, see Example 1.

Example 1

For $L = \{ w \in \varSigma ^* \mid |w|_a = 0 \text{ or } |w|_b > 0 \}$ with $\varSigma = \{a,b\}$ the minimal deterministic and complete automaton and the minimal commutative automaton are not the same, see Fig. 1. This language is from [8]. In fact, the difference can get quite large, as shown by $L_p = \{ w \in \varSigma ^* \mid \sum _{j=1}^k j\cdot |w|_{a_j} \equiv 0 \pmod {p} \}$ for a prime $p > k$. Here, ${\text {sc}}(L_p) = p$, but $\mathcal C_{L_p}$ has $p^k$ states.

The next definition from [14, 15] generalizes the notion of a cyclic and non-cyclic part for unary automata [24], and the notion of periodic language [6, 14, 15].

Definition 3

(index and period vector). The index vector $(i_1, \ldots , i_k)$ and period vector $(p_1, \ldots , p_k)$ for a commutative regular language $L \subseteq \varSigma ^*$ with minimal commutative automaton $\mathcal C_L = (\varSigma , S_1 \times \ldots \times S_k, \delta , s_0, F)$ are the unique minimal numbers such that $\delta (s_0, a_j^{i_j}) = \delta (s_0, a_j^{i_j + p_j})$ for all $j \in \{1,\ldots ,k\}$.

Note that, in Definition 3, we have, for all $j \in \{1,\ldots ,k\}$, $|S_j| = i_j + p_j$. Also note that for unary languages, i.e., if $|\varSigma | = 1$, $\mathcal C_L$ equals $\mathcal A_L$ and $i_1 + p_1$ equals the number of states of the minimal automaton.

Example 2

Let . Then $(i_1, i_2) = (0,0)$, $(p_1, p_2) = (4,2)$, $\pi _1(L) = (a a)^{*}$ and $\pi _2(L) = b^{*}$.

Let $u, v \in \varSigma ^*$. Then, u is a subsequence^{Footnote 1} of v, denoted by $u \preccurlyeq v$, if and only if The thereby given order is called the subsequence order. Let $L \subseteq \varSigma ^*$. Then, we define (1) the upward closure ; (2) the downward closure ; (3) the upward interior, denoted by , as the largest upward-closed set in L, i.e. the largest subset $U \subseteq L$ such that $\mathop {\uparrow \!} U = U$ and (4) the downward interior, denoted by , as the largest downward-closed set in L, i.e., the largest subset $U \subseteq L$ such that $\mathop {\downarrow \!} U = U$. We have and

The following two results, which will be needed later, are from [14, 15].

Theorem 4

Let $U,V \subseteq \varSigma ^*$ be commutative regular languages with index and period vectors $(i_1, \ldots , i_k), (j_1, \ldots , j_k)$ and $(p_1, \ldots , p_k), (q_1, \ldots , q_k)$. Then, the index vector of is at most

$$ (i_1 + j_1 + \mathrm {lcm}(p_1, q_1) - 1, \ldots , i_k + j_k + \mathrm {lcm}(p_k,q_k) - 1) $$

and the period vector is at most $ (\mathrm {lcm}(p_1, q_1), \ldots , \mathrm {lcm}(p_k, q_k)). $ So, .

Theorem 5

Let $\varSigma = \{a_1, \ldots , a_k\}$. Suppose $L \subseteq \varSigma ^*$ is commutative and regular with index vector $(i_1, \ldots , i_k)$ and period vector $(p_1, \ldots , p_k)$. Then, .

3 Product-Form Minimal Automata

As shown in Example 1, the minimal automaton, in general, does not equal the minimal commutative automaton. Here, we introduce the class of commutative regular languages for which both are isomorphic. The corresponding commutative languages are called languages with a minimal automaton of product-form, as the minimal commutative automaton is built with the Cartesian product.

Definition 6

(languages with product-form minimal automaton). A commutative and regular language $L \subseteq \varSigma ^*$ is said to have a minimal automaton of product-form, if $\mathcal C_L$ is isomorphic to $\mathcal A_L$.

If $|\varSigma | = 1$, we see easily that $\mathcal C_L$ is the minimal deterministic and complete automaton.

Proposition 7

If $|\varSigma | = 1$, then each commutative and regular $L \subseteq \varSigma ^*$ has a minimal automaton of product-form. More generally, if $L \subseteq \{a\}^*$, then has a minimal automaton of product-form.

Apart from the unary languages, we give another example of a language with minimal automaton of product-form next.

Example 3

Let over $\varSigma = \{a,b\}$. See Fig. 2 for the minimal commutative automaton. Here, the minimal commutative automaton equals the minimal automaton.

However, the next proposition gives a strong necessary criterion for a commutative language to have a minimal automaton of product-form.

Proposition 8

If $L \subseteq \varSigma ^*$ is commutative and regular with a minimal automaton of product-form, then $|\{ x \in \varSigma \mid \pi _{\{x\}}(L) \text{ is } \text{ finite } \}| \leqslant 1$. So, $\pi _{\varGamma }(L)$ is infinite for $|\varGamma | \geqslant 2$, in particular no finite language over an at least binary alphabet is in this class.

For example, $L = \{\varepsilon \}$ over $\varSigma $ does not have a minimal automaton of product-form if $|\varSigma | > 1$. Recall that the minimal automaton, as defined here, is always complete. Note that the converse of Proposition 8 is not true, as shown by $aa^*$ over $\varSigma = \{a,b\}$.

In the following statement, we give alternative characterizations for commutative languages with minimal automata of product-form.

Theorem 9

Let $L \subseteq \varSigma ^*$ be a commutative regular language with index vector $(i_1, \ldots , i_k)$ and period vector $(p_1, \ldots , p_k)$. The following are equivalent:

1.
the minimal automaton has product-form;
2.
${\text {sc}}(L) = \prod _{j=1}^k (i_j + p_j)$;
3.
$u \equiv _L v$ implies $\forall a \in \varSigma : a^{|u|_a} \equiv _L a^{|v|_a}$;
4.
$u \equiv _L v$ if and only if $\forall a \in \varSigma : a^{|u|_a} \equiv _L a^{|v|_a}$.

Next, we give a way to construct commutative regular languages with minimal automata of product-form.

Lemma 10

Let $\varSigma = \{a_1, \ldots , a_k\}$ and, for $j \in \{1,\ldots ,k\}$, $L_j \subseteq \{a_j\}^*$ be regular and infinite with index $i_j$ and period $p_j$. Then, and has index vector $(i_1, \ldots , i_k)$ and period vector $(p_1, \ldots , p_k)$. With Theorem 9, has a product-form minimal automaton.

In the next theorem and the following remark, we investigate closure properties of the class in question.

Theorem 11

The class of commutative regular languages with minimal automata of product-form is closed under left and right quotients and complementation. It is not closed under union, intersection and projection.

Remark 1

We have , showing, using Proposition 7 and 8, that this class is not closed under intersection and by DeMorgan’s laws, as we have closure under complementation, we also cannot have closure under union. Also, has a minimal automaton of product-form, but is the language from Example 1. So, this class is also not closed under projection.

Theorem 12

Let $U, V \subseteq \varSigma ^*$ be commutative regular languages with product-form minimal automata with ${\text {sc}}(U) = n$ and ${\text {sc}}(V) = m$.

1.
We have and if $|\varSigma | = 1$. Furthermore, for any $\varSigma $, there exist U, V as above such that .
2.
In the worst case, n states are sufficient and necessary for a DFA to recognize $\uparrow \! U$. Similarly for the downward closure and interior operations.
3.
In the worst case, n states are sufficient and necessary for a DFA to recognize the projection of U.
4.
In the worst case, nm states are sufficient and necessary for a DFA to recognize $U \cap V$ or $U \cup V$.

Remark 2

I do not know if the bound 2nm stated in Theorem 1 for the shuffle operation is tight, but the next example shows that if we have a binary alphabet, we can find commutative languages with state complexities n and m and product-form minimal automata whose shuffle needs an automaton with strictly more than nm states. A similar construction works for more than two letters. Let $p, q > 11$ be two coprime numbers. Set and . Then, using that shuffle distributes over union and a number-theoretical result from [27, Lemma 5.1], we find

where $a^{q-1 + p - 1}(a^p)^* (a^q)^* = F \cup a^{pq - 1}a^*$ for some finite set $F \subseteq \{\varepsilon , a, \ldots , a^{pq - 3} \}$ and $W = E \cup b^{pq-1}b^*$ for some $E \subseteq \{\varepsilon , b, \ldots , b^{pq-3} \}$. Note that by [27, Lemma 5.1] we have . All languages involved have a product-form minimal automaton. The minimal automaton for U has $(2 + p) \cdot (1+p)$ states, the minimal automaton for V has $(1 + q)\cdot (q+2)$ states and that for has $2pq\cdot (pq+3)$ states. As $(p-11)(q-11) > 0$ we can deduce $(1+p)(2+p)(1+q)(2+q)< 2(pq)^2 < 2pq(pq+3)$.

4 Partial Commutativity and Other Subclasses

A partial commutation on $\varSigma $ is a symmetric and irreflexive relation $I \subseteq \varSigma \times \varSigma $, often called the independence relation. Of interest is the congruence $\sim _I$ generated on $\varSigma ^*$ by the relation $ \{ (ab, ba) \mid (a,b) \in I \}. $ A language $L \subseteq \varSigma ^*$ is closed under I-commutation if $u \in L$ and $u \sim _I v$ implies $v \in L$. If $I = \{ (a,b) \in \varSigma \times \varSigma \mid a \ne b \}$, then the languages closed under I-commutation are precisely the commutative languages.

Languages closed under some partial commutation relation have been extensively studied, see [10], also for further references, and in particular with relation to (Mazurkiewicz) trace theory [5, 10, 21], a formalism to describe the execution histories of concurrent programs.

Here, we will focus on the case that $(\varSigma \times \varSigma ) \setminus I$ is transitive, i.e., if $u \not \sim _I v$ and $v \not \sim _I w$ implies $u \not \sim _I w$. In this case, $(\varSigma \times \varSigma ) \setminus I$ is an equivalence relation and we will write $\varSigma _1, \ldots , \varSigma _k$ for the different equivalence classes.

The reason to focus on this particular generalization is, as we will see later, that the definition of the minimal commutative automaton transfers to this more general setting without much difficulty.

To ease the notation, if we have a partial commutation relation as above with a corresponding partition $\varSigma = \varSigma _1 \cup \ldots \varSigma _k$ of the alphabet, we also write $\mathcal L_{\varSigma _1, \ldots , \varSigma _k}$ for the class of languages closed under this partial commutation. Then, as is easily seen, we have $L \in \mathcal L_{\varSigma _1, \ldots , \varSigma _k}$ if and only if, for $x \in \varSigma _i$, $y \in \varSigma _j$ ($i \ne j$) and each $u, v \in \varSigma ^*$ we have $ uxyv \in L \Leftrightarrow uyxv \in L. $ For example, L is commutative if and only if $L \in \mathcal L_{\{a_1\}, \ldots , \{a_k\}}$ for $\varSigma = \{a_1, \ldots , a_k\}$.

4.1 The Canonical Automaton

Here, we generalize our notion of commutative minimal automaton, Definition 1, to have uniform recognition devices for languages in $\mathcal L_{\varSigma _1,\ldots ,\varSigma _k}$.

Definition 13

Let $\varSigma = \varSigma _1 \cup \ldots \cup \varSigma _k$ be a partition and $L \subseteq \varSigma ^*$. Set $\mathcal C_{L, \varSigma _1, \ldots , \varSigma _k} = (\varSigma , S_1 \times \ldots \times S_k, \delta , s_0, F)$ with, for $i \in \{1,\ldots , k\}$, $ S_i = \{ [u]_{\equiv _L} \mid u \in \varSigma _i^* \}$, $F = \{ ([\pi _{\varSigma _1}(u)]_{\equiv _L}, \ldots , [\pi _{\varSigma _k}(u)]_{\equiv _L}) \mid u \in L \}$, $s_0 = ( [\varepsilon ]_{\equiv _L}, \ldots , [\varepsilon ]_{\equiv _L})$ and, for $x \in \varSigma _i$,

$$ \delta (([u_1]_{\equiv _L}, \ldots , [u_i]_{\equiv _L}, \ldots , [u_k]_{\equiv _L}), x) = ([u_1]_{\equiv _L}, \ldots , [u_ix]_{\equiv _L}, \ldots , [u_k]_{\equiv _L}) $$

with words $u_j \in \varSigma _j^*$, $j \in \{1,\ldots , k\}$. This is called the canonical automaton for the given L with respect to $\varSigma = \varSigma _1 \cup \ldots \cup \varSigma _k$.

Next, we show that the canonical automata recognize precisely the languages in $\mathcal L_{\varSigma _1, \ldots , \varSigma _k}$. Note that we have dropped the assumption of regularity of L.

Theorem 14

Let $L \subseteq \varSigma ^*$ and $\varSigma = \varSigma _1 \cup \ldots \cup \varSigma _k$ be a partition. Then,

1.
$L \subseteq L(\mathcal C_{L,\varSigma _1,\ldots ,\varSigma _k})$ and $L(\mathcal C_{L,\varSigma _1,\ldots ,\varSigma _k}) \in \mathcal L_{\varSigma _1, \ldots , \varSigma _k}$.
2.
$L = L(\mathcal C_{L,\varSigma _1,\ldots ,\varSigma _k}) \Leftrightarrow L \in \mathcal L_{\varSigma _1, \ldots , \varSigma _k}$.
3.
Let $L \in \mathcal L_{\varSigma _1,\ldots ,\varSigma _k}$. Then L is regular if and only if $\mathcal C_{L,\varSigma _1,\ldots ,\varSigma _k}$ is finite.

Also, used in defining a subclass in the next subsection, we will derive a canonical automaton for certain projected languages from $\mathcal C_{L,\varSigma _1,\ldots ,\varSigma _k}$. Essentially, the next definition and proposition mean that if we only use one “coordinate” of $\mathcal C_{L,\varSigma _1, \ldots , \varSigma _k}$, then this recognizes a projection of L.

Definition 15

Let $i \in \{1,\ldots , k\}$ and $L \in \mathcal L_{\varSigma _1,\ldots ,\varSigma _k}$. The canonical projection automaton (for $\varSigma _i)$ is $\mathcal C_{L,\varSigma _i} = (\varSigma _i, S_i, \delta _i, [\varepsilon ]_{\equiv _L}, F_i)$ with $S_i = \{ [u]_{\equiv _L} \mid u \in \varSigma _i^* \}$, $\delta _i([u]_{\equiv _L}, x) = [ux]_{\equiv _L} \text{ for } x \in \varSigma _i$ and $F_i = \{ [\pi _{\varSigma _i}(u)]_{\equiv _L} \mid u \in L \}$.

Proposition 16

Let $L \in \mathcal L_{\varSigma _1, \ldots , \varSigma _k}$. Then, for $i \in \{1,\ldots ,k\}$, $\pi _{\varSigma _i}(L) = L(\mathcal C_{L, \varSigma _i})$.

4.2 Subclasses in $\mathcal L_{\varSigma _1, \ldots , \varSigma _k}$

Here, we investigate several subclasses of $\mathcal L_{\varSigma _1,\ldots , \varSigma _k}$. Recall that, for $L \subseteq \varSigma ^*$, the minimal automaton of L is denoted by $\mathcal A_L$.

Definition 17

Let $\varSigma = \varSigma _1 \cup \ldots \cup \varSigma _k$ be a partition. Then, define the following classes of languages.

First, we show that these are in fact subclasses of $\mathcal L_{\varSigma _1, \ldots , \varSigma _k}$.

Proposition 18

Let $\varSigma = \varSigma _1 \cup \ldots \cup \varSigma _k$ be a partition. For each $i \in \{1,2,3,4\}$ we have $\mathcal L_i \subseteq \mathcal L_{\varSigma _1, \ldots , \varSigma _k}$.

Remark 3

Regarding $\mathcal L_1$, note that there exist languages $L = L(\mathcal C_{L,\varSigma _1, \ldots , \varSigma _k})$ such that the minimal automaton has a single final state, but $\mathcal C_{L,\varSigma _1, \ldots , \varSigma _k}$ has more than one final state. For example, $L = \{ w \in \{a,b\}^* \mid |w|_a> 0 \text{ or } |w|_b > 0 \}$. However, if $\mathcal C_{L,\varSigma _1, \ldots , \varSigma _k}$ has a single final state, then the minimal automata also has only a single final state.

Example 4

Let $\varSigma = \varSigma _1 \cup \varSigma _2$ with $\varSigma _1 = \{a\}$ and $\varSigma _2 = \{b\}$. Set . Then $L \in (\mathcal L_3 \cap \mathcal L_4) \setminus \mathcal L_2$.

Example 5

Set . Then $L \in \mathcal L_3 \setminus \mathcal L_4$.

The languages in $\mathcal L_1$ arise in connection with the canonical automaton.

Proposition 19

Let $L \in \mathcal L_{\varSigma _1, \ldots , \varSigma _k}$ and $ \mathcal C_{L, \varSigma _1, \ldots , \varSigma _k} = (\varSigma , S_1 \times \ldots \times S_k, \delta , s_0, F). $ Then, for all $s \in S_1 \times \ldots \times S_k$, $ \{ w \in \varSigma ^* \mid \delta (s_0, w) = s \} \in \mathcal L_1. $

Next, we give alternative characterization for $\mathcal L_2, \mathcal L_3$ and $\mathcal L_4$.

Theorem 20

Let $L \in \mathcal L_{\varSigma _1, \ldots , \varSigma _k}$. Then,

1.
$L \in \mathcal L_2$ if and only if, for each $w \in \varSigma ^*$, the following is true:
$$ w \in L \Leftrightarrow \forall i \in \{1,\ldots ,k\} : \pi _{\varSigma _i}(w) \in \pi _{\varSigma _i}(L); $$
2.
$ L \in \mathcal L_3$ if and only if, for all $i \in \{1,\ldots ,k\}$ and $u \in \varSigma _i^*$, we have
$$ [u]_{\equiv _L} \cap \varSigma _i^* = [u]_{\equiv _{\pi _{\varSigma _i}(L)}} \cap \varSigma _i^*; $$
3.
$L \in \mathcal L_4$ if and only if, for each $u,v \in \varSigma ^*$,
$$ u \equiv _L v \Leftrightarrow \forall i \in \{1,\ldots ,k\} : \pi _{\varSigma _i}(u) \equiv _L \pi _{\varSigma _i}(v). $$

Example 6

Let $L_1$ be the language from Example 3. Set . Both of their letters commute for the partition $\{a_1,a_2\} = \{a_1\} \cup \{a_2\}$. Then, $L_1 \in \mathcal L_4 \setminus \mathcal L_3$ and $L_2 \in \mathcal L_1 \setminus \mathcal L_4$.

Finally, in Theorem 21, we establish inclusion relations, which are all proper, between $\mathcal L_1, \mathcal L_2$ and $\mathcal L_3$, also see Fig. 3.

Theorem 21

We have $\mathcal L_1 \subsetneq \mathcal L_2 \subsetneq \mathcal L_3$.

Remark 4

Theorem 21 and Example 6 show that $\mathcal L_4$ is incomparable to each of the other language classes with respect to inclusion.

5 Conclusion

The language class of commutative regular languages with minimal automata of product-form behaves well with respect to the descriptional complexity measure of state complexity for certain operations, see Table 2, and Lemma 10 allows us to construct infinitely many commutative regular languages with product-form minimal automaton. The investigation started could be carried out for other operations and measures of descriptional complexity as well. Likewise, as done in [8, 9] for commutative and more general partial commutativity conditions, it might be interesting if the learning algorithms given there could be improved for the language class introduced.

Lastly, if the bound 2nm for shuffle is tight is an open problem. Remark 2 shows that the bound nm is not sufficient, however, giving an infinite family of commutative regular languages with minimal automata of product-form attaining the bound 2nm for shuffle is an open problem.

Notes

1.
Also called a scattered subword in the literature [11, 19].

References

Brzozowski, J., Jirásková, G., Liu, B., Rajasekaran, A., Szykuła, M.: On the state complexity of the shuffle of regular languages. In: Câmpeanu, C., Manea, F., Shallit, J. (eds.) DCFS 2016. LNCS, vol. 9777, pp. 73–86. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-41114-9_6
Chapter MATH Google Scholar
Brzozowski, J.A., Simon, I.: Characterizations of locally testable events. Discret. Math. 4(3), 243–271 (1973)
Article MathSciNet Google Scholar
Campbell, R.H., Habermann, A.N.: The specification of process synchronization by path expressions. In: Gelenbe, E., Kaiser, C. (eds.) OS 1974. LNCS, vol. 16, pp. 89–102. Springer, Heidelberg (1974). https://doi.org/10.1007/BFb0029355
Chapter Google Scholar
Câmpeanu, C., Salomaa, K., Yu, S.: Tight lower bound for the state complexity of shuffle of regular languages. J. Autom. Lang. Comb. 7(3), 303–310 (2002)
MathSciNet MATH Google Scholar
Diekert, V., Rozenberg, G. (eds.): The Book of Traces. World Scientific, River Edge (1995)
Google Scholar
Ehrenfeucht, A., Haussler, D., Rozenberg, G.: On regularity of context-free languages. Theor. Comput. Sci. 27, 311–332 (1983)
Article MathSciNet Google Scholar
Gao, Y., Moreira, N., Reis, R., Yu, S.: A survey on operational state complexity. J. Autom. Lang. Comb. 21(4), 251–310 (2017)
MathSciNet MATH Google Scholar
Cano Gómez, A., Álvarez, G.I.: Learning commutative regular languages. In: Clark, A., Coste, F., Miclet, L. (eds.) ICGI 2008. LNCS (LNAI), vol. 5278, pp. 71–83. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88009-7_6
Chapter Google Scholar
Cano Gómez, A.: Inferring regular trace languages from positive and negative samples. In: Sempere, J.M., García, P. (eds.) ICGI 2010. LNCS (LNAI), vol. 6339, pp. 11–23. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15488-1_3
Chapter Google Scholar
Gómez, A.C., Guaiana, G., Pin, J.: Regular languages and partial commutations. Inf. Comput. 230, 76–96 (2013)
Article MathSciNet Google Scholar
Gruber, H., Holzer, M., Kutrib, M.: The size of Higman-Haines sets. Theor. Comput. Sci. 387(2), 167–176 (2007)
Article MathSciNet Google Scholar
Gruber, H., Holzer, M., Kutrib, M.: More on the size of Higman-Haines sets: effective constructions. Fundam. Informaticae 91(1), 105–121 (2009)
Article MathSciNet Google Scholar
Héam, P.: On shuffle ideals. RAIRO Theor. Inform. Appl. 36(4), 359–384 (2002)
Article MathSciNet Google Scholar
Hoffmann, S.: Commutative regular languages - properties and state complexity. Inf. Comput. (submitted)
Google Scholar
Hoffmann, S.: Commutative regular languages – properties and state complexity. In: Ćirić, M., Droste, M., Pin, J.É. (eds.) CAI 2019. LNCS, vol. 11545, pp. 151–163. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21363-3_13
Chapter Google Scholar
Hoffmann, S.: State complexity investigations on commutative languages - the upward and downward closure, commutative aperiodic and commutative group languages. In: Han, Y.-S., Ko, S.-K. (eds.) DCFS 2021, LNCS 13037, pp. 64–75. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-93489-7_6
Hoffmann, S.: State complexity of projection on languages recognized by permutation automata and commuting letters. In: Moreira, N., Reis, R. (eds.) DLT 2021. LNCS, vol. 12811, pp. 192–203. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-81508-0_16
Chapter Google Scholar
Jirásková, G., Masopust, T.: On a structural property in the state complexity of projected regular languages. Theor. Comput. Sci. 449, 93–105 (2012)
Article MathSciNet Google Scholar
Karandikar, P., Niewerth, M., Schnoebelen, P.: On the state complexity of closures and interiors of regular languages with subwords and superwords. Theor. Comput. Sci. 610, 91–107 (2016)
Article MathSciNet Google Scholar
Maslov, A.N.: Estimates of the number of states of finite automata. Dokl. Akad. Nauk SSSR 194(6), 1266–1268 (1970)
MathSciNet Google Scholar
Mazurkiewicz, A.: Concurrent program schemes and their interpretations. Technical report, DAIMI Report Series 6(78) (1977)
Google Scholar
Mazurkiewicz, A.: Parallel recursive program schemes. In: Bečvář, J. (ed.) MFCS 1975. LNCS, vol. 32, pp. 75–87. Springer, Heidelberg (1975). https://doi.org/10.1007/3-540-07389-2_183
Chapter Google Scholar
Okhotin, A.: On the state complexity of scattered substrings and superstrings. Fundam. Informaticae 99(3), 325–338 (2010)
Article MathSciNet Google Scholar
Pighizzini, G., Shallit, J.: Unary language operations, state complexity and Jacobsthal’s function. Int. J. Found. Comput. Sci. 13(1), 145–159 (2002)
Article MathSciNet Google Scholar
Shaw, A.C.: Software descriptions with flow expressions. IEEE Trans. Softw. Eng. 4, 242–254 (1978)
Article Google Scholar
Wong, K.: On the complexity of projections of discrete-event systems. In: Proceedings of WODES 1998, Cagliari, Italy, pp. 201–206 (1998)
Google Scholar
Yu, S., Zhuang, Q., Salomaa, K.: The state complexities of some basic operations on regular languages. Theor. Comput. Sci. 125(2), 315–328 (1994)
Article MathSciNet Google Scholar

Download references

Acknowledgement

I thank the anonymous referees of [14] (the extended version of [15]), whose feedback also helped in the present work. I also sincerely thank the referees of the present submission, which helped me alot in identifying unclear or ungrammatical formulations and a missing definition.

Author information

Authors and Affiliations

Informatikwissenschaften, FB IV, Universität Trier, Universitätsring 15, 54296, Trier, Germany
Stefan Hoffmann

Authors

Stefan Hoffmann
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stefan Hoffmann .

Editor information

Editors and Affiliations

Yonsei University, Seoul, Korea (Republic of)
Yo-Sub Han
Kangwon National University, Chuncheon, Korea (Republic of)
Sang-Ki Ko

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hoffmann, S. (2021). Commutative Regular Languages with Product-Form Minimal Automata. In: Han, YS., Ko, SK. (eds) Descriptional Complexity of Formal Systems. DCFS 2021. Lecture Notes in Computer Science(), vol 13037. Springer, Cham. https://doi.org/10.1007/978-3-030-93489-7_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-93489-7_5
Published: 01 January 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93488-0
Online ISBN: 978-3-030-93489-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Federation for Information Processing (opens in a new tab)

Commutative Regular Languages with Product-Form Minimal Automata