1 Introduction

The state complexity of a regular operation is the number of states that is sufficient and necessary in the worst case for a deterministic finite automaton (DFA) to accept the language resulting from the operation, considered as a function of the sizes of DFAs for operands. The tight upper bounds on the complexity of union, intersection, concatenation, star, and reversal were presented by Maslov [12], Yu, Zhuang and Salomaa [17], and Leiss [11], and to describe witness languages, a binary alphabet was used.

If operands have some special properties, then the complexity of an operation may be significantly smaller. For example, the state complexity of star on prefix-free languages is n, while it is \(\frac{3}{4}2^n\) in the regular case. On the other hand, the complexity of intersection, union, concatenation, and star on star-free languages is the same as in the regular case [5]. This led to the investigation of operational complexity in several subclasses of regular languages. Operations on unary languages were studied by Yu et al. [17] and Pighizzini and Shallit [15], and on finite languages by Câmpeanu et al. [6]. The classes of prefix- and suffix-free languages were examined by Han et al. [7, 8], and the classes of bifix-, factor-, and subword-free, star-free, ideal, and closed languages by Brzozowski et al. [2,3,4,5], Except for reversal on star-free languages, the state complexity of basic regular operations in each of the above mentioned classes was determined. Bordihn, Holzer, and Kutrib [1] considered also some other subclasses, including combinational, finitely generated left ideal, symmetric definite, star, comet, two-sided comet, ordered, and power-separating languages, and investigated the complexity of the conversion of nondeterministic finite automata (NFAs) to deterministic ones. The closure properties of these classes, as well as nondeterministic operational complexity in them, were studied by Olejár et al. [9, 13].

Here we continue this research and study the state complexity of basic regular operations in several classes from [1]. For each considered operation and each considered class, we get a tight upper bound on its complexity. To get the upper bounds, we examine minimal DFAs for languages in considered classes. We show that every minimal DFA for a finitely generated left ideal must have two states that can be distinguished only by the empty string. This gives upper bounds for union, intersection, and reversal on finitely generated left ideals. We also show that the set of all non-final states in a minimal DFA for a power-separating language cannot be reachable in the reversed automaton. This gives an upper bound \(2^n-1\) for reversal on power-separating languages, and shows that the same lower bound for star-free languages [5, Theorem 7] is tight since every star-free language is power-separating. To get upper bounds for concatenation, we carefully inspect the reachable and distinguishable states in the subset automaton of an NFA for the concatenation of given languages. Finally, the star of a star language is the same language, and if a given language is symmetric definite, then its star differs from it only in the empty string. The corresponding upper bounds n and \(n+1\) for the star operation follow. To get lower bounds, we sometimes use witnesses known from the literature, and just prove that they belong to a considered class. However, most often, we describe appropriate witnesses so that the corresponding product automata for union and intersection, or subset automata of NFAs for concatenation, star, or reversal have a desired number of reachable and pairwise distinguishable states.

2 Preliminaries

We assume that the reader is familiar with the basic notions in formal languages and automata theory, and for details, we refer to [16].

Let \(\varSigma \) be a finite non-empty alphabet of symbols. Then \(\varSigma ^*\) denotes the set of all strings over \(\varSigma \) including the empty string \(\varepsilon \). For a finite set S, the notation |S| denotes the size of S and \(2^S\) denotes the power set of S. For two non-negative integers ij, the set \(\{i,i+1,\ldots ,j\}\) is denoted by [ij].

A nondeterministic finite automaton with multiple initial states (MNFA) is a quintuple \(A = (Q, \varSigma , \cdot , I, F)\) where Q is a finite non-empty set of states, \(\varSigma \) is a finite non-empty input alphabet, \(\cdot :Q \times \varSigma \rightarrow 2^Q\) is the transition function, \(I \subseteq Q\) is the set of initial states, and \(F \subseteq Q\) is the set of final states. The transition function can be extended to the domain \(2^Q \times \varSigma ^*\) in the natural way. For states pq and a symbol a, we sometimes write (paq) whenever \(q \in p\cdot a\). The language accepted by A is the set of strings \(L(A)= \{w \in \varSigma ^* \mid I \cdot w \cap F \ne \emptyset \}\).

The reverse of an MNFA \(A= (Q, \varSigma , \cdot , I, F)\) is the MNFA \(A^R = (Q, \varSigma , \cdot ^R, F, I)\) where \(q \cdot ^R a= \{p \mid q \in p \cdot a \}\). We say that a subset S of Q is reachable in A if there exists a string w such that \(S=I\cdot w\). A subset S is co-reachable in A if it is reachable in \(A^R\). We usually omit \(\cdot \) and write just juxtaposition qw instead of \(q\cdot w\). If \(S\subseteq Q\), then \(Sw = \{qw \mid q\in S\}\) and \(wS = \{q \mid qw\in S\}\).

An MNFA is a nondeterministic finite automaton (NFA) if \(|I|=1\). An NFA is a deterministic finite automaton (DFA) if \(|q a|=1\) for each state q and each symbol a. We usually write \(p a =q\) instead of \(p a =\{q\}\). A non-final state q of a DFA is called dead if \(qa=q\) for each symbol a. A DFA is minimal if all its states are reachable and pairwise distinguishable. The state complexity of a regular language L, \({{\,\textrm{sc}\,}}(L)\), is the number of states in a minimal DFA recognizing L. The state complexity of a k-ary regular operation f is the function from \(\mathbb {N}^k\) to \(\mathbb {N}\) defined as \((n_1, n_2, \ldots , n_k) \mapsto \max \{{{\,\textrm{sc}\,}}(f(L_1,L_2,\ldots ,L_k)) \mid {{\,\textrm{sc}\,}}(L_i)\le n_i \text { for each i}.\}\)

Every MNFA \(A = (Q, \varSigma , \cdot , I, F)\) can be converted to an equivalent deterministic automaton \(\mathcal {D}(A) = (2^Q, \varSigma , \cdot , I, \{S \in 2^Q \mid S \cap F \ne \emptyset \})\). The DFA \(\mathcal {D}(A)\) is called the subset automaton of A.

Finally, we recall the definitions of language classes considered in this paper. A language \(L\subseteq \varSigma \) is:

  • combinational (class abbreviation CB): if \(L=\varSigma ^*H\) for \(H\subseteq \varSigma \);

  • singleton (SGL): if it consists of one string;

  • finitely generated left ideal (FGLID): if \(L=\varSigma ^*H\) for some finite language H;

  • left ideal (LID): if \(L=\varSigma ^*L\);

  • symmetric definite (SYDEF): if \(L=G\varSigma ^*H\) for some regular languages GH;

  • star (STAR): if \(L=L^*\);

  • comet (COM): if \(L{=}G^*H\) for some regular languages GH with \(G{\notin }\{\emptyset ,\{\varepsilon \}\}\);

  • two-sided comet (2COM): if \(L{=}EG^*H\) for some regular languages EGH with \(G{\notin }\{\emptyset ,\{\varepsilon \}\}\);

  • ordered (ORD): if it is accepted by a (possibly non-minimal) DFA with ordered states such that \(p \preceq q\) implies \(p\cdot \sigma \preceq q\cdot \sigma \) for each symbol \(\sigma \);

  • star-free (STFR): if L is constructible from finite languages, concatenation, union, and complementation (equivalently, if L has an aperiodic DFA);

  • power-separating (PSEP): if for every w in \(\varSigma ^*\) there exists an integer k such that \(\bigcup _{i\ge k} \{w^i\} \subseteq L\) or \(\bigcup _{i\ge k} \{w^i\} \subseteq L^c\).

We have \(\text {CB}\subsetneq \text {FGLID}\subsetneq \text {LID}\subsetneq \text {SYDEF}\), \(\text {STAR}\setminus \big \{\{\varepsilon \}\big \}\subsetneq \text {COM}\subsetneq \text {2COM}\), and \(\text {SGL}\subsetneq \text {ORD}\subsetneq \text {STFR}\subsetneq \text {PSEP}\) [1, 13].

3 Combinational and Singleton Languages

We start with two simple subregular classes: the class of combinational languages and the class of singleton languages. The following two theorems present our results on the state complexity of basic operations in these two classes.

Theorem 1

Let K and L be combinational languages over \(\varSigma \). Then we have \({{\,\textrm{sc}\,}}(K),{{\,\textrm{sc}\,}}(L)\le 2\) and \({{\,\textrm{sc}\,}}(K\cap L),{{\,\textrm{sc}\,}}(K\cup L)\le 2\), \({{\,\textrm{sc}\,}}(KL)\le 3\), \({{\,\textrm{sc}\,}}(L^*)\le 2\), and \({{\,\textrm{sc}\,}}(L^R)\le 3\). All these upper bounds are tight if \(|\varSigma |\ge 2\) (star, reversal) or if \(|\varSigma |\ge 1\) (intersection, union, concatenation).

Proof

Let \(K=\varSigma ^*G\) and \(L=\varSigma ^*H\) with \(G,H\subseteq \varSigma \). If H is empty, then L is empty. Otherwise, L is accepted by a two-state DFA \((\{1,2\}, \varSigma , \cdot , 1, \{2\})\) with \(1a=2a=1\) if \(a\notin H\) and \(1a=2a=2\) if \(a\in H\). We only prove the result for star. Each string in \(\varSigma ^*\) either ends with a symbol in H, or ends with a symbol in \(\varSigma \setminus H\), or is empty. Since \(L=L^+\) we have \(L^* = \{\varepsilon \}\cup L = (\varSigma ^*(\varSigma \setminus H))^c\), so \(L^*\) is the complement of a combinational language. The upper bound 2 is met by \((a+b)^*a\).    \(\square \)

Theorem 2

Let K and L be singleton languages accepted by DFAs with m and n states, respectively. Then \({{\,\textrm{sc}\,}}(K\cap L)\le \min \{m,n\}\), \({{\,\textrm{sc}\,}}(K\cup L)\le m+n-3\), \({{\,\textrm{sc}\,}}(KL)\le m+n-2\), \({{\,\textrm{sc}\,}}(L^*)\le n-1\), and \({{\,\textrm{sc}\,}}(L^R)\le n\). All these bounds are tight, with witnesses described over an alphabet of size at least 1 (intersection, concatenation, reversal) or 2 (union, star).

Proof

We only prove the result for star. The minimal DFA for a singleton language \(\{v\}\) has \(|v|+2\) states, including the dead state. Next, the language \(\{v\}^*\) is accepted by a DFA with \(|v|+1\) states, possibly including the dead state. This gives the upper bound \(n-1\), which is met by the language \(\{a^{n-2}\}\) over the binary alphabet \(\{a,b\}\).    \(\square \)

4 Intersection and Union

In this section, we examine the intersection and union operations in subregular classes. Recall that the state complexity of these two operations is mn with binary witnesses [12, 17]. Our aim is to show that the complexity of these operations is \(mn-2\) in the class of finitely generated left ideals and mn in the remaining classes. Recall that if a DFA for a language L has a unique final state which is the initial one, then \(L=L^*\), so L is a star language. To get the upper bound in class FGLID, we use the following property.

Lemma 3

In the minimal DFA for a finitely generated left ideal different from \(\emptyset \) and \(\varSigma ^*\), there exist two states that are distinguishable only by the empty string.

Proof

Let A be a minimal DFA for a finitely generated left ideal L with the initial state 1. Assume, to get a contradiction, that any two distinct states of A can be distinguished by a non-empty string. Since \(L\notin \{\emptyset ,\varSigma ^*\}\), there is a symbol a such that \(1a\ne 1\). Now, the states 1 and 1a are distinguishable by a non-empty string \(w_1\). We must have \(1w_1\notin F\) and \(1aw_1\in F\), because \(1w_1\in F\) and \(1aw_1\notin F\) would mean that \(w_1\in L\) and \(aw_1\notin L\), so L would not be a left ideal. The resulting states are again distinguishable by a non-empty string \(w_2\), and again, we must have \(1w_1w_2\notin F\) and \(1aw_1w_2\in F\). Inductively, the states \(1w_1w_2\cdots w_{i-1}\) and \(1aw_1w_2\cdots w_{i-1}\) are distinguishable by a non-empty string \(w_i\), and we must have \(1w_1w_2\cdots w_{i-1}w_i\notin F\) and \(1aw_1w_2\cdots w_{i-1}w_i\in F\). Since the number of states is finite, there exist jk with \(j<k\) such that \(1aw_1w_2\cdots w_k = 1aw_1w_2\cdots w_j\). This means that \(aw_1w_2\cdots w_j(w_{j+1}w_{j+2}\cdots w_k)^* \subseteq L\), and on the other hand we have \(w_1w_2\cdots w_j(w_{j+1}w_{j+2}\cdots w_k)^* \subseteq L^c\). It follows that the minimal generator \(L\setminus \varSigma L\) of the left ideal L is infinite, a contradiction.    \(\square \)

Theorem 4

Let K and L be finitely generated left ideals over \(\varSigma \) accepted by an m-state and n-state DFA, respectively. Then \({{\,\textrm{sc}\,}}(K\cap L),{{\,\textrm{sc}\,}}(K\cup L)\le mn-2\). This upper bound is tight for union if \(|\varSigma |\ge 3\), and for intersection if \(|\varSigma |\ge 10\).

Proof

Let A and B be minimal DFAs for K and L, respectively. By Lemma 3, there exist states \(p,p'\) of A and \(q,q'\) of B that are distinguished only by the empty string. It follows that \(pa = p'a\) and \(qa = q'a\) for each input symbol a. Then, in the product automaton \(A\times B\), the pairs \((p,q), (p,q'), (p',q), (p',q')\) are sent to the same state by each input symbol, so, they may be distinguished only by \(\varepsilon \). This gives the upper bound \(mn-2\) for union and intersection.

We can prove that this upper bound for union and intersection is met by languages accepted by NFAs from Figs. 1 and 3, respectively; notice that equivalent DFAs shown in Figs. 2 and 4 have m and n states as well.    \(\square \)

Fig. 1.
figure 1

Finitely generated left ideal witnesses for union (\(mn-2\)).

Fig. 2.
figure 2

The reachable parts of the subset automata \(\mathcal {D}(M)\) and \(\mathcal {D}(N)\) of NFAs M and N from Fig. 1; we have \(p_i=[1,i]\) and \(q_j=[1,j]\).

Fig. 3.
figure 3

Finitely generated left ideal witnesses for intersection (\(mn-2\)).

Fig. 4.
figure 4

The reachable parts of the subset automata \(\mathcal {D}(M)\) and \(\mathcal {D}(N)\) of NFAs M and N from Fig. 3; we have \(p_i=\{1,i\}\) and \(q_j=\{1,j\}\).

Theorem 5

The state complexity of intersection and union on the classes of symmetric definite, ordered, power-separating, and star languages is mn.

Proof

The upper bound mn is met by the intersection of binary left ideal languages \(K=\{w\in \{a,b\}^* \mid |w|_a\ge m-1\}\) and \(L=\{w\in \{a,b\}^* \mid |w|_b\ge n-1\}\) [3, Theorem 7], and by the union of quaternary left ideals from [3, Theorem 8]. Every left ideal is symmetric definite. Next, the languages K and L are ordered (so also power-separating), and these classes are closed under complementation. This gives the complexity mn with binary witnesses on ORD and PSEP. Finally, to get star witnesses, we consider the minimal DFAs, the first of which counts the number of a’s modulo m, and the second the number of b’s modulo n.    \(\square \)

Remark 6

Since \(\text {STAR}\subseteq \text {COM}\subseteq \text {2COM}\), the previous theorem gives complexity mn for intersection and union on comet and two-sided comet languages.    \(\square \)

5 Concatenation

Now we examine the concatenation operation, the state complexity of which is \(m2^n-2^{n-1}\) with binary witnesses [12]. We first consider the concatenation operation on finitely generated left ideal, symmetric definite, and star languages, and show that the resulting complexities are always smaller than the regular one. On the other hand, in the remaining classes, the complexity is the same as in the regular case. To get an upper bound for SYDEF, we use the following observation, cf. [14, Theorem 9.2].

Lemma 7

Every minimal DFA for a symmetric definite language contains a state \(\overline{q}\) such that each accepting computation goes through \(\overline{q}\) and a loop on every input symbol can be added in \(\overline{q}\) without changing the language.    \(\square \)

Remark 8

As shown in [3, Theorem 9], the upper bound on the complexity of concatenation on left ideals is \(m+n-1\). This upper bound is shown to be met by unary languages \(a^*a^{m-1}\) and \(a^*a^{n-1}\) in the proof of [3, Theorem 9]. These unary languages are finitely generated left ideals. This gives the same complexity of concatenation on FGLID.    \(\square \)

Theorem 9

Let \(K,L\subseteq \varSigma ^*\) be accepted by an m-state and n-state DFA, respectively. If K and L are symmetric definite, then \({{\,\textrm{sc}\,}}(KL)\le m2^{n-1} - 2^{n-2} + 1\). If K and L are star languages, then \({{\,\textrm{sc}\,}}(KL)\le m2^{n} - 2^{n-1} - m + 1\). Both upper bounds are tight if \(|\varSigma |\ge 3\).

Proof

Let \(A=(Q_A,\varSigma ,\cdot _A,s_A,F_A)\) and \(B=(Q_B,\varSigma ,\cdot _B,s_B,F_B)\) be the minimal DFAs for K and L, respectively.

(a) By Lemma 7, there is a state \(\overline{q}\) of B such that every accepting computation of B goes through the state \(\overline{q}\) and, moreover, we can add a loop on every input symbol in \(\overline{q}\) without changing the language; denote the resulting NFA by \(B'\). Construct the MNFA N for KL from automata A and \(B'\) by adding the transition \((q,a,s_B)\) whenever \(qa\in F_A\), the initial set is \(\{s_A\}\) if \(s_A\notin F_A\) and \(\{s_A,s_B\}\) otherwise, and the set of final states is \(F_B\). Each reachable subset of the subset automaton of N is of the form \(\{p\}\cup S\) with \(p\in Q_A\) and \(S\subseteq Q_B\); let us represent it by the pair (pS). Recall that \(p\in F_A\) implies \(s_B\in S\) for each reachable state (pS).

If \(\overline{q}\) is the initial state of B, then L is a left ideal, and the complexity of KL is at most \(m+n-1\); cf. [3, Theorem 9]. If \(\overline{q}\) is final, then in the corresponding subset automaton, each state \((p, S\cup \{\overline{q}\})\) is equivalent to the state \((p, \{\overline{q}\})\). This gives the upper bound \(m2^{n-1}-2^{n-2}+1\). Finally, let \(\overline{q}\) be neither initial nor final. Let \(Q_1\) be the set of states in \(Q_B\) reachable without going through \(\overline{q}\) and let \(Q_2 = Q_B\setminus (Q_1\cup \{\overline{q}\})\). Let \(S_1\subseteq Q_1\) and \(S_2\subseteq Q_2\). Every accepting computation of B goes through \(\overline{q}\), and \(\overline{q}\) has a loop on every input symbol in \(B'\). Let us show that each state \((p, S_1\cup \{\overline{q}\}\cup S_2)\) is equivalent to \((p,\{\overline{q}\}\cup S_2)\). It is enough to shown that each string accepted in \(B'\) from a state in \(S_1\) is also accepted from a state in \(\{\overline{q}\}\cup S_2\). Let a string w be accepted from \(S_1\). Then the computation on w must go through \(\overline{q}\), so \(w=uv\) where u leads a state in \(S_1\) to the state \(\overline{q}\) and v is accepted from \(\overline{q}\). Since \(\overline{q}\) has a loop on each symbol in \(B'\), the string uv is also accepted from \(\overline{q}\). Hence w is accepted from \((p,\{\overline{q}\}\cup S_2)\). This gives an upper bound \((m-1)(2^{|Q_1|} + 2^{|Q_2|}) + 2^{|Q_1|-1}+2^{|Q_2|} +1 \le (m-1)(2^{n-1}) + 2^{n-2} +1\).

For tightness, consider the DFAs A and B from Fig. 5. The automaton A recognizes a left ideal since, after adding a loop on each input symbol in its initial state and performing determinization and minimization, we get the isomorphic automaton. The automaton B recognizes a right ideal since it has a loop on each input symbol in its unique final state. Thus both automata recognize symmetric definite languages, and we can show that they meet the desired upper bound.

Fig. 5.
figure 5

Symmetric definite witnesses for concatenation (\(m2^{n-1}-2^{n-2}+1\)).

(b) If K and L are star languages, then \(\varepsilon \in K\), so the initial state of the DFA for K is final. It follows that the MNFA for KL has two initial states, and so the initial state of the corresponding subset automaton is \((s_A,\{s_B\})\). This means that the states \((p,\emptyset )\) are unreachable. This gives the desired upper bound.

For tightness, consider the DFAs A and B from Fig. 6. Since the initial state is the unique final state, both automata recognize star languages.

Fig. 6.
figure 6

Star witnesses for concatenation (\(m2^n-2^{n-1}-m+1\)).

Construct the MNFA M for L(A)L(B) from A and B by adding the transitions \((q_1,b,1)\) and \((q_m,a,1)\) and making the state \(q_1\) non-final; the set of initial states is \(\{q_1, 1\}\). Notice that in the reversed automaton \(M^R\), each singleton \(\{j\}\) is reached from \(\{1\}\) by a string in \(cb^*\), and each singleton \(\{q_i\}\) is reached by a string in \(ba^*\). It follows that for each state p of M, there is a string \(w_p\) such that \(\{p\}\) is reached from \(\{1\}\) by \(w_p\) in \(M^R\). It follows that the string \(w_p^R\) is accepted by M from and only from the state p. This means that all states of the subset automaton \(\mathcal {D}(M)\) are pairwise distinguishable. Let us denote the (reachable) state \(\{q_i\}\cup S\) with \(S\subseteq [1,n]\) by (iS), and let us show that the set of reachable states is \(\{(1,S) \mid S\subseteq [1,n] \text { with }1\in S\} \cup \{(i,S) \mid i=2,3,\ldots ,m \text { and }S\subseteq [1,n] \text { with }|S|\ge 1\}\). The initial state is \((1,\{1\})\) and it is sent to \((i,\{j\})\) by \(a^{i-1} b^{j-1}\). Next, each (iS) with \(1\in S\) is reached from \((m,S\setminus \{1\})\) by \(a^{i}\) and each (iS) with \(i\ne 1\), \(S\ne \emptyset \), and \(\min S = j\ge 2\) is reached from a state \((i,b^{j-1}S)\) with \(1\in b^{j-1}S\) by \(b^{j-1}\).    \(\square \)

Remark 10

If A and B are the binary DFAs shown in Fig. 7, cf. [12], then we have \({{\,\textrm{sc}\,}}(L(A)L(B))=m2^n-2^{n-1}\) since in the NFA for concatenation, each singleton set is co-reachable, and the reachability of the desired number of states is shown similarly as in Theorem 9(b). Since \(L(A)=b^*L(A)\) and \(L(B)=a^*L(B)\), automata A and B recognize comet languages. Moreover we have COM \(\subseteq \) 2COM, so the state complexity of concatenation on comets and two-sided comets is the same as in the class of regular languages.    \(\square \)

Fig. 7.
figure 7

Comet witnesses for concatenation (\(m2^n-2^{n-1}\)).

Remark 11

The star-free witness languages that meet the regular upper bound for concatenation are described in [5, Theorem 2], and they are recognized by the DFAs shown in Fig. 8. Notice that both these DFAs are ordered. Since we have \(\text {ORD}\subseteq \text {STFR}\subseteq \text {PSEP}\), the state complexity of concatenation on ordered and power-separating languages is \(m2^n-2^{n-1}\).    \(\square \)

Fig. 8.
figure 8

Ordered witnesses for concatenation (\(m2^n-2^{n-1}\)).

6 Star and Reversal

In this section, we consider the star and reversal operations. The state complexity of star is \(\frac{3}{4}2^n\) with binary witnesses [17]. Our aim is to show that the complexity of star in our classes is always the same as in the regular case, except for classes of finitely generated left ideal, symmetric definite, and star languages, where it is \(n+1\), \(n+1\), and n, respectively. Recall that \(\text {FGLID}\subseteq \text {SYDEF}\).

Theorem 12

Let \(n\ge 3\). Let L be a symmetric definite language accepted by an n-state DFA. Then \(L^*\) is accepted by a DFA with \(n+1\) states. This upper bound is met by the finitely generated left ideal \(L=(a+b)^*a^{n-1}\).

Proof

If \(L\in \text {SYDEF}\) then \(L=G\varSigma ^*H\). Hence \(L^+ = L L^* =G\varSigma ^*H (G\varSigma ^*H)^* = G\varSigma ^* (HG\varSigma ^*)^* H = G\varSigma ^*H = L\). It follows that \(L^* = \{\varepsilon \}\cup L\). This gives the upper bound \(n+1\). The reader may verify that \(L=(a+b)^*a^{n-1}\) is accepted by an n-state DFA and \({{\,\textrm{sc}\,}}(\{\varepsilon \}\cup L)=n+1\).    \(\square \)

Remark 13

If L is a star language, then \(L=L^*\), so \({{\,\textrm{sc}\,}}(L^*)={{\,\textrm{sc}\,}}(L)\).    \(\square \)

Remark 14

The regular witness for star [17, Theorem 3.3] is recognized by the DFA A from Fig. 9. Since \(L(A)=b^*L(A)\), the language L(A) is a comet, so also a two-sided comet. Thus the complexity of star on COM and 2COM is \(\frac{3}{4}2^n\).    \(\square \)

Fig. 9.
figure 9

A comet witness for star (\(\frac{3}{4}2^n\)).

Remark 15

The regular upper bound \(\frac{3}{4}2^n\) is met by the quaternary star-free language [5, Theorem 6] which is accepted by the DFA shown in Fig. 10. Since this DFA is ordered, and every ordered language is power-separating, the state complexity of star on ORD and PSEP is the same as in the regular case.    \(\square \)

Fig. 10.
figure 10

An ordered witness for star; transitions (idn) for each i are not shown (\(\frac{3}{4}2^n\)).

We conclude our investigation with the reversal operation, the state complexity of which is \(2^n\) with binary witnesses [11]. First, using Lemmas 3 and 7, we get tight upper bounds for finitely generated left ideals and symmetric definite languages. Then, by proving that the set of all non-final states of a DFA accepting a power-separating language cannot be co-reachable, we get the tight upper bound \(2^n-1\) for reversal on ordered, star-free, and power-separating languages, which shows that a lower bound for star-free languages [5, Theorem 7] is tight. Finally, we show that the complexity of reversal on star, comet, and two-sided comet languages is the same as in the regular case.

Theorem 16

Let L be a finitely generated left ideal over \(\varSigma \) accepted by an n-state DFA. Then \({{\,\textrm{sc}\,}}(L^R)\le 2^{n-2}+2\), and this bound is tight if \(|\varSigma |\ge 2^{n-2}+2\).

Proof

Let A be the minimal DFA for L. By Lemma 3, A has two states \(q,q'\) with \(qa=q'a\) for each input a, and such that exactly one of them, say q, is final. Since L is a left ideal, adding a loop on each symbol in the initial state of A does not change the language; denote the resulting NFA by N. Thus all final sets of the subset automaton of \(N^R\) are equivalent. Moreover, except for possible initial set \(\{q\}\), each reachable set contains either both \(q,q'\), or none of them.

For tightness, let \(\varSigma =\{a,b\} \cup \{c_S\mid S\subseteq [2,n-1]\}\). Consider an n-state NFA \(N=([1,n],\varSigma ,\cdot ,1,\{n\})\) where \(1\cdot a=\{1,2\}\), \(1\cdot b = \{1\}\) and \(i\cdot b =\{i+1\}\) if \(2\le i \le n-1\), \(1\cdot c_S=\{1\}\) and \(i\cdot c_S=\{n\}\) if \(i\in S\), and all the remaining transitions go to the empty set. The NFA N recognizes a finitely generated left ideal since state 1 has a loop on each symbol and for any other transition \((i,\sigma ,j)\) we have \(i<j\). Since the subset automaton of N has n reachable states, the language L(N) is accepted by an n-state DFA. In the reverse \(N^R\), the initial set is \(\{n\}\), and it is sent to a subset S of \([2,n-1]\) by \(c_S\). The set \(\{1\}\) is reached from \(\{2\}\) by a. For distinguishability, notice that \(\{1\}\) is the unique final state. If \(S,T\subseteq [2,n]\) and \(s\in S\setminus T\), then \(b^{s-2}a\) sends S to the final set \(\{1\}\) and it sends T to the empty set.    \(\square \)

Theorem 17

Let \(L\subseteq \varSigma ^*\) be a symmetric definite language accepted by an n-state DFA. Then \({{\,\textrm{sc}\,}}(L^R)\le 2^{n-1}+1\), and this upper bound is tight if \(|\varSigma |\ge 3\).

Proof

We get an upper bound in a similar way as in the proof of Theorem 9(a). The ternary left ideal from [3, Theorem 11(2), Fig. 17] meets this bound.    \(\square \)

Theorem 18

Let L be a power-separating language accepted by an n-state DFA. Then \({{\,\textrm{sc}\,}}(L^R)\le 2^n-1\), and this bound is met by an ordered language over an alphabet of size \(n-1\).

Proof

Let \(A=(Q,\varSigma ,\cdot ,s,F)\) be a minimal DFA for a power-separating language L. Our aim is to show that the set \(Q\setminus F\) of all non-final states of A cannot be co-reachable in A. Assume that \(Q\setminus F\) is co-reachable by a string w. Since w sends F to \(Q\setminus F\) in \(A^R\), the string w must be rejected by A from each final state, and accepted from each non-final state, so there is no \(k\ge 0\) such that \(\bigcup _{i\ge k}\{w^i\}\subseteq L\) or \(\bigcup _{i\ge k}\{w^i\}\subseteq L^c\). This means that L is not power-separating, a contradiction.

To get tightness, let \(\varSigma =\{a,b\}\cup \{c_i \mid i=3,4,\ldots ,n-1\}\). Consider the n-state DFA \(A=([1,n], \varSigma , \cdot , 1, F)\) where \(F=\{i\in [1,n] \mid i\bmod 2=0\}\), and for each i in [1, n] and each \(j=3,4,\ldots ,n-1\), let \(i \cdot a = i+1\) if \(i\le n-1\) and \(n\cdot a = n\), \(i \cdot b = i-1\) if \(i\ge 2\) and \(1\cdot b = 1\), \(i \cdot c_j = i\) if \(i\ne j\) and \(j\cdot c_j = j-1\). Then A is ordered. It is shown in [5, Theorem 7] that every DFA for the reverse of L(A) has at least \(2^n-1\) states.    \(\square \)

Remark 19

The DFA A from Fig. 11 differs from Šebej’s witness for reversal meeting the upper bound \(2^n\) [10, Theorem 5] only in the set of final states, and Šebej’s proof works for this DFA as well. Since the initial state of A is a unique final state, A recognizes a star, so also comet and two-sided comet, language.    \(\square \)

Fig. 11.
figure 11

A star witness for reversal (\(2^n\)).

7 Conclusion

We investigated the state complexity of basic regular operations in the classes of combinational, singleton, finitely generated left ideal, symmetric definite, star, comet, two-sided comet, ordered, and power-separating languages. Our results are summarized in Table 1 where the sizes of alphabets used to describe witnesses are given by numbers in parentheses. Except for the ordered and finitely generated left ideal witnesses for reversal, all the remaining witnesses are described over a fixed alphabet, and whenever a binary alphabet is used, it is always optimal. By showing that the upper bound for reversal on power-separating languages is \(2^n-1\), we proved that the same lower bound for reversal on star-free languages from [5, Theorem 7] is tight.

In [1], the classes of definite, generalized definite, locally testable, and strictly locally testable languages were also considered. We leave the operational complexity in these classes for future research. The optimality of alphabet sizes greater than two remains open as well.

Table 1. Operational complexity in subregular classes; we have \(\bullet = m2^{n-1}-2^{n-2}+1\) and \(\circ = m2^{n}-2^{n-1}-m+1\).