Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

The state complexity of a regular language L, denoted \({\text {sc}}(L)\), is the least number of states needed to recognize the language with a deterministic finite automaton. The state complexity of an operation on regular languages is the worst-case state complexity of the result of the operation, expressed as a function of the maximal allowed state complexity of the input languages.

To establish the state complexity of an operation, there are two steps. First, one derives an upper bound on the state complexity. Next, one searches for witnesses, that is, families of languages which attain the upper bound. One must not only find these witnesses but also prove that the desired state complexity bound is reached. Such proofs are the subject of this paper.

We are interested in the case where the operation is concatenation of languages. We assume that one is working within some subclass of the regular languages, and has derived an upper bound f(mn) for the worst-case state complexity of concatenation within this subclasses. We also assume one has found candidate witnesses for this upper bound, in the form of two sequences of languages \((L_m : m \ge 1)\) and \((K_n : n \ge 1)\) such that \({\text {sc}}(L_m) \le m\) and \({\text {sc}}(K_n) \le n\). The goal is to prove that for each pair (mn), the concatenation \(L_mK_n\) has state complexity f(mn). We may divide such a proof into three steps:

  1. 1.

    Construct a deterministic automaton \({\mathcal A}\) for \(L_mK_n\) in the standard way.

  2. 2.

    Show that \({\mathcal A}\) contains at least f(mn) reachable states.

  3. 3.

    Show exactly f(mn) reachable states in \({\mathcal A}\) are pairwise distinguishable.

We present a new technique for dealing with step (2) of this process. The standard way to construct a deterministic finite automaton \({\mathcal A}\) for the concatenation of two languages yields an automaton in which the states are sets; to show a particular set is reachable, one typically proceeds by induction on the size of the set. We prove a result that seems to generalize many of these ad-hoc induction arguments, and can be used to establish reachability of sets without directly using induction. Additionally, we prove some helpful lemmas that make our main result easier to apply.

We have tested our technique by applying it to a variety of concatenation witnesses taken from the literature. The state complexity of concatenation has been studied in the class of all regular languages, as well as many subclasses. Table 1 lists some examples of subclasses that have been studied, and the state complexity of concatenation in each subclass. See the cited papers for definitions of each subclass and derivations/proofs of each complexity.

Table 1. Subclasses of regular languages and the state complexity of the concatenation operation within each subclass. Bold type indicates that the complexity grows exponentially in terms of n.

If the state complexity of concatenation grows exponentially with n (indicated in Table 1 by bold type), it is typical to use an induction argument to prove the desired number of states is reachable. It is cases like this in which our technique is most likely to be useful. We selected 16 concatenation witnesses, all from subclasses in which the state complexity of concatenation is exponential in n, and tried to apply our technique to these witnesses. (In the interest of space, we present just two of these applications; the other 14 can be found in the arXiv version of this paper [11].) In many cases we were able to produce shorter and simpler proofs than the original authors, and we only found two cases in which our technique did not work or was not useful. This suggests that our technique is widely applicable and should be considered as a viable alternative to the traditional induction argument when attempting reachability proofs in concatenation automata.

The rest of the paper is structured as follows. Section 2 contains background material and definitions needed to understand the paper. Section 3 describes our new technique and proves the relevant results. Section 4 concludes the paper.

2 Preliminaries

2.1 Relations and Functions

A binary relation \(\rho \) between X and Y is a subset of \(X \times Y\). If \(\rho \subseteq X \times Y\) and \(\tau \subseteq Y \times Z\), the composition of \(\rho \) and \(\tau \) is the relation

$$\begin{aligned} \rho \tau = \{(x,z) \in X \times Z : \text { there exists } y \in Y \text { such that } (x,y) \in \rho \text { and } (y,z) \in \tau \}. \end{aligned}$$

For \(x \in X\) and \(\rho \subseteq X \times Y\), the image of x under \(\rho \) is the set \(x\rho = \{ y \in Y : (x,y) \in \rho \}\). For \(x \not \in X\) we define \(x\rho = \emptyset \). The converse of a binary relation \(\rho \subseteq X \times Y\) is the relation \(\rho ^{-1} = \{ (y,x) : (x,y) \in \rho \} \subseteq Y \times X\). The set \(y\rho ^{-1} = \{ x \in X : (x,y) \in \rho \}\) is called the preimage of y under \(\rho \). Elements of this set are called preimages of y; for example, if \(x \in y\rho ^{-1}\) we say that x is a preimage of y.

If we write \({\mathcal P}(S)\) for the power set of a set S (that is, the set of all subsets of S), then we can view \(\rho \) as a map \(\rho :X {\rightarrow }{\mathcal P}(Y)\). We may also extend \(\rho \) by union to a map \(\rho :{\mathcal P}(X) {\rightarrow }{\mathcal P}(Y)\) as follows: for \(S \subseteq X\), we define

$$\begin{aligned} S\rho = \bigcup _{s \in S} s\rho . \end{aligned}$$

We thus have two ways to make sense of an expression like \(x\rho \tau \): it is the image of x under the composite relation \(\rho \tau \subseteq X \times Z\), and it is also the image of the set \(x\rho \subseteq Y\) under the map \(\tau :{\mathcal P}(Y) {\rightarrow }{\mathcal P}(Z)\). Additionally, we have a way to make sense of a composition \(\rho \tau :X {\rightarrow }{\mathcal P}(Z)\) of maps \(\rho :X {\rightarrow }{\mathcal P}(Y)\) and \(\tau :Y {\rightarrow }{\mathcal P}(Z)\): take the composition of the corresponding relations.

A function \(f :X {\rightarrow }Y\) is a binary relation \(f \subseteq X \times Y\) such that \(|xf| = 1\) for all \(x \in X\). Following our notation for binary relations, we write functions to the right of their arguments. Composition of functions is defined by composing the corresponding relations. Thus the order of composition is left-to-right; in a composition fg, first f is applied and then g.

A transformation of a set X is a function \(t :X {\rightarrow }X\), that is, a function from X into itself. We say t is a permutation of X if \(Xt = X\). We say t acts as a permutation on \(S \subseteq X\) if \(St = S\). If t acts as a permutation on S, then every element of S has at least one preimage under t, that is, for all \(s \in S\), the set \(st^{-1} = \{ x \in X : xt = s\}\) is non-empty.

A cyclic permutation of a set \(\{x_1,\cdots ,x_k\} \subseteq X\) is a permutation p such that \(x_ip = x_{i+1}\) for \(1 \le i < k\), \(x_kp = x_1\), and \(xp = x\) for all \(x \in X \setminus \{x_1,\cdots ,x_k\}\). We denote such a permutation as \((x_1,\cdots ,x_k)\). A transposition is a cyclic permutation of a two-element set. We denote the identity transformation by \(\text {id}\).

The notation \((S {\rightarrow }x)\) for \(S \subseteq X\) and \(x \in X\) denotes a transformation that sends every element of S to x and fixes every element of \(S \setminus X\). For example, \((\{i\} {\rightarrow }j)\) maps i to j and fixes everything else, and \((X {\rightarrow }x)\) is a constant transformation that maps every element of X to x. In the case where \(X = \{1,2,\cdots ,n\}\), the notation \(({}_i^j x {\rightarrow }x+1)\) denotes a transformation such that for each x with \(i \le x \le j\), the transformation sends x to \(x+1\), and every other x is fixed. For example, the transformation \(({}_2^{n-1} x {\rightarrow }x+1)\) fixes 1, sends x to \(x+1\) for \(2 \le x \le n-1\), and fixes n. The notation \(({}_i^j x {\rightarrow }x-1)\) is defined similarly.

2.2 Automata

A finite automaton (FA) is a tuple \({\mathcal A}= (Q,\varSigma ,T,I,F)\) where Q is a finite set of states, \(\varSigma \) is a finite set of letters called an alphabet, \(T \subseteq Q \times \varSigma \times Q\) is a set of transitions, \(I \subseteq Q\) is a set of initial states, and \(F \subseteq Q\) is a set of final states.

We now define a binary relation \(T_w \subseteq Q \times Q\) for each \(w \in \varSigma ^*\). Define \(T_{\varepsilon }= \{(q,q) : q \in Q\}\); in terms of maps, this is the identity map on Q. For \(a \in \varSigma \), define \(T_a = \{ (p,q) \in Q \times Q : (p,a,q) \in T\}\). For \(w = a_1\cdots a_k\) with \(a_1,\cdots ,a_k \in \varSigma \), define \(T_w = T_{a_1} \cdots T_{a_k}\). The relation \(T_w\) is called the relation induced by w. The set \(\{T_w : w \in \varSigma ^*\}\) is a monoid under composition, called the transition monoid of \({\mathcal A}\). If w is a word but is not a word over \(\varSigma \), we define \(T_w\) to be the empty relation. We sometimes write \(p \xrightarrow {w} q\) to mean \(q \in pT_w\).

If \({\mathcal A}= (Q,\varSigma ,T,I,F)\) is a finite automaton such that \(|I| = 1\) and \(T_a\) is a function for each \(a \in \varSigma \), we say \({\mathcal A}\) is deterministic. We abbreviate “deterministic finite automaton” to DFA.

Let \({\mathcal A}= (Q,\varSigma ,T,I,F)\) be an FA. A word \(w \in \varSigma ^*\) is accepted by \({\mathcal A}\) if we have \(IT_w \cap F \ne \emptyset \). If \({\mathcal A}\) is a DFA with \(I = \{i\}\), this condition becomes \(iT_w \in F\). The language of \({\mathcal A}\), denoted \(L({\mathcal A})\), is the set of all words it accepts. A language recognized by an FA is called a regular language.

Given two regular languages L and K with DFAs \({\mathcal A}= (Q^{\mathcal A},\varSigma ^{\mathcal A},T^{\mathcal A},i^{\mathcal A},F^{\mathcal A})\) and \({\mathcal B}= (Q^{\mathcal B},\varSigma ^{\mathcal B},T^{\mathcal B},i^{\mathcal B},F^{\mathcal B})\), we may construct an FA \({\mathcal A}{\mathcal B}= (Q,\varSigma ,T,I,F)\) that accepts the concatenation LK as follows:

  • \(Q = Q^{\mathcal A}\cup Q^{\mathcal B}\). We assume without loss of generality that \(Q^{\mathcal A}\cap Q^{\mathcal B}= \emptyset \).

  • \(\varSigma = \varSigma ^{\mathcal A}\cup \varSigma ^{\mathcal B}\).

  • \(T = T^{\mathcal A}\cup T^{\mathcal B}\cup \{(q,a,i^{\mathcal B}) : q T^{\mathcal A}_a \in F^{\mathcal A}, a \in \varSigma ^{\mathcal A}\}\).

  • \(I = \{i^{\mathcal A}\}\) if \(i^{\mathcal A}\not \in F^{\mathcal A}\), and otherwise \(I = \{i^{\mathcal A},i^{\mathcal B}\}\).

  • \(F = F^{\mathcal B}\).

Next, we convert the FA \({\mathcal A}{\mathcal B}= (Q,\varSigma ,T,I,F)\) to a DFA recognizing the same language. We apply the usual subset construction to obtain the DFA \({\mathcal C}= ({\mathcal P}(Q),\varSigma ,T^{\mathcal C},I,F^{\mathcal C})\), where \(T^{\mathcal C}= \{ (S,a,ST_a) : S \subseteq Q, a \in \varSigma \}\), and \(S \subseteq Q\) is in \(F^{\mathcal C}\) if \(S \cap F \ne \emptyset \). We call \({\mathcal C}\) the concatenation DFA for \({\mathcal A}\) and \({\mathcal B}\).

We make some observations and introduce some conventions to make it easier to work with the concatenation DFA.

  • Since we are assuming \({\mathcal A}\) and \({\mathcal B}\) are DFAs, the only reachable states in \({\mathcal C}\) have the form \(S^{\mathcal A}\cup S^{\mathcal B}\), where \(S^{\mathcal A}\subseteq Q^{\mathcal A}\), \(S^{\mathcal B}\subseteq Q^{\mathcal B}\), and \(|S^{\mathcal A}| \le 1\). Without loss of generality, we can assume the state set of \({\mathcal C}\) consists of states of this form, rather than all of \({\mathcal P}(Q)\).

  • We mark the states of \({\mathcal A}\) with primes so they can be distinguished from the states of \({\mathcal B}\). So a variable named p or q generally means an element of \(Q^{\mathcal B}\), while \(p'\) or \(q'\) means an element of \(Q^{\mathcal A}\).

  • We identify the set \(S^{\mathcal A}\cup S^{\mathcal B}\) with the ordered pair \((S^{\mathcal A},S^{\mathcal B})\). Hence we can view the states of \({\mathcal C}\) as these ordered pairs. Reachable states are either of the form \((\emptyset ,S)\) or \((\{q'\},S)\) with \(q' \in Q^{\mathcal A}\), \(S \subseteq Q^{\mathcal B}\).

  • For convenience, we frequently make no distinction between singleton sets and the elements they contain, and so write \((q',S)\) rather than \((\{q'\},S)\).

  • Rather than \(T_w\), \(T^{\mathcal A}_w\) and \(T^{\mathcal B}_w\), we simply write w when it is clear from context which relation is meant. For example, \((q',S)w\) means \((q',S)T_w\) since \((q',S)\) is a state of \({\mathcal C}\), and thus \(T_w\) is the natural relation to apply. From our convention for marking the states of \({\mathcal A}\) and \({\mathcal B}\) with primes, one can infer that \(q'w\) means \(q'T^{\mathcal A}_w\) and qw means \(qT^{\mathcal B}_w\).

  • Rather than \(i^{\mathcal A}\) and \(i^{\mathcal B}\), let \(1'\) denote the initial state of \({\mathcal A}\) and let 1 denote the initial state of \({\mathcal B}\). We also assume without loss of generality that \(Q^{\mathcal A}= \{1',2',\cdots ,m'\}\) and \(Q^{\mathcal B}= \{1,2,\cdots ,n\}\) for some m and n.

Under these conventions, the transitions of \({\mathcal C}\) can be described as follows:

$$\begin{aligned} (q',S)a = {\left\{ \begin{array}{ll} (\emptyset ,Sa), &{} \text {if }a \in \varSigma ^{\mathcal B}\setminus \varSigma ^{\mathcal A}; \\ (q'a,\emptyset ), &{} \text {if }a \in \varSigma ^{\mathcal A}\setminus \varSigma ^{\mathcal B}\text { and }q'a \not \in F^{\mathcal A}; \\ (q'a,1), &{} \text {if }a \in \varSigma ^{\mathcal A}\setminus \varSigma ^{\mathcal B}\text { and } q'a \in F^{\mathcal A}; \\ (q'a,Sa), &{} \text {if }a \in \varSigma ^{\mathcal A}\cap \varSigma ^{\mathcal B}\text { and } q'a \not \in F^{\mathcal A}; \\ (q'a,Sa \cup 1), &{} \text {if }a \in \varSigma ^{\mathcal A}\cap \varSigma ^{\mathcal B}\text { and } q'a \in F^{\mathcal A}. \end{array}\right. } \end{aligned}$$

Recall that \(T^{\mathcal A}_w\) is the empty relation if w is not a word over \(\varSigma ^{\mathcal A}\), and a similar statement can be made for \({\mathcal B}\). Thus the transitions admit a simpler description:

$$\begin{aligned} (q',S)a = {\left\{ \begin{array}{ll} (q'a,Sa \cup 1), &{} \text {if }a \in \varSigma ^{\mathcal A}\text { and } q'a \in F^{\mathcal A};\\ (q'a,Sa), &{} \text {otherwise.} \end{array}\right. } \end{aligned}$$

2.3 State Complexity

We say a DFA \({\mathcal A}\) is minimal if it has the least number of states among all DFAs that recognize \(L({\mathcal A})\). It is well known that each regular language has a unique minimal DFA (up to renamings of the states). The state complexity of a regular language L, denoted \({\text {sc}}(L)\), is the number of states in its minimal DFA.

The following characterization of minimality is useful. Let \({\mathcal D}= (Q,\varSigma ,T,i,F)\) be a DFA. A state \(q \in Q\) is reachable if \(iw = q\). For \(p,q \in Q\), we say q is reachable from p if \(pw = q\). Two states \(p,q \in Q\) are indistinguishable if they are equivalent under the following equivalence relation: \(p \sim q\) if for all \(w \in \varSigma ^*\), we have \(pw \in F \iff qw \in F\). Otherwise they are distinguishable by some word w such that \(pw \in F \iff qw \not \in F\). A DFA is minimal if and only if all of its states are reachable and pairwise distinguishable.

Let \(\circ \) be a binary operation on regular languages. The state complexity of the operation \(\circ \) is the following function, where m and n are positive integers:

$$\begin{aligned} (m,n) \mapsto \max \{ {\text {sc}}(L \circ K) : {\text {sc}}(L) \le m, {\text {sc}}(K) \le n\}. \end{aligned}$$

This is the worst-case state complexity of the result of the operation, expressed as a function of the maximal allowed state complexities of the input languages. When computing state complexity of operations we assume the input languages L and K are languages over the same alphabet. The case where the inputs are allowed to have different alphabets has been studied by Brzozowski [4]. An example of how to apply our results in this case is given in [11, Theorem 4].

3 Results

Let \({\mathcal A}= (Q^{\mathcal A},\varSigma ^{\mathcal A},T^{\mathcal A},1',F^{\mathcal A})\) and \({\mathcal B}= (Q^{\mathcal B},\varSigma ^{\mathcal B},T^{\mathcal B},1,F^{\mathcal B})\) be DFAs, with \(Q^{\mathcal A}= \{1',2',\cdots ,m'\}\) and \(Q^{\mathcal B}= \{1,2,\cdots ,n\}\) for positive integers m and n. Let \({\mathcal C}= (Q,\varSigma ,T,I,F)\) denote the concatenation DFA of \({\mathcal A}\) and \({\mathcal B}\) as defined in Sect. 2.2.

Remark 1

Let \(p',q' \in Q^{\mathcal A}\), let \(X,Y,Z \subseteq Q^{\mathcal B}\), and let \(w \in \varSigma ^*\). Then in \({\mathcal C}\), if \((p',X)w = (q',Y)\), then \((p',X \cup Z)w = (q',Y \cup Zw)\).

Indeed, recall that the pair \((p',X)\) stands for the set \(\{p'\} \cup X\). Thus \((\{p'\} \cup X)w = \{p'w\} \cup Xw = \{q'\} \cup Y\). It follows that \(p'w = q'\) and \(Xw = Y\). Hence \((\{p'\} \cup X \cup Z)w = \{p'w\} \cup Xw \cup Zw = \{q'\} \cup Y \cup Zw\), which in our pair notation is \((q',Y \cup Zw)\). We will readily use this basic fact in proofs.

Before stating our main result formally, we give some motivating exposition. Fix a state \(s' \in Q^{\mathcal A}\) and a subset B of \(Q^{\mathcal B}\). The state \(s'\) is called the focus state or simply focus; it is often taken to be the initial state \(1'\) but in general can be any state. The subset B is called the base. Fix a set T with \(B \subseteq T \subseteq Q^{\mathcal B}\), called the target. Our goal is to give sufficient conditions under which starting from \((s',B)\), we can reach \((s',S)\) for all sets S with \(B \subseteq S \subseteq T\). That is, we can reach any state of the concatenation DFA \({\mathcal C}\) in which the first component is the focus and the second component lies between the base and the target.

The idea is to first assume we can reach \((s',B)\), the state consisting of the focus and the base. Now, for \(q \in Q\), define a q-word to be a word w such that \((s',B)w = (s',B \cup q)\). We can think of this as a word that “adds” the state q to the base B. Our next assumption is that we have a q-word for each state q in the target T. To reach a set S with \(B \subseteq S \subseteq T\), we will repeatedly use q-words to add each missing element of S to the base B.

There is a problem with this idea, which we illustrate with an example. Suppose \(w_p\) is a p-word and \(w_q\) is a q-word, and we want to reach \((s',B \cup \{p,q\})\). Starting from \((s',B)\) we may apply \(w_p\) to reach \((s',B \cup p)\). But now if we apply \(w_q\), we reach \((s',B \cup \{pw_q,q\})\). There is no guarantee that we have \(pw_q = p\), and in many cases we will not. What we should really do is find a state r such that \(rw_q = p\), use an r-word to reach \((s',B \cup r)\), and then apply \(w_q\) to reach \((s',B \cup \{p,q\})\). But this idea only works if p has a preimage under \(w_q\), which may not be the case.

We resolve this by making a technical assumption, which ensures that preimages will always exist when we attempt constructions like the above. First, define a construction set for the target T to be a set of words consisting of exactly one q-word for each \(q \in T\). If W is a construction set for T, we write W[q] for the unique q-word in W.

We say a construction set is complete if there is a total order \(\prec \) on the target T such that for all \(p,q \in T\) with \(p \prec q\), the state q has at least one preimage under the unique p-word W[p], and at least one of these preimages lies in T. More formally, whenever \(p \prec q\), the set \(qW[p]^{-1} = \{s \in Q^{\mathcal B}: sW[p] = q\}\) intersects T non-trivially. Our final assumption is that we have a complete construction set for T.

Note that the definition of a q-word depends not only on q, but also on \(s'\) and B. Since a construction set for T is a set of q-words, the definition of construction set also depends on \(s'\) and B. For simplicity, we omit this dependence on \(s'\) and B from the notation for q-words and construction sets.

We summarize the definitions that have just been introduced:

  • Fix a state \(s' \in Q^{\mathcal A}\), called the focus, and a set \(B \subseteq Q^{\mathcal B}\) called the base.

  • For \(q \in Q^{\mathcal B}\), a q-word is a word w such that \((s',B)w = (s',B \cup q)\).

  • Given a target set T with \(B \subseteq T \subseteq Q^{\mathcal B}\), a construction set for T is a set of words that contains exactly one q-word for each \(q \in T\).

  • The unique q-word in a construction set W is denoted by W[q].

  • A construction set for T is complete if there exists a total order \(\prec \) on T such that for all \(p,q \in T\) with \(p \prec q\), we have

    $$\begin{aligned} qW[p]^{-1} \cap T = \{s \in Q^{\mathcal B}: sW[p] = q\} \cap T \ne \emptyset . \end{aligned}$$

Now, we state our main theorem, which formalizes the above construction.

Theorem 1

Fix a state \(s' \in Q^{\mathcal A}\) and sets B and T such that \(B \subseteq T \subseteq Q^{\mathcal B}\). If there is a complete construction set for T, then all states of the form \((s',S)\) with \(B \subseteq S \subseteq T\) are reachable from \((s',B)\) in \({\mathcal C}\). In particular, if \((s',B)\) itself is reachable, then all states \((s',S)\) with \(B \subseteq S \subseteq T\) are reachable.

Proof

Note that if \(B \subseteq S \subseteq T\), we can write \(S = R \cup B\) with \(R \cap B = \emptyset \) and \(R \subseteq T\). Thus it suffices to show that all states of the form \((s',R \cup B)\) with \(R \cap B = \emptyset \) and \(R \subseteq T\) are reachable from \((s',B)\). We proceed by induction on |R|. When \(|R| = 0\), the only state of this form is \((s',B)\) itself.

Now suppose every state \((s',R \cup B)\) with \(R \cap B = \emptyset \), \(R \subseteq T\) and \(0 \le |R| < k\) is reachable from \((s',B)\). We want to show this also holds for \(|R| = k\). Let W be a complete construction set for T and let \(\prec \) be the corresponding total order on T. Let p be the minimal element of R under \(\prec \). Let w be W[p], the unique p-word in W. For all \(q \in R \setminus p\), we have \(p \prec q\) and thus \(qw^{-1}\) contains an element of T (since W is complete).

Construct sets X and Y as follows: starting with \(X = \emptyset \), for each \(q \in R \setminus p\), choose an element of \(qw^{-1} \cap T\) and add it to X. Then set \(Y = X \setminus B\). Observe that X is a subset of T of size \(|R \setminus p| = k-1\). Hence Y is a subset of T of size at most \(k-1\) such that \(Y \cap B = \emptyset \). It follows by the induction hypothesis that \((s',Y \cup B)\) is reachable from \((s',B)\). But \(Y \cup B = X \cup B\), so \((s',X \cup B)\) is reachable from \((s',B)\). By the definition of X, we have \(Xw = R \setminus p\). Since w is a p-word, we have \((s',B)w = (s',B \cup p)\), and thus

$$\begin{aligned} (s',X \cup B)w = (s',Xw \cup B \cup p) = (s',(R \setminus p) \cup B \cup p) = (s',R \cup B). \end{aligned}$$

Hence \((s',R \cup B)\) is reachable from \((s',B)\), as required.    \(\square \)

The definition of completeness is somewhat complicated, which makes it difficult to use Theorem 1. Thus, we prove some results giving simpler sufficient conditions for a construction set to be complete.

Before stating our first such result, we introduce some notation. Define \(\varSigma _0 = \varSigma ^{\mathcal A}\cap \varSigma ^{\mathcal B}\). We call \(\varSigma _0\) the shared alphabet of \({\mathcal A}\) and \({\mathcal B}\). The following remark shows that when \(\varSigma ^{\mathcal A}\ne \varSigma ^{\mathcal B}\), it is important to work exclusively with the shared alphabet when looking for complete construction sets. Of course, in the case where \(\varSigma ^{\mathcal A}= \varSigma ^{\mathcal B}\) there is nothing to worry about.

Remark 2

A construction set for a non-empty target cannot be complete unless it is a subset of \(\varSigma _0^*\). To see this, suppose W is a construction set and let \(w \in W\). If w contains a letter from \(\varSigma ^{\mathcal A}\setminus \varSigma ^{\mathcal B}\), then w is not a word over \(\varSigma ^{\mathcal B}\). Recall that if w is not a word over \(\varSigma ^{\mathcal B}\), then \(T^{\mathcal B}_w\) is defined to be the empty relation. Thus the converse relation \((T^{\mathcal B}_w)^{-1}\) is also empty, which means \(qw^{-1}\) is empty for all q. It follows W cannot be complete. On the other hand, suppose w contains a letter from \(\varSigma ^{\mathcal B}\setminus \varSigma ^{\mathcal A}\). Then \((s',B)w = (\emptyset ,Bw)\). Hence w is not a q-word for any q, and so w cannot be an element of a construction set, which is a contradiction. Thus all words in a complete construction set are words over the shared alphabet \(\varSigma _0\).

Lemma 1

Fix \(s' \in Q^{\mathcal A}\) and sets \(B \subseteq T \subseteq Q^{\mathcal B}\). Let \(x_1,\cdots ,x_j\) be words over \(\varSigma _0\) that act as permutations on T, and let y be an arbitrary word over \(\varSigma _0\). Choose \(x_0 \in \{{\varepsilon },x_1,\cdots ,x_j\}\). Define \(W = \{x_1,x_2,\cdots ,x_j\} \cup \{x_0y,x_0y^2,\cdots ,x_0y^k\}\). If W is a construction set for T, then it is complete.

Proof

For \(1 \le i \le j\), let \(w_i = x_i\). For \(1 \le i \le k\), let \(w_{j+i} = x_0y^i\). Let \(\ell = j+k\). Then we have \(W = \{w_1,\cdots ,w_\ell \}\). Let \(q_i\) be the state in T such that \((s',B)w_i = (s',B \cup q_i)\). Define an order \(\prec \) on T so that \(q_1 \prec q_2 \prec \cdots \prec q_{\ell }\). We claim this order makes W complete. Notice that \(w_r = W[q_r]\), the unique \(q_r\)-word in W. Thus we must show that whenever \(q_r \prec q_s\), we have \(q_sw_r^{-1} \cap T \ne \emptyset \).

Suppose \(r < s\) and \(r \le j\). Then \(w_r = x_r\) acts as a permutation on T. Thus \(q_sw_r^{-1} \cap T\) is non-empty, since \(q_s \in T\).

Suppose \(r < s\) and \(r > j\). Since \(s-r > 0\), we can write \(w_s = x_0y^{s-j} = x_0y^{s-r}y^{r-j} = w_{j+s-r}y^{r-j}\). Thus \((s',B)w_s = (s',B \cup q_{j+s-r})y^{r-j} = (s',B \cup q_s)\). There are two possibilities: \(q_{j+s-r}y^{r-j} = q_s\), or \(qy^{r-j} = q_s\) for some \(q \in B\).

In either case, \(q_s(y^{r-j})^{-1} \cap T\) is non-empty. That is, there exists \(q \in T\) such that \(qy^{r-j} = q_s\). Since \(x_0\) acts as a permutation on T, there exists \(p \in T\) such that \(px_0 = q\). Thus \(px_0y^{r-j} = pw_r = q_s\). It follows that \(q_sw_r^{-1} \cap T\) is non-empty, as required.    \(\square \)

Usually, we will use one of the following corollaries instead of Lemma 1 itself.

Corollary 1

Fix \(s' \in Q^{\mathcal A}\) and sets \(B \subseteq T \subseteq Q^{\mathcal B}\). Let x and y be words over \(\varSigma _0\) such that x acts as a permutation on T. Suppose W is one of the following sets:

  1. 1.

    \(\{y,y^2,\cdots ,y^k\}\).

  2. 2.

    \(\{{\varepsilon },y,y^2,\cdots ,y^k\}\).

  3. 3.

    \(\{x,xy,xy^2,\cdots ,xy^k\}\).

  4. 4.

    \(\{{\varepsilon },x,xy,xy^2,\cdots ,xy^k\}\).

If W is a construction set for T, then it is complete.

Proof

All statements follow easily from Lemma 1:

  1. 1.

    Set \(j = 0\).

  2. 2.

    Set \(j = 1\) and \(x_0 = x_1 = {\varepsilon }\).

  3. 3.

    Set \(j = 1\) and \(x_0 = x_1 = x\).

  4. 4.

    Set \(j = 2\), \(x_1 = {\varepsilon }\) and \(x_0 = x_2 = x\).    \(\square \)

Corollary 2

Fix \(s' \in Q^{\mathcal A}\) and sets \(B \subseteq T \subseteq Q^{\mathcal B}\). Let \(W \subseteq \varSigma _0^*\) be a construction set for T.

  1. 1.

    If every word in W acts as a permutation on T, then W is complete.

  2. 2.

    If there is a word \(w \in W\) such that every word in \(W \setminus w\) acts as a permutation on T, then W is complete.

Proof

Both statements follow easily from Lemma 1:

  1. 1.

    Set \(k = 0\) in Lemma 1.

  2. 2.

    Set \(k = 1\), \(x_0 = {\varepsilon }\) and \(y = w\) in Lemma 1.    \(\square \)

In the special case where W contains \({\varepsilon }\), Corollary 2 admits the following generalization, which we found occasionally useful.

Lemma 2

Fix \(s' \in Q^{\mathcal A}\) and sets \(B \subseteq T \subseteq Q^{\mathcal B}\). Let \(W = \{{\varepsilon },w_1,\cdots ,w_k\}\) be a construction set for T, where \(w_1,\cdots ,w_k\) are non-empty words over \(\varSigma _0\). Suppose that for every word \(w \in W\), there exists a set S with \(T \setminus B \subseteq S \subseteq T\) such that w acts as a permutation on S. Then W is complete.

Proof

Write \(B = \{q_1,\cdots ,q_j\}\). Note that \({\varepsilon }\) is a \(q_i\)-word for \(1 \le i \le j\). Thus by the definition of a construction set, \({\varepsilon }\) is the unique \(q_i\)-word in W for each \(q_i \in B\), that is, \(W[q_i] = {\varepsilon }\) for \(1 \le i \le j\). In particular, each non-empty word in W is a q-word for some \(q \in T \setminus B\). For \(1 \le i \le k\), let \(q_{j+i}\) be the state such that \((s',B)w_i = (s',B \cup q_{j+i})\). Then \(T = \{q_1,\cdots ,q_{j+k}\}\). Note that \(W[q_i] = {\varepsilon }\) if \(1 \le i \le j\), and \(W[q_i] = w_{i-j}\) if \(j+1 \le i \le j+k\).

Define an order \(\prec \) on T by \(q_1 \prec q_2 \prec \cdots \prec q_{j+k}\). We claim this order makes W complete. Choose \(q_r,q_s \in T\) with \(q_r \prec q_s\); we want to show that \(q_sW[q_r]^{-1} \cap T \ne \emptyset \). Suppose \(q_r \in B\). Then \(W[q_r] = {\varepsilon }\), and we have \(q_s{\varepsilon }^{-1} \cap T\) non-empty as required. Now if \(q_r \not \in B\), then since \(q_r \prec q_s\) we also have \(q_s \not \in B\). In this case, \(W[q_r] = w_{r-j}\), which acts as a permutation on some superset S of \(T \setminus B\). Since \(q_s \in T \setminus B\), it follows that \(q_s\) has a preimage under \(w_{r-j}\), and furthermore this preimage lies in T, since S is a subset of T. Thus \(q_sw_{r-j}^{-1} \cap T \ne \emptyset \) as required. This proves that W is complete.    \(\square \)

Note that all words referred to in the above lemmas and corollaries are words over \(\varSigma _0\), the shared alphabet of \({\mathcal A}\) and \({\mathcal B}\). When working with automata that have different alphabets, it is important to use only words over the shared alphabet when trying to find a complete construction set.

The following “master theorem” summarizes the results of this section. We have attempted to state this theorem so that it can be cited without having to first define all the notions introduced in this section, such as q-words and construction sets and completeness.

Theorem 2

Let \({\mathcal A}= (Q^{\mathcal A},\varSigma ^{\mathcal A},T^{\mathcal A},i^{\mathcal A},F^{\mathcal A})\) and \({\mathcal B}= (Q^{\mathcal B},\varSigma ^{\mathcal B},T^{\mathcal B},i^{\mathcal B},F^{\mathcal B})\) be DFAs. Let \({\mathcal C}= (Q,\varSigma ,T,I,F)\) denote the concatenation DFA of \({\mathcal A}\) and \({\mathcal B}\), as defined in Sect. 2.2. Let \(\varSigma _0 = \varSigma ^{\mathcal A}\cap \varSigma ^{\mathcal B}\).

Fix a state \(s' \in Q^{\mathcal A}\) and sets \(B \subseteq T \subseteq Q^{\mathcal B}\). Suppose that for each \(q \in T\), there exists a word \(w_q \in \varSigma _0^*\) such that \((s',B) \xrightarrow {w_q} (s',B \cup q)\) in \({\mathcal C}\). Let \(W = \{w_q : q \in T\}\). Suppose that one of the following conditions holds:

  1. 1.

    There exist words \(x,y \in \varSigma _0^*\), where x acts as a permutation on T, such that W can be written in one of the following forms:

    • \(W = \{y,y^2,\cdots ,y^k\}\).

    • \(W = \{{\varepsilon },y,y^2,\cdots ,y^k\}\).

    • \(W = \{x,xy,xy^2,\cdots ,xy^k\}\).

    • \(W= \{{\varepsilon },x,xy,xy^2,\cdots ,xy^k\}\).

  2. 2.

    Every word in W acts as a permutation on T.

  3. 3.

    There exists \(w \in W\) such that every word in \(W \setminus w\) acts as a permutation on T.

  4. 4.

    W contains \({\varepsilon }\), and for every non-empty word \(w \in W\), there exists a set S such that \(T \setminus B \subseteq S \subseteq T\) and w acts as a permutation on S.

  5. 5.

    There exists a total order \(\prec \) on T such that for all \(p,q \in T\) with \(p \prec q\), the set \(qw_p^{-1} = \{ s \in Q^{\mathcal B}: s \xrightarrow {w_p} q\}\) contains an element of T.

If one of the above conditions holds, then every state of the form \((s',X)\) with \(B \subseteq X \subseteq T\) is reachable from \((s',B)\) in \({\mathcal C}\).

To close this section, we give two examples of how our results can be applied. Many more examples can be found in [11].

Theorem 3

(Yu, Zhuang and Salomaa, 1994 [20]). Define \({\mathcal A}\) and \({\mathcal B}\) as follows:

$$ \begin{array}{lcccc} &{}a&{}b&{}c&{}\text {Final States}\\ {\mathcal A}:&{}(1',\cdots ,m')&{}(Q^{\mathcal A}{\rightarrow }1')&{}\mathrm{id}&{}\{m'\}\\ {\mathcal B}:&{}\mathrm{id}&{}(1,\cdots ,n)&{}(Q^{\mathcal B}{\rightarrow }2)&{}\{n\} \end{array} $$

Then \({\mathcal C}\) has \((m-1)2^{n}+2^{n-1}\) reachable and pairwise distinguishable states.

Proof

The initial state of \({\mathcal C}\) is \((1',\emptyset )\). For \(k \le n-2\) we have

$$\begin{aligned} (1',\emptyset ) \xrightarrow {a^m} (1',1) \xrightarrow {b^k} (1',1+k). \end{aligned}$$

It follows that \(\{a^m,a^mb,\cdots ,a^mb^{n-1}\}\) is a construction set for \(Q^{\mathcal B}\) (with \(s' = 1'\) and \(B = \emptyset \)). By Corollary 1, it is complete (taking \(x = a^m\) and \(y = b\)). Hence all states \((1',S)\) with \(S \subseteq Q^{\mathcal B}\) are reachable. We can reach \((q',S)\) for \(q' \ne m'\) and \((m',S \cup 1)\) by words in \(a^*\).

Let \((p',S)\) and \((q',T)\) be distinct states of \({\mathcal C}\). If \(S \ne T\), let r be a state in the symmetric difference of S and T. Then \(b^{n-r}\) distinguishes the states. If \(S = T\) and \(p' < q'\), then \(ca^{m-q}b^{n-2}\) distinguishes the states.    \(\square \)

Theorem 4

(Maslov, 1970 [16]). Define \({\mathcal A}\) and \({\mathcal B}\) as follows:

$$ \begin{array}{lcccc} &{}a&{}b&{}\text {Final States}\\ {\mathcal A}:&{}(1',\cdots ,m')&{}\mathrm{id}&{}\{m'\}\\ {\mathcal B}:&{}(n-1,n)&{}({}_1^{n-1} q {\rightarrow }q+1)&{}\{n\} \end{array} $$

Then \({\mathcal C}\) has \((m-1)2^{n}+2^{n-1}\) reachable and pairwise distinguishable states.

Proof

The initial state is \((1',\emptyset )\). We have

$$\begin{aligned} (1',\emptyset ) \xrightarrow {a^m} (1',1) \xrightarrow {b^k} (1',1+k). \end{aligned}$$

Thus \(\{a^m,a^mb,a^mb^2,\cdots ,a^mb^{n-1}\}\) is a construction set for \(Q^{\mathcal B}\) (with \(s' = 1'\) and \(B = \emptyset \)). By Corollary 1, it is complete. Hence \((1',S)\) is reachable for all \(S \subseteq Q^{\mathcal B}\). We can reach \((q',S)\) for \(q' \ne m'\) and \((m',S \cup 1)\) by words in \(a^*\).

Let \((p',S)\) and \((q',T)\) be distinct states of \({\mathcal C}\). If \(S \ne T\), let r be a state in the symmetric difference of S and T. Then \(b^{n-r}\) distinguishes the states. If \(S = T\) and \(p' < q'\), by \(b^n\) we reach \((p',n)\) and \((q',n)\). Then by \(a^{m-q}\) we reach \(((p+m-q)',na^{m-q})\) and \((m',na^{m-q} \cup 1)\). These states differ in their second component, so they are distinguishable.    \(\square \)

4 Conclusions

We have introduced a new technique for demonstrating the reachability of states in DFAs for the concatenation of two regular languages, and tested this technique in a wide variety of cases. In addition to the examples of Theorems 3 and 4, we demonstrate in [11] that our results are applicable to three other regular language concatenation witnesses [3, 4, 9], a star-free witness [7], two non-returning witnesses [5, 12], a prefix-closed witness [2], two suffix-free witnesses [8, 14], and three right ideal witnesses [1, 6, 9]. This suggests our technique is worth considering as an alternative to traditional induction proofs when working with concatenation automata.