1 Introduction and Preliminaries

A deterministic finite automaton (DFA) over a finite alphabet \(\varSigma \) is called synchronizing, if it admits a synchronizing word. A word \(w \in \varSigma ^*\) is called synchronizing (or directed, or reset), if, starting in any state q, after reading w, one always ends in one particular state \(q_s\). So reading w acts as a reset button: no matter in which state the system is, it always moves to the particular state \(q_s\). Now Černý’s conjecture [2] states:

Every synchronizing DFA on n states admits a synchronizing word of length \(\le (n-1)^2\).

Surprisingly, despite extensive effort, this conjecture is still open, and even the best known upper bounds are still cubic in n. In 1983 Pin [8] established the bound \(\frac{1}{6}(n^3-n)\). Only very recently a slight improvement was claimed by Szykuła [11]. For a survey on synchronizing automata and Černý’s conjecture, we refer to [13].

Formally, a deterministic finite automaton (DFA) over a finite alphabet \(\varSigma \) consists of a finite set Q of states and a map \(\delta : Q \times \varSigma \rightarrow Q\).Footnote 1 For \(w \in \varSigma ^*\) and \(q \in Q\), we define qw inductively by \(q \lambda = q\) and \(q w a = \delta (qw,a)\) for \(a \in \varSigma \), where \(\lambda \) is the empty word. So qw is the state where one ends, when starting in q and reading the symbols in w consecutively, and qa is a short hand notation for \(\delta (q,a)\). A word \(w \in \varSigma ^*\) is called synchronizing, if a state \(q_s \in Q\) exists such that \(q w = q_s\) for all \(q \in Q\).

In [2], Černý already gave DFAs for which the bound of the conjecture is attained: for \(n \ge 2\) the DFA \(C_n\) is defined to consist of n states \(1,2,\ldots ,n\), and two symbols ab, acting by \(qa = q+1\) for \(q = 1,\ldots ,n-1\), \(\delta (n,a) = 1\), and \(qb = q\) for \(q = 2,\ldots ,n\), \(1b = 2\).

figure a

For \(n=4\), this is depicted on the right. For \(C_n\), the string \(w = b (a^{n-1}b)^{n-2}\) of length \(|w| = (n-1)^2\) satisfies \(qw = 2\) for all \(q \in Q\), so w is synchronizing. No shorter synchronizing word exists for \(C_n\), as is shown in [2], showing that the bound in Černý’s conjecture is sharp.

A DFA on n states is critical, if its shortest synchronizing word has length \((n-1)^2\). One goal of this paper is to investigate all critical DFAs up to some size. To exclude infinitely many trivial extensions, we only consider basic DFAs: no two distinct symbols act in the same way in the automaton, and no symbol acts as the identity. Obviously, adding the identity or copies of existing symbols has no influence on synchronization.

An extensive investigation was already done by Trahtman in [12]: by computer support and clever algorithms, all critical DFAs on n states and k symbols were investigated for \(3 \le n \le 7\) and \(k \le 4\), and for \(n=8,9,10\) and \(k=2\). Here, a minimality requirement was added: examples were excluded if criticality may be kept after removing one symbol. Then up to isomorphism there are exactly 8 of them, apart from the basic Černý examples: 3 with 3 states, 3 with 4, one with 5 and one with 6. In [4], the minimality requirement and restrictions on alphabet size were dropped and several more examples were found that Trahtman originally expected not to exist. All these are extensions of known examples: in total there are exactly 15 basic critical DFAs for \(n=3\) and exactly 12 basic critical DFAs for \(n=4\). In this paper, we show that for \(n=5,6\), no more critical DFAs exist than the four known ones, without any restriction on the number of symbols.

A generalization of a DFA is a Partial Finite Automaton (PFA); the only difference is that now the transition function \(\delta \) is allowed to be partial. In a PFA, qw may be undefined, in fact it is only defined if every step is defined. A word \(w \in \varSigma ^*\) is called carefully synchronizing for a PFA, if a state \(q_s \in Q\) exists such that qw is defined and \(q w = q_s\) for all \(q \in Q\). Stated in words: starting in any state q and reading w, every step is defined and one always ends in state \(q_s\). As being a generalization of DFAs, the shortest carefully synchronizing word may be longer. For \(n=4,5,6\) we show that this is indeed the case by finding the maximal shortest carefully synchronizing word length to be 10, 21 and 37, respectively. The maximal length grows exponentially in n, as was already observed by Rystsov [10]. Martyugin [7] established the lower bound \(\varOmega (3^{n/3})\) with a construction in which the number of symbols is linear in n. In a recent paper, the upper bound \(O((3+\varepsilon )^\frac{n}{3})\) was proved [5].

Until recently it was an open question if exponential lower bounds can be achieved with a constant alphabet size. We answer this question by giving a construction of a PFA on n states and three symbols with exponential shortest synchronizing word length. The key idea is that synchronization is forced to mimic exponentially many string rewrite steps, similar to binary counting. Our three-symbol PFA can be transformed to a two-symbol PFA by a standard construction for which we develop a substantial improvement. Independent of our work, recently in [14] it was shown that exponential bounds exist for every constant alphabet size and for two symbols the bound \(\varOmega (2^{n/35})\) was given. Our basic construction strongly improves this and gives length \(\varOmega (\phi ^{n/3})\) for the three-symbol PFA and length \(\varOmega (\phi ^{n/5})\) for the two-symbol PFA, where \(\phi = \frac{1+ \sqrt{5}}{2}\). Some optimizations yield further improvements.

The basic tool to analyze (careful) synchronization is the power automaton. For any DFA or PFA \((Q,\varSigma , \delta )\), its power automaton is the DFA \((2^Q,\varSigma , \delta ')\) where \(\delta ' : 2^Q \times \varSigma \rightarrow 2^Q\) is defined by \(\delta '(V,a) = \{q \in Q \mid \exists p \in V : \delta (p,a) = q \}\), if \(\delta (p,a)\) is defined for all \(p \in V\), otherwise \(\delta '(V,a) = \emptyset \). For any \(V \subseteq Q, w \in \varSigma ^*\), we define Vw as above, using \(\delta '\) instead of \(\delta \). From this definition, one easily proves that \(Vw = \{ qw \mid q \in V \}\) if qw is defined for all \(q \in V\), otherwise \(Vw = \emptyset \), for any \(V \subseteq Q, w \in \varSigma ^*\). A set of the shape \(\{q\}\) for \(q \in Q\) is called a singleton. So a word w is (carefully) synchronizing, if and only if Qw is a singleton. Hence a DFA (PFA) is (carefully) synchronizing, if and only if its power automaton admits a path from Q to a singleton, and the shortest length of such a path corresponds to the shortest length of a (carefully) synchronizing word.

This paper is organized as follows. In Sect. 2 we describe our exhaustive analysis of DFAs on at most 6 states. In Sect. 3 we give our results for PFAs on at most 6 states. In Sect. 4 we present our construction of PFAs on three symbols with exponential shortest carefully synchronizing word length. In Sect. 5 we improve the transformation used by Martyugin [7] and Vorel [14] to reduce to alphabet size two. Section 6 discusses optimizations. We conclude in Sect. 7.

2 Critical DFAs on at Most 6 States

A natural question when studying Černý’s conjecture is: what can be said about automata in which the bound of the conjecture is actually attained, the so-called critical automata? Throughout this section we restrict ourselves to basic DFAs. As has already been noted by several authors [4, 12, 13], critical DFAs are rare. There is only one construction known which gives a critical DFA for each n, namely the well-known sequence \(C_n\), discovered by and named after Černý [2]. Apart from this sequence, all known critical DFAs have at most 6 states. In [4], all critical DFAs on less than 5 states were identified, without restriction on the size of the alphabet. For \(n=5\) and 6 it was still an open question if there exist critical (or even supercritical) DFAs, other than those already discovered by Černý, Roman [9] and Kari [6]. In this paper we verify that this is not the case, so for \(n=5\) only two critical DFAs exist (Černý, Roman) and also for \(n=6\) only two exist (Černý, Kari). In fact our results also prove the following theorem (previously only known for \(n\le 5\), see [3]):

Theorem 1

Every synchronizing DFA on \(n\le 6\) states admits a synchronizing word of length at most \((n-1)^2\).

As the number of DFAs on n states grows like \(2^{n^{n}}\), an exhaustive search is a non-trivial affair, even for small values of n. The problem is that the alphabet size in a basic DFA can be as large as \(n^n-1\). Up to now for \(n=5,6\) only DFAs with at most four symbols were checked by Trahtman [12]. Here we give describe our algorithm to investigate all DFAs on 5 and 6 states, without restriction on the alphabet size.

Before explaining the algorithm, we introduce some terminology. A DFA \(\mathcal {B}\) obtained by adding some symbols to a DFA \(\mathcal {A}\) will be called an extension of \(\mathcal {A}\). If \(\mathcal {A} = (Q,\varSigma ,\delta )\), then \(S\subseteq Q\) will be called reachable if there exists a word \(w\in \varSigma ^*\) such that \(Qw=S\). We say that S is reducible if there exists a word w such that \(|Sw|<|S|\), and we call w a reduction word for S. Our algorithm is mainly based on the following observation:

Property 2

If a DFA \(\mathcal {A}\) is synchronizing, and \(\mathcal {B}\) is an extension of \(\mathcal {A}\), then \(\mathcal {B}\) is synchronizing as well and its shortest synchronizing word is at most as long as the shortest synchronzing word for \(\mathcal {A}\).

The algorithm roughly runs as follows. We search for (super)critical DFAs on n states, so a DFA is discarded if it synchronizes faster, or if it does not synchronize at all. For a given DFA \(\mathcal {A} = (Q,\varSigma ,\delta )\) which is not yet discarded or investigated, the algorithm does the following:

  1. 1.

    If \(\mathcal {A}\) is synchronizing and (super)critical, we have identified an example we are searching for.

  2. 2.

    If \(\mathcal {A}\) is synchronizing and subcritical, it is discarded, together with all its possible extensions (justified by Property 2).

  3. 3.

    If \(\mathcal {A}\) is not synchronizing, then find an upper bound L for how fast any synchronizing extension of \(\mathcal {A}\) will synchronize (see below). If \(L < (n-1)^2\), then discard \(\mathcal {A}\) and all its extensions. Otherwise, discard only \(\mathcal {A}\) itself.

The upper bound L for how fast any synchronizing extension of \(\mathcal {A}\) will synchronize, is found by analyzing distances in the directed graph of the power automaton of \(\mathcal {A}\). For \(S,T\subseteq Q\), the distance from S to T in this graph is equal to the length of the shortest word w for which \(Sw=T\), if such a word exists. We compute L as follows:

  1. 1.

    Determine the size |S| of a smallest reachable set. Let m be the minimal distance from Q to a set of size |S|.

  2. 2.

    For each \(k\le |S|\), partition the collection of irreducible sets of size k into strongly connected components. Let \(m_k\) be the number of components plus the sum of their diameters.

  3. 3.

    For each reducible set of size \(k\le |S|\), find the length of its shortest reduction word. Let \(l_k\) be the maximum of these lengths.

  4. 4.

    Now note that a synchronizing extension of \(\mathcal {A}\) will have a synchronizing word of length at most

    $$\begin{aligned} L \; = \; m+\sum _{k=2}^{|S|}(m_k+l_k). \end{aligned}$$

The algorithm performs a depth-first search. So after investigating a DFA, first all its extensions (not yet considered) are investigated before moving on. Still, we can choose which extension to pick first. We would like to choose an extension that is likely to be discarded immediately together with all its extensions. Therefore, we apply the following heuristic: for each possible extension \(\mathcal {B}\) by one symbol, we count how many pairs of states in \(\mathcal {B}\) would be reducible. The extension for which this is maximal is investigated first. The motivation is that a DFA is synchronizing if and only if each pair is reducible [2].

Finally, we note that we have described a primitive version of the algorithm here. The algorithm which has actually been used also takes symmetries into account, making it almost n! times faster. For the source code, we refer to [1].

3 PFAs on at Most 6 States

In the remainder of this paper, we study PFAs and shortest carefully synchronizing word lengths. In this section, we focus on PFAs on at most 6 states. In the next section, we construct PFAs with shortest carefully synchronizing words of exponential lengths for general n.

To find PFAs with small number of states and long shortest carefully synchronizing word, we exploit that Property 2 also holds for PFAs. However, for PFAs it is not true that reducibility of all pairs of states guarantees careful synchronization. Therefore, we apply a different search algorithm. Technical details are discussed in [1].

For \(n\le 6\), our algorithm has identified the maximal length of a shortest carefully synchronizing word in a PFA on n states. The results are:

figure b

We observe that PFAs exist for \(n=4,5,6\) with shortest carefully synchronizing word lengths exceeding \((n-1)^2\). Note that for \(n=5,6\) this even exceeds the Pin-Frankl bound \(\frac{1}{6}(n^3-n)\) for DFAs from [8]. Where for \(n\ge 5\) no critical DFAs are known with more than three symbols, PFAs with long shortest carefully synchronizing word lengths tend to have more symbols: for \(n=4,5,6\) states the minimal numbers of symbols achieving the maximal shortest carefully synchronizing word lengths 10, 21 and 37 are 3, 6, 6, respectively. Below we give examples of PFAs on 4, 5 and 6 states reaching these lengths.

figure c

The left one has two synchronizing words of length 10: \(abcabab\) \((b+c)ca\). The right one has unique shortest synchronizing word \( abcabdbebcabdbfbcdeca \) of length 21.

figure d

The shortest synchronizing word is \(ab^2ab^2cb^2ab^2db^2eb^2cb^2ab^2db^2fb^2cdecb^2a\) for this PFA on 6 states. It is unique and has length 37.

4 Exponential Bounds for PFAs

In this section, we construct for any \(k \ge 3\) a strongly connected PFA on \(n = 3k\) states and three symbols, for which we show that it is carefully synchronizing, and the shortest carefully synchronizing word has length \(\varOmega (\phi ^{n/3})\) for \(\phi = \frac{1+ \sqrt{5}}{2} = 1.618\cdots \). The set of states is \(Q = \{A_i, B_i, C_i \mid i = 1,\ldots ,k\}\). If a set \(S \subseteq Q\) contains exactly one element of \(\{A_i, B_i, C_i\}\) for every i, it can be represented by a string over \(\{A,B,C\}\) of length k. The idea of our construction is that the PFA will mimic rewriting the string \(C^2 A^{k-2}\) to the string \(C^2 A^{k-3} B\) with respect to the rewrite system R, which consists of the following three rules

$$\begin{aligned} BBA \rightarrow AAB, \; CBA \rightarrow CAB, \; CCA \rightarrow CCB. \end{aligned}$$

The key argument is that this rewriting is possible, but requires an exponential number of steps. This is elaborated in the following lemma, in which we use \(\rightarrow _R\) for rewriting with respect to R, that is, \(u \rightarrow _R v\), if and only if \(u = u_1 \ell u_2\) and \(v = u_1 r u_2\), for strings \(u_1,u_2\) and a rule \(\ell \rightarrow r\) in R. Its transitive closure is denoted by \(\rightarrow _R^+\). We write \({\textsf {fib}}\) for the standard fibonacci function, defined by \({\textsf {fib}}(i) = i\) for \(i=0,1\), and \({\textsf {fib}}(i) = {\textsf {fib}}(i-1) + {\textsf {fib}}(i-2)\) for \(i > 1\). It is well-known that \({\textsf {fib}}(n) = \varTheta (\phi ^n)\).

Lemma 3

For \(k \ge 3\), we have \(CCA^{k-2} \rightarrow _R^+ CCA^{k-3}B\). Furthermore, the smallest possible number of steps for rewriting \(CCA^{k-2}\) to a string ending in B, is exactly \( {\textsf {fib}}(k)-1\).

Proof

For the first claim we do induction on k. For \(k=3\), we have \(CCA \rightarrow _R CCB\). For \(k=4\), we have \(CCAA \rightarrow _R CCBA \rightarrow _R CCAB\). For \(k>4\), applying the induction hypothesis twice, we obtain

$$\begin{aligned} CCA^{k-2} \rightarrow _R^+ CCA^{k-4}BA \rightarrow _R^+ CCA^{k-5}BBA \rightarrow _R CCA^{k-3}B. \end{aligned}$$

For the second claim, we define the weight W(u) of a string \(u = u_1 u_2 \cdots u_k\) over \(\{A,B,C\}\) of length k by

$$ W(u) = \sum _{i : u_i = B} ({\textsf {fib}}(i)-1). $$

So every B on position i in u contributes \({\textsf {fib}}(i)-1\) to the weight, and the other symbols have no weight.

Now we claim that \(W(v) = W(u)+1\) for all strings uv with \(u \rightarrow _R v\) and uv only having C’s in the first two positions. Since the Cs only occur at positions 1 and 2, by applying \(CCA \rightarrow CCB\), the weight increases by \({\textsf {fib}}(3)-1 = 1\) by the creation of B on position 3, and by applying \(CBA \rightarrow CAB\), it increases by \({\textsf {fib}}(4)-1 -({\textsf {fib}}(3)-1) = 1\) since B on position 3 is replaced by B on position 4. By applying \(BBA \rightarrow AAB\), the contributions to the weight \({\textsf {fib}}(i)-1\) and \({\textsf {fib}}(i+1)-1\) of the two Bs are replaced by \({\textsf {fib}}(i+2)-1\) of the new B, which is an increase by 1 according to the definition of \({\textsf {fib}}\).

So this weight increases by exactly 1 at every rewrite step, hence it requires exactly \({\textsf {fib}}(k)-1\) steps, to go from the initial string \(CCA^{k-2}\) of weight 0 to the weight \({\textsf {fib}}(k)-1\) of a B symbol on the last position k, if that is the only B, and more steps if there are more Bs.    \(\square \)

Now we are ready to define the PFA on \(Q = \{A_i, B_i, C_i \mid i = 1,\ldots ,k\}\) and three symbols. The three symbols are a start symbol s, a rewrite symbol r and a cyclic shift symbol c. The transitions are defined as follows (writing \(\bot \) for undefined):

figure e

A shortest carefully synchronizing word starts by s, since r is not defined on all states and c permutes all states. After s, the set of reached states is \(S(CCA^{k-2}) = \{C_1,C_2,A_3,\ldots ,A_k\}\). Here, for a string \(u = a_1 a_2 \cdots a_k\) of length k over \(\{A,B,C\}\), we write S(u) for the set of k states, containing \(A_i\) if and only if \(a_i = A\), containing \(B_i\) if and only if \(a_i = B\), and containing \(C_i\) if and only if \(a_i = C\), for \(i = 1,2,\ldots ,k\). Note that for \(x \in \{A,B,C\}\) and \(v \in \{A,B,C\}^{k-1}\), we have \(S(vx)c = S(xv)\), so c performs a cyclic shift on strings of length k.

The next lemma states that the symbol r indeed mimicks rewriting: applied on sets of the shape S(u), up to cyclic shift it acts as rewriting on u with respect to R defined above.

Lemma 4

Let u be a string of the shape CCw, where \(w \in \{A,B\}^{k-2}\). If \(u \rightarrow _R v\) for a string v, then \(S(u)c^irc^{k-i} = S(v)\) for some \(i<k\).

Conversely, if u does not end in B and there exists an i such that r is defined on \(S(u)c^i\), then \(u \rightarrow _R v\) for a string v of the shape CCw, where \(w \in \{A,B\}^{k-2}\).

Proof

First assume that \(u \rightarrow _R v\). If \(u = u_1 BBA u_2\) and \(v = u_1 AAB u_2\), then let \(i = |u_2| + 3\), so

$$\begin{aligned} S(u) c^i r c^{k-i}&= S(u_1 BBA u_2) c^i r c^{k-i} = S(BBA u_2 u_1) r c^{k-i} \\&= S(AAB u_2 u_1) c^{k-i} = S(u_1 AAB u_2) = S(v). \end{aligned}$$

If \(u = u_1 CBA u_2\) and \(v = u_1 CAB u_2\), then again let \(i = |u_2| + 3\), so

$$\begin{aligned} S(u) c^i r c^{k-i}&= S(u_1 CBA u_2) c^i r c^{k-i} = S(CBA u_2 u_1) r c^{k-i} \\&= S(CAB u_2 u_1) c^{k-i} = S(u_1 CAB u_2) = S(v). \end{aligned}$$

Finally, if \(u = u_1 CCA u_2\) and \(v = u_1 CCB u_2\), then \(u_1 = \epsilon \) and the result follows for \(i=0\).

Conversely, suppose that \(S(u)c^ir\) is defined. Since \(S(u)c^k = S(u)\), we may assume that \(i < k\) and can write \(u = u_1 u_2\), such that \(|u_2| = i\). Then \(S(u)c^i = S(w)\), where \(w = u_2 u_1\). Write \(w = a_1 a_2 \cdots a_k\). Since \(S(u_2 u_1)r\) is defined, we get \(a_1 \ne A\), \(a_2 \ne A\) and \(a_3 \ne B\). Among these 8 cases, \(a_1 = a_2 = a_3 = C\) does not occur since u only contains 2 Cs, and \(a_1 a_2 = BC\) or \(a_2 a_3 = BC\) does not occur since u does not end in B. The remaining 3 cases are

$$ a_1a_2a_3 = BBA, \qquad a_1a_2a_3 = CBA, \qquad \text{ and } \qquad a_1a_2a_3 = CCA, $$

where \(a_1a_2a_3\) is replaced by the corresponding right hand side of the rule by the action of r. Then in \(S(u)c^irc^{k-i}\), the two Cs are on positions 1 and 2 again, and we obtain \(S(u)c^ir c^{k-i} = S(v)\) for a string v of the given shape, satisfying \(u {\rightarrow _R}v\).    \(\square \)

Combining Lemmas 3 and 4 and the fact that \({\textsf {fib}}(n) = \varOmega (\phi ^n)\), we obtain the following.

Corollary 5

There is a word w such that \(S(CCA^{k-2})w = S(CCA^{k-3}B)\); the shortest word w for which \(S(CCA^{k-2})w\) is of the shape \(S(u) c^i\) for u ending in B has length \(\varOmega (\phi ^k)\).

Now we are ready to prove the lower bound:

Lemma 6

If w is carefully synchronizing, then \(|w| = \varOmega (\phi ^k)\).

Proof

Assume that w is a shortest carefully synchronizing word. Then we already observed that the first symbol of w is s, and w yields \(S(CCA^{k-2})\) after the first step in the power automaton. By applying only c-steps and r-steps, according to Lemma 4, only sets of the shape \(S(u)c^i\) for which \(CCA^{k-2} \rightarrow _R^+ u\) can be reached, until u ends in B. In this process, each r-step corresponds to a rewrite step. Applying the third symbol s does not make sense, since then we go back to \(S(CCA^{k-2})\). According to Corollary 5, in the power automaton at least \(\varOmega (\phi ^k)\) steps are required to reach a set which is not of the shape \(S(u) c^i\). So for reaching a singleton, the total number of steps is at least \(\varOmega (\phi ^k)\).    \(\square \)

Note that for the reasoning until now, the definition of \(C_3 r = B_2\) did not play a role, and by sr all states were replaced by states having the same index. But after the last symbol of u has become B, this \(C_3 r = B_2\) will be applied, leading to a subset in which no state of the group \(A_3, B_3, C_3\) occurs any more. Now we arrive at the main theorem.

Theorem 7

For every n there is a carefully synchronizing PFA on n states and three symbols with shortest carefully synchronizing word length \(\varOmega (\phi ^{n/3})\).

Proof

If \(n = 3k\) we take our automaton, otherwise we add one or two states on which rc are undefined and s maps to \(A_1\), having no influence on the argument. The bound was proved in Lemma 6; it remains to prove that the automaton is carefully synchronizing, that is, it is possible to end up in a singleton in the power automaton.

Let w be the word from Corollary 5. Since \(S(CCA^{k-2})w = S(CCA^{k-3}B)\) and the number of c’s in w is divisible by k, we have \(C_1 w = C_1\), \(C_2 w = C_2\), \(A_3 w = A_3, \ldots , A_{k-1} w = A_{k-1}\), \(A_k w = B_k\). Hence

figure f

So for all \(i \ne 2\), \(\{A_i,B_i,C_i\}swcr\) is contained in the cyclic successor \(\{A_i,B_i,C_i\}c\) of \(\{A_i,B_i,C_i\}\). \(\{A_2,B_2,C_2\}swcr\) is just contained in \(\{A_2,B_2,C_2\}\) itself. Since for any i, one can take the cyclic successor of \(\{A_i,B_i,C_i\}\) at most \(k-1\) times before ending up in \(\{A_2,B_2,C_2\}\), we deduce that

$$ \{A_i,B_i,C_i\}(swcr)^{k-1} \subseteq \{A_2,B_2,C_2\} \quad \text{ for } i=1,2,\ldots ,k. $$

As \(\{A_2,B_2,C_2\}s = \{C_2\}\), we obtain the carefully synchronizing word \((s w c r)^{k-1}s\) of the PFA.    \(\square \)

5 Reduction to Two Symbols

In this section we construct PFAs with two symbols and exponential shortest carefully synchronizing word length. We do this by a general transformation to two-symbol PFAs, as was done before, e.g. in [14]. There a PFA on n states and m symbols was transformed to a PFA on mn states and two symbols, preserving synchronization length. In the next theorem, we improve this resulting number of states to \((m-1)n\) or even less, only needing a mild extra condition. Using this result, we reduce our 3-symbol PFA with synchronizing length \(\varOmega (\phi ^{n/3})\) to a 2-symbol PFA with synchronizing length \(\varOmega (\phi ^{n/5})\).

Theorem 8

Let \(P = (Q, \varSigma )\) be a carefully synchronizing PFA with \(|Q| = n\), \(|\varSigma | = m\), and shortest carefully synchronizing word length f(n). Assume \(s \in \varSigma \) and \(Q' \subseteq Q\) satisfy the following properties.

  1. 1.

    there is some number p such that all symbols are defined on \(Q s^p\) for a complete symbol s,

  2. 2.

    \(qs = q\) for all \(q \in Q'\), and

  3. 3.

    \(qa = qb\) for all \(q \in Q'\) and all \(a,b \in \varSigma \setminus \{s\}\).

Let \(n' = n - |Q'|\). Then there exists a carefully synchronizing PFA on \(n + n'\) \((m-2)\) states and 2 symbols, with shortest carefully synchronizing word length at least f(n).

Note that if \(Q' = \emptyset \) then only requirement 1 remains, and the resulting number of states is \(n + n'\) \((m-2) = (m-1)n\).

Proof

Write \(Q = \{1,2,\ldots ,n\}\), \(Q' = \{n'+1,\ldots ,n\}\), and \(\varSigma = \{s,a_1,\ldots ,a_{m-1}\}\). Let the states of the new PFA be \(P_{1,j}\) for \(j = 1,\ldots ,n\) and \(P_{i,j}\) for \(i = 2,\ldots ,m-1\), \(j = 1,\ldots ,n'\). Define the following two symbols ab on these states:

and \(P_{i,j} b = P_{1,j a_i}\), for all \(i = 1,\ldots ,m-1\) and \(j = 1,\ldots ,n\) for which \(P_{i,j}\) exists and \(j a_i\) is defined.

If we arrange the states as indicated above, then on the leftmost \(n'\) columns, a moves the states one step downward if possible, and for the bottom row jumps to the top row and acts there as s. For the remainder of the top row a also acts as s (which is the identity). On the leftmost \(n'\) columns, the symbol b acts as \(a_i\) on row i and then jumps to the top line. For the remainder of the top row, all \(a_i\) act in the same way and b acts likewise.

Define \(\psi (a_i) = a^{i-1} b\) for \(i = 1,\ldots ,m-1\), and \(\psi (s) = a^{m-1}\). Then on the top line \(\psi (a_i)\) acts in the same way as \(a_i\) in the original PFA. Similarly, \(\psi (s)\) acts as s. On any other row, \(\psi (s)\) acts as s, too. Since every symbol \(a_i\) is defined on \(qs^p\) for every \(q\in Q\), we obtain that \(\psi (s)^pb=a^{(m-1)p}b\) is defined on every state and ends up in the top row.

Assume that w is carefully synchronizing in the original PFA. Then by the above observations, \(a^{(m-1)p} b \psi (w)\) is carefully synchronizing in the new PFA. Conversely, any carefully synchronizing word of the new PFA can be written as \(\psi (w)a^j\), where \(0\le j\le m-2\) and \(\psi (w)\) is a concatenation of blocks of the form \(\psi (l),l\in \varSigma \). Now note that \(a^j\) can never synchronize two distinct states in the top row. Therefore, \(\psi (w)\) synchronizes the top row and consequently w is synchronizing in the original PFA. Clearly \(|\psi (w)a^j|\ge |w|\ge f(n)\).    \(\square \)

We apply Theorem 8 to our basic construction with 3k states and \(m=3\) symbols; note that sc are defined on all states and r is defined on Qs, so the requirements of Theorem 8 hold for \(p=1\). As r and c act differently on all states, the only option for \(Q'\) is \(Q' = \emptyset \). Hence we obtain a carefully synchronizing PFA on \((m-1)3k = 6k\) states and two symbols, with shortest carefully synchronizing word length \(\varOmega (\phi ^k)\). For n being the number of states of the new PFA, this is \(\varOmega (\phi ^{n/6})\).

However, instead of our three symbols scr we also get careful synchronization on the three symbols scrc with careful synchronization length of the same order. But then for \(i = 4,\ldots ,k\) we have \(A_i s = A_i\) and \(A_i c = A_i rc\), so we may choose \(Q' = \{A_4,\ldots ,A_k\}\) in Theorem 8, by which \(n' = 3k - (k-3) = 2k+3\), yielding a PFA on two symbols and \(5k+3\) states. This results in the following theorem, where for n not of the shape \(5k+3\) we add \(\le 4\) extra states to achieve this shape, where b is undefined on the new states and a maps the new states to existing states.

Theorem 9

For every n there is a carefully synchronizing PFA on n states and two symbols with shortest carefully synchronizing word length \(\varOmega (\phi ^{n/5})\).

6 Further Optimizations

Some further optimizations are possible. For instance, for any \(h \ge 2\) we can take \(h+1\) rewrite rules

$$ C^i B^{h-i} A \rightarrow C^i A^{h-i} B $$

for \(i = 0,\ldots ,h\), and construct a PFA on the \(n = 3k\) states \(A_i,B_i,C_i\) for \(i = 1,\ldots ,k\) with a similar s, c, and r, mimicking the rewrite rules in which the rewriting takes place in the states with indexes \(\le h+1\). For \(h=2\), this coincides with our construction, but for \(h > 2\), this gives a better bound \(\varOmega (a^k)\), where a is the real zero of \(x^h - x^{h-1} - \cdots - x - 1\) in between 3 / 2 and 2. As this value tends to 2 for increasing h, for every \(\epsilon > 0\), we achieve the bound \(\varOmega ((2 - \epsilon )^{n/3})\) for three symbols and \(\varOmega ((2 - \epsilon )^{n/5})\) for two symbols. For further improvements to \(\varOmega ((4 - \epsilon )^{n/5}) = \varOmega ((2 - \epsilon )^{2n/5})\) for three symbols and \(\varOmega ((4 - \epsilon )^{n/6}) = \varOmega ((2 - \epsilon )^{n/3})\) for two symbols, we refer to the extended version [1].

7 Conclusions

For every n we constructed a PFA on n states and 3 symbols for which careful synchronization is forced to mimic rewriting with respect to a string rewriting system. This system requires an exponential number of steps to reach a string of a particular shape. The resulting exponential synchronization length is much larger than the cubic upper bound for synchronization length of DFAs. We show that for \(n=4\) the shortest synchronization length for a PFA already can exceed the maximal shortest synchronization length for a DFA. For \(n=4,5,6\) we found greatest possible shortest synchronization lengths, both for DFAs and PFAs, where for DFAs until now this was only fully investigated for \(n \le 4\), that is, by not assuming any bound on the number of symbols. Both for DFAs and PFAs better techniques are needed to do the same analysis for \(n=7\) or higher.