Deciding FO-definability of Regular Languages

Kurucz, Agi; Ryzhikov, Vladislav; Savateev, Yury; Zakharyaschev, Michael

doi:10.1007/978-3-030-88701-8_15

Agi Kurucz¹²,
Vladislav Ryzhikov¹³,
Yury Savateev^13,14 &
…
Michael Zakharyaschev^13,14

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 13027))

Included in the following conference series:

International Conference on Relational and Algebraic Methods in Computer Science

493 Accesses

Abstract

We prove that, similarly to known $\textsc {PSpace}$-completeness of recognising $\mathsf {FO}(<)$-definability of the language ${\boldsymbol{L}}(\mathfrak A)$ of a DFA $\mathfrak A$, deciding both $\mathsf {FO}(<,\equiv )$- and $\mathsf {FO}(<,\mathsf {MOD})$-definability (corresponding to circuit complexity in ${\textsc {AC}^0}$ and ${\textsc {ACC}^0}$) are $\textsc {PSpace}$-complete. We obtain these results by first showing that known algebraic characterisations of FO-definability of ${\boldsymbol{L}}(\mathfrak A)$ can be captured by ‘localisable’ properties of the transition monoid of $\mathfrak A$. Using our criterion, we then generalise the known proof of $\textsc {PSpace}$-hardness of $\mathsf {FO}(<)$-definability, and establish the upper bounds not only for arbitrary DFAs but also for 2NFAs.

Access provided by Autonomous University of Puebla. Download conference paper PDF

On Decidability of Regular Languages Theories

Deciding Open Definability via Subisomorphisms

Complexity of Proper Prefix-Convex Regular Languages

1 Introduction

This paper gives answers to some open questions related to finite automata, logic and circuit complexity. Research in this area goes back (at least) to the early1960 s when Büchi [8], Elgot [12] and Trakhtenbrot [28] showed that $\text {MSO}(<)$ (monadic second-order) sentences over finite strict linear orders define exactly the class of regular languages.

$\mathsf {FO}(<)$-definable regular languages were proven to be the same as star-free languages [19], and their algebraic characterisation as languages with aperiodic syntactic monoids was obtained in [23]. Algebraic characterisations of FO-definability in other signatures, and circuit and descriptive complexity of regular languages were investigated in [3, 4, 26], which established an ${\textsc {AC}^0}$/${\textsc {ACC}^0}$/${{\textsc {NC}^1}}$ trichotomy. In particular, the regular languages decidable in $\textsc {AC}^0$ are definable by $\mathsf {FO}(<,\equiv )$-sentences with unary predicates $x \equiv 0\, (\text {mod}\ n)$; those in ${\textsc {ACC}^0}$ are definable by $\mathsf {FO}(<,\mathsf {MOD})$-sentences with quantifiers $\exists ^n x\, \psi (x)$ checking whether the number of positions satisfying $\psi $ is divisible by n; and all regular languages are definable in $\mathsf {FO}(\mathsf {RPR})$ with relational primitive recursion [11]; see Table 1.

The problem of deciding whether the language of a given DFA $\mathfrak A$ is $\mathsf {FO}(<)$-definable is known to be $\textsc {PSpace}$-complete [7, 10, 25] (which is also a special case of general results on finite monoids [5, 13]). As shown in [4], the algebraic criteria of Table 1 yield algorithms deciding whether a given regular language is in ${\textsc {AC}^0}$ and $\mathsf {FO}(<,\equiv )$-definable, or in ${\textsc {ACC}^0}$ and $\mathsf {FO}(<,\mathsf {MOD})$-definable, or ${{\textsc {NC}^1}}$-complete and is not $\mathsf {FO}(<,\mathsf {MOD})$-definable (unless ${\textsc {ACC}^0}= {{\textsc {NC}^1}}$). However, these ‘brute force’ algorithms are not optimal, requiring the generation of the whole transition monoid of $\mathfrak A$, which can be of exponential size [14]. As far as we know, the precise complexity of these decision problems has remained open.

Table 1. Definability, algebraic characterisations and circuit complexity of a regular language ${\boldsymbol{L}}$, where $M({\boldsymbol{L}})$ is the syntactic monoid and $\eta _{\boldsymbol{L}}$ the syntactic morphism of ${\boldsymbol{L}}$.

Full size table

Our interest in the exact complexity of these problems is motivated by recent advances in ontology-based data access (OBDA) with linear time temporal logic LTL [1, 2]. The classical (atemporal) OBDA paradigm [20, 30] relies on a reduction of answering a query mediated by an ontology under the open-world semantics to evaluating a database query in a standard language such as SQL or its extension—that is, essentially, an extension of first-order logic—under the closed-world semantic. In the context of temporal OBDA, answering LTL ontology-mediated queries is equivalent to deciding certain regular languages given by an NFA or 2NFA of (possibly) exponential size, which gives rise to the circuit complexity and FO-definability problems for those languages. For further details the reader is referred to [22], which relies on the results we obtain below.

Our contribution in this paper is as follows. Let $\mathcal {L}$ be one of the languages $\mathsf {FO}(<,\equiv )$ or $\mathsf {FO}(<,\mathsf {MOD})$. First, using the algebraic characterisation results of [3, 4, 26], we obtain criteria for the $\mathcal {L}$-definability of the language ${\boldsymbol{L}}(\mathfrak A)$ of any given DFA $\mathfrak A$ in terms of a limited part of the transition monoid of $\mathfrak A$ (Theorem 1). Then, by using our criteria and generalising the construction of [10], we show that deciding $\mathcal {L}$-definability of ${\boldsymbol{L}}(\mathfrak A)$ for any minimal DFA $\mathfrak A$ is $\textsc {PSpace}$-hard (Theorem 2). Finally, we apply our criteria to give a $\textsc {PSpace}$-algorithm deciding $\mathcal {L}$-definability of ${\boldsymbol{L}}(\mathfrak A)$ for not only any DFA but any 2NFA $\mathfrak A$ (Theorem 3).

2 Preliminaries

We begin by briefly reminding the reader of the basic algebraic and automata-theoretic notions required in the remainder of the paper.

2.1 Monoids and Groups

A semigroup is a structure $\mathfrak S=(S,\cdot )$ where $\cdot $ is an associative binary operation. Given $s,s'\in S$ and $n>0$, we write $s^n$ for $s\cdot $ $\dots $ $\cdot s$ n-times, and often write $ss'$ for $s\cdot s'$. An element s in a semigroup $\mathfrak S$ is idempotent if $s^2=s$. An element e in $\mathfrak S$ is an identity if $e\cdot x=x\cdot e=x$ for all $x\in S$. (It is easy to see that such an e is unique, if exists.) The identity element is clearly idempotent. A monoid is a semigroup with an identity element. For any element s in a monoid, we set $s^0=e$. A monoid $\mathfrak S=(S,\cdot )$ is a group if, for any $x\in S$, there is $x^-\in S$—the inverse of x—such that $x\cdot x^-=x^-\cdot x=e$ (every element of a group has a unique inverse). A group is trivial if it has one element, and nontrivial otherwise.

Given two groups $\mathfrak G=(G,\cdot )$ and $\mathfrak G'=(G',\cdot ')$, a map $h:G\rightarrow G'$ is a group homomorphism from $\mathfrak G$ to $\mathfrak G'$ if $h(g_1\cdot g_2)=h(g_1)\cdot ' h(g_2)$ for all $g_1,g_2\in G$. (It is easy to see that any group homomorphism maps the identity of $\mathfrak G$ to the identity of $\mathfrak G'$ and preserves the inverses. The set $\{h(g)\mid g\in G\}$ is closed under $\cdot '$, and so is a group, the image of $\mathfrak G$ under h.) $\mathfrak G$ is a subgroup of $\mathfrak G'$ if $G\subseteq G'$ and the identity map $\mathsf {id}_G$ is a group homomorphism. Given $X\subseteq G$, the subgroup of $\mathfrak G$ generated by X is the smallest subgroup of $\mathfrak G$ containing X. The order $o_\mathfrak G(g)$ of an element g in $\mathfrak G$ is the smallest positive number n with $g^n=e$, which always exists. Clearly, $o_\mathfrak G(g)=o_\mathfrak G(g^-)$ and, if $g^k=e$ then $o_\mathfrak G(g)$ divides k. Also,

$$\begin{aligned} \text {if }g\text { is a nonidentity element in a group }\mathfrak G\text {, then }g^k\ne g^{k+1}\text { for any }k. \end{aligned}$$

(1)

A semigroup $\mathfrak S'=(S',\cdot ')$ is a subsemigroup of a semigroup $\mathfrak S=(S,\cdot )$ if $S'\subseteq S$ and $\cdot '$ is the restriction of $\cdot $ to $S'$. Given a monoid $\boldsymbol{M}=(M,\cdot )$ and a set $S\subseteq M$, we say that S contains the group $\mathfrak G=(G,\cdot ')$, if $G\subseteq S$ and $\mathfrak G$ is a subsemigroup of $\boldsymbol{M}$. Note that we do not require the identity of $\boldsymbol{M}$ to be in $\mathfrak G$, even if it is in S. If $S=M$, we also say that $\boldsymbol{M}$ contains the group $\mathfrak G$, or $\mathfrak G$ is in $\boldsymbol{M}$. We call a monoid $\boldsymbol{M}$ aperiodic if it does not contain any nontrivial groups.

Let $\mathfrak S=(S,\cdot )$ be a finite semigroup and $s\in S$. By the pigeonhole principle, there exist $i,j\ge 1$ such that $i+j\le |S|+1$ and $s^{i}=s^{i+j}$. Take the minimal such numbers, that is, let $i_s,j_s\ge 1$ be such that $i_s+j_s\le |S|+1$ and $s^{i_s}=s^{i_s+j_s}$ but $s^{i_s},s^{i_s+1},\dots ,s^{i_s+j_s-1}$ are all different. Then clearly $\mathfrak G_s=(G_s,\cdot )$, where $G_s=\{s^{i_s},s^{i_s+1},\dots ,s^{i_s+j_s-1}\}$, is a subsemigroup of $\mathfrak S$. It is easy to see that there is $m\ge 1$ with $i_s\le m\cdot j_s<i_s+j_s\le |S|+1$, and so $s^{m\cdot j_s}$ is idempotent. Thus, for every element s in a semigroup $\mathfrak S$, we have the following:

$$\begin{aligned}&\text {there is }n\ge 1\text { such that }s^n\text { is idempotent;}\end{aligned}$$

(2)

$$\begin{aligned}&\mathfrak G_s\text { is a group in }\mathfrak S\text { (isomorphic to the cyclic group }\mathbb Z_{j_s}\text {);}\end{aligned}$$

(3)

$$\begin{aligned}&\mathfrak G_s\text { is nontrivial iff }s^n\ne s^{n+1}\text { for any }n. \end{aligned}$$

(4)

Let $\delta :Q\rightarrow Q$ be a function on a finite set $Q \ne \emptyset $. For any $p\in Q$, the subset $\{\delta ^k(p)\mid k<\omega \}$ with the obvious multiplication is a semigroup, and so we have:

$$\begin{aligned}&\text {for every }p\in Q\text {, there is }n_p\ge 1\text { such that }\delta ^{n_p}\bigl (\delta ^{n_p}(p)\bigr )=\delta ^{n_p}(p)\text {;}\end{aligned}$$

(5)

$$\begin{aligned}&\text {there exist }q\in Q\text { and }n\ge 1\text { such that }q=\delta ^n(q)\text {;}\end{aligned}$$

(6)

$$\begin{aligned}&\text {for every }q\in Q\text {, if }q=\delta ^k(q)\text { for some }k\ge 1, \nonumber \\&\qquad \qquad \qquad \qquad \qquad \text {then there is }n\text {, }1\le n\le |Q|\text {, with }q=\delta ^n(q). \end{aligned}$$

(7)

For a definition of solvable and unsolvable groups the reader is referred to [21]. Here, we only need the fact that any homomorphic image of a solvable group is solvable and the Kaplan–Levy criterion [16] (generalising Thompson’s [27, Cor.3]) according to which a finite group $\mathfrak G$ is unsolvable iff it contains three elements a, b, c, such that $o_\mathfrak G(a)=2$, $o_\mathfrak G(b)$ is an odd prime, $o_\mathfrak G(c)>1$ and coprime to both 2 and $o_\mathfrak G(b)$, and abc is the identity element of $\mathfrak G$.

A one-to-one and onto function on a finite set S is called a permutation on S. The order of a permutation $\delta $ is its order in the group of all permutations on S (whose operation is composition, and its identity element is the identity permutation $\mathsf {id}_S$). We use the standard cycle notation for permutations.

Suppose $\mathfrak G$ is a monoid of $Q\rightarrow Q$ functions, for some finite set $Q \ne \emptyset $. Let $S=\{q\in Q\mid e_\mathfrak G(q)=q\}$, where $e_\mathfrak G$ the identity element in $\mathfrak G$. For every function $\delta $ in $\mathfrak G$, let $\delta \!\!\restriction _S$ denote the restriction of $\delta $ to S. Then we have the following:

$$\begin{aligned}&\mathfrak G\text { is a group iff }\delta \!\!\restriction _S\text { is a permutation on }S\text {, for every }\delta \text { in }\mathfrak G;\end{aligned}$$

(8)

$$\begin{aligned} \nonumber&\text {if }\mathfrak G\text { is a group and }\delta \text { is a nonindentity element in it, then }\delta \!\!\restriction _S\ne \mathsf {id}_S\text { and}\\&\qquad \qquad \qquad \qquad \, \, \text {the order of the permutation }\delta \!\!\restriction _S\text { divides }o_\mathfrak G(\delta ). \end{aligned}$$

(9)

2.2 Automata: DFAs, NFAs, 2NFAs

A two-way nondeterministic finite automaton is a quintuple $\mathfrak A= (Q, \varSigma , \delta , Q_0, F)$ that consists of an alphabet $\varSigma $, a finite set Q of states with a subset $Q_0 \ne \emptyset $ of initial states and a subset F of accepting states, and a transition function $\delta :Q \times \varSigma \rightarrow 2^{Q \times \{-1,0,1\}}$ indicating the next state and whether the head should move left ($-1$), right (1), or stay put. If $Q_0 = \{q_0\}$ and $|\delta (q, a)| = 1$, for all $q \in Q$ and $a \in \varSigma $, then $\mathfrak A$ is deterministic, in which case we write $\mathfrak A= (Q, \varSigma , \delta , q_0, F)$. If $\delta (q, a) \subseteq Q \times \{1\}$, for all $q \in Q$ and $a \in \varSigma $, then $\mathfrak A$ is a one-way automaton, and we write $\delta :Q \times \varSigma \rightarrow 2^Q$. As usual, DFA and NFA refer to one-way deterministic and non-deterministic finite automata, respectively, while 2DFA and 2NFA to the corresponding two-way automata. Given a 2NFA $\mathfrak A$, we write $q \rightarrow _{a,d} q'$ if $(q', d) \in \delta (q,a)$; given an NFA $\mathfrak A$, we write $q \rightarrow _{a} q'$ if $q' \in \delta (q,a)$. A run of a 2NFA $\mathfrak A$ is a word in $(Q \times \mathbb N)^*$. A run $(q_0, i_0), \dots , (q_m, i_m)$ is a run of $\mathfrak A$ on a word $w = a_0 \dots a_n \in \varSigma ^*$ if $q_0 \in Q_0$, $i_0 = 0$ and there exist $d_0, \dots , d_{m-1} \in \{-1,0,1\}$ such that $q_j \rightarrow _{a_j, d_j} q_{j+1}$ and $i_{j+1} = i_j+d_j$ for all j, $0 \le j < m$. The run is accepting if $q_m \in F$, $i_m = n+1$. $\mathfrak A$ accepts $w \in \varSigma ^*$ if there is an accepting run of $\mathfrak A$ on w; the language ${\boldsymbol{L}}(\mathfrak A)$ of $\mathfrak A$ is the set of all words accepted by $\mathfrak A$.

Given an NFA $\mathfrak A$, states $q,q' \in Q$, and $w = a_0 \dots a_n \in \varSigma ^*$, we write $q \rightarrow _w q'$ if either $w = \varepsilon $ and $q' = q$ or there is a run of $\mathfrak A$ on w that starts with $(q_0, 0)$ and ends with $(q', n+1)$. We say that a state $q \in Q$ is reachable if $q' \rightarrow _w q$, for some $q' \in Q_0$ and $w \in \varSigma ^*$.

Given a DFA $\mathfrak A= (Q, \varSigma , \delta , q_0, F)$ and a word $w \in \varSigma ^*$, we define a function $\delta _w :Q \rightarrow Q$ by taking $\delta _w(q) = q'$ iff $q \rightarrow _w q'$. We also define an equivalence relation $\sim $ on the set $Q^r\subseteq Q$ of reachable states by taking $q\sim q'$ iff, for every $w \in \varSigma ^*$, we have $\delta _w(q)\in F$ just in case $\delta _w(q')\in F$. We denote the $\sim $-class of q by $q/_{\mathop {\sim }}$, and let $X/_{\mathop {\sim }}=\{q/_{\mathop {\sim }}\mid q\in X\}$ for any $X\subseteq Q^r$. Define $\tilde{\delta }_w:Q^r\!/_{\mathop {\sim }}\rightarrow Q^r\!/_{\mathop {\sim }}$ by taking $\tilde{\delta }_w(q/_{\mathop {\sim }})=\delta _w(q)/_{\mathop {\sim }}$. Then $\bigl (Q^r\!/_{\mathop {\sim }},\varSigma ,\tilde{\delta },q_0/_{\mathop {\sim }},(F\cap Q^r)/_{\mathop {\sim }}\bigr )$ is the minimal DFA whose language coincides with the language of $\mathfrak A$. Given a regular language ${\boldsymbol{L}}$, we denote by $\mathfrak A_{{\boldsymbol{L}}}$ the minimal DFA whose language is ${\boldsymbol{L}}$.

The transition monoid of a DFA $\mathfrak A$ is $M(\mathfrak A) = (\{ \delta _w \mid w \in \varSigma ^*\},\cdot )$ with $\delta _v\cdot \delta _w = \delta _{vw}$, for any v, w. The syntactic monoid $M({\boldsymbol{L}})$ of ${\boldsymbol{L}}$ is the transition monoid $M(\mathfrak A_{{\boldsymbol{L}}})$ of $\mathfrak A_{{\boldsymbol{L}}}$. The syntactic morphism of ${\boldsymbol{L}}$ is the map $\eta _{\boldsymbol{L}}$ from $\varSigma ^*$ to the domain of $M({\boldsymbol{L}})$ defined by $\eta _{\boldsymbol{L}}(w) = \tilde{\delta }_w$. We call $\eta _{\boldsymbol{L}}$ quasi-aperiodic if $\eta _L(\varSigma ^t)$ is aperiodic for every $t<\omega $.

Suppose $\mathcal {L}\in \{ \mathsf {FO}(<), \mathsf {FO}(<,\equiv ), \mathsf {FO}(<,\mathsf {MOD})\}$. A language ${\boldsymbol{L}}$ over an alphabet $\varSigma $ is $\mathcal {L}$-definable if there is an $\mathcal {L}$-sentence $\varphi $ in the signature $\varSigma $, whose symbols are treated as unary predicates, such that, for any $w \in \varSigma ^*$, we have $w=a_0\ldots a_n \in {\boldsymbol{L}}$ iff $\mathfrak S_w \models \varphi $, where $\mathfrak S_w$ is an FO-structure with domain $\{0,\dots ,n\}$ ordered by <, in which $\mathfrak S_w \models a(i)$ iff $a=a_i$, for $0\le i\le n$.

Table 1 summarises the known results that connect definability of a regular language ${\boldsymbol{L}}$ with properties of the syntactic monoid $M({\boldsymbol{L}})$ and syntactic morphism $\eta _{\boldsymbol{L}}$ (see [4] for details) and with its circuit complexity under a reasonable binary encoding of ${\boldsymbol{L}}$’s alphabet (see, e.g., [7, Lemma 2.1]) and the assumption that ${\textsc {ACC}^0}\ne {{\textsc {NC}^1}}$. We also remind the reader that a regular language is $\mathsf {FO}(<)$-definable iff it is star-free [26], and that ${\textsc {AC}^0}\subsetneqq {\textsc {ACC}^0}\subseteq {{\textsc {NC}^1}}$ [15, 26].

3 Criteria of $\mathcal {L}$-definability

In this section, we show that the algebraic characterisations of FO-definability of ${\boldsymbol{L}}(\mathfrak A)$ given in Table 1 can be captured by ‘localisable’ properties of the transition monoid of $\mathfrak A$, for any given DFA $\mathfrak A$. Note that Theorem 1 (i) was already observed in [25] and used in proving that $\mathsf {FO}(<)$-definability of ${\boldsymbol{L}}(\mathfrak A)$ is $\textsc {PSpace}$-complete [7, 10, 25]; while criteria (ii) and (iii) seem to be new.

Theorem 1

For any DFA $\mathfrak A=(Q,\varSigma ,\delta ,q_0,F)$, the following criteria hold:

(i)
${\boldsymbol{L}}(\mathfrak A)$ is not $\mathsf {FO}(<)$-definable iff $\mathfrak A$ contains a nontrivial cycle, that is, there exist a word $u\in \varSigma ^*$, a state $q\in Q^r$, and a number $k\le |Q|$ such that $q\not \sim \delta _u(q)$ and $q= \delta _{u^k}(q)$;
(ii)
${\boldsymbol{L}}(\mathfrak A)$ is not $\mathsf {FO}(<,\equiv )$-definable iff there are words $u,v\in \varSigma ^*$, a state $q\in Q^r$, and a number $k\le |Q|$ such that $q\not \sim \delta _u(q)$, $q=\delta _{u^k}(q)$, $|v|=|u|$, and $\delta _{u^i}(q)=\delta _{u^iv}(q)$, for every $i<k$;
(iii)
${\boldsymbol{L}}(\mathfrak A)$ is not $\mathsf {FO}(<,\mathsf {MOD})$-definable iff there exist words $u,v\in \varSigma ^*$, a state $q\in Q^r$ and numbers $k,l\le |Q|$ such that k is an odd prime, $l>1$ and coprime to both 2 and k, $q\not \sim \delta _u(q)$, $q\not \sim \delta _v(q)$, $q\not \sim \delta _{uv}(q)$ and, for all $x\in \{u,v\}^*$, we have $\delta _{x}(q)\sim \delta _{xu^{2}}(q)\sim \delta _{xv^{k}}(q)\sim \delta _{x(uv)^{l}}(q)$.

Proof

Throughout, we use the algebraic criteria of Table 1 for ${\boldsymbol{L}}={\boldsymbol{L}}(\mathfrak A)$. Thus, $M({\boldsymbol{L}})$ is the transition monoid of the minimal DFA $\mathfrak A_{{\boldsymbol{L}}(\mathfrak A)}$, whose transition function we denote by $\tilde{\delta }$.

$(i)~(\Rightarrow )$ Suppose $\mathfrak G$ is a nontrivial group in $M(\mathfrak A_{{\boldsymbol{L}}(\mathfrak A)})$. Let $u\in \varSigma ^*$ be such that $\tilde{\delta }_u$ is a nonidentity element in $\mathfrak G$. We claim that there is $p\in Q^r$ such that $\tilde{\delta }_{u^n}(p/_{\mathop {\sim }})\ne \tilde{\delta }_{u^{n+1}}(p/_{\mathop {\sim }})$ for any $n>0$. Indeed, otherwise for every $p\in Q^r$ there is $n_p>0$ with $\tilde{\delta }_{u^{n_p}}(p/_{\mathop {\sim }})=\tilde{\delta }_{u^{n_p+1}}(p/_{\mathop {\sim }})$. Let $n=\max \{n_p\mid p\in Q^r\}$. Then $\tilde{\delta }_{u^n}=\tilde{\delta }_{u^{n+1}}$, contrary to (1).

By (5), there is $m\ge 1$ with $\tilde{\delta }_{u^{2m}}(p/_{\mathop {\sim }})=\tilde{\delta }_{u^{m}}(p/_{\mathop {\sim }})$. Let $s/_{\mathop {\sim }}=\tilde{\delta }_{u^m}(p/_{\mathop {\sim }})$. Then $s/_{\mathop {\sim }}=\tilde{\delta }_{u^m}(s/_{\mathop {\sim }})$, and so the restriction of $\delta _{u^m}$ to the subset $s/_{\mathop {\sim }}$ of $Q^r$ is an $s/_{\mathop {\sim }}\rightarrow s/_{\mathop {\sim }}$ function. By (6), there exist $q\in s/_{\mathop {\sim }}$ and $n\ge 1$ such that $(\delta _{u^{m}})^n(q)=q$. Thus, $\delta _{u^{mn}}(q)=q$, and so by (7), there is $k\le |Q|$ with $\delta _{u^{k}}(q)=q$. As $s/_{\mathop {\sim }}\ne \tilde{\delta }_{u}(s/_{\mathop {\sim }})$, we also have $q\not \sim \delta _u(q)$, as required.

$(i)~(\Leftarrow )$ Suppose the condition holds for $\mathfrak A$. Then there are $u\in \varSigma ^*$, $q\in Q^r\!/_{\mathop {\sim }}$, and $k<\omega $ such that $q\ne \tilde{\delta }_u(q)$ and $q=\tilde{\delta }_{u^k}(q)$. Then $\tilde{\delta }_{u^n}\ne \tilde{\delta }_{u^{n+1}}$ for any $n>0$. Indeed, otherwise we would have some $n>0$ with $\tilde{\delta }_{u^n}(q)=\tilde{\delta }_{u^{n+1}}(q)$. Let i, j be such that $n=i\cdot k+j$ and $j<k$. Then

$$ q=\tilde{\delta }_{u^k}(q)=\tilde{\delta }_{u^{(i+1)k}}(q)=\tilde{\delta }_{u^n u^{k-j}}(q)=\tilde{\delta }_{u^{n+1}u^{k-j}}(q)=\tilde{\delta }_{u^{(i+1)k}u}(q)=\tilde{\delta }_u(q). $$

So, by (3) and (4), $\mathfrak G_{\tilde{\delta }_u}$ is a nontrivial group in $M(\mathfrak A_{{\boldsymbol{L}}(\mathfrak A)})$.

$(ii)~(\Rightarrow )$ Let $\mathfrak G$ be a nontrivial group in $\eta _{\boldsymbol{L}}(\varSigma ^t)$, for some $t<\omega $, and let $u\in \varSigma ^t$ be such that $\tilde{\delta }_u$ is a nonidentity element in $\mathfrak G$. As shown in the proof of $(i)~(\Rightarrow )$, there exist $s\in Q^r$ and $m\ge 1$ such that $s/_{\mathop {\sim }}\ne \tilde{\delta }_{u}(s/_{\mathop {\sim }})$ and $s/_{\mathop {\sim }}=\tilde{\delta }_{u^m}(s/_{\mathop {\sim }})$. Now let $v\in \varSigma ^t$ be such that $\tilde{\delta }_v$ is the identity element in $\mathfrak G$, and consider $\delta _v$. By (2), there is $\ell \ge 1$ such that $\delta _{v^\ell }$ is idempotent. Then $\delta _{v^{2\ell -1}v^{2\ell }}=\delta _{v^{2\ell -1}}$. Thus, if we let $\bar{u}=uv^{2\ell -1}$ and $\bar{v}=v^{2\ell }$, then $|\bar{u}|=|\bar{v}|$ and $\delta _{\bar{u}^i}=\delta _{\bar{u}^i\bar{v}}$ for any $i<\omega $. Also, $\tilde{\delta }_{u^i}=\tilde{\delta }_{\bar{u}^i}$ for every $i\ge 1$, and so the restriction of $\delta _{\bar{u}^m}$ to $s/_{\mathop {\sim }}$ is an $s/_{\mathop {\sim }}\rightarrow s/_{\mathop {\sim }}$ function. By (6), there exist $q\in s/_{\mathop {\sim }}$ and $n\ge 1$ such that $(\delta _{\bar{u}^{m}})^n(q)=q$. Thus, $\delta _{\bar{u}^{mn}}(q)=q$, and so by (7), there is some $k\le |Q|$ with $\delta _{\bar{u}^{k}}(q)=q$. As $s/_{\mathop {\sim }}\ne \tilde{\delta }_u(s/_{\mathop {\sim }})=\tilde{\delta }_{\bar{u}}(s/_{\mathop {\sim }})$, we also have $q\not \sim \delta _{\bar{u}}(q)$, as required.

$(ii)~(\Leftarrow )$ If the condition holds for $\mathfrak A$, then there exist $u,v\in \varSigma ^*$, $q\in Q^r\!/_{\mathop {\sim }}$, and $k<\omega $ such that $q\ne \tilde{\delta }_u(q)$, $q=\tilde{\delta }_{u^k}(q)$, $|v|=|u|$, and $\tilde{\delta }_{u^i}(q)=\tilde{\delta }_{u^iv}(q)$, for every $i<k$. As $M(\mathfrak A_{{\boldsymbol{L}}(\mathfrak A)})$ is finite, it has finitely many subsets. So there exist $i,j\ge 1$ such that $\eta _{\boldsymbol{L}}(\varSigma ^{i|u|})=\eta _{\boldsymbol{L}}(\varSigma ^{(i+j)|u|})$. Let z be a multiple of j with $i\le z<i+j$. Then $\eta _{\boldsymbol{L}}(\varSigma ^{z|u|})=\eta _{\boldsymbol{L}}(\varSigma ^{(z|u|)^2})$, and so $\eta _{\boldsymbol{L}}(\varSigma ^{z|u|})$ is closed under the composition of functions (that is, the semigroup operation of $M(\mathfrak A_{{\boldsymbol{L}}(\mathfrak A)})$). Let $w=uv^{z-1}$ and consider the group $\mathfrak G_{\tilde{\delta }_w}$ (defined above (2)–(4)). Then $G_{\tilde{\delta }_w}\subseteq \eta _{\boldsymbol{L}}(\varSigma ^{z|u|})$. We claim that $\mathfrak G_{\tilde{\delta }_w}$ is nontrivial. Indeed, we have $\tilde{\delta }_w(q)=\tilde{\delta }_{uv^{z-1}}(q)=\tilde{\delta }_u(q)\ne q$. On the other hand, $\tilde{\delta }_{w^{k}}(q)=\tilde{\delta }_{u^{k}}(q)=q$. By the proof of $(i)~(\Leftarrow )$, $\mathfrak G_{\tilde{\delta }_w}$ is nontrivial.

$(iii)~(\Rightarrow )$ Suppose $\mathfrak G$ is an unsolvable group in $M(\mathfrak A_{{\boldsymbol{L}}(\mathfrak A)})$. By the Kaplan–Levy criterion, $\mathfrak G$ contains three functions a, b, c such that $o_\mathfrak G(a)=2$, $o_\mathfrak G(b)$ is an odd prime, $o_\mathfrak G(c)>1$ and coprime to both 2 and $o_\mathfrak G(b)$, and $c\circ b\circ a=e_\mathfrak G$ for the identity element $e_\mathfrak G$ of $\mathfrak G$. Let $u,v\in \varSigma ^*$ be such that $a=\tilde{\delta }_u$, $b=\tilde{\delta }_v$ and $c=(\tilde{\delta }_{uv})^-$, and let $k=o_\mathfrak G(\tilde{\delta }_{v})$ and $r=o_\mathfrak G(c)=o_\mathfrak G(\tilde{\delta }_{uv})$. Then $r>1$ and coprime to both 2 and k. Let $S=\bigl \{p\in Q^r\!/_{\mathop {\sim }}\mid e_\mathfrak G(p)=p\bigr \}$. As $\tilde{\delta }_x$ is $\mathfrak G$ for every $x\in \{u,v\}^*$, we have $e_\mathfrak G\circ \tilde{\delta }_x=\tilde{\delta }_x$. Thus,

$$\begin{aligned}&\tilde{\delta }_{xu^2}(q)=\tilde{\delta }_{u^2}\bigl (\tilde{\delta }_x(q)\bigr )=e_\mathfrak G\bigl (\tilde{\delta }_x(q)\bigr )=(e_\mathfrak G\circ \tilde{\delta }_x)(q)=\tilde{\delta }_x(q),\quad \text {and}\\&\tilde{\delta }_{xv^k}(q)=\tilde{\delta }_{v^k}\bigl (\tilde{\delta }_x(q)\bigr )=e_\mathfrak G\bigl (\tilde{\delta }_x(q)\bigr )=(e_\mathfrak G\circ \tilde{\delta }_x)(q)=\tilde{\delta }_x(q),\quad \text {for every }q\in S. \end{aligned}$$

Then, by (8), each of $\tilde{\delta }_u\!\!\restriction _S$, $\tilde{\delta }_v\!\!\restriction _S$ and $\tilde{\delta }_{uv}\!\!\restriction _S$ is a permutation on S. By (9), the order of $\tilde{\delta }_u\!\!\restriction _S$ is 2, the order of $\tilde{\delta }_v\!\!\restriction _S$ is k, and the order l of $\tilde{\delta }_{uv}\!\!\restriction _S$ is a $>1$ divisor of r, and so it is coprime to both 2 and k. Also, we have $k,l\le |S|\le |Q|$. Further, for every x, if q is in S then $\tilde{\delta }_x(q)\in S$ as well. So we have

$$ \tilde{\delta }_{x(uv)^l}(q)=\tilde{\delta }_{(uv)^l}\bigl (\tilde{\delta }_x(q)\bigr )=(\tilde{\delta }_{uv}\!\!\restriction _S)^l\bigl (\tilde{\delta }_x(q)\bigr )=\mathsf {id}_S\bigl (\tilde{\delta }_x(q)\bigr )=\tilde{\delta }_x(q),\ \ \text {for all }q\in S. $$

It remains to show that there is $q\in S$ with $q\ne \tilde{\delta }_u(q)$, $q\ne \tilde{\delta }_u(q)$, and $q\ne \tilde{\delta }_{uv}(q)$. Recall that the length of any cycle in a permutation divides its order. First, we show there is $q\in S$ with $q\ne \tilde{\delta }_u(q)$ and $q\ne \tilde{\delta }_u(q)$. Indeed, as $\tilde{\delta }_{u}\!\!\restriction _S\ne \mathsf {id}_S$, there is $q\in S$ such that $\tilde{\delta }_u(q)=q'\ne q$. As the order of $\tilde{\delta }_{u}\!\!\restriction _S$ is 2, $\tilde{\delta }_u(q')=q$. If both $\tilde{\delta }_v(q)=q$ and $\tilde{\delta }_v(q')=q'$ were the case, then $\tilde{\delta }_{uv}(q)=q'$ and $\tilde{\delta }_{uv}(q')=q$ would hold, and so $(qq')$ would be a cycle in $\tilde{\delta }_{uv}\!\!\restriction _S$, contrary to l being coprime to 2. So take some $q\in S$ with $\tilde{\delta }_u(q)=q'\ne q$ and $\tilde{\delta }_v(q)\ne q$. If $\tilde{\delta }_v(q')\ne q$ then $\tilde{\delta }_{uv}(q)\ne q$, and so q is a good choice. Suppose $\tilde{\delta }_v(q')=q$, and let $q''=\tilde{\delta }_v(q)$. Then $q''\ne q'$, as k is odd. Thus, $\tilde{\delta }_{uv}(q')\ne q'$, and so $q'$ is a good choice.

$(iii)~(\Leftarrow )$ Suppose $u,v\in \varSigma ^*$, $q\in Q^r$, and $k,l<\omega $ are satisfying the conditions. For every $x\in \{u,v\}^*$, we define an equivalence relation $\approx _x$ on $Q^r\!/_{\mathop {\sim }}$ by taking $p\approx _x p'$ iff $\tilde{\delta }_x(p)=\tilde{\delta }_x(p')$. Then we clearly have that $\approx _x\subseteq \approx _{xy}$, for all $x,y\in \{u,v\}^*$. As Q is finite, there is $z\in \{u,v\}^*$ such that $\approx _z= \approx _{zy}$ for all $y\in \{u,v\}^*$. Take such a z. By (2), $\tilde{\delta }_z^n$ is idempotent for some $n\ge 1$. We let $w=z^n$. Then $\tilde{\delta }_w$ is idempotent and we also have that

$$\begin{aligned} \approx _w\,=\, \approx _{wy}\quad \text {for all }y\in \{u,v\}^*. \end{aligned}$$

(10)

Let $G_{\{u,v\}}=\bigl \{\tilde{\delta }_{wxw}\mid x\in \{u,v\}^*\bigr \}$. Then $G_{\{u,v\}}$ is closed under composition. Let $\mathfrak G_{\{u,v\}}$ be the subsemigroup of $M(\mathfrak A_{{\boldsymbol{L}}(\mathfrak A)})$ with universe $G_{\{u,v\}}$. Then $\tilde{\delta }_w=\tilde{\delta }_{w\varepsilon w}$ is an identity element in $\mathfrak G_{\{u,v\}}$. Let $S=\{p\in Q^r\!/_{\mathop {\sim }}\mid \tilde{\delta }_w(p)=p\}$. We show that

$$\begin{aligned} \text {for every }\tilde{\delta }\text { in }\mathfrak G_{\{u,v\}}\text {, }\tilde{\delta }\!\!\restriction _S\text { is a permutation on }S, \end{aligned}$$

(11)

and so $\mathfrak G_{\{u,v\}}$ is a group by (8). Indeed, take some $x\in \{u,v\}^*$. As $\tilde{\delta }_w\bigl (\tilde{\delta }_{wxw}(p)\bigr )=\tilde{\delta }_{wxww}(p)=\tilde{\delta }_{wxw}(p)$ for any $p\in Q^r\!/_{\mathop {\sim }}$, $\tilde{\delta }_{wxw}\!\!\restriction _S$ is an $S\rightarrow S$ function. Also, if $p,p'\in S$ and $\tilde{\delta }_{wxw}(p)=\tilde{\delta }_{wxw}(p')$ then $p\approx _{wxw}p'$. Thus, by (10), $p\approx _w p'$, that is, $p=\tilde{\delta }_w(p)=\tilde{\delta }_w(p')=p'$, proving (11).

We show that $\mathfrak G_{\{u,v\}}$ is unsolvable by finding an unsolvable homomorphic image of it. Let $R=\bigl \{p\in Q^r\!/_{\mathop {\sim }}\mid p=\tilde{\delta }_x(q)\text { for some }x\in \{u,v\}^*\bigr \}$. We claim that, for every $\tilde{\delta }$ in $\mathfrak G_{\{u,v\}}$, $\tilde{\delta }\!\!\restriction _R$ is a permutation on R, and so the function h mapping every $\tilde{\delta }$ to $\tilde{\delta }\!\!\restriction _R$ is a group homomorphism from $\mathfrak G_{\{u,v\}}$ to the group of all permutations on R. Indeed, by (11), it is enough to show that $R\subseteq S$. Let $\overline{w}=\overline{z}_{m}\dots \overline{z}_1$, where $w=z_1\dots z_m$ for some $z_i\in \{u,v\}$, $\overline{u}=u$ and $\overline{v}=v^{k-1}$. Since $\tilde{\delta }_{x}(q)=\tilde{\delta }_{x(u)^{2}}(q)=\tilde{\delta }_{x(v)^{k}}(q)$ for all $x\in \{u,v\}^*$, we obtain that

$$\begin{aligned} \tilde{\delta }_{yw\overline{w}}(q)=&\tilde{\delta }_{\overline{z}_{m-1}\dots \overline{z}_1}\bigl (\tilde{\delta }_{yz_1\dots z_m\overline{z}_m}(q)\bigr ) =\tilde{\delta }_{\overline{z}_{m-1}\dots \overline{z}_1}\bigl (\tilde{\delta }_{yz_1\dots z_{m-1}}(q)\bigr )=\dots \nonumber \\&\ \dots =\tilde{\delta }_{\overline{z}_1}\bigl (\tilde{\delta }_{yz_1}(q)\bigr )=\tilde{\delta }_{xz_1\overline{z}_1}(q)=\tilde{\delta }_y(q),\quad \text {for all }y\in \{u,v\}^*. \end{aligned}$$

(12)

Now suppose $p\in R$, that is, $p=\tilde{\delta }_x(q)$ for some $x\in \{u,v\}^*$. Then, by (12),

$$ \tilde{\delta }_w(p)=\tilde{\delta }_w\bigl (\tilde{\delta }_x(q)\bigr )=\tilde{\delta }_{xw}(q)=\tilde{\delta }_{xww\overline{w}}(q)=\tilde{\delta }_{xw\overline{w}}(q)=\tilde{\delta }_x(q)=p, $$

and so $p\in S$, as required.

Now let $\mathfrak G$ be the image of $\mathfrak G_{\{u,v\}}$ under h. We prove that $\mathfrak G$ is unsolvable by finding three elements a, b, c in it such that $o_\mathfrak G(a)=2$, $o_\mathfrak G(b)=k$, $o_\mathfrak G(c)$ is coprime to both 2 and $o_\mathfrak G(b)$, and $c\circ b\circ a=\mathsf {id}_R$ (the identity element of $\mathfrak G$). So let $a=h(\tilde{\delta }_{wuw})$, $b=h(\tilde{\delta }_{wvw})$, and $c=h(\tilde{\delta }_{wuvw})^-$. Observe that, for every $x\in \{u,v\}^*$, $h(\tilde{\delta }_{wxw})=\tilde{\delta }_x\!\!\restriction _R$, and so $c\circ b\circ a=\mathsf {id}_R$. Also, for any $\tilde{\delta }_x(q)\in R$, $a^2\bigl (\tilde{\delta }_x(q)\bigr )=(\tilde{\delta }_u\!\!\restriction _R)^2\bigl (\tilde{\delta }_x(q)\bigr )=\tilde{\delta }_{xu^2}(q)=\tilde{\delta }_x(q)$ by our assumption, so $a^2=\mathsf {id}_R$. On the other hand, $q\in R$ as $\tilde{\delta }_\varepsilon (q)=q$, and $\mathsf {id}_R(q)=q\ne \tilde{\delta }_u(q)$ by assumption, so $a\ne \mathsf {id}_R$. As $o_\mathfrak G(a)$ divides 2, $o_\mathfrak G(a)=2$ follows. Similarly, we can show that $o_\mathfrak G(b)=k$ (using that $\tilde{\delta }_{xv^k}(q)=\tilde{\delta }_x(q)$ for every $x\in \{u,v\}^*$, and $u\ne \tilde{\delta }_v(q)$). Finally (using that $\tilde{\delta }_{x(uv)^l}(q)=\tilde{\delta }_x(q)$ for every $x\in \{u,v\}^*$, and $u\ne \tilde{\delta }_{uv}(q)$), we obtain that $h(\tilde{\delta }_{wuvw})^l=\mathsf {id}_R$ and $h(\tilde{\delta }_{wuvw})\ne \mathsf {id}_R$. Therefore, it follows that $o_\mathfrak G(c)=o_\mathfrak G\bigl (h(\tilde{\delta }_{wuvw})^-\bigr )=o_\mathfrak G\bigl (h(\tilde{\delta }_{wuvw})\bigr )>1$ and divides l, and so coprime to both 2 and k, as required.

4 Deciding FO-definability: $\textsc {PSpace}$-hardness

Kozen [18] showed that deciding whether the intersection of the languages recognised by a set of given deterministic DFAs is non-empty is $\textsc {PSpace}$-complete. By carefully analysing Kozen’s lower bound proof and using the criterion of Theorem 1 (i), Cho and Huynh [10] established that deciding $\mathsf {FO}(<)$-definability of ${\boldsymbol{L}}(\mathfrak A)$ is $\textsc {PSpace}$-hard, for any given minimal DFA $\mathfrak A$. We generalise their construction and use the criteria in Theorem 1 (ii)–(iii) to cover $\mathsf {FO}(<,\equiv )$- and $\mathsf {FO}(<,\mathsf {MOD})$-definability as well.

Theorem 2

For any $\mathcal {L}\in \{ \mathsf {FO}(<), \mathsf {FO}(<,\equiv ), \mathsf {FO}(<,\mathsf {MOD})\}$, deciding $\mathcal {L}$-definability of the language ${\boldsymbol{L}}(\mathfrak A)$ of a given minimal DFA $\mathfrak A$ is $\textsc {PSpace}$-hard.

Proof

Let $\boldsymbol{M}$ be a deterministic Turing machine that decides a language using at most $N=P_{\boldsymbol{M}}(n)$ tape cells on any input of size n, for some polynomial $P_{\boldsymbol{M}}$. Given such an $\boldsymbol{M}$ and an input $\boldsymbol{x}$, our aim is to define three minimal DFAs whose languages are, respectively, $\mathsf {FO}(<)$-, $\mathsf {FO}(<,\equiv )$-, and $\mathsf {FO}(<,\mathsf {MOD})$-definable iff $\boldsymbol{M}$ rejects $\boldsymbol{x}$, and whose sizes are polynomial in $N$ and the size $|\boldsymbol{M}|$ of $\boldsymbol{M}$.

Suppose $\boldsymbol{M}=(Q, \varGamma , \gamma , \mathsf {b}, q_0, q_{\textit{acc}})$ with a set Q of states, tape alphabet $\varGamma $ with $\mathsf {b}$ for blank, transition function $\gamma $, initial state $q_0$ and accepting state $q_{\textit{acc}}$. Without loss of generality we assume that $\boldsymbol{M}$ erases the tape before accepting, its head is at the left-most cell in an accepting configuration, and if $\boldsymbol{M}$ does not accept the input, it runs forever. Given an input word $\boldsymbol{x}=x_1\dots x_n$ over $\varGamma $, we represent configurations $\mathfrak c$ of the computation of $\boldsymbol{M}$ on $\boldsymbol{x}$ by the $N$-long word written on the tape (with sufficiently many blanks at the end) in which the symbol y in the active cell is replaced by the pair (q, y) for the current state q. The accepting computation of $\boldsymbol{M}$ on $\boldsymbol{x}$ is encoded by a word $\sharp \, \mathfrak c_1 \, \sharp \, \mathfrak c_2 \, \sharp \, \dots \, \sharp \, \mathfrak c_{k-1} \, \sharp \, \mathfrak c_{k} \flat $ over the alphabet $\varSigma =\varGamma \cup (Q\times \varGamma )\cup \{\sharp ,\flat \}$, with $\mathfrak c_1,\mathfrak c_2,\dots ,\mathfrak c_k$ being the subsequent configurations. In particular, $\mathfrak c_1$ is the initial configuration on $\boldsymbol{x}$ (so it is of the form $(q_0,x_1)x_2\dots x_n\mathsf {b}\dots \mathsf {b}$), and $\mathfrak c_k$ is the accepting configuration (so it is of the form $(q_{\textit{acc}},\mathsf {b})\mathsf {b}\dots \mathsf {b}$). As usual for this representation of computations, we may regard $\gamma $ as a partial function from $\bigl (\varGamma \cup (Q\times \varGamma )\cup \{\sharp \}\bigr )^3$ to $\varGamma \cup (Q\times \varGamma )$ with $\gamma (\sigma ^j_{i-1},\sigma ^j_i,\sigma ^j_{i+1})=\sigma ^{j+1}_i$ for each $j< k$, where $\sigma ^j_i$ is the ith symbol of $\mathfrak c^j$.

Let $p_{\boldsymbol{M},\boldsymbol{x}}=p$ be the first prime such that $p\ge N+2$ and $p\not \equiv \pm 1\ (\text {mod}\ 10)$. By [6, Corollary 1.6], $p$ is polynomial in $N$. Our first aim is to construct a $p+1$-long sequence $\mathfrak A_i$ of disjoint minimal DFAs over $\varSigma $. Each $\mathfrak A_i$ has size polynomial in $N$ and $|\boldsymbol{M}|$, and it checks certain properties of an accepting computation on $\boldsymbol{x}$ such that $\boldsymbol{M}$ accepts $\boldsymbol{x}$ iff the intersection of the ${\boldsymbol{L}}(\mathfrak A_i)$ is not empty and consists of the single word encoding the accepting computation on $\boldsymbol{x}$.

We define each $\mathfrak A_i$ as an NFA, and assume that it can be turned to a DFA by adding a ‘trash state’ $\textit{tr}_i$ looping on itself with every $\sigma \in \varSigma $, and adding the missing transitions leading to $\textit{tr}_i$. The DFA $\mathfrak A_{0}$ checks that an input starts with the initial configuration on $\boldsymbol{x}$ and ends with the accepting configuration:

When $1 \le i \le N$, the DFA $\mathfrak A_{i}$ checks, for all $j< k$, whether the ith symbol of $\mathfrak c^j$ changes ‘according to $\gamma $’ in passing to $\mathfrak c^{j+1}$. The non-trash part of its transition function $\delta ^i$ is as follows, for $1<i<N$. (For $i=1$ and $i=N$ some adjustments are needed.) For all $u,u',v,w,w',y,z\in \varGamma \cup (Q\times \varGamma )$,

$$\begin{aligned}&\delta ^i_\sharp (t_i)=q^0,\quad \delta ^i_u(q^j)=q^{j+1},\ \text {for }j=0,...,i-3,\quad \delta ^i_u(q^{i-2})=r_u,\quad \delta ^i_v(r_u)=r_{uv},\\&\delta ^i_w(r_{uv})=q^0_{\gamma (u,v,w)},\quad \delta ^i_y(q^j_z)=q^{j+1}_z,\ \text {for }j=0,...,N-3\text {, }j\ne N-i-1,\\&\delta ^i_\sharp (q^{N-i-1}_z)=q^{N-i}_z,\quad \delta ^i_\flat (q^{N-i-1}_z)=f_i,\quad \delta ^i_{u'}(q^{N-2}_z)=p_{u'z},\ \ \delta ^i_z(p_{u'z})=r_{u'z},\\&\text {see below, where }z=\gamma (u,v,w)\text { and }z'=\gamma (u',z,w'): \end{aligned}$$

Finally, if $N+1\le i\le p$ then $\mathfrak A_i$ accepts all words over $\varSigma $ with a single occurrence of $\flat $, which is the input’s last character:

Note that $\mathfrak A_{p-1}=\mathfrak A_p$ as $p\ge N+2$. It is not hard to check that each of the $\mathfrak A_i$ is a minimal DFA that does not contain nontrivial cycles and the following holds:

Lemma 1

$\boldsymbol{M}$ accepts $\boldsymbol{x}$ iff $\bigcap _{i=0}^{p} {\boldsymbol{L}}(\mathfrak A_i) \ne \emptyset $, in which case this language consists of a single word that encodes the accepting computation of $\boldsymbol{M}$ on $\boldsymbol{x}$.

Next, we require three sequences of DFAs $\mathfrak B^p_<$, $\mathfrak B^p_\equiv $ and $\mathfrak B^p_\mathsf {MOD}$, where $p>5$ is a prime number with $p\not \equiv \pm 1\ (\text {mod}\ 10)$; see the picture below for $p=7$.

In general, the first sequence is $\mathfrak B^p_< = \bigl (\{s_{i} \mid i< p \},\{a\},\delta ^{{\mathfrak B}_{<}^{p}},s_0,\{s_0\} \bigr )$, where $\delta ^{{\mathfrak B}_{<}^{p}}_a(s_i)=s_j$ if $i,j<p$ and $j\equiv i+1\ (\text {mod}\ p)$. Then ${\boldsymbol{L}}(\mathfrak B^p_<)$ comprises all words of the form $(a^{p})^*$, $\mathfrak B^p_<$ is the minimal DFA for ${\boldsymbol{L}}(\mathfrak B^p_<)$, and the syntactic monoid $M(\mathfrak B^p_<)$ is the cyclic group of order p (generated by the permutation $\smash {\delta ^{{\mathfrak B}_{<}^{p}}_a}$).

The second sequence is $\mathfrak B^p_\equiv = \bigl (\{s_{i} \mid i < p \},\{a,\natural \},\delta ^{{\mathfrak B}_{\equiv }^{p}},s_0,\{s_0\} \bigr )$, where $\delta ^{{\mathfrak B}_{\equiv }^{p}}_\natural (s_i)=s_i$ and $\delta ^{{\mathfrak B}_{\equiv }^{p}}_a(s_i)=s_j$ if $i,j<p$ and $j\equiv i+1\ (\text {mod}\ p)$. One can check that ${\boldsymbol{L}}(\mathfrak B^p_\equiv )$ comprises all words of a’s and $\natural $’s where the number of a’s is divisible by p, $\mathfrak B^p_\equiv $ is the minimal DFA for this language, and $M(\mathfrak B^p_\equiv )$ is also the cyclic group of order p (generated by the permutation $\delta ^{{\mathfrak B}_{\equiv }^{p}}_a$).

The third sequence is $\mathfrak B^p_\mathsf {MOD}= \bigl (\{s_{i} \mid i \le p \},\{a,\natural \},\delta ^{{\mathfrak B}_{\mathsf {MOD}}^{p}},s_0,\{s_0\} \bigr )$, where

$\delta ^{{\mathfrak B}_{\mathsf {MOD}}^{p}}_a(s_p)=s_p$, and $\delta ^{{\mathfrak B}_{\mathsf {MOD}}^{p}}_a(s_i)=s_j$ whenever $i,j<p$ and $j\equiv i+1\ (\text {mod}\ p)$;
$\delta ^{{\mathfrak B}_{\mathsf {MOD}}^{p}}_\natural (s_0)=s_p$, $\delta ^{{\mathfrak B}_{\mathsf {MOD}}^{p}}_\natural (s_p)=s_0$, and $\delta ^{{\mathfrak B}_{\mathsf {MOD}}^{p}}_\natural (s_i)=s_j$ whenever $1\le i,j<p$ and $i\cdot j\equiv p-1\ (\text {mod}\ p)$, that is, $j = -1/i$ in the finite field $\mathbb F_p$.

One can check that $\mathfrak B^p_\mathsf {MOD}$ is the minimal DFA for its language, and the syntactic monoid $M(\mathfrak B^p_\mathsf {MOD})$ is the permutation group generated by $\delta ^{{\mathfrak B}_{\mathsf {MOD}}^{p}}_a$ and $\delta ^{{\mathfrak B}_{\mathsf {MOD}}^{p}}_\natural $.

Lemma 2

For any prime $p>5$ with $p\not \equiv \pm 1\ (\text {mod}\ 10)$, the group $M(\mathfrak B^p_\mathsf {MOD})$ is unsolvable, but all of its proper subgroups are solvable.

Proof

One can check that the order of the permutation $\delta ^{{\mathfrak B}_{\mathsf {MOD}}^{p}}_\natural $ is 2, that of $\delta ^{{\mathfrak B}_{\mathsf {MOD}}^{p}}_a$ is p, while the order of the inverse of $\delta ^{{\mathfrak B}_{\mathsf {MOD}}^{p}}_{\natural a}$ is the same as the order of $\delta ^{{\mathfrak B}_{\mathsf {MOD}}^{p}}_{\natural a}$, which is 3. So $M(\mathfrak B^p_\mathsf {MOD})$ is unsolvable, for any prime p, by the Kaplan–Levy criterion. To prove that all proper subgroups of $M(\mathfrak B^p_\mathsf {MOD})$ are solvable, we show that $M(\mathfrak B^p_\mathsf {MOD})$ is a subgroup of the projective special linear group $\text {PSL}_2(p)$. If p is a prime with $p>5$ and $p\not \equiv \pm 1\ (\text {mod}\ 10)$, then all proper subgroups of $\text {PSL}_2(p)$ are solvable; see, e.g., [17, Theorem 2.1]. (So $M(\mathfrak B^p_\mathsf {MOD})$ is in fact isomorphic to the unsolvable group $\text {PSL}_2(p)$.) Consider the set $P=\{0,1,\dots ,p-1,\infty \}$ of all points of the projective line over the field $\mathbb F_p$. By identifying $s_i$ with i for $i<p$, and $s_p$ with $\infty $, we may regard the elements of $M(\mathfrak B^p_\mathsf {MOD})$ as $P\rightarrow P$ functions. The group $\text {PSL}_2(p)$ consists of all $P\rightarrow P$ functions of the form $i\mapsto \frac{w\cdot i + x}{y\cdot i+z}$, where $w\cdot z-x\cdot y=1$, with the field arithmetic of $\mathbb F_p$ extended by $i+\infty =\infty $ for any $i\in P$, $0\cdot \infty =1$ and $i\cdot \infty =\infty $ for $i\ne 0$. One can check that the two generators of $M(\mathfrak B^p_\mathsf {MOD})$ are in $\text {PSL}_2(p)$: take $w=1$, $x=1$, $y=0$, $z=1$ for $\delta ^{{\mathfrak B}_{\mathsf {MOD}}^{p}}_a$, and $w=0$, $x=1$, $y=p-1$, $z=0$ for $\delta ^{{\mathfrak B}_{\mathsf {MOD}}^{p}}_\natural $.

Finally, we define three automata $\mathfrak A_<$, $\mathfrak A_\equiv $, $\mathfrak A_\mathsf {MOD}$ over the same tape alphabet $\varSigma _+= \varSigma \cup \{a_1,a_2,\natural \}$, where $a_1,a_2$ are fresh symbols. We take, respectively, $\mathfrak B^{p}_<$, $\mathfrak B^{p}_\equiv $, $\mathfrak B^{p}_\mathsf {MOD}$ and replace each transition $s_i\rightarrow _a s_j$ in them by a fresh copy of $\mathfrak A_i$, for $i \le p$, as shown in the picture below.

We make $\mathfrak A_<$, $\mathfrak A_\equiv $, $\mathfrak A_\mathsf {MOD}$ deterministic by adding a trash state $\textit{tr}$ looping on itself with every $y\in \varSigma _+$, and adding the missing transitions leading to $\textit{tr}$. It follows that $\mathfrak A_<$, $\mathfrak A_\equiv $, and $\mathfrak A_\mathsf {MOD}$ are minimal DFAs of size polynomial in $N$, $|\boldsymbol{M}|$.

Lemma 3

(i)
${\boldsymbol{L}}(\mathfrak A_<)$ is $\mathsf {FO}(<)$-definable iff $\bigcap _{i=0}^{p} {\boldsymbol{L}}(\mathfrak A_i)= \emptyset $.
(ii)
${\boldsymbol{L}}(\mathfrak A_\equiv )$ is $\mathsf {FO}(<,\equiv )$-definable iff $\bigcap _{i=0}^{p} {\boldsymbol{L}}(\mathfrak A_i)= \emptyset $.
(iii)
${\boldsymbol{L}}(\mathfrak A_\mathsf {MOD})$ is $\mathsf {FO}(<,\mathsf {MOD})$-definable iff $\bigcap _{i=0}^{p} {\boldsymbol{L}}(\mathfrak A_i)= \emptyset $.

Proof

As $\mathfrak A_<,\mathfrak A_\equiv ,\mathfrak A_\mathsf {MOD}$ are minimal, we can replace $\sim $ by $=$ in the conditions of Theorem 1. For the ($\Rightarrow $) directions, given some $w\in \bigcap _{i=0}^{p} {\boldsymbol{L}}(\mathfrak A_i) $, in each case we show how to satisfy the corresponding condition of Theorem 1: (i) take $u=a_1wa_2$, $q=s_0$, and $k=p$; (ii) take $u=a_1wa_2$, $v=\natural ^{|u|}$, $q=s_0$, and $k=p$; (iii) take $u=\natural $, $v=a_1wa_2$, $q = s_0$, $k = p$ and $l = 3$.

$(\Leftarrow )$ We show that the corresponding condition of Theorem 1 implies non-emptiness of $\bigcap _{i=0}^{p} {\boldsymbol{L}}(\mathfrak A_i) $. To this end, we define a $\varSigma _+^*\rightarrow \{a,\natural \}^*$ homomorphism by taking $h(\natural )=\natural $, $h(a_1)=a$, and $h(b)=\varepsilon $ for all other $b\in \varSigma _+$.

(i) and (ii): Let $\circ \in \{<,\equiv \}$ and suppose q is a state in $\mathfrak A^p_\circ $ and $u'\in \varSigma _+^*$ such that $q\ne \delta ^{{\mathfrak A}_{\circ }^{p}}_{u'}(q)$ and $q=\delta ^{{\mathfrak A}_{\circ }^{p}}_{(u')^k}(q)$ for some k. Let $S=\{s_0,s_1,\dots ,s_{p-1}\}$. We claim that there exist $s\in S$ and $u\in \varSigma _+^*$ such that

$$\begin{aligned}&s\ne \delta ^{{\mathfrak A}_{\circ }^{p}}_{u}(s),\end{aligned}$$

(13)

$$\begin{aligned}&\delta ^{{\mathfrak A}_{\circ }^{p}}_{x}(s)\in S,\quad \text {for every }x\in \{u\}^*. \end{aligned}$$

(14)

Indeed, observe that none of the states along the cyclic $q\rightarrow _{(u')^k} q$ path $\varPi $ in $\mathfrak A^p_\circ $ is $\textit{tr}$. So there is some state along $\varPi $ that is in S, as otherwise one of the $\mathfrak A_i$ would contain a nontrivial cycle. Therefore, $u'$ must be of the form $w\natural ^n a_1w'$ for some $w\in \varSigma ^*$, $n<\omega $ and $w'\in \varSigma _+^*$. It is easy to see that $s=\delta ^{{\mathfrak A}_{\circ }^{p}}_{(u')^{k-1}w}(q)$ and $u=\natural ^na_1w'w$ is as required in (13) and (14).

As $M(\mathfrak B^p_\circ )$ is a finite group, the set $\bigl \{\delta ^{{\mathfrak B}_{\circ }^{p}}_{h(x)}\mid x\in \{u\}^*\bigr \}$ forms a subgroup $\mathfrak G$ in it (the subgroup generated by $\delta ^{{\mathfrak B}_{\circ }^{p}}_{h(u)}$). We show that $\mathfrak G$ is nontrivial by finding a nontrivial homomorphic image of it. To this end, (14) implies that, for every $x\in \{u\}^*$, the restriction $\delta ^{{\mathfrak A}_{\circ }^{p}}_{x}\!\!\restriction _{S'}$ of $\delta ^{{\mathfrak A}_{\circ }^{p}}_{x}$ to the set $S'=\bigl \{ \delta ^{{\mathfrak A}_{\circ }^{p}}_{y}(s)\mid y\in \{u\}^*\bigr \}$ is an $S'\rightarrow S'$ function and $\delta ^{{\mathfrak A}_{\circ }^{p}}_{x}\!\!\restriction _{S'}=\delta ^{{\mathfrak B}_{\circ }^{p}}_{h(x)}\!\!\restriction _{S'}$. As $M(\mathfrak B^p_\circ )$ is a group of permutations on a set containing $S'$, $\delta ^{{\mathfrak B}_{\circ }^{p}}_{h(x)}\!\!\restriction _{S'}$ is a permutation of $S'$, for every $x\in \{u\}^*$. Thus, $\bigl \{\delta ^{{\mathfrak B}_{\circ }^{p}}_{h(x)}\!\!\restriction _{S'}\mid x\in \{u\}^*\bigr \}$ is a homomorphic image of $\mathfrak G$ that is nontrivial by (13).

As $\mathfrak G$ is a nontrivial subgroup of the cyclic group $M(\mathfrak B^p_\circ )$ of order $p$ and $p$ is a prime, $\mathfrak G=M(\mathfrak B^p_\circ )$. Then there is $x\in \{u\}^*$ with $\delta ^{{\mathfrak B}_{\circ }^{p}}_{h(x)}=\delta ^{{\mathfrak B}_{\circ }^{p}}_{a}$ (a permutation containing the $p$-cycle $(s_0 s_1\dots s_{p-1})$ ‘around’ all elements of S), and so $S'=S$ and $x=\natural ^n a_1wa_2w'$ for some $n<\omega $, $w\in \varSigma ^*$, and $w'\in \varSigma _+^*$. As $n=0$ when $\circ =<$ and $\delta ^{{\mathfrak A}_{\equiv }^{p}}_{\natural ^n}(s)$ for every $s\in S$, $S'=S$ implies that $w\in \bigcap _{i=0}^{p-1} {\boldsymbol{L}}(\mathfrak A_i)=\bigcap _{i=0}^{p} {\boldsymbol{L}}(\mathfrak A_i)$.

(iii) Suppose q is a state in $\mathfrak A^p_\mathsf {MOD}$ and $u',v'\in \varSigma _+^*$ such that $q\ne \delta ^{{\mathfrak A}_{\mathsf {MOD}}^{p}}_{u'}(q)$, $q\ne \delta ^{{\mathfrak A}_{\mathsf {MOD}}^{p}}_{v'}(q)$, $q\ne \delta ^{{\mathfrak A}_{\mathsf {MOD}}^{p}}_{u'v'}(q)$, and $\delta ^{{\mathfrak A}_{\mathsf {MOD}}^{p}}_{x}(q)=\delta ^{{\mathfrak A}_{\mathsf {MOD}}^{p}}_{x(u')^2}(q)=\delta ^{{\mathfrak A}_{\mathsf {MOD}}^{p}}_{x(v')^k}(q)=\delta ^{{\mathfrak A}_{\mathsf {MOD}}^{p}}_{x(u'v')^l}(q)$ for some odd prime k and number l that is coprime to both 2 and k. Take $S=\{s_0,s_1,\dots ,s_{p}\}$. We claim that there exist $s\in S$ and $u,v\in \varSigma _+^*$ such that

$$\begin{aligned}&s\ne \delta ^{{\mathfrak A}_{\mathsf {MOD}}^{p}}_{u}(s),\ s\ne \delta ^{{\mathfrak A}_{\mathsf {MOD}}^{p}}_{v}(s),\ s\ne \delta ^{{\mathfrak A}_{\mathsf {MOD}}^{p}}_{uv}(s),\end{aligned}$$

(15)

$$\begin{aligned}&\delta ^{{\mathfrak A}_{\mathsf {MOD}}^{p}}_{x}(s)\in S,\quad \text {for every }x\in \{u,v\}^*,\end{aligned}$$

(16)

$$\begin{aligned}&\delta ^{{\mathfrak A}_{\mathsf {MOD}}^{p}}_{x}(s)=\delta ^{{\mathfrak A}_{\mathsf {MOD}}^{p}}_{xu^2}(s)=\delta ^{{\mathfrak A}_{\mathsf {MOD}}^{p}}_{xv^k}(s)=\delta ^{{\mathfrak A}_{\mathsf {MOD}}^{p}}_{x(uv)^l}(s),\quad \text {for every }x\in \{u,v\}^*. \end{aligned}$$

(17)

Indeed, by an argument similar to the one in the proof of (i) and (ii) above, we must have $u'=w_u\natural ^n a_1w'_u$ and $v'=w_v\natural ^m a_1w'_v$ for some $w_u,w_v\in \varSigma ^*$, $n,m<\omega $ and $w'_u,w'_v\in \varSigma _+^*$. For every $x\in \{u,v\}^*$, as both $\delta ^{{\mathfrak A}_{\mathsf {MOD}}^{p}}_{xw_u}(q)$ and $\delta ^{{\mathfrak A}_{\mathsf {MOD}}^{p}}_{xw_v}(q)$ are in S, they must be the same state. Using this it is not hard to see that $s=\delta ^{{\mathfrak A}_{\mathsf {MOD}}^{p}}_{u'w_u}(q)$, $u=\natural ^na_1w'_uw_u$ and $v=\natural ^ma_1w'_vw_v$ are as required in (15)–(17).

As $M(\mathfrak B^p_\mathsf {MOD})$ is a finite group, the set $\bigl \{\delta ^{{\mathfrak B}_{\mathsf {MOD}}^{p}}_{h(x)}\mid x\in \{u,v\}^*\bigr \}$ forms a subgroup $\mathfrak G$ in it (the subgroup generated by $\delta ^{{\mathfrak B}_{\mathsf {MOD}}^{p}}_{h(u)}$ and $\delta ^{{\mathfrak B}_{\mathsf {MOD}}^{p}}_{h(v)}$). We show that $\mathfrak G$ is unsolvable by finding an unsolvable homomorphic image of it. To this end, we let $S'=\bigl \{ \delta ^{{\mathfrak A}_{\mathsf {MOD}}^{p}}_{y}(s)\mid y\in \{u,v\}^*\bigr \}$. Then (16) implies that $S'\subseteq S$ and

$$\begin{aligned} \delta ^{{\mathfrak B}_{\mathsf {MOD}}^{p}}_{h(x)}(s')=\delta ^{{\mathfrak A}_{\mathsf {MOD}}^{p}}_{x}(s')\in S',\quad \text {for all }s'\in S\text { and }x\in \{u,v\}^*, \end{aligned}$$

(18)

and so the restriction $\delta ^{{\mathfrak A}_{\mathsf {MOD}}^{p}}_{x}\!\!\restriction _{S'}$ of $\delta ^{{\mathfrak A}_{\mathsf {MOD}}^{p}}_{x}$ to $S'$ is an $S'\rightarrow S'$ function and $\delta ^{{\mathfrak A}_{\mathsf {MOD}}^{p}}_{x}\!\!\restriction _{S'}=\delta ^{{\mathfrak B}_{\mathsf {MOD}}^{p}}_{h(x)}\!\!\restriction _{S'}$. As $M(\mathfrak B^p_\mathsf {MOD})$ is a group of permutations on a set containing $S'$, $\delta ^{{\mathfrak B}_{\mathsf {MOD}}^{p}}_{h(x)}\!\!\restriction _{S'}$ is a permutation of $S'$, for any $x\in \{u,v\}^*$. So $\{\delta ^{{\mathfrak B}_{\mathsf {MOD}}^{p}}_{h(x)}\!\!\restriction _{S'}\mid x\in \{u,v\}^*\!\}$ is a homomorphic image of $\mathfrak G$ that is unsolvable by the Kaplan–Levy criterion: By (15), (17), and 2 and k being primes, the order of the permutation $\delta ^{{\mathfrak B}_{\mathsf {MOD}}^{p}}_{h(u)}\!\!\restriction _{S'}$ is 2, the order of $\delta ^{{\mathfrak B}_{\mathsf {MOD}}^{p}}_{h(v)}\!\!\restriction _{S'}$ is k, and the order of $\delta ^{{\mathfrak B}_{\mathsf {MOD}}^{p}}_{h(uv)}\!\!\restriction _{S'}$ (which is the same as the order of its inverse) is a $>1$ divisor of l, and so coprime to both 2 and k.

As $\mathfrak G$ is an unsolvable subgroup of $M(\mathfrak B^p_\mathsf {MOD})$, it follows from Lemma 2 that $\mathfrak G=M(\mathfrak B^p_\mathsf {MOD})$, and so $\{u,v\}^*\not \subseteq \natural ^*$. We claim that $S'=S$ also follows. Indeed, let $x\in \{u,v\}^*$ be such that $\delta ^{{\mathfrak B}_{\mathsf {MOD}}^{p}}_{h(x)}=\delta ^{{\mathfrak B}_{\mathsf {MOD}}^{p}}_{a}$. As $|S'|\ge 2$ by (15), $s\in \{s_0,\dots ,s_{p-1}\}$ must hold, and so $\{s_0,\dots ,s_{p-1}\}\subseteq S'$ follows by (18). As there is $y\in \{u,v\}^*$ with $\delta ^{{\mathfrak B}_{\mathsf {MOD}}^{p}}_{h(y)}=\delta ^{{\mathfrak B}_{\mathsf {MOD}}^{p}}_{\natural }$, $s_p\in S'$ also follows by (18). Finally, as $\{u,v\}^*\not \subseteq \natural ^*$, there is $x\in \{u,v\}^*$ of the form $\natural ^na_1wa_2w'$, for some $n<\omega $, $w\in \varSigma $ and $w'\in \varSigma _+^*$. As $S'=S$, $\delta ^{{\mathfrak B}_{\mathsf {MOD}}^{p}}_{x}(s_i)\in S$ for every $i\le p$, and so $w\in \bigcap _{i=0}^{p} {\boldsymbol{L}}(\mathfrak A_i)$.

Theorem 2 clearly follows from Lemmas 1 and 3.

5 Deciding $\mathcal {L}$-definability of 2NFAs in $\textsc {PSpace}$

Using the criterion Theorem 1 (i), Stern [25] showed that deciding whether the language of any given DFA is $\mathsf {FO}(<)$-definable can be done in $\textsc {PSpace}$. In this section, we also use the criteria of Theorem 1 to provide $\textsc {PSpace}$-algorithms deciding whether the language of any given 2NFA is $\mathcal {L}$-definable, whenever $\mathcal {L}\in \{ \mathsf {FO}(<), \mathsf {FO}(<,\equiv ), \mathsf {FO}(<,\mathsf {MOD})\}$. Let $\mathfrak A= (Q, \varSigma , \delta , Q_0, F)$ be a 2NFA. Following [9], we first construct a(n exponential size) DFA $\mathfrak A'$ such that ${\boldsymbol{L}}(\mathfrak A)={\boldsymbol{L}}(\mathfrak A')$. To this end, for any $w \in \varSigma ^+$, we introduce four binary relations $\mathsf {b}_{lr}(w)$, $\mathsf {b}_{rl}(w)$, $\mathsf {b}_{rr}(w)$, and $\mathsf {b}_{ll}(w)$ on Q describing the left-to-right, right-to-left, right-to-right, and left-to-left behaviour of $\mathfrak A$ on w. Namely,

$(q,q') \in \mathsf {b}_{lr}(w)$ if there is a run of $\mathfrak A$ on w from (q, 0) to $(q', |w|)$;
$(q,q') \in \mathsf {b}_{rr}(w)$ if there is a run of $\mathfrak A$ on w from $(q, |w|-1)$ to $(q', |w|)$;
$(q,q') \in \mathsf {b}_{rl}(w)$ if, for some $a \in \varSigma $, there is a run on aw from $(q, |aw|-1)$ to $(q', 0)$ such that no $(q'',0)$ occurs in it before $(q', 0)$;
$(q,q') \in \mathsf {b}_{ll}(w)$ if, for some $a \in \varSigma $, there is a run on aw from (q, 1) to $(q', 0)$ such that no $(q'',0)$ occurs in it before $(q', 0)$.

For $w = \varepsilon $ (the empty word), we define the $\mathsf {b}_{ij}(w)$ as the identity relation on Q. Let $\mathsf {b} = (\mathsf {b}_{lr}, \mathsf {b}_{rl}, \mathsf {b}_{rr}, \mathsf {b}_{ll})$, where the $\mathsf {b}_{ij}$ are the behaviours of $\mathfrak A$ on some $w \in \varSigma ^*$, in which case we can also write $\mathsf {b}(w)$, and let $\mathsf {b}' = \mathsf {b}(w')$, for some $w' \in \varSigma ^*$. We define the composition $\mathsf {b} \cdot \mathsf {b}' = \mathsf {b}''$ with components $\mathsf {b}_{ij}''$ as follows. Let X and Y be the transitive closure of $\mathsf {b}_{ll}' \circ \mathsf {b}_{rr}$ and $\mathsf {b}_{rr} \circ \mathsf {b}_{ll}'$, respectively. Then we set:

$$\begin{aligned}&\mathsf {b}_{lr}'' = \mathsf {b}_{lr} \circ \mathsf {b}_{lr}' \cup \mathsf {b}_{lr} \circ X \circ \mathsf {b}_{lr}',\qquad \mathsf {b}_{rl}'' = \mathsf {b}_{rl}' \circ \mathsf {b}_{rl} \cup \mathsf {b}_{rl}' \circ Y \circ \mathsf {b}_{rl}, \\&\mathsf {b}_{rr}'' = \mathsf {b}_{rr}' \cup \mathsf {b}_{rl}' \circ Y \circ \mathsf {b}_{rr} \circ \mathsf {b}_{lr}',\qquad \mathsf {b}_{ll}'' = \mathsf {b}_{ll} \cup \mathsf {b}_{lr} \circ X \circ \mathsf {b}_{ll}' \circ \mathsf {b}_{rl}. \end{aligned}$$

One can check that $\mathsf {b}'' = \mathsf {b}(ww')$. Define a DFA $\mathfrak A' = (Q', \varSigma , \delta ', q_0', F')$ by taking

$$\begin{aligned}&Q' = \bigl \{ (B_{lr}, B_{rr}) \mid B_{lr} \subseteq Q_0 \times Q, \ B_{rr} \subseteq Q \times Q \bigr \},\ \ q_0' = \bigl (\bigl \{(q,q) \mid q \in Q_0\bigr \}, \emptyset \bigr ),\\&F' = \bigl \{(B_{lr}, B_{rr}) \mid (q_0, q) \in B_{lr}, \text { for some }q_0 \in Q_0\text { and }q\in F \bigr \},\\&\delta '_a\bigl ((B_{lr}, B_{rr})\bigr ) =(B_{lr}', B_{rr}'), \text { with}\ B_{lr}' = B_{lr} \circ X(a) \circ \mathsf {b}_{lr}(a),\\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \ \ \, B_{rr}' = B_{rr} \cup \mathsf {b}_{rl}(a) \circ Y(a) \circ \mathsf {b}_{lr}(a), \end{aligned}$$

where X(a) and Y(a) are the reflexive and transitive closures of $\mathsf {b}_{ll}(a) \circ B_{rr}$ and $B_{rr} \circ \mathsf {b}_{ll}(a)$, respectively. It is not hard to see that, for any $w\in \varSigma ^*$,

$$\begin{aligned}&\delta '_w\bigl ((B_{lr}, B_{rr})\bigr ) =(B_{lr}', B_{rr}')\ \text {iff}\ B_{lr}' = B_{lr} \circ X(w) \circ \mathsf {b}_{lr}(w) \text { and} \nonumber \\&\qquad \qquad \qquad \qquad \qquad \qquad \ \ \ \ B_{rr}' = B_{rr} \cup \mathsf {b}_{rl}(w) \circ Y(w) \circ \mathsf {b}_{lr}(w), \end{aligned}$$

(19)

where X(w) and Y(w) are the reflexive and transitive closures of $\mathsf {b}_{ll}(w) \circ B_{rr}$ and $B_{rr} \circ \mathsf {b}_{ll}(w)$, respectively. Also, one can show in a way similar to [24, 29] that

$$\begin{aligned} {\boldsymbol{L}}(\mathfrak A) = {\boldsymbol{L}}(\mathfrak A'). \end{aligned}$$

(20)

Next, we show that, even if the size of $\mathfrak A'$ is exponential in $\mathfrak A$, we can still use Theorem 1 to decide $\mathcal {L}$-definability of ${\boldsymbol{L}}(\mathfrak A)$ in $\textsc {PSpace}$:

Theorem 3

For $\mathcal {L}\in \{ \mathsf {FO}(<), \mathsf {FO}(<,\equiv ), \mathsf {FO}(<,\mathsf {MOD})\}$, deciding $\mathcal {L}$-definability of ${\boldsymbol{L}}(\mathfrak A)$, for any 2NFA $\mathfrak A$, is in $\textsc {PSpace}$.

Proof

Let $\mathfrak A'$ be the DFA defined above for the given 2NFA $\mathfrak A$. By Theorem 1 (i) and (20), ${\boldsymbol{L}}(\mathfrak A)$ is not $\mathsf {FO}(<)$-definable iff there exist a word $u\in \varSigma ^*$, a reachable state $q \in Q'$, and a number $k\le |Q'|$ such that $q\not \sim \delta '_u(q)$ and $q= \delta '_{u^k}(q)$. We guess the required k in binary, q and a quadruple $\mathsf {b}(u)$ of binary relations on Q. Clearly, they all can be stored in polynomial space in $|\mathfrak A|$. To check that our guesses are correct, we first check that $\mathsf {b}(u)$ indeed corresponds to some $u \in \varSigma ^*$. This is done by guessing a sequence $\mathsf {b}_0, \dots , \mathsf {b}_n$ of distinct quadruples of binary relations on Q such that $\mathsf {b}_0 = \mathsf {b}(u_0)$ and $\mathsf {b}_{i+1} = \mathsf {b}_{i} \cdot \mathsf {b}(u_{i+1})$, for some $u_0, \dots , u_n \in \varSigma $. (Any sequence with a subsequence starting after $\mathsf {b}_i$ and ending with $\mathsf {b}_{i+m}$, for some i and m such that $\mathsf {b}_i = \mathsf {b}_{i+m}$, is equivalent, in the context of this proof, to the sequence with such a subsequence removed.) Thus, we can assume that $n \le 2^{O(|Q|)}$, and so n can be guessed in binary and stored in $\textsc {PSpace}$. So, the stage of our algorithm checking that $\mathsf {b}(u)$ corresponds to some $u \in \varSigma ^*$ makes n iterations and continues to the next stage if $\mathsf {b}_n = \mathsf {b}(u)$ or terminates with an answer $\mathsf {no}$ otherwise. Now, using $\mathsf {b}(u)$, we compute $\mathsf {b}(u^k)$ by means of a sequence $\mathsf {b}_0, \dots , \mathsf {b}_k$, where $\mathsf {b}_0 = \mathsf {b}(u)$ and $\mathsf {b}_{i+1} = \mathsf {b}_{i} \cdot \mathsf {b}(u)$. With $\mathsf {b}(u)$ ($\mathsf {b}(u^k)$), we compute $\delta '_{u}(q)$ (respectively, $\delta '_{u^k}(q)$) in $\textsc {PSpace}$ using (19). If $\delta '_{u^k}(q) \ne q$, the algorithm terminates with an answer $\mathsf {no}$. Otherwise, in the final stage of the algorithm, we check that $\delta '_{u}(q) \not \sim q$. This is done by guessing $v \in \varSigma ^*$ such that $\delta '_v(q) = q_1$, $\delta '_v\bigl (\delta '_{u}(q)\bigr ) = q_2$, and $q_1 \in F'$ iff $q_1 \not \in F'$. We guess such a v (if exists) in the form of $\mathsf {b}(v)$ using an algorithm analogous to that for guessing u above.

By Theorem 1 (ii) and (20), ${\boldsymbol{L}}(\mathfrak A)$ is not $\mathsf {FO}(<,\equiv )$-definable iff there there exist words $u,v\in \varSigma ^*$, a reachable state $q \in Q'$, and a number $k\le |Q'|$ such that $q\not \sim \delta '_u(q)$, $q=\delta '_{u^k}(q)$, $|v|=|u|$, and $\delta '_{u^i}(q)=\delta '_{u^iv}(q)$, for all $i<k$. We outline how to modify the algorithm for $\mathsf {FO}(<)$ above to check $\mathsf {FO}(<,\equiv )$-definability. First, we need to guess and check v in the form of $\mathsf {b}(v)$ in parallel with guessing and checking u in the form of $\mathsf {b}(u)$, making sure that $|v| = |u|$. For that, we guess a sequence of distinct pairs $(\mathsf {b}_0, \mathsf {b}_0'), \dots , (\mathsf {b}_n, \mathsf {b}_n')$ such that the $\mathsf {b}_i$ are as above, $\mathsf {b}_0' = \mathsf {b}(v_0)$ and $\mathsf {b}_{i+1}' = \mathsf {b}_{i}' \cdot \mathsf {b}(v_{i+1})$, for some $v_0, \dots , v_n \in \varSigma $. (Any such sequence with a subsequence starting after $(\mathsf {b}_i, \mathsf {b}_i')$ and ending with $(\mathsf {b}_{i+m}, \mathsf {b}_{i+m}')$, for some i and m such that $(\mathsf {b}_i, \mathsf {b}_i') = (\mathsf {b}_{i+m}, \mathsf {b}_{i+m}')$, is equivalent to the sequence with that subsequence removed.) So $n \le 2^{O(|Q|)}$. For each $i<k$, we can then compute $\delta '_{u^i}(q)$ and $\delta '_{u^iv}(q)$, using (19), and check whether whether they are equal.

Finally, by Theorem 1 (iii) and (20), ${\boldsymbol{L}}(\mathfrak A)$ is not $\mathsf {FO}(<,\mathsf {MOD})$-definable iff there exist $u,v\in \varSigma ^*$, a reachable state $q\in Q'$ and $k,l\le |Q'|$ such that k is an odd prime, $l>1$ and coprime to both 2 and k, $q\not \sim \delta '_u(q)$, $q\not \sim \delta '_v(q)$, $q\not \sim \delta '_{uv}(q)$, and $\delta '_{x}(q)\sim \delta '_{xu^{2}}(q)\sim \delta '_{xv^{k}}(q)\sim \delta '_{x(uv)^{l}}(q)$, for all $x\in \{u,v\}^*$. We start by guessing $u,v \in \varSigma ^*$ in the form of $\mathsf {b}(u)$ and $\mathsf {b}(u)$, respectively. Also, we guess k and l in binary and check that k is an odd prime and l is coprime to both 2 and k. By (19), $\delta '_x$ is determined by $\mathsf {b}(x)$, for any $x\in \{u,v\}^*$. Thus, we can proceed as follows to verify that u, v, k and l are as required. We perform the following steps, for each quadruple $\mathsf {b}$ of binary relations on Q. First, we check whether $\mathsf {b} = \mathsf {b}(x)$, for some $x \in \{u,v\}^*$ (we discuss the algorithm for this below). If this is not the case, we construct the next quadruple $\mathsf {b}'$ and process it as this $\mathsf {b}$. If it is the case, we compute all the states $\delta '_{x}(q)$, $\delta '_{xu^{2}}(q)$, $\delta '_{xv^{k}}(q)$, $\delta '_{x(uv)^{l}}(q)$, $\delta '_{u}(q)$, $\delta '_{v}(q)$, $\delta '_{uv}(q)$, and check their required (non)equivalences w.r.t. $\sim $, using the same method as for checking $\delta '_{u}(q) \not \sim q$ above. If they do not hold as required, our algorithm terminates with an answer $\mathsf {no}$. Otherwise, we construct the next quadruple $\mathsf {b}'$ and process it as this $\mathsf {b}$. When all possible quadruples $\mathsf {b}$ of binary relations of Q have been processed, the algorithm terminates with an answer $\mathsf {yes}$.

Now, to check that a given quadruple $\mathsf {b}$ is equal to $\mathsf {b}(x)$, for some $x \in \{u,v\}^*$, we simply guess a sequence $\mathsf {b}_0, \dots , \mathsf {b}_n$ of quadruples of binary relations on Q such that $\mathsf {b}_0 = \mathsf {b}(w_0)$, $\mathsf {b}_n = \mathsf {b}$ and $\mathsf {b}_{i+1} = \mathsf {b}_{i} \cdot \mathsf {b}(w_{i+1})$, where $w_i \in \{u, v\}$. It follows from the argument above that it is enough to consider $n \le 2^{O(|Q|)}$.

6 Further Research

The results obtained in this paper have been used for deciding the rewritability type of ontology-mediated queries (OMQs) given in linear temporal logic LTL [22]. As mentioned in the introduction, LTL OMQs can be simulated by automata. In the worst case, the automata are of exponential size, and deciding FO-rewritability of some OMQs may become $\textsc {ExpSpace}$-complete. On the other hand, there are natural and practically important fragments of LTL with automata of special forms whose FO-rewritability can be decided in $\textsc {PSpace}$, $\varPi ^p_2$ or $\textsc {coNP}$. However, it remains to be seen whether the corresponding algorithms, even in the simplest case of $\mathsf {FO}(<)$-definability, are efficient enough for applications in temporal OBDA. Note that the problems considered in this paper are also relevant to the optimisation problem for recursive SQL queries.

References

Artale, A., et al.: Ontology-mediated query answering over temporal data: a survey. In: Schewe, S., Schneider, T., Wijsen, J. (eds.) TIME, vol. 90, pp. 1–37 (2017)
Google Scholar
Artale, A., et al.: First-order rewritability of ontology-mediated queries in linear temporal logic. Artif. Intell. 299, 103536 (2021)
Article MathSciNet Google Scholar
Barrington, D.: Bounded-width polynomial-size branching programs recognize exactly those languages in NC${^1}$. J. Comput. Syst. Sci. 38(1), 150–164 (1989)
Article MathSciNet Google Scholar
Barrington, D., Compton, K., Straubing, H., Thérien, D.: Regular languages in NC${^1}$. J. Comput. Syst. Sci. 44(3), 478–499 (1992)
Article MathSciNet Google Scholar
Beaudry, M., McKenzie, P., Thérien, D.: The membership problem in aperiodic transformation monoids. J. ACM 39(3), 599–616 (1992)
Article MathSciNet Google Scholar
Bennett, M., Martin, G., O’Bryant, K., Rechnitzer, A.: Explicit bounds for primes in arithmetic progressions. Illinois J. of Math. 62(1–4), 427–532 (2018)
MathSciNet MATH Google Scholar
Bernátsky, L.: Regular expression star-freeness is PSPACE-complete. Acta Cybern. 13(1), 1–21 (1997)
MathSciNet MATH Google Scholar
Büchi, J.R.: Weak second-order arithmetic and finite automata. Z. Math. Logik und Grundlagen der Math. 6(1–6), 66–92 (1960)
Article MathSciNet Google Scholar
Carton, O., Dartois, L.: Aperiodic two-way transducers and fo-transductions. In: Kreutzer, S. (ed.) CSL 2015, vol. 41 pp. 160–174 (2015)
Google Scholar
Cho, S., Huynh, D.: Finite-automaton aperiodicity is PSPACE-complete. Theor. Comp. Sci. 88(1), 99–116 (1991)
Article MathSciNet Google Scholar
Compton, K., Laflamme, C.: An algebra and a logic for NC${^1}$. Inf. Comput. 87(1/2), 240–262 (1990)
MathSciNet MATH Google Scholar
Elgot, C.: Decision problems of finite automata design and related arithmetics. Trans. Am. Math. Soc. 98, 21–51 (1961)
Article MathSciNet Google Scholar
Fleischer, L., Kufleitner, M.: The intersection problem for finite monoids. In: Niedermeier, R., Vallée, B. (eds.) STACS 2018, vol. 96, pp. 1–14 (2018)
Google Scholar
Holzer, M., König, B.: Regular languages, sizes of syntactic monoids, graph colouring, state complexity results, and how these topics are related to each other. Bull. EATCS 38, 139–155 (2004)
MathSciNet MATH Google Scholar
Jukna, S.: Boolean Function Complexity. Advances and Frontiers, vol. 27. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-24508-4
Kaplan, G., Levy, D.: Solvability of finite groups via conditions on products of 2-elements and odd p-elements. Bull. Austr. Math. Soc. 82(2), 265–273 (2010)
Article MathSciNet Google Scholar
King, O.: The subgroup structure of finite classical groups in terms of geometric configurations. In: Webb, B. (ed.) Surveys in Combinatorics, Society Lecture Note Series, vol. 327, pp. 29–56. Cambridge University Press (2005)
Google Scholar
Kozen, D.: Lower bounds for natural proof systems. In: Proceedings of FOCS 1977, pp. 254–266. IEEE Computer Society Press (1977)
Google Scholar
McNaughton, R., Papert, S.: Counter-free automata. The MIT Press, Cambridge (1971)
MATH Google Scholar
Poggi, A., Lembo, D., Calvanese, D., De Giacomo, G., Lenzerini, M., Rosati, R.: Linking data to ontologies. J. Data Seman. 10, 133–173 (2008)
MATH Google Scholar
Rotman, J.: An introduction to the theory of groups. Springer-Verlag, New York (1999). https://doi.org/10.1007/978-1-4612-4176-8
Book Google Scholar
Ryzhikov, V., Savateev, Y., Zakharyaschev, M.: Deciding FO-rewritability of ontology-mediated queries in linear temporal logic. In: Combi, C., Eder, J., Reynolds, M. (eds.) TIME 2021, LIPIcs, pp. 6:1–7:15 (2021)
Google Scholar
Schützenberger, M.: On finite monoids having only trivial subgroups. Inf. Control 8(2), 190–194 (1965)
Article MathSciNet Google Scholar
Shepherdson, J.C.: The reduction of two-way automata to one-way automata. IBM J. Res. Dev. 3(2), 198–200 (1959)
Article MathSciNet Google Scholar
Stern, J.: Complexity of some problems from the theory of automata. Inf. Control 66(3), 163–176 (1985)
Article MathSciNet Google Scholar
Straubing, H.: Finite Automata, Formal Logic, and Circuit Complexity. Birkhauser Verlag, Basel (1994)
Book Google Scholar
Thompson, J.: Nonsolvable finite groups all of whose local subgroups are solvable. Bull. Amer. Math. Soc. 74(3), 383–437 (1968)
Google Scholar
Trakhtenbrot, B.: Finite automata and the logic of one-place predicates. Siberian Math. J. 3, 103–131 (1962)
MathSciNet MATH Google Scholar
Vardi, M.: A note on the reduction of two-way automata to one-way atuomata. Inf. Process. Lett. 30(5), 261–264 (1989)
Article Google Scholar
Xiao, G., et al.: Ontology-based data access: a survey. In: Lang, J. (ed.) Proceedings of IJCAI 2018, pp. 5511–5519 (2018). ijcai.org
Google Scholar

Download references

Acknowledgements

This work was supported by UK EPSRC EP/S032282.

Author information

Authors and Affiliations

King’s College London, London, UK
Agi Kurucz
Birkbeck, University of London, London, UK
Vladislav Ryzhikov, Yury Savateev & Michael Zakharyaschev
HSE University, Moscow, Russia
Yury Savateev & Michael Zakharyaschev

Authors

Agi Kurucz
View author publications
You can also search for this author in PubMed Google Scholar
Vladislav Ryzhikov
View author publications
You can also search for this author in PubMed Google Scholar
Yury Savateev
View author publications
You can also search for this author in PubMed Google Scholar
Michael Zakharyaschev
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vladislav Ryzhikov .

Editor information

Editors and Affiliations

École polytechnique, Palaiseau, France
Uli Fahrenberg
Université Côte d’Azur, Nice, France
Mai Gehrke
Aix-Marseille University, Marseille, France
Luigi Santocanale
Brock University, St Catharines, ON, Canada
Michael Winter

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kurucz, A., Ryzhikov, V., Savateev, Y., Zakharyaschev, M. (2021). Deciding FO-definability of Regular Languages. In: Fahrenberg, U., Gehrke, M., Santocanale, L., Winter, M. (eds) Relational and Algebraic Methods in Computer Science. RAMiCS 2021. Lecture Notes in Computer Science(), vol 13027. Springer, Cham. https://doi.org/10.1007/978-3-030-88701-8_15

Download citation

DOI: https://doi.org/10.1007/978-3-030-88701-8_15
Published: 22 October 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88700-1
Online ISBN: 978-3-030-88701-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Deciding FO-definability of Regular Languages

Abstract

Similar content being viewed by others

On Decidability of Regular Languages Theories

Deciding Open Definability via Subisomorphisms

Complexity of Proper Prefix-Convex Regular Languages

1 Introduction