1 Introduction

Motivations. An important aspect of automata theory is the definition of automata subclasses with particular properties, of algorithmic interest for instance. As an example, the inclusion problem for non-deterministic finite automata is PSpace-c but becomes PTime if the automata are k-ambiguous for a fixed k [21].

By automata theory, we mean automata in the general sense of finite state machines processing finite words. This includes what we call automata with outputs, which may also produce output values in a fixed monoid \(\mathbb {M}= (D,\oplus ,\mathbb {0})\). In such an automaton, the transitions are extended with an (output) value in D, and the value of an accepting path is the sum (for \(\oplus \)) of all the values occurring along its transitions. Automata over finite words in \(\varLambda ^*\) and with outputs in \(\mathbb {M}\) define subsets of \(\varLambda ^*\times D\) as follows: to any input word \(w\in \varLambda ^*\), we associate the set of values of all the accepting paths on w. For example, transducers are automata with outputs in a free monoid: they process input words and produce output words and therefore define binary relations of finite words [15].

The many decidability properties of finite automata do not carry over to transducers, and many restrictions have been defined in the literature to recover decidability, or just to define subclasses relevant to particular applications. The inclusion problem for transducer is undecidable [13], but decidable for finite-valued transducers [23]. Another well-known subclass is that of the determinisable transducers [5], defining sequential functions of words. Finite-valuedness and determinisability are two properties decidable in PTime, i.e., it is decidable in PTime, given a transducer, whether it is finite-valued (resp. determinisable). As a second example of automata with outputs, we also consider sum-automata, i.e. automata with outputs in \((\mathbb {Z},+,0)\), which defines relations from words to \(\mathbb {Z}\). Properties such as functionality, determinisability, and k-valuedness (for a fixed k) are decidable in PTime for sum-automata [10, 11].

In our experience, it is quite often the case that deciding a subclass goes in two steps: (1) define a characterisation of the subclass through a “simple” pattern, (2) show how to decide the existence of a such a pattern. For instance, the determinisable transducers have been characterised via the so called twinning property [4, 6, 24], which, said briefly, asks that the output words produced by any two different paths on input words of the form \(uv^n\) cannot differ unboundedly when n grows, with a suitable definition of “differ”. Quite often, the most difficult part is step (1) and step (2) is technical but less difficult to achieve, as long as we do not seek for optimal complexity bounds (by this we mean that PTime is good enough, and obtaining the best polynomial degree is not the objective). We even noticed that in transducer theory, even though step (2) share common techniques (reduction to emptiness of reversal-bounded counter machines for instance), the algorithms are often ad-hoc to the particular subclass considered. Here is a non-exhaustive list of subclasses of transducers which are decidable in PTime: determinisable transducers [1, 4,5,6,7, 24], functional transducers [4, 5], k-sequential transducers (for a fixed k) [8], multi-sequential transducers [7, 16], k-valued transducers (for a fixed k) [14], finite-valued transducers [18, 23]. Our goal in this paper is to define a common tool for step (2), i.e., define a generic way of deciding a subclass characterised through a structural pattern. More precisely, we want to define logics, tailored to particular monoids \(\mathbb {M}\), able to express properties of automata with outputs in \(\mathbb {M}\), such that model-checking these properties on given automata can be done in PTime.

Contributions. We define a general logic, denoted \(\textsf {PL}[\mathcal {O}]\) for “pattern logic”, to express properties of automata with outputs in a fixed monoid \(\mathbb {M}= (D,\oplus ,\mathbb {0})\). This logic is parameterised by a set of predicates \(\mathcal {O}\) interpreted on D. We first give sufficient conditions under which the problem of model-checking an automaton with outputs in \(\mathbb {M}\) against a formula in this logic is decidable. Briefly, these conditions require the existence of a machine model accepting tuples of runs which satisfy the atomic predicates of the logic, is closed under union and intersection, and has decidable emptiness problem.

Then, we study three particular classes of automata with outputs: finite automata (which can be seen as automata with outputs in a trivial monoid with a single element), transducers (automata with outputs in a free monoid), and sum-automata (automata with outputs in \((\mathbb {Z},+,0)\)). For each of them, we define particular logics, called \(\textsf {PL}_\text {NFA}\), \(\textsf {PL}_\text {Trans}\) and \(\textsf {PL}_\text {Sum}\) to express properties of automata with outputs in these particular monoids. Formulas in these logics have the following form:

$$ \exists \pi _1:p_1\xrightarrow {u_1\mid v_1} q_1,\dots ,\exists \pi _n:p_n\xrightarrow {u_n\mid v_n} q_n, \mathcal {C}$$

where the \(\pi _i\) are path variables, the \(p_i,q_i\) are state variables, the \(u_i\) are (input) word variables and the \(v_i\) are output value variables (interpreted in D). The subformula \(\mathcal {C}\) is a quantifier free Boolean combinations of constraints talking about states, paths, input words and output values. Such a formula expresses the fact that there exists a path \(\pi _1\) from some state \(p_1\) to some state \(q_1\), over some input word \(u_1\), producing some value \(v_1\), some path \(\pi _2\) etc. such that they all satisfy the constraints in \(\mathcal {C}\). In the three logics, paths can be tested for equality. Input words can be compared with the prefix relation, w.r.t. their length, and their membership to a regular language be tested. States can be compared for equality, and it can be expressed whether they are initial or final.

The predicates we take for the output values depends on the monoids. For transducers, output words can be compared with the non-prefix relation (and by derivation \(\ne \)), a predicate which cannot be negated (otherwise model-checking becomes undecidable), and can also be compared with respect to their length, and membership to a regular language can be tested. For sum-automata, the output values can be compared with < (and by derivation \(=,\ne ,\le \)). As an example, a transducer (resp. sum-automaton) is not \((n-1)\)-valued iff it satisfies the following \(\textsf {PL}_\text {Trans}\)-formula (resp. \(\textsf {PL}_\text {Sum}\)-formula):

$$ \exists \pi _1:p_1\xrightarrow {u\mid v_1} q_1,\dots ,\exists \pi _{n}:p_{n}\xrightarrow {u\mid v_n} q_n, \bigwedge _{i=1}^n \textsf {init}(p_i)\wedge \textsf {final}(q_i)\wedge \bigwedge _{1\le i<j\le n} v_i\ne v_j. $$

For the three logics, we show that deciding whether a given automaton satisfies a given formula is PSPace-c. When the formula is fixed, the model-checking problem becomes NLogSpace-c for \(\textsf {PL}_\text {NFA}\) and \(\textsf {PL}_\text {Trans}\), and NP-c for \(\textsf {PL}_\text {Sum}\). If output values can only be compared via disequality \(\ne \) (which cannot be negated), then \(\textsf {PL}_\text {Sum}\) admits PTime model-checking. We show that many of the properties from the literature, including all the properties mentioned before, can be expressed in these logics. As a consequence, we show that most of the PTime upper-bounds obtained for deciding subclasses of finite automata in [2, 25], of transducers in [5,6,7,8, 14, 16, 18, 22, 24] and sum-automata in [3, 8, 10, 11], can be directly obtained by expressing in our logics the structural patterns given in these papers, which characterise these subclasses.

Related works. In addition to the results already mentioned, we point out that the syntax of our logic is close to a logic, defined in [9] by Figueira and Libkin, to express path queries in graph databases (finite graphs with edges labelled by a symbol). In this work, there is no disjunction nor negation, and no distinction between input and output values. By making such a distinction, and by adding negation and disjunction, we were able to tailor our logics to particular automata models and add enough power to be able to directly express classical structural automata properties.

2 Finite Automata with Outputs

In this section, we define a general model of finite automata defining functions from the free monoid \(\varLambda ^*\) (where \(\varLambda \) is a finite input alphabet) to any monoids \(\mathbb {M}= (D,\oplus ,\mathbb {0})\). More precisely, they are parametrised by a monoid of output values, read input words over some alphabet and output elements of the output monoid, obtained by summing the output values met along accepting paths.

Formally, a monoid \(\mathbb {M}\) is a tuple \((D,\oplus _\mathbb {M},\mathbb {0}_\mathbb {M})\) where D is a set of elements which we call here values or sometimes outputs, \(\oplus _\mathbb {M}\) is an associative binary operation on D, for which \(\mathbb {0}_\mathbb {M}\in D\) is neutral. Monoids of interest in this paper are the free monoid \((\varLambda ^*,\cdot ,\varepsilon )\) for some finite alphabet of symbols \(\varLambda \) (where \(\cdot \) denotes the concatenation), and the monoid \((\mathbb {Z},+,0)\). We also let \(\varLambda _\varepsilon = \varLambda \cup \{\varepsilon \}\). For \(w \in \varLambda ^*\), |w| denotes its length, in particular \(|\varepsilon | = 0\). The set of positions of w is \(\{1,\dots ,|w|\}\) (and empty if \(w = \epsilon \)). We let w[i] be the ith symbol of w. Given \(w_1,w_2\), we write \(w_1\mathrel {\sqsubseteq }w_2\) whenever \(w_1\) is a prefix of \(w_2\). All over this paper, the input alphabet is denoted by the letter \(\varLambda \).

Definition 1

(Automata with outputs). An automaton A with outputs over an (output) monoid \(\mathbb {M}=(D, \oplus _\mathbb {M}, \mathbb {0}_\mathbb {M})\) is a tuple \(\langle Q, I,F, \varDelta , \gamma \rangle \) where Q is a non-empty finite set of states, \(I\subseteq Q\) the set of initial states, \(F\subseteq Q\) the set of final states, \(\varDelta \subseteq Q \times \varLambda _\varepsilon \times Q\) the set of transitions labelled with some element of \(\varLambda _\varepsilon \), and \(\gamma :\varDelta \rightarrow D\) a mapping from transitions to output valuesFootnote 1. The set of automata over \(\mathbb {M}\) is written \(\mathcal {A}_{out}(\mathbb {M})\).

We write to refer to the number of states of A. A path in A is a sequence \(\pi = q_0 a_1 d_1 q_1 \dots a_n d_n q_n \in Q(\varLambda _\varepsilon D Q)^*\), for \(n\ge 0\), such that for all \(1 \le i \le n\) we have \((q_{i-1}, a_i, q_i) \in \varDelta \) and \(\gamma (q_{i-1}, a_i, q_i) = d_i\). The input of \(\pi \) is defined as the word \(\text {in}(\pi ) = a_1\dots a_n\) (and \(\varepsilon \) if \(\pi \in Q\)), the output of \(\pi \) as the element \(\text {out}(\pi ) = d_1\oplus _\mathbb {M}\dots \oplus _\mathbb {M}d_n\) (and \(\mathbb {0}_\mathbb {M}\) if \(\pi \in Q\)), and the size of \(\pi \) as \(|\pi | = n\). We may write \(\pi : q_0\xrightarrow {\text {in}(\pi )\mid \text {out}(\pi )} q_n\) to denote that \(\pi \) is a path from \(q_0\) to \(q_n\) on input \(\text {in}(\pi )\) and output \(\text {out}(\pi )\). For convenience we write \(\pi ^\triangleleft , \pi ^\triangleright \) to denote respectively the starting state \(q_0\) and the ending state \(q_n\) of the path \(\pi \). The set of all paths of A is written \(\text {Paths}(A)\). A path \(\pi : q_0\xrightarrow {u\mid v} q_n\) is initial if \(q_0\in I\), final if \(q_n\in F\) and accepting if it is both initial and final. The set of accepting paths of A is denoted by \(\text {Paths}_{acc}(A)\). The input/output relation (or just relation) defined by A is the set of pairs \(R(A) \subseteq \varLambda ^*\times D\) defined by

$$ R(A) = \{ (u,v)\mid \exists \pi \in \text {Paths}_{acc}(A)\cdot \text {in}(\pi ) = u\wedge \text {out}(\pi ) = v\} $$

Finite automata, transducers and sum-automata. In this paper, we consider three instances of automata with outputs. First, finite automata (over \(\varLambda \)), are seen as automata with outputs in a trivial monoid (and which is therefore ignored). Transducers are automata with outputs in the free monoid \(\varGamma ^*\). They define relations from \(\varLambda ^*\) to \(\varGamma ^*\). Finally, sum-automata are automata with outputs in the monoid \((\mathbb {Z},+,0)\).

3 A Pattern Logic for Automata with Outputs

In this section, we introduce a generic pattern logic. It is built over four kind of variables, namely path, state, input and output variables. More precisely, we let \(X_P= \{ \pi ,\pi _1,\dots \}\), \(X_Q= \{ q,q_1,p\dots ,\}\), \(X_I= \{ u,u_1,\dots \}\) and \(X_O= \{ v,v_1,\dots \}\) be disjoint and countable sets of resp. path, state, input and output variables. We define \(Terms(X_O,\oplus ,\mathbb {0})\) as the set of terms built over variables of \(X_O\), a binary function symbol \(\oplus \) (representing the monoid operation) and constant symbol \(\mathbb {0}\) (neutral element).

The logic syntax is parametrised by a set of output predicates \(\mathcal {O}\). Output predicates of arity 0 are called constant symbols, and we denote by \(\mathcal {O}|_n\) the predicates of arity n. Predicates talking about states, paths and input words are however fixed in the logic.

Definition 2

A pattern formula \(\varphi \) over a set of output predicates \(\mathcal {O}\) is of the form

$$ \varphi \ =\ \exists \pi _1:p_1\xrightarrow {u_1\mid v_1} q_1, \dots , \exists \pi _n:p_n\xrightarrow {u_n\mid v_n} q_n, \mathcal {C}$$

where for all \(1\le i\le n\), \(\pi _i\in X_P\) and they are all pairwise different, \(p_i,q_i\in X_Q\), \(u_i\in X_I\), \(v_i\in X_O\), and \(\mathcal {C}\) is a Boolean combination of atoms amongst

$$ \begin{array}{lcl@{\quad }r} \text {Input constraints} &{} :&{} u \mathrel {\sqsubseteq }u' \mid u \in L \mid |u| \le |u'| &{} u,u'\in X_I\\ \text {Output constraints} &{} :&{} p(t_1,\dots ,t_n) &{} p \in \mathcal {O}|_n, t_{i} \in Terms(X_O, \oplus ,\mathbb {0}) \\ \text {State constraints} &{} :&{} \textsf {init}(q) \mid \textsf {final}(q)\mid q = q' &{} q,q'\in X_Q\\ \text {Path constraints} &{} :&{} \pi = \pi ' &{} \pi ,\pi '\in X_P\end{array} $$

where L is a regular language of words over \(\varLambda \) (assumed to be represented as an NFA). The sequence of existential quantifiers before \(\mathcal {C}\) in \(\varphi \) is called the prefix of \(\varphi \). We denote by \(\textsf {PL}(\mathcal {O})\) the set of pattern formulas over \(\mathcal {O}\), and by \(\textsf {PL}^+(\mathcal {O})\) the fragment where output predicates does not occur under an odd number of negations.

The size of a formula is the number of its symbols plus the number of states of all NFA representing the membership constraints. We denote by \(\text {Var}(\varphi )\) the variables occurring in any pattern formula \(\varphi \), and by \(\text {Var}_P(\varphi )\) (resp. \(\text {Var}_Q(\varphi )\), \(\text {Var}_I(\varphi )\), \(\text {Var}_O(\varphi )\)) its restriction to path (resp. state, input, output) variables. We finally let , , .

Semantics. To define the semantics of a pattern formula \(\varphi \), we first fix some monoid \(\mathbb {M}= (D,\oplus _\mathbb {M},\mathbb {0}_\mathbb {M})\) together with an interpretation \({p}^\mathbb {M}\) of each output predicates \(p \in \mathcal {O}\) of arity \(\alpha (p)\), such that \(p^\mathbb {M}\in D\) if p is a constant and \(p^\mathbb {M}\subseteq D^{\alpha (p)}\) otherwise. Given a valuation \(\nu :X_O\rightarrow D\), the interpretation \(.^\mathbb {M}\) can be inductively extended to terms t by letting \(\mathbb {0}^{\nu , \mathbb {M}} = \mathbb {0}_\mathbb {M}\), \((t_1\oplus t_2)^{\nu , \mathbb {M}} = t_1^{\nu , \mathbb {M}}\oplus _\mathbb {M}t_2^{\nu , \mathbb {M}}\) and \(x^{\nu , \mathbb {M}} = \nu (x)\).

Then, a formula \(\varphi \in \textsf {PL}(\mathcal {O})\) is interpreted in an automaton with outputs \(A\in \mathcal {A}_{out}(\mathbb {M})\) as a set of valuations \([\![\varphi ]\!]_A\) of \(\text {Var}(\varphi )\) which we now define. Each valuation \(\nu \in [\![\varphi ]\!]_A\) maps state variables to states of A, path variables to paths of A, etc. Such a valuation \(\nu \) satisfies an atom \(u\mathrel {\sqsubseteq }u'\) if \(\nu (u)\) is a prefix of \(\nu (u')\), \(u\in L\) if \(\nu (u)\in L\), \(|u|\le |u'|\) if \(|\nu (u)|\le |\nu (u')|\). Given a predicate \(p \in \mathcal {O}\) of arity \(\alpha (p)\), an atom \(p(t_1,\dots ,t_{\alpha (p)})\) is satisfied by \(\nu \) if \((t_1^{\nu ,\mathbb {M}},\dots ,t_{\alpha (p)}^{\nu ,\mathbb {M}})\in p^\mathbb {M}\). Finally, \(\nu \) satisfies \(\textsf {init}(q)\) (resp. \(\textsf {final}(q)\)) if \(\nu (q)\) is initial (resp. \(\nu (q)\) is final). The satisfiability relation is naturally extended to Boolean combinations of atoms. Finally, assume that \(\varphi \) is of the form \( \exists \pi _1:p_1\xrightarrow {u_1\mid v_1} q_1, \dots , \exists \pi _n :p_n\xrightarrow {u_n\mid v_n} q_n, \mathcal {C}\), we say that A satisfies \(\varphi \), denoted by \(A \models \varphi \), if there exists a valuation \(\nu \) of \(\text {Var}(\varphi )\) such that for all \(i\in \{1,\dots ,n\}\), \(\nu (\pi _i) :\nu (p_i)\xrightarrow {\nu (u_i)\mid \nu (v_i)}\nu (q_i)\) and \(\nu \) satisfies \(\mathcal {C}\) (\(\nu \models \mathcal {C}\)). Given a pattern formula \(\varphi \) and an automaton with outputs A, the model-checking problem consists in deciding whether A satisfies \(\varphi \), i.e. \(A \models \varphi \).

Example 1

Given \(k\in \mathbb {N}\), the k-valuedness property has been already expressed in Introduction (assuming \({=}\in \mathcal {O}\)). The formula \( \exists \pi _0:p_0\xrightarrow {u|v_0} q_0, \dots , \exists \pi _k :p_k\xrightarrow {u|v_{k}} q_k,\mathcal {C}_0\) where \(\mathcal {C}_0 = \bigwedge _{0\le i<j\le k} \pi _i\ne \pi _j\wedge \bigwedge _{i=0}^k \textsf {init}(p_i)\wedge \textsf {final}(q_i)\) expresses the fact that an automaton is not \((k-1)\)-ambiguous (has at least k accepting paths for some input).

4 Model-Checking Problem

In this section, we give sufficient conditions on the output monoid \(\mathbb {M}\) and the set of output predicates \(\mathcal {O}\) by which the model-checking of automata with outputs in \(\mathbb {M}\) against pattern formulas over the output predicates \(\mathcal {O}\) is decidable. In the next sections, we study the precise complexity of the model-checking problem for particular monoids \(\mathbb {M}\).

Tuple acceptors. Since automata with outputs can get their output values in arbitrary monoids, to get an effective model-checking algorithm, we will assume the existence of machines, called tuple acceptors, that can recognise sets of word tuples. These machines will be required to satisfy some key properties, forming the notion of good class of tuple acceptors. First, what we call a tuple acceptor is a machine M whose semantics is a set of tuples of words \([\![M]\!]\subseteq (\varSigma ^*)^n\), for some alphabet \(\varSigma \) and some arity \(n\ge 1\). The notion of good class, formally defined later, require (i) that any regular set of tuples is recognised by some machine, for a regularity notion that we will make clear (roughly, by seeing tuples of words as words resulting from the overlapping of all components), (ii) all output predicates (and their negation) are recognised by some machine, (iii) the class is closed under union and intersection.

Regular sets of word tuples. Let \(\varSigma \) be some alphabet containing some symbol \(\bot \), \(\pi \in \varSigma ^*\) and \(m\ge |\pi |\). The padding of \(\pi \) with respect to m is the word \(\pi ' = \pi \bot ^{m-|\pi |}\). Let \(\pi _1,\pi _2\in \varSigma ^*\) and let \(m = \text {max}(|\pi _1|,|\pi _2|)\). For \(j=1,2\), let \(\pi '_j\) the padding of \(\pi _j\) with respect to m. Note that \(|\pi '_1| = |\pi '_2| = m\). The convolution \(\pi _1\mathbin {\otimes }\pi _2\) is the word of length m defined for all \(1\le i\le n\) by \((\pi _1\mathbin {\otimes }\pi _2)[i] = (\pi '_1[i],\pi '_2[i])\). E.g. \(q_1\lambda _1d_1q_2\otimes p_1 = (q_1,p_1)(\lambda _1,\bot )(d_1,\bot )(q_2,\bot )\). The convolution can be naturally extended to multiple words as follows: \(\mathop {\bigotimes }_{i=1}^n \pi _i = \pi _1\mathbin {\otimes }(\pi _2\mathbin {\otimes }\dots \mathbin {\otimes }\pi _n)\).

Definition 3

A set of n-ary word tuples \(P\subseteq (\varSigma ^*)^n\) is regular if \(L = \{ \mathop {\bigotimes }_{i=1}^n \pi _i \mid (\pi _1, \dots , \pi _n)\in P \}\) is a regular language over \(\varSigma ^n\). We often identify L and P.

Good class of tuple acceptors. First, any valuation \(\nu \) of a set of path variables X into paths of some automaton with values in some monoid \(\mathbb {M}\) gives a way to interpret terms \(t\in \text {Terms}(X, \oplus , \mathbb {0})\) as follows: for \(\pi \in X\), \(\pi ^{\nu ,\mathbb {M}} = \text {out}(\nu (\pi ))\), \(\mathbb {0}^{\nu ,\mathbb {M}} = \mathbb {0}_\mathbb {M}\) and \((t_1\oplus t_2)^{\nu ,\mathbb {M}} = t_1^{\nu ,\mathbb {M}}\oplus _\mathbb {M}t_2^{\nu ,\mathbb {M}}\). Then, for a class \(\mathcal {C}\) (i.e. a set) of tuple acceptors, we denote by \(\mathcal {C}|_n\) its restriction to acceptors of arity n.

Definition 4

(Good class). A class of tuple acceptors \(\mathcal {C}\) is said to be good for an output monoid \(\mathbb {M}= (D,\oplus _\mathbb {M},\mathbb {0}_\mathbb {M})\), a set of output predicates \(\mathcal {O}\) and an interpretation \(p^\mathbb {M}\subseteq D^{\alpha (p)}\) for all \(p\in \mathcal {O}\) of arity \(\alpha (p)\), if the following conditions are satisfied:

  1. 1.

    for all automata with outputs \(A\in \mathcal {A}_{out}(\mathbb {M})\) with a set of states Q we have:

    1. (a)

      \(\forall n\ge 1\),\(\forall R\subseteq \text {Paths}(A)^n\) regular, \(R = [\![M]\!]\) for some \(M\in \mathcal {C}|_n\).

    2. (b)

      all \(p\in \mathcal {O}\) of arity \(\alpha (p)\), all \(X = \{ \pi _1, \dots , \pi _n \}\) finite sets of path variables and all \(t_1, \dots , t_{\alpha (p)} \in \text {Terms}(X, \oplus , \mathbb {0})\), there exist \(M,M'\in \mathcal {C}|_{n}\) such that

      1. i.

        \([\![M]\!] = \{ \left( \nu (\pi _1), \dots , \nu (\pi _n) \right) | \nu :X\rightarrow \text {Paths}(A) \wedge (t^{\nu , \mathbb {M}}_1, \dots , t^{\nu , \mathbb {M}}_{\alpha (p)}) \in p^\mathbb {M}\}\)

      2. ii.

        \([\![M']\!] = \text {Paths}(A)^n\setminus [\![M]\!]\).

  2. 2.

    \(\forall n\ge 1\), \(\forall M_1,M_2\in \mathcal {C}|_n\), there exist \(M, M'\in \mathcal {C}|_n\) such that \([\![M]\!]=[\![M_1]\!]\cap [\![M_2]\!]\) and \([\![M']\!]=[\![M_1]\!]\cup [\![M_2]\!]\).

We say that \(\mathcal {C}\) is effective if all properties are effective and moreover it is decidable whether \([\![M]\!]\ne \varnothing \) for any (effectively represented) \(M\in \mathcal {C}\). We say that \(\mathcal {C}\) is weakly good if all properties hold except 1(b)ii.

Effectiveness of a good class gives effective model-checking, as announced.

Theorem 1

Let \(\mathbb {M}\) be a monoid and \(\mathcal {O}\) be a set of output predicates, interpreted over \(\mathbb {M}\). If there exists an effective good class \(\mathcal {C}\) (resp. effective weakly good class) of tuple acceptors for \(\mathbb {M}\) and \(\mathcal {O}\), then the model-checking problem of automata with outputs in \(\mathbb {M}\) against pattern formulas \(\psi \in \textsf {PL}[\mathcal {O}]\) (resp. \(\psi \in \textsf {PL}^+[\mathcal {O}]\)) is decidable.

Proof

(sketch). First, the formula is put in negation normal form: negation is pushed down to the atoms. Then, given an automaton with outputs in \(\mathbb {M}\), we show that any tuple of paths which satisfy state, input and path predicates and their negations is a regular set of path tuples (this is doable even for input equality as well as input length comparison thanks to the way paths are overlapped by the definition of convolution). By condition 1a, these sets of tuples are accepted by acceptors of \(\mathcal {C}\). By conditions 1(b)i and ii, tuples of paths satisfying output predicates and their negations are also accepted by acceptors of \(\mathcal {C}\). Then, the closure properties (condition 2) allows us to construct an acceptor for the tuples of paths satisfying the whole formula inductively.    \(\square \)

5 A Pattern Logic for Finite Automata

Finite automata can be seen as automata with outputs in a trivial monoid (with a single element). As the monoid is trivial, there is no need for predicates over it and so we specialize our pattern logic into \(\textsf {PL}_\text {NFA}=\textsf {PL}[\varnothing ]\).

Definition 5

(Pattern logic for NFA). The logic \(\textsf {PL}_\text {NFA}\) is the set of formulas

where for all \(i\ne j\), \(\pi _i\ne \pi _j\), L is a regular language over \(\varLambda \) (assumed to be represented as an NFA), \(u,u'\in \{u_1,\dots ,u_n\}\), \(q,q'\in \{q_1,\dots ,q_n\}\) and \(\pi ,\pi '\in \{\pi _1,\dots ,\pi _n\}\).

As a yardstick to measure the expressiveness of \(\textsf {PL}_\text {NFA}\), we have considered the structural properties of NFA studied in two classical papers: [25] by Weber and Seidl and in [2] by Allauzen et al. The authors of these two papers give PTime membership algorithms for k-ambiguity, finite ambiguity, polynomial ambiguity and exponential ambiguity (with as applications the approximation of the entropy of probabilistic automata for example). We refer the interested readers to these papers for the formal definitions of those classes. The solutions to these membership problems follow a recurrent schema: one defines (1) a pattern that identifies the members of the class and (2) an algorithm to decide if an automaton satisfies the pattern. The next theorem states that all these membership problems can be reduced to the model-checking problem of \(\textsf {PL}_\text {NFA}\) using a constant space reduction. The proof of this theorem is obtained by showing how the patterns identified in [25], can be succinctly and naturally encoded into (fixed) \(\textsf {PL}_\text {NFA}\) formulas. As a corollary, we get that all the class membership problems are in NLogSpace, using a model-checking algorithm that we defined below for \(\textsf {PL}_\text {NFA}\).

Theorem 2

The membership problem to the subclasses of k-ambiguous, finitely ambiguous, polynomially ambiguous and exponentially ambiguous NFA can be reduced to the model-checking problem of \(\textsf {PL}_\text {NFA}\) with constant space reduction. The obtained formulas are constant (for fixed k).

Proof

For each membership problem, our reduction copies (in constant space) the NFA and considers the model-checking for this NFA against a fixed \(\textsf {PL}_\text {NFA}\) (one for each class). As illustration, k-ambiguity has already been expressed in Example 1. As a second example, an automaton is not polynomially ambiguous iff there exists a state p which is reachable from an initial state, and the source of two different cycles labelled identically by a word v. With \(\textsf {PL}_\text {NFA}\) this gives: \(\exists \pi _0 :q_0 \xrightarrow {u_1} p, \exists \pi _1 :p\xrightarrow {u_2} p, \exists \pi _2:p\xrightarrow {u_2} p, \exists \pi _3:p\xrightarrow {u_3} q, \textsf {init}(q_0) {\wedge } \pi _1\ne \pi _2 {\wedge } \textsf {final}(q)\)

   \(\square \)

The model-checking problem asks if a given NFA A satisfies a given \(\textsf {PL}_\text {NFA}\)-formula \(\varphi \).

Theorem 3

The model-checking problem of NFA against formulas in \(\textsf {PL}_\text {NFA}\) is PSpace-C. It is in NLogSpace-C if the formula is fixed.

Proof

(sketch). We use NFA as acceptors for tuples of paths. The algorithm presented in the proof of Theorem 1 yields an exponentially large NFA (and polynomial if the formula is fixed). We show that it does not need to be constructed explicitly and that a short non-emptiness witness can be searched non-deterministically on-the-fly. For PSpace-hardness, we notice that the non-emptiness of the intersection of n DFA can be easily expressed in \(\textsf {PL}_\text {NFA}\), by seeing the n DFA as a disjoint union, and by asking for the existence of n different accepting paths over the same input in this union.    \(\square \)

Corollary 1

(of Theorems 2 and 3). The membership problem to the classes of k-ambiguous, finitely ambiguous, polynomially ambiguous and exponentially ambiguous NFA is in NLogSpace.

6 A Pattern Logic for Transducers

Transducers are automata with outputs in a free monoid \(\mathbb {M}_{Trans} = (\varGamma ^*,\cdot ,\varepsilon )\) and therefore define subsets of \(\varLambda ^*\times \varGamma ^*\). Since our general pattern logic can test for output equalities (by repeating twice an output variable in the prefix), the model-checking is easily shown to be undecidable by encoding PCP:

Theorem 4

The model-checking problem of transducers against formulas in \(\textsf {PL}[\varnothing ]\) is undecidable.

To obtain a decidable logic for transducers, we need to exclude equality tests on the output words in the logic. However, as we will see, we can instead have inequality test \(\ne \) as long as it is not under an odd number of negations in the formula. We also allow to test (non) membership of output word concatenations to a regular language, as well as comparison of output word concatenations wrt their length. Formally:

Definition 6

(Pattern logic for transducers). The logic \(\textsf {PL}_{\text {Trans}}\) is the set of formulas of the form

where for all \(1\le i<j\le n\), \(\pi _i\ne \pi _j\) and \(v_i\ne v_j\) (no implicit output equality tests), L (resp. N) is a regular language over \(\varLambda \) (resp. \(\varGamma \)), assumed to be represented as an NFA, \(u,u'\in \{u_1,\dots ,u_n\}\), \(q,q'\in \{q_1,\dots ,q_n\}\), \(t,t'\in Terms(\{v_1,\dots ,v_n\},\cdot ,\epsilon )\), \(\pi ,\pi '\in \{\pi _1,\dots ,\pi _n\}\), and does not occur under an odd number of negations.

We define the macros , and

Let us explain the latter macro. Many properties of transducers are based on the notion of output delays, by which to compare output words. Formally, for any two words \(v_1,v_2\), \(\text {delay}(v_1,v_2) = (\alpha _1,\alpha _2)\) such that \(v_1 = \ell \alpha _1\) and \(v_2=\ell \alpha _2\) where \(\ell \) is the longest common prefix of \(v_1\) and \(v_2\). It can be seen that for any words \(v_1,v'_1,v_2,v'_2\), if we have \(\text {SDel}_{\ne }(v_1,v'_1,v_2,v'_2)\), then \(\text {delay}(v_1,v_2)\ne \text {delay}(v_1v'_1,v_2v'_2)\), but the converse does not hold. But, if \(\text {delay}(v_1,v_2)\ne \text {delay}(v_1v'_1,v_2v'_2)\), then \(\text {SDel}_{\ne }(v_1(v'_1)^i,v'_1,v_2(v'_2)^i,v'_2)\) holds for some \(i\ge 0\). These two facts allows us to express all the known transducer properties from the literature relying on the notion of delays. We leave however as open whether our logic can express a constraint such as \(\text {delay}(v_1,v_2)\ne \text {delay}(v_3,v_4)\).

We review here some of the main transducer subclasses studied in the literature. We refer the reader to the mentioned references for the formal definitions. As for the NFA subclasses of the previous section, deciding them usually goes in two steps: (1) identify a structural pattern characterising the property, (2) decide whether such as pattern is satisfied by a given transducer. The class of determinisable transducers are the transducers which define sequential functions [5, 6, 24]. The k-sequential transducers are the transducers defining unions of (graphs) of k sequential functions [8]. The multi-sequential ones are the union of all k-sequential transducers for all k [7, 16]. Finally, the k-valued transducers are the transducers for which any input word has at most k output words [14, 19], and the finite-valued ones are all the k-valued transducers for all k [18, 22, 23]. All these classes, according to the given references, are decidable in PTime.

Theorem 5

The membership problem of transducers to the classes of determinisable, functional, k-sequential, multi-sequential, k-valued, and finite-valued transducers can be reduced to the model-checking problem of \(\textsf {PL}_\text {Trans}\) with a constant space reduction. The obtained formulas are constant (as long as k is fixed).

Proof

Without going through all the properties, let us remind the reader that the formula for k-valuedness has been given in the introduction. We also give the \(\textsf {PL}_\text {Trans}\) formulas for the class of determinisable transducer. It is known that a transducer is determinisable iff it satisfies the twinning property, which is literally the negation of:

$$ \begin{array}{lr} \qquad \exists \pi _1:q_1\xrightarrow {u\mid v_1} p_1,\exists \pi '_1: p_1\xrightarrow {u'\mid v'_1} p_1, \exists \pi ''_1:p_1\xrightarrow {u''\mid v''_1} r_1,\\ \qquad \exists \pi _2:q_2\xrightarrow {u\mid v_2} p_2,\exists \pi '_2: p_2\xrightarrow {u'\mid v'_2} p_2,\exists \pi ''_2: p_2\xrightarrow {u''\mid v''_2} r_2, \\ \qquad \qquad \textsf {init}(q_1)\wedge \textsf {init}(q_2)\wedge \textsf {final}(r_1)\wedge \textsf {final}(r_2)\wedge \text {SDel}_{\ne }(v_1,v'_1,v_2,v'_2) \end{array} $$

   \(\square \)

Theorem 6

The model checking of transducers against formulas in \(\textsf {PL}_{Trans}\) is PSpace-C. It is in NLogSpace-C if the formula is fixed.

Proof

(sketch). We use Parikh automata as acceptors for tuples of paths. They extend automata with counters that can only be incremented and never tested for zero. The acceptance condition is given by a semi-linear set (represented for instance by an existential Presburger formula). The formal definition can be found e.g. in [9]. The counters allow us to compare the output length of paths, or to identify some output position of two paths with different labels (to test ). The counters are needed because this position may not occur at the same location in the convolution encoding of path tuples.    \(\square \)

Corollary 2

(of Theorems 5 and 6). The membership problem of transducers to the classes of determinisable, functional, k-sequential, multi-sequential, k-valued, and finite-valued transducers (for fixed k) is decidable in NLogSpace.

7 A Pattern Logic for Sum-Automata

We remind the reader that sum-automata are automata with outputs in the monoid \(\mathbb {M}_\text {Sum}=(\mathbb {Z},+,0)\) (assumed to be encoded in binary) and therefore define subsets of \(\varLambda ^* \times \mathbb {Z}\). We consider in this section two logics for expressing structural properties of sum-automata: the logic \(\textsf {PL}_\text {Sum}\) which is obtained as \(\textsf {PL}[\{\le \}]\) where the output predicate \(\le \) is interpreted by the natural total order over integers, and a subset of this logic \(\textsf {PL}_\text {Sum}^{\ne }\) obtained as \(\textsf {PL}^+[\{\ne \}]\) where the predicate \(\ne \) never appears in the scope of an odd number of negations (to avoid the expressibility of the equality predicate). We show that the fragment \(\textsf {PL}_\text {Sum}^{\ne }\) enjoys better complexity results. Formally, those two logics are defined as follows:

Definition 7

(Two pattern logics for sum-automata). The logic \(\textsf {PL}_\text {Sum}\) is the set of formulas of the form

where for all \(1\le i<j\le n\), \(\pi _i\ne \pi _j\), L is a regular language over \(\varLambda \) assumed to be represented as an NFA, \(u,u'\in \{u_1,\dots ,u_n\}\), \(q,q'\in \{q_1,\dots ,q_n\}\), \(t,t'\in Terms(\{v_1,\dots ,v_n\},\cdot ,\epsilon )\) and \(\pi ,\pi '\in \{\pi _1,\dots ,\pi _n\}\).

The logic \(\textsf {PL}^{\ne }_\text {Sum}\) is defined as above but the constraint \(t \le t'\) is replaced by \(t \ne t'\) and this constraint does not occur under an odd number of negations, and moreover \(v_i\ne v_j\) for all \(1\le i<j\le n\) (no implicit output equality tests).

We review here some of the main sum-automata subclasses decidable in PTime studied in the literature. We refer the reader to the mentioned references for the formal definitions. The class of functional sum-automata [11] are those such that all accepting paths associated with a given word return the same value. The classes of k-valued [10] and k-sequential sum-automata [8] are defined similarly as for transducers.

Theorem 7

The membership problem of sum-automata in the class of functional, k-valued, and k-sequential automata can be reduced to the model-checking problem of \(\textsf {PL}^{\ne }_\text {Sum}\). Moreover, the obtained \(\textsf {PL}^{\ne }_\text {Sum}\) formulas are constant (as long as k is fixed).

Proof

We have already shown in the introduction that functionality [11] and more generally k-valuedness [10] are expressible in \(\textsf {PL}^{\ne }_\text {Sum}\). The twinning property [1, 11] is as well expressible in \(\textsf {PL}^{\ne }_\text {Sum}\), just by replacing in the formula expressing it for transducers (proof of Theorem  5) the atom \(\text {SDel}_{\ne }(v_1,v'_1,v_2,v'_2)\) by \(v'_1\ne v'_2\). In [8], a generalization of the twinning property is shown to be complete for testing k-sequentiality.    \(\square \)

The proof of the results below for \(\textsf {PL}_\text {Sum}\) follows arguments that are similar to those developed for transducers in the proof of Theorem 6, and for the PTime result for \(\textsf {PL}^{\ne }_\text {Sum}\), we use a reduction to the k-valuedness problem of sum-automata [10].

Theorem 8

The model checking of sum-automata against formulas in \(\textsf {PL}_\text {Sum}\) is PSpace-C, NP-C when the formula is fixed, and NLogSpace-C if in addition the values of the automaton are encoded in unary. The model checking of sum-automata against formulas in \(\textsf {PL}^{\ne }_\text {Sum}\) is PSpace-C, and in PTime when the formula is fixed (even if the values of the automaton are encoded in binary).

Corollary 3

(of Theorems 7 and 8). The membership problem of sum-automata in the class of functional, k-valued, and k-sequential automata is decidable in PTime.

Note that we have shown that the k-valuedness property is expressible in \(\textsf {PL}^{\ne }_\text {Sum}\), and so the k-valuedness property is reducible to the model-checking problem of \(\textsf {PL}^{\ne }_\text {Sum}\). Nevertheless, this result does not provide a new algorithm for k-valuedness as our model-checking algorithm is based on a reduction to k-valuedness [10].

8 Extensions and Future Work

The logics we have presented can be extended in two ways by keeping the same complexity results, no matter what the output monoid is. The first extension allows to express properties of automata whose states can be coloured by an arbitrary (but fixed) set of colours. This is useful for instance to express properties of disjoint unions of automata, the colours allowing to identify the subautomata. The second extension is adding a bunch of universal state quantifiers before the formula. This does not change the complexity, and allow for instance to express properties such as whether an automaton is trim (all its states are accessible and co-accessible). As future work, we would like to investigate other monoids (discounted sum group for instance [11]), and other data structures for which transducers and weighted automata have been defined: nested words, infinite words and trees are the main structures we want to work on.