Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Weighted automata are a generalization of non-deterministic automata introduced by Schützenberger [14]. Every transition is associated with an input letter from an alphabet A and a weight expressing the cost (or probability, time, resources needed) of its execution. This weight is typically an element of a semiring. The multiplication of the semiring is used to accumulate the weight of a path by multiplying the weights of each transition in the path, while the addition of the semiring computes the weight of a string w by summing up the weights of the paths labeled with w [11]. In this way, the behaviour of weighted automata is given in terms of formal power series, i.e. functions assigning a weight to each finite string w over A.

Weighted automata may have a non-deterministic behaviour because different transitions from the same state may be labeled by the same input letter, with possibly different weights. However, they can be determinized by assigning a linear structure to the state-space using a generalization of the powerset construction for non-deterministic automata [5]. As such, determinized weighted automata are typically infinite-state, but determinization allows us to study weighted automata both from an algebraic perspective and a coalgebraic one. From the algebraic perspective, a (determinized) weighted automaton is just an algebra with a unary operation for each input symbol, whereas coalgebraically, a weighted automaton is a deterministic transition system with output weights associated to each state.

In this context, and building on the work by  [3] on ordinary deterministic automata and on the duality between reachability and observability [2, 4, 6], we study equations and coequations for weighted automata. In general, equations characterize classes of algebras called varieties [7], whereas coequations characterize classes of coalgebras, so-called covarieties [13]. Using the algebraic perspective, an equation for a weighted automaton is just a pair of words (uv). Satisfaction becomes reachability: a weighted automaton satisfies an equation (uv) if from any state, the linear combination of states reached after reading u is the same as the one reached after reading v [3].

Dually, the coalgebraic perspective allows us to define coequations for weighted automata as subsets (or predicates) of power series. A weighted automaton M satisfies a coequation S if for any function associating an output weight to each state, the behaviour of M from any initial state is a power series in S.

Our main result is a duality between sets of equations called congruences and sets of coequations called closed subsystems. More precisely, we prove that the classes of weighted automata defined by congruences are in a one-to-one correspondence with classes of weighted automata defined by closed subsystems. This allows, for example, to give equational characterizations to some specific subsets of power series and, vice versa, to define the least subset of power series specified by a set of equations.

In the second part of the paper, we retain the linear structure of the transitions of the determinized weighted automata, allowing for a more general kind of equations, called linear. In general, a full duality result does not hold anymore, but when the weights come from a field, we still have one direction of it: a variety defined by a linear congruence can be recovered from a corresponding covariety. As an example, we show that linear congruences (under certain conditions) are finitely generated by a set of equations, using the fact that by Hilbert basis theorem [10, VIII, Theorem 4.9], linear congruences correspond to ideals of polynomials.

We will proceed as follows. After some mathematical preliminaries, in Sect. 3 we show how to construct the set of equations and coequations for a given weighted automaton. In Sect. 4, we give the duality result between congruences and closed subsystems. In Sect. 5, we move from \({{\mathrm{Set}}}\) to vector spaces and linear equations. We conclude, in Sect. 6, with a discussion of related work.

2 Preliminaries

In this section, we will define the main concepts and notation used in this paper. Given two sets X and Y we define \(Y^X=\{f\ |\ f:X\rightarrow Y\}\). For a function \(f\in Y^X\), we define the kernel and the image of f by

$$ \ker (f)=\{(x_1,x_2)\in X\times X\ |\ f(x_1)=f(x_2)\}\ \ \ {{\mathrm{Im}}}(f)=\{f(x)\ |\ x\in X\}. $$

For any set A we denote by \(A^*\) the free monoid on A, its identity element will be denoted by \(\epsilon \). Given an element \(f\in X^{A^*}\) and \(w\in A^*\), we define the right derivative \(f_w\) of f with respect to w and the left derivative \(_wf\) of f with respect to w as the elements \(_wf,f_w\in X^{A^*}\) such that for every \(u\in A^*\),

$$ f_w(u)=f(wu), \text{ and } _wf(u)=f(uw). $$

Let A be an alphabet (not necessarily finite), a deterministic automaton on A is a pair \((X,\alpha )\) where \(\alpha :X\times A\rightarrow X\) is a function. We can add an initial state \(x_0\in X\) or an output function \(c:X\rightarrow O\) to get a pointed automaton \((X,x_0,\alpha )\) or a Moore automaton \((X,c,\alpha )\) with outputs in O, respectively. For any \(x\in X\) and \(u\in A^*\) we define u(x) inductively as \(\epsilon (x)=x\) and \(wa(x)=\alpha (w(x),a)\).

We have that pointed automata are F-algebras for the endofunctor F on \({{\mathrm{Set}}}\) given by \(F(X)=1+(A\times X)\). By using the correspondence \(\alpha :X\times A\rightarrow X \ \Leftrightarrow \ \alpha ':X\rightarrow X^A\), given by \(\alpha (x,a)=\alpha '(x)(a)\), Moore automata with outputs in O are G-coalgebras for the endofunctor G on \({{\mathrm{Set}}}\) given by \(G(X)=O\times X^A\). The initial F–algebra is \((A^*,\epsilon , \tau )\), where the transition function \(\tau \) is concatenation, that is \(\tau (w,a)=wa\), for all \(w\in A^*\) and \(a\in A\), and for any pointed automaton \((X,x_0,\alpha )\) the unique F–algebra morphism \(r_{x_0}:(A^*,\epsilon , \tau )\rightarrow (X,x_0,\alpha )\) is given by \(r_{x_0}(w)=w(x_0)\). Dually, the final G–coalgebra is \((O^{A^*},\hat{\epsilon },\hat{\tau })\), where \(\hat{\epsilon }(f)=f(\epsilon )\) and \(\hat{\tau }(f)(a)=f_a\), and for any Moore automaton \((X,c,\alpha )\) with outputs in O, the unique G–coalgebra morphism \(o_c:(X,c,\alpha )\rightarrow (O^{A^*},\hat{\epsilon },\hat{\tau })\) is given by \(o_c(x)=\lambda w. c(w(x))\in O^{A^*}\).

A semiring \(\mathbb {S}\) is an algebra \(\mathbb {S}=(S,+,\cdot ,0,1)\), where \(+\) and \(\cdot \) are binary operations and 0 and 1 are nullary operations, such that \((S,+,0)\) is a commutative monoid, \((S,\cdot ,1)\) is a monoid, \(\cdot \) distributes over \(+\) on the left and on the right, and \(0\cdot s=s\cdot 0=0\) for every \(s\in S\). Given a semiring \(\mathbb {S}\), a semimodule over \(\mathbb {S}\), or \(\mathbb {S}\)semimodule, is a commutative monoid \(\mathsf {R}=(R,+,0)\) together with an \(\mathbb {S}\)-left-action \(\cdot : S\times R\rightarrow R\) such that

$$ \begin{array}{lll} (s+s')\cdot r= s\cdot r+s'\cdot r \qquad &{} 0\cdot r= 0 \qquad &{} 1\cdot r= r\\ s\cdot (r+r')= s\cdot r+s\cdot r' &{} s\cdot 0= 0 &{} s\cdot (s'\cdot r)= (s\cdot s')\cdot r \end{array} $$

for any \(s,s'\in S\) and \(r,r'\in R\). We will often write sr instead of \(s\cdot r\). For an alphabet A and a semiring \(\mathbb {S}\), elements in \(\mathbb {S}^{A^*}\) are called power series. Given a set X, the free \(\mathbb {S}\)–semimodule on the generators X, denoted by V(X), is the \(\mathbb {S}\)–semimodule whose underlying set is \(V(X)=\{\phi \in \mathbb {S}^X\ |\ {{\mathrm{supp}}}(\phi ) \text{ is } \text{ finite } \}\), where \({{\mathrm{supp}}}(\phi )\), the support of \(\phi \), is defined as \({{\mathrm{supp}}}(\phi )=\{x\in X\ |\ \phi (x)\ne 0\}\). Addition in V(X) is component-wise, \(0\in V(X)\) is the constant function with 0 as its value, and the action of \(\mathbb {S}\) over V(X) is multiplication of a constant by a function. For \(\phi \in V(X)\) we have the correspondence \(\phi \,\Leftrightarrow \, s_1\cdot x_1+\cdots +s_n\cdot x_n\), where \({{\mathrm{supp}}}(\phi )=\{x_1,\ldots ,x_n\}\) and \(\phi (x_i)=s_i\), \(i=1,\ldots ,n\). Notice that we are using formal sums here. According to this correspondence there is a copy of X in V(X), namely \(x\mapsto 1 \cdot x\); in this case, \(1 \cdot x\) will be simply denoted as x.

A linear map between \(\mathbb {S}\)–semimodules \(\mathsf {S}_1\) and \(\mathsf {S}_2\) is a function \(h:S_1\rightarrow S_2\) such that for any \(x,y\in S_1\) and \(c,d\in \mathbb {S}\), \(h(cx+dy)=ch(x)+dh(y)\). Let \(\mathsf {S}\) be an \(\mathbb {S}\)–semimodule, X a set, and \(f:X\rightarrow S\) a function. We define the linear extension of f as the linear map \(\bar{f}: V(X) \rightarrow S\) given by

$$ \bar{f}(c_1\cdot x_1+\cdots +c_n\cdot x_n)=c_1f(x_1)+\cdots +c_nf(x_n). $$

A weighted automaton with input alphabet A and weights over a semiring \(\mathbb {S}\) is a pair \((X, \alpha )\), where X is a set (not necessarily finite) and \(\alpha :X\rightarrow V(X)^A\) is a function. We can add an initial state \(i:1\rightarrow V(X)\) and/or a final state function, or colouring, \(f:X\rightarrow \mathbb {S}\), yielding the following situation:

figure a

Notice that if we remove \(\bar{f}\) we have a pointed automaton, and if we remove i we have a Moore automaton with outputs in \(\mathbb {S}\). In particular, every weighted automaton \((X,\alpha )\) gives rise to a deterministic automaton \((V(X),\bar{\alpha })\).

An equation is a pair \((u,v)\in A^*\times A^*\), sometimes also denoted by \(u=v\). Given a weighted automaton \((X,\alpha )\) and an equation \((u,v)\in A^*\times A^*\), we define \((X,\alpha )\models (u,v)\); and say \((X,\alpha )\) satisfies the equation (uv), as follows:

$$ (X,\alpha )\models (u,v)\ \Leftrightarrow \ \forall \phi \in V(X)\ \ u(\phi )=v(\phi ), $$

recall the definition of u(x) at the beginning of this section. For any set of equations \(E\subseteq A^*\times A^*\) we write \((X,\alpha )\models E\) if \((X,\alpha )\models (u,v)\) for every \((u,v)\in E\). Observe that since \(\bar{\alpha }\) is the linear extension of \(\alpha \), the condition \( \forall \phi \in V(X)\ u(\phi )=v(\phi )\) is equivalent to \(\forall x\in X\ u(x)=v(x)\). In other words, \(u=v\) is satisfied if the linear combination of states in V(X) reached after u from every \(x \in X\) is equal to that reached after v.

A set of coequations is a subset \(D\subseteq \mathbb {S}^{A^*}\). We define \((X,\alpha )\models D\); and say \((X,\alpha )\) satisfies the set of coequations D, as follows:

$$ (X,\alpha )\models D \ \Leftrightarrow \ \forall f\in \mathbb {S}^X, \phi \in V(X)\ \ o_{\bar{f}}(\phi ) \in D $$

The power series \(o_{\bar{f}}(\phi )\) is the behaviour of the state \(\phi \) in \((X,\alpha )\) with respect to the colouring f. According to this, \((X,\alpha )\models D\) means that D includes all the possible behaviours for the automaton \((X,\alpha )\).

An equivalence relation C on \(A^*\) is a congruence on \(A^*\) if for any \(t,u,v,w\in A^*\), \((t,v)\in C\) and \((u,w)\in C\) imply \((tu,vw)\in C\). If C is a congruence on \(A^*\), the congruence quotient \(A^*/C\) has a pointed automaton structure \(A^*/C=(A^*/C, [\epsilon ], [\tau ])\) with transition function given by \([\tau ] ([w],a)=[\tau (w,a)]=[wa]\), which is well defined since C is a congruence.

3 Free and Cofree Construction for Weighted Automata

In this section we will show how to construct the maximum set of equations and the minimum set of coequations satisfied by a weighted automaton \((X,\alpha )\) (thereby generalizing the approach of [3, Sect.5]). We use the notation \(\mathbf {free}_w\) and \(\mathbf {cofree}_w\) to distinguish between the free and cofree construction for weighted automata from the construction \(\mathbf {free}\) and \(\mathbf {cofree}\) for deterministic automata defined in [3], respectively.

To get the maximum set of equations of \((X,\alpha )\) we are going to construct the pointed deterministic automaton \(\mathbf {free}_w(X,\alpha )\) by taking the following steps:

  1. 1.

    Define the pointed deterministic automaton \(\prod (X,\alpha )=(\prod _{x\in X} V(X), \varDelta , \hat{\alpha })\) where \(\hat{\alpha }\) is the product of \(\bar{\alpha }\) |X| times, that is \(\hat{\alpha }(\theta )(a)(x)=\bar{\alpha } (\theta (x))(a)\), and \(\varDelta \in \prod _{x\in X} V(X)\) is given by \(\varDelta (x)=x\). Then, by initiality of \(A^*=(A^*,\epsilon ,\tau )\), we get a unique F-algebra morphism \(r_\varDelta :A^*\rightarrow \prod (X,\alpha )\).

  2. 2.

    Define \(\mathbf {free}_w(X,\alpha )\) and \(\mathbf {Eq}_w(X,\alpha )\) as

    $$ \mathbf {free}_w(X,\alpha ):=A^*/\ker (r_\varDelta ) \text{ and } \mathbf {Eq}_w(X,\alpha ):=\ker (r_\varDelta ) $$

By construction, we have the following theorem.

Theorem 1

\(\mathbf {Eq}_w(X,\alpha )\) is the maximum set of equations satisfied by \((X,\alpha )\).

Example 2

Consider the weighted automaton with input alphabet \(A=\{a,b\}\) and weights on the semiring (field) \(\mathbb {Z}_3\) given by the following diagram:

figure b

According to the definition \(\mathbf {free}_w(X,\alpha )=A^*/\ker (r_\varDelta )\cong {{\mathrm{Im}}}(r_\varDelta )\), so in order to construct \(\mathbf {free}_w(X,\alpha )\) we only need to construct the reachable part of \(\prod (X,\alpha )\) from the state \(\varDelta =(x,y)\). By doing that we get the automaton:

figure c

Thus \(\mathbf {Eq}_w(X,\alpha )\) is the congruence generated by \(\{a=\epsilon ,bb=bbb\}\) and \(\mathbf {free}_w(X,\alpha )\) is the automaton with states \([\epsilon ]=a^*\), \([b]=a^*ba^*\), and \([bb]=a^*ba^*b(a+b)^*\).    \(\square \)

Now we will show how to get the minimum set of coequations satisfied by \((X,\alpha )\). First some notation: for any family of sets \(\{X_i\}_{i\in I}\) we denote by \(\coprod _{i\in I}X_i\) the disjoint union (coproduct in \({{\mathrm{Set}}}\)) of the family which is given by \(\coprod _{i\in I}X_i=\bigcup _{i\in I}\{i\}\times X_i\). We will construct the Moore automaton \(\mathbf {cofree}_w(X,\alpha )\) by taking the following steps:

  1. 1.

    Define the Moore automaton \(\coprod (X,\alpha )=(\coprod _{f\in \mathbb {S}^X}V(X),\varPhi , \tilde{\alpha })\) where \(\tilde{\alpha }\) and \(\varPhi \) are given by \(\tilde{\alpha }(f,\phi )(a)=(f,a(\phi ))\) and \(\varPhi (f,\phi )=\bar{f}(\phi )\). Then, by finality of \(\mathbb {S}^{A^*}=\left( \mathbb {S}^{A^*},\hat{\epsilon },\hat{\tau }\right) \), we get a unique G-coalgebra morphism \(o_\varPhi :\coprod (X,\alpha )\rightarrow ~\mathbb {S}^{A^*}\).

  2. 2.

    Define \(\mathbf {cofree}_w(X,\alpha )\) and \(\mathbf {coEq}_w(X,\alpha )\) as

    $$ \mathbf {cofree}_w(X,\alpha )=\mathbf {coEq}_w(X,\alpha ):={{\mathrm{Im}}}(o_\varPhi ). $$

Similarly as in the case of equations we have the following theorem.

Theorem 3

\(\mathbf {coEq}_w(X,\alpha )\) is the minimum set of coequations satisfied by \((X,\alpha )\).

4 Duality Between Equations and Coequations

In this section, we will use the free and cofree construction given in [3, Sect. 5], to show a duality result between equations and coequations (here the we are going to use the semiring \(\mathbb {S}\) as a set of colours on the coalgebraic side). In the sequel \(\mathbb {S}\) will be a fixed semiring.

Proposition 4

For every congruence quotient \(A^*/C\),

$$ \mathbf {cofree}(A^*/C)=\{f\in \mathbb {S}^{A^*}\ |\ C\subseteq \ker (f)\}. $$

Observe that the previous proposition is a generalization of [3, Proposition 14]. As a consequence, we have:

Theorem 5

For any congruence quotient \(A^*/C\), \(\mathbf {free}\circ \mathbf {cofree}(A^*/C)=A^*/C\).

Now we will define one of the main concepts that will lead us to the duality. A subset \(S\subseteq \mathbb {S}^{A^*}\) is called a closed subsystem if

  1. (i)

    S is closed under left and right derivatives.

  2. (ii)

    \(\mathbf {B}(S)\mathop {=}\limits ^{\mathrm{def}}\left( \{{{\mathrm{supp}}}(f)|f\in S\}, \cap , (\ )', A^*\right) \) is a complete atomic Boolean algebra.

  3. (iii)

    Let \({{\mathrm{At}}}(\mathbf {B}(S))\) be the set of atoms of \(\mathbf {B}(S)\), then

    $$ S=\left\{ f\in \mathbb {S}^{A^*}\ |\ \forall P\in {{\mathrm{At}}}(\mathbf {B}(S)) \ f{\upharpoonright }_P\text { is constant} \right\} . $$

Notice that condition i) implies that in fact S is a subsystem of \(\mathbb {S}^{A^*}\), i.e. a subcoalgebra. For the case of the Boolean semiring, that is \(\mathbb {S}=\mathbb {B}\), the notion of closed subsystem coincides with that of preformation of languages defined in [3], which was one of the reasons and motivations to state the previous definition in its final form. We can think of closed subsystems as \(\mathbb {S}\)-colourings of the atoms of \(\mathbf {B}(S)\), according to condition iii), which in the case of preformations of languages are only 2-colourings, but we also need those colourings to be well-behaved in the sense of being closed under left and right derivatives to get a subsystem of the final coalgebra \(\mathbb {S}^{A^*}\), and also to induce a congruence defining the system in the following sense.

Theorem 6

Let S be a subset of \(\mathbb {S}^{A^*}\). S is a closed subsystem if and only if \(S=\mathbf {cofree}(A^*/C)\) for some congruence quotient \(A^*/C\).

Observe that by Theorem 5, the congruence C of the previous theorem is unique.

Example 7

Let \(S\subseteq \mathbb {S}^{A^*}\) be the set \(S=\left\{ f\in \mathbb {S}^{A^*}\ \big |\ \forall w\in A^* f(w)=f\left( b^{|w|_b}\right) \right\} \), where \(|w|_b\) is the number of b’s in the word w. Then one can verify that S is a closed subsystem in which \({{\mathrm{At}}}(\mathbf {B}(S))=\{a^*,a^*ba^*,a^*(ba^*)^2,a^*(ba^*)^3,\ldots \}\) and \(S=\mathbf {cofree}(A^*/C)\), if we take for C the congruence generated by \(\{a=\epsilon \}\). \(\square \)

Corollary 8

For any closed subsystem S of \(\mathbb {S}^{A^*}\), \(\mathbf {cofree}\circ \mathbf {free}(S)=S\).

Example 9

Consider the congruence quotient \(A^*/C=\{[\epsilon ],[b],[bb]\}\) from Example 2, that is \(A=\{a,b\}\) and C is the congruence generated by \(\{a=\epsilon ,bb=bbb\}\). If \(\mathbb {S}=\mathbb {Z}_3\), then by Proposition 4 we know that

$$ \mathbf {cofree}(A^*/C)=\left\{ f\in (\mathbb {Z}_3)^{A^*}\ |\ \forall [w]\in A^*/C,\ f{\upharpoonright }_{[w]} \text { is constant }\right\} \!, $$

that is \(\mathbf {cofree}(A^*/C)\) has 27 elements. For any partition \(\mathcal {A}=\{[w_i]\ |\ 1\le i\le n\}\) of \(A^*\) and \(c_i\in \mathbb {S}\) we define the function \(c_1\chi _{[w_1]}+\cdots +c_n\chi _{[w_n]}\in \mathbb {S}^{A^*}\) as \((c_1\chi _{[w_1]}+\cdots +c_n\chi _{[w_n]})(w)=c_i\) if and only if \(w\in [w_i]\), which is well-defined since \(\mathcal {A}\) is a partition of \(A^*\). Using the previous notation, we have that every element in \(\mathbf {cofree}(A^*/C)\) is of the form \(c_1\chi _{[\epsilon ]}+c_2\chi _{[b]}+c_3\chi _{[bb]}\), \(c_i\in \mathbb {Z}_3\). Here are some examples:

  1. (i)

    Consider the colouring \(c:A^*/C\rightarrow \mathbb {Z}_3\) given by \(c([\epsilon ])=0\), \(c([b])=1\), and \(c([bb])=2\), then we get that \(o_c([b])\in (\mathbb {Z}_3)^{A^*}\) is given by

    $$ o_c([b])(w)={\left\{ \begin{array}{ll} 1 &{}\text {if } w\in a^*,\\ 2 &{}\text {otherwise}. \end{array}\right. } $$

    That is, \(o_c([b])=\chi _{[\epsilon ]}+2\chi _{[b]}+2\chi _{[bb]}\).

  2. (ii)

    For any \(c_i\in \mathbb {Z}_3\), the reader can easily verify the following identities for left derivatives

    $$\begin{aligned} _a\left( c_1\chi _{[\epsilon ]}+c_2\chi _{[b]}+c_3\chi _{[bb]}\right)&= c_1\chi _{[\epsilon ]}+c_2\chi _{[b]}+c_3\chi _{[bb]}\\ _b\left( c_1\chi _{[\epsilon ]}+c_2\chi _{[b]}+c_3\chi _{[bb]}\right)&= c_2\chi _{[\epsilon ]}+c_3\chi _{[b]}+c_3\chi _{[bb]}. \end{aligned}$$

\(\square \)

Next we define the category \(\mathcal {C}\) of congruence quotients and the category \(\mathcal {K}\) of closed subsystems, for a fixed semiring \(\mathbb {S}\), as follows:

$$\begin{aligned} \mathrm{objects}(\mathcal {C})&=\{(A^*/C,[\tau ])\ |\ A^*/C \text { is a congruence quotient}\}\\ \mathrm{arrows}(\mathcal {C})&=\{e:A^*/C\rightarrow A^*/D\ |\ e \,\text { is an epimorphism}\}\\ \mathrm{objects}(\mathcal {K})&=\{(K,\alpha )\ |\ K \text { is a closed subsystem of}\,\mathbb {S}^{A^*}\}\\ \mathrm{arrows}(\mathcal {K})&=\{m:K\rightarrow K'\ |\ m \,\text { is a monomorphism}\} \end{aligned}$$

By Theorem 5 and Corollary 8, we have the following.

Theorem 10

\(\mathbf {cofree}:\ \mathcal {C}\cong \mathcal {K}^{op}\ :\mathbf {free}\)

Example 11

Let \(C_1\) be the congruence generated by \(\{a=\epsilon ,bb=bbb\}\) and let \(C_2\) be the congruence generated by \(\{a=\epsilon ,b=bb\}\). Clearly we have that \(C_1\subseteq C_2\), \(A^*/C_1=\{[\epsilon ]_1,[b]_1,[bb]_1\}\), and \(A^*/C_2=\{[\epsilon ]_2,[b]_2\}\) where

$$ \begin{array}{ll} \ [\epsilon ]_1=[\epsilon ]_2=a^* &{} [b]_1=a^*ba^*\\ \ [bb]_1=a^* ba^* b(a+b)^* &{} [b]_2=a^*b(a+b)^*=[b]_1\cup [bb]_1. \end{array} $$

By the duality, we have the following situation:

figure d

where the epimorphism \(e:A^*/C_1\rightarrow A^*/C_2\) is given by \(e([w]_1)=[w]_2\), and the monomorphism \(m:\mathbf {cofree}(A^*/C_2)\rightarrow \mathbf {cofree}(A^*/C_1)\) is given by \(m(c_1\chi _{[\epsilon ]_2}+c_2\chi _{[b]_2})=c_1\chi _{[\epsilon ]_1}+c_2\chi _{[b]_1}+c_2\chi _{[bb]_1}\). \(\square \)

There is also the following consequence of the duality, which basically tells us that we can either work with congruences or closed subsystems in weighted automata.

Theorem 12

Let \(S\subseteq \mathbb {S}^{A^*}\) be a closed subsystem, C a congruence on \(A^*\), and \((X,\alpha )\) a weighted automata. Then

  1. (i)

    \((X,\alpha )\models C\) if and only if \((X,\alpha )\models \mathbf {coEq}(A^*/C)\).

  2. (ii)

    \((X,\alpha )\models S\) if and only if \((X,\alpha )\models \mathbf {Eq}(S)\).

Example 13

(Example 7 continued) Let \(A=\{a,b\}\). We showed the correspondence between the congruence C on \(A^*\) generated by \(\{a=\epsilon \}\) and the closed subsystem \(S\subseteq \mathbb {S}^{A^*}\) given by \(S=\left\{ f\in \mathbb {S}^{A^*}\ \big |\ \forall w\in A^* f(w)=f\left( b^{|w|_b}\right) \right\} \), that is \(\mathbf {cofree}(A^*/C)=S\) or, equivalently, \(\mathbf {free}(S)=A^*/C\). In this case, we have, for a weighted automaton \((X,\alpha )\) on A

$$ (X,\alpha )\models C\ \Leftrightarrow \ (X,\alpha )\models S\ \Leftrightarrow \ \forall x\in X \ \alpha (x,a)=x. \square $$

5 Linear Equations and Coequations

In this section, we work with a more general kind of equations and a more general kind of automata. In the previous sections we defined equations to be pairs \((u,v)\in A^*\times A^*\) and we said that an automaton \((X,\alpha )\) with weights in \(\mathbb {S}\) satisfies (uv) if for any \(x\in X\), \(u(x)=v(x)\). Since \(u(x)\in V(X)\) (the free \(\mathbb {S}\)-semimodule generated by X) for any \(u\in A^*\), it makes sense to define expressions like

$$\begin{aligned} (s_1w_1+\cdots +s_nw_n)(x) := s_1w_1(x)+\cdots +s_nw_n(x) \end{aligned}$$

for elements \(s_i\in \mathbb {S}\) and \(w_i\in A^*\), that is, for any \(\varphi = \sum _{i=1}^ns_iw_i\in V(A^*)\) and \(x\in X\) we get an element \(\varphi (x)\in V(X)\). In this case we can ask whether \(\varphi (x)=\psi (x)\) holds or not for some given \(\varphi ,\psi \in V(A^*)\) and \(x\in X\). A pair \((\varphi ,\psi )\in V(A^*)\) will be called a linear equation, and note that this is now a more general kind of equation since \(A^*\subseteq V(A^*)\). We write \((X,\alpha )\models (\varphi ,\psi )\); \((X,\alpha )\) satisfies the equation \((\varphi ,\psi )\), if for every \(x\in X\), \(\varphi (x)=\psi (x)\).

Example 14

Let \((X,\alpha )\) be the weighted automata on \(A=\{a,b,c\}\) with weights on \(\mathbb {Z}_3\) given by the following diagram:

figure e

Then one easily verifies the following linear equations are satisfied by \((X,\alpha )\): \(ac=c\), \(a+c=b\), and \(2c+b=b^2\).

We have that \(\mathbf {free}_w(X,\alpha )\) is the following automaton:

figure f

Hence, \(\mathbf {Eq}_w(X,\alpha )\) is the congruence generated by

$$ \{c^2=c^3,b^3=\epsilon ,a^3=\epsilon ,ab=\epsilon ,ba=\epsilon ,ac=c,bc=c,ca=c,cb=c\}. $$

Observe that none of the linear equations \(a+c=b\) nor \(2c+b=b^2\) can be deduced from \(\mathbf {Eq}_w(X,\alpha )\), even though \((X,\alpha )\) satisfies both of them. \(\square \)

As the previous example shows, equations of the form \((\varphi , \psi )\in V(A^*)\times V(A^*)\) that are satisfied by \((X,\alpha )\) are interesting. We now turn to define the notion of linear automata.

For a field \(\mathbb {K}\), let \({{\mathrm{Vec_\mathbb {K}}}}\) denote the category of vector spaces over \(\mathbb {K}\) with linear maps as morphisms. A \(\mathbb {K}\)linear automaton, or \({{\mathrm{Vec_\mathbb {K}}}}\) automaton, on an alphabet A is a pair \((S,\alpha )\), where S is an vector space and \(\alpha :S\rightarrow S^A\) is a linear map. As in weighted automata, we can have an initial state \(s_0\in {{\mathrm{Vec_\mathbb {K}}}}(\mathbb {K},S)\) and/or a colouring \(c\in {{\mathrm{Vec_\mathbb {K}}}}(S,\mathbb {K})\). Notice that \(s_0\) is completely determined by its value at 1, so we can identify \(s_0\) with the element \(s_0:=s_0(1)\in S\). Observe that every weighted automaton over a field \(\mathbb {K}\) yields a \({{\mathrm{Vec_\mathbb {K}}}}\) automaton and conversely. A pointed \({{\mathrm{Vec_\mathbb {K}}}}\) automaton \((S,s_0,\alpha )\) is a H-algebra for the endofunctor \(H: {{\mathrm{Vec_\mathbb {K}}}}\rightarrow {{\mathrm{Vec_\mathbb {K}}}}\) given by \(H(S)=\mathbb {K}+A\times S\) (formally, H(S) is the sum of the space \(\mathbb {K}\) and A copies of the space S), and a coloured \({{\mathrm{Vec_\mathbb {K}}}}\) automaton \((S,c,\alpha )\) is an I-coalgebra for the endofunctor \(I:{{\mathrm{Vec_\mathbb {K}}}}\rightarrow {{\mathrm{Vec_\mathbb {K}}}}\) given by \(I(S)=\mathbb {K}\times S^A\).

The initial H-algebra is \((V(A^*),\epsilon ,\tau )\) where \(\tau \) is given by \(\tau \left( \sum s_iw_i\right) (a)=\sum s_i (w_ia)\), and for any pointed \({{\mathrm{Vec_\mathbb {K}}}}\) automaton \((S,s_0,\alpha )\) the unique H-algebra morphism \(r_{s_0}:V(A^*)\rightarrow S\) is given by \(r_{s_0}\left( \sum s_iw_i\right) =\sum s_i w_i(s_0)\). Dually, the final I-coalgebra is \(\left( \mathbb {K}^{A^*}, \hat{\epsilon },\hat{\tau }\right) \), where \(\hat{\epsilon }(f)=f(\epsilon )\) and \(\hat{\tau }(f)(a)=f_a\), and for any coloured \({{\mathrm{Vec_\mathbb {K}}}}\) automaton \((S,c,\alpha )\) the unique I-coalgebra morphism \(o_c:S\rightarrow \mathbb {K}^{A^*}\) is given by \(o_c(s)(w)=c(w(s))\).

A \({{\mathrm{Vec_\mathbb {K}}}}\) automaton \((S,\alpha )\) satisfies the linear equation \((\phi ,\psi )\in V(A^*)\times V(A^*)\), denoted by \((S,\alpha )\models (\phi ,\psi )\), if for any \(s\in S\) we have that \(\phi (s)=\psi (s)\).

An equivalence relation C on \(V(A^*)\) is a linear congruence on \(V(A^*)\) if:

  1. (i)

    \((\varphi _1,\psi _1),(\varphi _2,\psi _2)\in C\) imply \((\varphi _1+\varphi _1,\psi _1+\psi _2)\in C\).

  2. (ii)

    \((\varphi ,\psi )\in C\), \(k \in \mathbb {K}\), and \(w\in A^*\) imply \((k\varphi ,k\psi ),(w\varphi ,w\psi ),(\varphi w,\psi w)\in C\). Where, for \(\phi =\sum _{i=1}^nk_iw_i\in V(A^*)\), \(k\phi \), \(w\phi \), and \(\phi w\) are defined as

    $$ k\phi :=\sum _{i=1}^nkk_iw_i,\ \ \ \ w\phi =\sum _{i=1}^nk_iww_i,\ \ \ \ \phi w=\sum _{i=1}^nk_iw_iw. $$

Observe that if C is a linear congruence on \(V(A^*)\) then the set \(V(A^*)/C\) has the structure of a \({{\mathrm{Vec_\mathbb {K}}}}\) pointed automaton.

Similarly to the case of weighted automata, we will define \(\mathbf {free}_l(S,\alpha )\) and \(\mathbf {Eq}_l(S,\alpha )\) by taking the following steps:

  1. 1.

    Define the pointed \({{\mathrm{Vec_\mathbb {K}}}}\) automaton \(\prod (S,\alpha )=(\prod _{s\in S} S, \varDelta , \hat{\alpha })\) where \(\hat{\alpha }\) is the product of \(\alpha \), |S| times, that is \(\hat{\alpha }(x)(a)(s)=\alpha (x(s))(a)\), and \(\varDelta \in \prod _{s\in S} S\) is given by \(\varDelta (s)=s\). Then, by initiality of \((V(A^*),\epsilon ,\tau )\), we get a unique H-algebra morphism \(r_\varDelta :V(A^*)\rightarrow \prod (S,\alpha )\).

  2. 2.

    Define \(\mathbf {free}_l(S,\alpha )\) and \(\mathbf {Eq}_l(S,\alpha )\) as

    $$ \mathbf {free}_l(S,\alpha ):=V(A^*)/\ker (r_\varDelta ), \text{ and } \mathbf {Eq}_l(S,\alpha ):= \ker (r_\varDelta ). $$

Notice that \(\mathbf {Eq}_l(S,\alpha )\) is a linear congruence on \(V(A^*)\).

Theorem 15

\(\mathbf {Eq}_l(S,\alpha )\) is the maximum set of equations satisfied by \((S,\alpha )\).

A set of coequations is a subspace \(D\subseteq \mathbb {K}^{A^*}\). We define \((S,\alpha )\models D\), and say \((S,\alpha )\) satisfies the set of coequations D, as follows:

$$ (S,\alpha )\models D \ \Leftrightarrow \ \forall c\in {{\mathrm{Vec_\mathbb {K}}}}(S,\mathbb {K}),s\in S\ o_c(s)\in D. $$

For a family \(\{S_i\}_{i\in I}\) of vector spaces, we denote the coproduct of that family by \(\coprod _{i\in I}S_i\). The minimum set of coequations satisfied by \((S,\alpha )\) is obtained by the following construction:

  1. 1.

    Define the coloured \({{\mathrm{Vec_\mathbb {K}}}}\) automaton \(\coprod (S,\alpha )=(\coprod _{c\in {{\mathrm{Vec_\mathbb {K}}}}(S,\mathbb {K})} S, \tilde{c}, \tilde{\alpha })\) where \(\tilde{\alpha }\) is the coproduct of the morphism \(\alpha \), \(|{{\mathrm{Vec_\mathbb {K}}}}(S,\mathbb {K})|\) times in \({{\mathrm{Vec_\mathbb {K}}}}\), that is \(\tilde{\alpha }\left( \sum (c_i,s_i)\right) (a)=\sum (c_i,a(s_i))\), and \(\tilde{c}\) is the coproduct of all \(c\in {{\mathrm{Vec_\mathbb {K}}}}(S,\mathbb {K})\) which is given by \(\tilde{c}\left( \sum (c_i,s_i)\right) =\sum c_i(s_i)\). Then, by finality of \(\left( \mathbb {K}^{A^*}, \hat{\epsilon },\hat{\tau }\right) \), we get a unique I-coalgebra morphism \(o_{\tilde{c}}:\coprod (S,\alpha )\rightarrow \mathbb {K}^{A^*}\).

  2. 2.

    Define \(\mathbf {cofree}_l(S,\alpha )\) and \(\mathbf {coEq}_l(S,\alpha )\) as

    $$ \mathbf {cofree}_l(S,\alpha )=\mathbf {coEq}_l(S,\alpha )={{\mathrm{Im}}}(o_{\tilde{c}}). $$

Theorem 16

\(\mathbf {cofree}_l(S,\alpha )\) is the minimum set of coequations satisfied by \((S,\alpha )\).

From the previous constructions we get the following result.

Theorem 17

For a linear congruence C, \(\mathbf {free}_l\circ \mathbf {cofree}_l(V(A^*)/C)=V(A^*)/C\).

The inclusion from right to left above does not hold in general when considering semimodules (over a semiring) instead of vector spaces. Let \(\mathbb {N}\) be the semiring of natural numbers with the usual sum and product, \(A=\{a,b\}\), and let C be the linear congruence on \(V(A^*)\) associated to the partition \(\{\{0\},V(A^*)\backslash \{0\}\}\) of \(V(A^*)\). Then \(V(A^*)/C\cong \mathbb {B}\), the Boolean semiring, where the action of \(\mathbb {N}\) on \(\mathbb {B}\) is given by \(n\cdot x=0\) if and only if \(n=0\) or \(x=0\). However, one can verify that \(\mathbf {cofree}_l(V(A^*)/C)\) has only one element and therefore it satisfies any identity. It follows that \(V(A^*)/C\) cannot be a subset of \(\mathbf {free}_l\circ \mathbf {cofree}_l(V(A^*)/C) = 1\).

Similarly to the case of weighted automata we can show that for a \({{\mathrm{Vec_\mathbb {K}}}}\) automaton \((S,\alpha )\) and a linear congruence C we have that

$$ (S,\alpha )\models C \text{ if } \text{ and } \text{ only } \text{ if } (S,\alpha )\models \mathbf {coEq}_l(V(A^*)/C). $$

Example 18

Let \(A=\{x,y\}\), and let \(C=\langle xy=yx \rangle \) be the linear congruence generated by the equation \(xy=yx\). Then the \({{\mathrm{Vec_\mathbb {K}}}}\) automaton \(V(A^*)/C\) is isomorphic to \(\mathbb {K}[x,y]\), the ring of polynomials on indeterminates x and y with coefficients in \(\mathbb {K}\). Here the transition function on \(\mathbb {K}[x,y]\) is (right) multiplication, that is, for a polynomial \(p(x,y)\in \mathbb {K}[x,y]\) we have that \(x(p(x,y))=p(x,y)x\), and \(y(p(x,y))=p(x,y)y\). Then, \(\mathbf {cofree}_l\left( V(A^*)/C\right) \) is the set

$$ \left\{ f\in \mathbb {K}^{A^*}\ \bigg | \ \forall w_1,w_2 \Big (|w_1|_x=|w_2|_x \wedge |w_1|_y=|w_2|_y \Rightarrow f(w_1)= f(w_2) \Big )\right\} , $$

where \(|w|_x\) is the number of x’s in the word \(w\in A^*\). We can go a little bit further and notice that \(\mathbf {cofree}_l\left( V(A^*)/C\right) \cong \mathbb {K}^{\mathcal {M}(\mathbb {K}[x,y])}\) where \(\mathcal {M}(\mathbb {K}[x,y])\) are the monic monomials in \(\mathbb {K}[x,y]\). \(\square \)

Example 19

Let \(A=\{x,y\}\), and, for a fixed \(k\in \mathbb {K}\), let \(C=\langle xy=yx, y-k=0 \rangle \) be the linear congruence generated by the equations \(xy=yx\) and \(y-k=0\). Then the \({{\mathrm{Vec_\mathbb {K}}}}\) automaton \(V(A^*)/C\) is isomorphic to \(\mathbb {K}[x,y]/\langle y-k \rangle \cong \mathbb {K}[x]\), where \(\langle y-k\rangle \) is the ideal generated by \(y-k\). A similar calculation as in the previous example shows that \(\mathbf {cofree}_l\left( V(A^*)/C\right) \cong \mathbb {K}^{\{x\}^*}\cong \mathbb {K}^\mathbb {N}\). \(\square \)

Example 20

Let A be a finite alphabet, and C a linear congruence on \(V(A^*)\) such that for any \(x,y\in A\), \((xy,yx)\in C\). We claim that \(C=\langle C_0\rangle \) for some finite \(C_0\subseteq C\). In fact, if \(A=\{x_1,\ldots ,x_n\}\) then \(V(A^*)/C\cong \mathbb {K}[x_1,\ldots ,x_n]/I\) for the ideal I of \(\mathbb {K}[x_1,\ldots ,x_n]\) given by \(I=\{\phi - \psi \ |\ (\phi ,\psi )\in C\}\), which is an ideal of \(\mathbb {K}[x_1,\ldots ,x_n]\) since C is a linear congruence (here \(\phi - \psi \) is calculated as in \(\mathbb {K}[x_1,\ldots ,x_n]\)). Then by Hilbert basis theorem we have that I is finitely generated, say \(I=\langle \varphi _1,\ldots ,\varphi _m\rangle \). It follows that

$$ C=\langle \{x_ix_j=x_jx_i\ |\ 1\le i<j\le n\}\cup \{\varphi _l=0\ |\ 1\le l\le m\}\rangle . $$

\(\square \)

Proposition 21

Let \(S\subseteq \mathbb {K}^{A^*}\) be a set that is closed under left derivatives. Then \((S,\alpha )\subseteq \mathbf {cofree}_l\circ \mathbf {free}_l (S,\alpha )\).

To complete a duality result in this case we still need to characterize systems S of the form \(S=\mathbf {cofree}_l(V(A^*)/C)\) for some linear congruence C. The previous proposition is a first step, but a full duality result is an open problem.

6 Conclusion

All the results we proved concerning weighted automata can be easily adapted to the case of probabilistic automata, that is, we can replace V(X) by \(\mathcal {D}(X)\), where \(\mathcal {D}(X)=\{f:X\rightarrow [0,1]\ |\ {{\mathrm{supp}}}(f) \text { is finite and }\sum _{x\in {{\mathrm{supp}}}(f)} f(x)=1\}\), and replace the semiring \(\mathbb {S}\) by the real interval [0, 1] to get similar results. In fact, the linear extension \(\bar{\alpha }:V(X)\rightarrow V(X)^A\) of a function \(\alpha : X\rightarrow V(X)^A\) is well-defined if we replace V(X) by \(\mathcal {D}(X)\) and the elements 0 and 1 in \(\mathbb {S}\) are the elements 0 and 1 in [0, 1], respectively. In the future we plan to study other types of automata such as tree automata. Furthermore, we also want to understand the duality for linear weighted automata.

The closest related work is [3] which can be considered as a special case of our study here. While we allow automata with infinite sets of states, other dualities have concentrated on the finitary case [1, 8, 9, 12]. Another difference with those approaches is that we present a local duality, in which the alphabet is (possibly infinite but) fixed, while [8, 9, 12] consider all finite alphabets at the same time.