1 Introduction

Regular model checking [2, 3, 13, 24, 31] is a symbolic exploration and verification technique where sets of configurations are expressed as regular languages and transition relations are encoded as rational relations [27,28,29] in the form of generalized sequential machines. A generalized sequential machine (GSM) is essentially a finite state machine with output capability; on every transition an input symbol is read, the state changes, and a finite string is appended to an output string (see Fig. 1, for instance, where the label \(\alpha /s\) indicates that the machine reads the symbol \(\alpha \) and writes the string s on any such transition). While regular model checking is undecidable in general, a number of approximation schemes and heuristics [1, 8, 12, 13, 18, 22, 23, 30] have made it a practical verification approach. It has, for example, been applied to verify programs with unbounded data structures such as lists and stacks [3, 13]. Moreover, since infinite strings over a finite alphabet can be naturally interpreted as real numbers in the unit interval, regular model checking over infinite strings provides a framework [7, 9, 10, 14, 25, 26] to analyze properties of dynamical systems.

Fig. 1.
figure 1

A GSM that shifts a string to the right by 1 or 2, or equivalently realizing division of the binary encoding of real numbers in [0, 1] by 2 or 4.

This paper generalizes the regular model checking approach so that transition relations can be expressed using regular relations over infinite strings. We propose the computational model of nondeterministic streaming string transducers on infinite strings (\(\omega \)-NSST), and explore theoretical properties of \(\omega \)-NSSTs required to effectively carry out regular model checking.

Regular Relations. While rational relations are capable of modelling a rich set of transition systems, their limitations can be observed by noting their inability to express common transformations such as \(\mathsf {copy}~{\mathop {=}\limits ^{\mathrm{{\tiny def}}}}~w \mapsto ww\) and \(\mathsf {reverse}{\mathop {=}\limits ^{\mathrm{{\tiny def}}}}~w \mapsto \overleftarrow{w}\), where the string \(\overleftarrow{w}\) is the reverse of the string w. Courcelle [16, 17] initiated the use of monadic second-order logic (MSO) in defining deterministic and nondeterministic graph-to-graph transformations which are known to include some non-rational transformations like copy and reverse. Engelfriet and Hoogeboom [20] showed that deterministic MSO-definable transformations (DMSOT) over finite strings coincide exactly with the transformations that can be realized by generalizations of GSMs that can read inputs in two directions (2GSM). Furthermore, they showed that this correspondence does not extend to the set of nondeterministic MSO-definable transformations (NMSOT) and nondeterministic 2GSMs (N2GSM).

Fig. 2.
figure 2

SST implementing reverse. Here, x is a string variable and input strings ending in the final state \(s_0\) output variable x (as shown by the label on the outgoing arrow from \(s_0\).)

Alur and Černý [4] proposed a one-way machine capable of realizing the same transformations as DMSOTs. These machines, known as streaming string transducers (SST), work by storing and combining partial outputs in a finite set of variables, and enjoy a number of appealing properties including decidability of functional equivalence and type-checking (see Fig. 2 for an SST realization of reverse). Alur and Deshmukh followed up this work by introducing nondeterministic streaming string transducers (NSST) as a natural generalization [5] and proved this model captures precisely the same set of relations as NMSOTs. Since the connection between automata and logic is often used as a yardstick for regularity, MSO-definable functions and relations over finite strings are often called regular functions and regular relations.

Regular Relations over Infinite Strings. The expressiveness of SSTs and MSO-definable transformations also coincide when representing functions over infinite strings [6]. Deterministic SSTs operating on infinite strings are known as \(\omega \)-DSSTs, however, for regular relations of infinite strings, no existing computational model exists. We combine and generalize results in the literature on NSSTs and \(\omega \)-DSSTs to propose the computational model of nondeterministic streaming \(\omega \)-string transducers (\(\omega \)-NSST) capturing regular relations of \(\omega \)-strings.

Fig. 3.
figure 3

An \(\omega \text {-NSST}{}\) implementing the relation \(R_{\overleftarrow{u}u}\) from Example 1. Let \(\alpha \) denote all symbols in A, excluding #. Variable w remembers the string since the last #, while x and y store the chosen suffix and its reverse. The output variable is z.

Example 1

Let A be a finite alphabet and \(\#\) be a special separator not in A. For \(u, v \in A^*\), we say that \(v \preceq u\) if v is a suffix of u. Consider a relation \(R_{\overleftarrow{u}u}\) that transforms strings in \((A \cup \left\{ \# \right\} )^\omega \) such that each maximal \(\#\)-free finite substring u occurring in the input string is transformed into \(\overleftarrow{v}v\) for some suffix v of u. Formally, \(R_{\overleftarrow{u}u}\) is defined as

$$\begin{aligned} \left\{ \left( u_1\#\cdots \#u_n\#w, \overleftarrow{v_1}v_1\#\cdots \#\overleftarrow{v_n}v_n\#w \right) : u_i, v_i \in A^*, w \in A^\omega , \text { and } v_i \preceq u_i \right\} \\ \cup \left\{ \left( u_1\#u_2\#\ldots , \overleftarrow{v_1}v_1\#\overleftarrow{v_2}v_2\#\ldots \right) : u_i, v_i \in A^* \text { and } v_i \preceq u_i \right\} , \end{aligned}$$

and can be implemented as an \(\omega \text {-NSST}{}\) with Büchi acceptance condition (accepting states are visited infinitely often for accepting strings) as shown in Fig. 3.

Contributions and Outline. In Sect. 2 we introduce \(\omega \)-NSSTs and their semantics as a computational model for regular relations. In Sect. 3 we prove that the \(\omega \)-NSST-definable relations coincide exactly with MSO-definable relations of infinite strings. In Sect. 4 we consider regular model checking with regular relations. To enable regular model checking with regular relations, we study the following key verification problem. The type checking problem for \(\omega \)-NSSTs asks to decide, given two \(\omega \)-regular languages \(L_1,L_2\) and an \(\omega \)-NSST, whether \([\![T]\!](L_1) \subseteq L_2\), where \([\![T]\!]\) is the regular relation implemented by T. We show that type checking for \(\omega \)-NSSTs is decidable in Pspace.

2 Regular Relations for Infinite Strings

An alphabet A is a finite set of letters. A string w over an alphabet A is a finite sequence of symbols in A. We denote the empty string by \(\varepsilon \). We write \(A^*\) for the set of all finite strings over A, and for \(w \in A^*\) we write |w| for its length. A language L over A is a subset of \(A^*\). An \(\omega \)-string x over A is a function \(x: \mathbb {N}{\rightarrow } A\), and written as \(x = x(0) x(1) \cdots \). We write \(A^\omega \) for the set of all \(\omega \)-strings over A, and \(A^\infty \) for \(A^* \cup A^\omega \). An \(\omega \)-language L over A is a subset of \(A^\omega \).

2.1 MSO Definable Relations

Strings may be viewed as ordered structures encoded over the signature \(\mathcal {S}_A = \{(a)_{a \in A}, <\}\) and interpreted with respect to \(A^*\) or \(A^\omega \). The domain of a string in this context refers to the set of valid positions in the string, and the relation < in \(\mathcal {S}_A\) ranges over this domain. The expression a(x) holds true if the symbol at position x is a, and \(x < y\) holds if x is a lesser index than y.

Formulae in MSO over \(\mathcal {S}_A\) are defined relative to a countable set of first-order variables \(x, y, z, \ldots \) that range over individual elements of the domain and a countable set of second-order variables \(X, Y, Z, \ldots \) that range over subsets of the domain. The syntax for well-formed formulae is given as:

$$\begin{aligned} \phi \,::\,= \exists X.\ \phi \mid \exists x.\ \phi \mid \phi \wedge \phi \mid \phi \vee \phi \mid \lnot \phi \mid a(x) \mid x < y \mid x \in X \end{aligned}$$

MSO transducers are particular specifications in this logic that define transformations between strings. Intuitively, each such transducer copies each input string some fixed number of times and treats the positions in each copy as nodes in a graph, which are then relabeled and and rearranged in accordance with the formulae of the transducer to produce an output.

Definition 1

A deterministic MSO \(\omega \)-string transducer (\(\omega \)-DMSOT) is a tuple

$$\begin{aligned} \left( A, B, \mathsf {dom}, N, (\phi ^n_b(x))^{n \in N}_{b \in B}, (\psi ^{n,m}(x,y))^{n,m \in N} \right) , \end{aligned}$$

where A and B are input and output alphabets, \(N = \left\{ 1,\ldots ,n \right\} \) is a set of copy indices, \(\mathsf {dom}\) is an MSO sentence that defines an input language, the node formulae \(\left( \phi ^n_b(x) \right) ^{n \in N}_{b \in B}\) specify the labels of positions in the output, and the edge formulae \((\psi ^{n,m}(x,y))^{n,m \in N}\) specify which positions in the output will be adjacent.

A \(\omega \)-DMSOT operates over N disjoint copies of the string graph of an input. Each formula \(\phi ^n_b\) has a single free variable and should be interpreted such that if a position satisfies \(\phi ^n_b\), then that position will be labeled by the symbol b in the \(n^{th}\) disjoint string graph comprising the output. Each formula \(\psi ^{(n, m)}\) has two free variables and a satisfying pair of indices indicates that there is a link between the former index in copy n and the latter index in copy m.

Nondeterminism is introduced through additional set variables \(X_1, \dots , X_k\) called parameters. Fixing a valuation—sets of positions of the input graph satisfying the domain formula—of these parameters determines an output graph, just as in the deterministic case. Each possible valuation may result in a different output graph for the same input graph, and thus nondeterminism arises from the choice of valuation.

Definition 2

A nondeterministic MSO \(\omega \)-string transducer (\(\omega \)-NMSOT) with k free set variables \(\mathbf {X}_k = (X_1,\ldots ,X_k)\) is given as a tuple

$$\begin{aligned} \left( A, B, \mathsf {dom}(\mathbf {X}_k), N, (\phi ^n_b(x, \mathbf {X}_k))^{n \in N}_{b \in B}, (\psi ^{n,m}(x, y, \mathbf {X}_k))^{n,m \in N} \right) , \end{aligned}$$

where all formulae are parameterized by the free second-order variables in addition to the required first-order parameters.

A relation between strings is a regular relation if it is definable by a \(\omega \)-NMSOT. Since \(\omega \)-DMSOTs can map each input to at most one output, the relations definable by \(\omega \)-DMSOTs are called the regular functions.

Fig. 4.
figure 4

Two possible outputs of the relation given in Example 1 constructed according ot the \(\omega \)-NMSOT from Example 2.

Example 2

We now describe a \(\omega \)-NMSOT capturing the relation given in Example 1. Set \(A = \left\{ a,b,\# \right\} = B\), \(N = \left\{ 1, 2 \right\} \), and consider a single parameter \(\mathbf {X}_1 = \left\{ X_1 \right\} \). The domain of the relation is simply \(A^\omega \), so we omit the formula. For all symbols \(\beta \in B\) and copy indices \(n \in N\), the node formulae labels each position with the same symbol as the corresponding position in the input string: \(\phi ^n_\beta (x, X_1)~{\mathop {=}\limits ^{\mathrm{{\tiny def}}}}~\beta (x)\). We omit formal specifications of the edge formulae (which can be found in the extended version of this work [19]) and describe them informally. The formula for edges from copy 1 to copy 1 connects adjacent non-\(\#\) positions that belong to \(X_1\) in the reverse order. The formula for edges from copy 1 to copy 2 connects non-\(\#\) positions to themselves when the predecessor position is not in \(X_1\). The formula for edges from copy 2 to copy 2 links the right-most sequence of positions in \(X_1\) that preceed a \(\#\) symbol and also connect all those positions coming after the final \(\#\) if required. Finally, the formula for edges from copy 2 to copy 1 links \(\#\) symbols to the last position in \(X_1\) left of the next \(\#\).

Two possible outputs from the relation of Example 1 are displayed in Fig. 4 which shows how the above \(\omega \)-NMSOT constructs an output string for two different valuations of \(X_1\). A 1 in the blue (resp. green) row signifies that the position at that column is in \(X_1\), while a 0 indicates that it is not in \(X_1\).

2.2 Nondeterministic Streaming String Transducers

Definition 3

A nondeterministic streaming string transducer T over \(\omega \)-strings (\(\omega \)-NSST) is a tuple \((A, B, S, I, \mathsf {Acc}, \varDelta , f, X, U)\), where

  • A and B are finite input and output alphabets,

  • S is a finite set of states,

  • \(I \subseteq Q\) is a set of initial states,

  • \(\mathsf {Acc}\) is an acceptance condition,

  • X is a finite set of string variables,

  • U is a finite set of variable update functions of type \(X \rightarrow (X \cup B)^*\),

  • \(\varDelta \) is a transition function of type \((S \times A) \rightarrow 2^{U \times S}\), and

  • \(f \in X\) is an append-only output variable.

Such a machine is deterministic (a \(\omega \)-DSST) if \(|\varDelta (s,a)| = 1\), for all states \(s \in S\) and symbols \(a \in A\), and \(|I| = 1\); it is nondeterministic otherwise.

On each transition \(s_k \xrightarrow [u_k]{a_k} s_{k+1}\), the transducer changes state and applies the update \(u_k\) to each variable of X in parallel. An \(\omega \)-NSST is copyless if every variable in X occurs at most once in the image \(\mathsf {im}(u)\) of every update \(u \in U\). Alternately stated, an update \(u \in U\) is copyless if the string \(u(x_0)u(x_1) \ldots u(x_{n-1})\) has at most one occurrence of each \(x \in X\), and an \(\omega \)-NSST is copyless if all of its updates are copyless.

A run of an \(\omega \)-NSST on an infinite string \(a_1a_2\cdots \in A^\omega \) is an infinite sequence of states and transitions \(s_0 \xrightarrow [u_0]{a_0} s_1 \xrightarrow [u_1]{a_1} \ldots \) where \(s_0 \in I\) and \((s_{k+1}, u_k) \in \varDelta (s_k, a_k)\) for all \(k \in \mathbb {N}\). Let \(\mathsf {Runs}_{T} (w)\) be the set of all runs in T, given input w. An update function \(u : X \rightarrow (X \cup B)^*\) can easily be extended to \(\widehat{u} : (X \cup B)^* \rightarrow (X \cup B)^*\) such that \(\widehat{u}(w)~{\mathop {=}\limits ^{\mathrm{{\tiny def}}}}{}~\varepsilon \) if \(w = \varepsilon \), \(\widehat{u}(w)~{\mathop {=}\limits ^{\mathrm{{\tiny def}}}}{}~b\widehat{u}(w')\) if \(w = bw'\), and \(u(x)\widehat{u}(w')\) if \(w = xw'\). The effect of two updates \(u_1, u_2 \in U\) in sequence can be summarized by the function composition \(\widehat{u}_1 \circ \widehat{u}_2\); likewise a sequence of updates of arbitrary length would be summarized by \(\widehat{u}_0 \circ \widehat{u}_1 \circ \ldots \circ \widehat{u}_{n-1}\). For notational convenience, we often omit the hats when the extension is clear from context. Notice that if all updates in a sequence of compositions are copyless, then so is the entire summary.

A valuation is a function \(X \rightarrow B^*\) mapping each variable to a string value. The initial valuation \(\mathsf {val}_{\varepsilon }\) of all variables is the empty string \(\varepsilon \). A valuation is well-defined after any finite prefix \(r_n\) of a run r and is computed as a composition of updates occurring on this prefix: \(\mathsf {val}_{r_n} = \mathsf {val}_{\epsilon } \circ u_0 \circ u_1 \circ \cdots \circ u_{n-1}\). The output \(T(r)~{\mathop {=}\limits ^{\mathrm{{\tiny def}}}}~\lim _{n \rightarrow \infty } \mathsf {val}_{r_n}(f)\) of T on r is well-defined only if r is accepted by T. Since the output variable f is only ever appended to and never prepended, this limit exists and is an \(\omega \)-string whenever r is accepted, otherwise we set \(T(r) = \bot \). The relation \([\![T]\!]\) realized by an \(\omega \)-NSST T is given by \([\![T]\!]~{\mathop {=}\limits ^{\mathrm{{\tiny def}}}}~\left\{ (w, T(r)) : r \in \mathsf {Runs}_{T} (w) \right\} \). An \(\omega \)-NSST T is functional if for every w the set \(\left\{ w' : (w,w') \in [\![T]\!] \right\} \) has cardinality at most 1.

We consider both Büchi and Muller acceptance conditions for \(\omega \)-NSSTs and reference these classes of machines by the initialisms NBT and NMT (DBT and DMT for their deterministic versions), respectively. For a run \(r \in \mathsf {Runs}_{T} (w)\), let \(\mathsf {Inf}(r) \subseteq S\) denote the set of states visited infinitely often.

  1. 1.

    A Büchi acceptance condition is given by a set of states \(F \subseteq S\) and is interpreted such that a NBT is defined on an input \(w \in A^\omega \) if there exists a run \(r \in \mathsf {Runs}_{T} (w)\) for which \(\mathsf {Inf}(r) \cap F \ne \emptyset \).

  2. 2.

    A Muller acceptance condition is given as a set of sets \(\mathbb {F}= \left\{ F_0,\ldots ,F_n \right\} \subseteq 2^S\), interpreted such that a NMT is defined on input \(w \in A^\omega \) if there exists a run \(r \in \mathsf {Runs}_{T} (w)\) for which \(\mathsf {Inf}(r) \in \mathbb {F}\).

Proposition 1

A relation is NBT definable if, and only if, it is NMT definable.

The equivalence of NBT and NMT -definable relations follows from a straightforward application of the equivalence of nondeterministic Büchi automata and nondeterministic Muller automata. Equivalence of these acceptance conditions in transducers allows us to switch between them whenever convenient.

Remark 1

Observe that DMTs and functional NMTs, both of which were introduced in [6], have a slightly different output mechanism, which is defined as a function \(\varOmega : 2^S \rightharpoonup X^*\) such that the output string \(\varOmega (S')\) is copyless and of the form \(x_1 \dots x_n\), for all \(S' \subseteq S\) for which \(\varOmega (S') \ne \bot \). Furthermore, there is the condition that if \(s, s' \in S'\) and \(a \in A\) s.t. \((u,s') \in \varDelta (s, a)\), then (1) \(u(x_k) = x_k\) for all \(k < n\) and (2) \(u(x_n) = x_n w\) for some \(w \in (X \cup B)^*\).

In contrast, our definition has a unique append-only output variable \(f \in X\). However, our model with the Muller acceptance is as expressive as that studied in [6]. One can use nondeterminism to guess a position in the input after which states in a Muller accepting set \(S'\) will be visited infinitely often. The output function can be defined by guessing a Muller set, and keeping an extra variable for the output. Upon making the guess, it will move the contents of \(x_1 \ldots x_n\) to the variable f and make a transition to a copy \(T_{S'}\) of the transducer where \(\mathsf {Acc}= \left\{ S' \right\} \). If any state outside the set \(S'\) is visited, or the variables \(x_1\ldots ,x_{n-1}\) are updated, or the variable f is assigned in non-appending fashion, then \(T_{S'}\) makes a transition to a rejecting sink state. Alur, Filiot, and Trivedi [6] showed the equivalence of functional NMT with DMT. This implies that the transductions definable using functional NMTs or functional NBTs (in our definition) are precisely those definable by \(\omega \)-DMSOT.

3 Equivalence of \(\omega \)-NMSOT and \(\omega \)-NSST

Alur and Deshmukh [5] showed that relations over finite strings definable by nondeterministic MSO transducers coincide with those definable by nondeterministic streaming string transducers. We generalize this result by proving that a relation is definable by an \(\omega \)-NMSOT if, and only if, it is definable by an \(\omega \)-NSST. We provide symmetric arguments to connect \(\omega \)-NSST, \(\omega \)-DSST and \(\omega \)-NMSOT, \(\omega \)-DMSOT, resulting in a simple proof.

Our arguments use the concept of a relabeling relation, following Engelfriet and Hoogeboom [20]. A relation \(\rho \subseteq A^\omega \times B^\omega \) is a relabeling, if there exists another relation \(\rho ' \subseteq A \times B\) such that \((aw, bv) \in \rho \) iff \((a,b) \in \rho '\) and \((w,v) \in \rho \). In other words, \(\rho \) is obtained by lifting the letter-to-letter relation \(\rho '\), in a straight-forward manner, to \(\omega \)-strings. Let \(\mathsf {Let}(\rho )\) denote the letter to letter relation \(\rho ' \subseteq A \times B \) corresponding to \(\rho \) and let RL be the set of all such relabelings.

Theorem 1

\(\omega \text {-NMSOT}{} = \omega \text {-NSST}{}\).

The proof of Theorem 1 proceeds in two stages. In the first part (Lemma 1), we show that every \(\omega \)-NSST is equivalent to the composition of a nondeterministic relabeling and a \(\omega \)-DSST. In the second part (Lemma 2), we show that every \(\omega \)-NMSOT is equivalent to the composition of a nondeterministic relabeling and a \(\omega \)-DMSOT. These two lemmas, in conjunction with the equivalence of DMTs and functional NMTs [6], allow us to equate these two models of transformation via a simple assignment.

Lemma 1

\(\omega \text {-NSST}{} = \omega \text {-DSST}{} \circ \text {RL}{}\)

Proof

We first show \(\omega \text {-DSST}{} \circ \text {RL}{} \subseteq \omega \text {-NSST}{}\) by proving that for every DMT \(T~{\mathop {=}\limits ^{\mathrm{{\tiny def}}}}~(B,C,S,I,\mathbb {F},\varDelta ,f,X,U)\) and nondeterministic relabeling \(\rho \subseteq A^\omega \times B^\omega \), there is a NMT \(T'~{\mathop {=}\limits ^{\mathrm{{\tiny def}}}}~(A,C,S,I,\mathbb {F},\varDelta _\rho ,f,X,U)\) such that \([\![T']\!] = [\![T]\!] \circ \rho \). As indicated by the tuple given to specify \(T'\), the only distinct components between the two machines are their input alphabets and their transition functions \(\varDelta \) and \(\varDelta _\rho \). The latter is given as \(\varDelta _\rho ~{\mathop {=}\limits ^{\mathrm{{\tiny def}}}}{}~(s, a) \mapsto \bigcup \limits _{(a, b) \in \mathsf {Let}(\rho )} \varDelta (s, b)\). The nondeterminism of \(\rho \) is therefore captured in \(\varDelta _\rho \). This results in a unique run through \(T'\), for every possible relabeling of inputs for T. Since the remaining pieces of T are untouched in the process of constructing \(T'\), it is clear that \([\![T']\!] = [\![T]\!] \circ \rho \).

What remains to be shown is the inclusion \(\omega \text {-NSST}{} \subseteq \omega \text {-DSST}{} \circ \text {RL}{}\): for any NMT \(T~{\mathop {=}\limits ^{\mathrm{{\tiny def}}}}~(A,B,S,I,\mathbb {F},\varDelta ,f,X,U)\), there exists a DMT \(T'\) and a nondeterministic relabeling \(\rho \) such that \([\![T]\!] = [\![T']\!] \circ \rho \). From T, we can construct a nondeterministic, letter-to-letter relation \(\rho ' \subseteq A \times (U \times S)\) as follows: \(\rho '~{\mathop {=}\limits ^{\mathrm{{\tiny def}}}}{}~\left\{ (a,(u,s')) : (u,s') \in \varDelta (s,a) \right\} \). Now let \(\rho \subseteq A^\omega \times (U \times S)^\omega \) be the extension of \(\rho '\) as described previously. The relation \(\rho \) contains the set of all possible runs through T for any possible input in \(A^\omega \).

Next, we construct a DMT \(T'~{\mathop {=}\limits ^{\mathrm{{\tiny def}}}}~(U \times S,B,S,I,\mathbb {F},\varDelta _\rho ,f,X,U)\) with transition function \(\varDelta _\rho ~{\mathop {=}\limits ^{\mathrm{{\tiny def}}}}~(s,(u,s')) \mapsto \left\{ (u,s') \,:\, (u,s') \in \varDelta (s,a) \text { for some } a\in A \right\} \). Consequently, \(T'\) retains only the pairs in \(\rho \) which correspond to valid runs T and encodes them as \(\omega \)-strings over the alphabet \(S \times U\). The DMT \(T'\) then simply follows the instructions encoded in its input and thereby simulates only legitimate runs through T. Thus, we may conclude that \([\![T]\!] = [\![T']\!] \circ \rho \).    \(\square \)

Lemma 2

\(\omega \text {-NMSOT}{} = \omega \text {-DMSOT}\circ \text {RL}{}\).

Proof

We begin by showing the inclusion \(\omega \text {-NMSOT}{} \subseteq \omega \text {-DMSOT}\circ \text {RL}{}\): for any \(\omega \)-NMSOT T, there exists an \(\omega \)-DMSOT \(T'\) and a relabeling \(\rho \) such that \([\![T]\!] = [\![T']\!] \circ \rho \). Nondeterministic choice in T is determined by the choice of assignment to free variables in \(\mathbf {X}_k\). Alternatively, the job of facilitating nondeterminism can be placed upon a relabeling relation, thereby allowing us to remove the parameter variables. Define a letter-to-letter relation \(\rho ' \subseteq A \times (A \times \left\{ 0,1 \right\} ^k)\) as follows: \(\rho '~{\mathop {=}\limits ^{\mathrm{{\tiny def}}}}~\left\{ (a, (a,b)) : b \in \left\{ 0,1 \right\} ^k \right\} \), and let the relabeling \(\rho \subseteq A^\omega \times (A \times \left\{ 0,1 \right\} ^k)^\omega \) be its extension. This relabeling essentially gives us a new alphabet such that each symbol from A is tagged with encodings of its membership status for each set parameter from \(\mathbf {X}_k\). Now, we can construct an \(\omega \)-DMSOT \(T'\) that is identical to T, apart from two distinctions. Firstly, \(T'\) is deterministic (i.e. it has no free set variables), and every occurrence of a subformula \(x \in X_i\) in T is replaced by a subformula \(\bigvee \limits _{\begin{array}{c} b \in \left\{ 0,1 \right\} ^k \wedge b[i] = 1 \end{array}} (a, b)(x)\) in \(T'\). As a result of this encoding, the equality \([\![T]\!] = [\![T']\!] \circ \rho \) holds.

The converse inclusion, \(\omega \text {-DMSOT}\circ \text {RL}\subseteq \omega \text {-NMSOT}\), is much simpler. Every relabeling \(\rho \) in RL is \(\omega \)-NMSOT definable: consider \(\rho ' = \mathsf {Let}(\rho ) \subseteq A \times B\). The \(\omega \text {-NMSOT}\) specifying \(\rho \) is similar to identity/copy, except that here we have that the output label is b iff the input label is a and \((a, b) \in \rho '\). This can be implemented using second-order variables \(X_{b}\) for all \(b \in B\). Let \(\mathbf {X}_B\) represent this set. Only a single copy is required to produce the output. Node formulae are given by \(\phi ^1_b(x,\mathbf {X}_B)~{\mathop {=}\limits ^{\mathrm{{\tiny def}}}}~\bigvee \limits _{a \in A} \bigvee \limits _{(a, b) \in \rho '} (a(x) \wedge x \in X_b)\), and the edge formulae by \(\psi ^{1,1}(x, y,\mathbf {X}_B)~{\mathop {=}\limits ^{\mathrm{{\tiny def}}}}~x < y\). It is known that \(\omega \text {-NMSOT}{}\) are closed under composition [17]. Thus, we conclude that any composition of a nondeterministic relabeling and a \(\omega \)-DMSOT is definable by a \(\omega \)-NMSOT and that \(\omega \text {-MSOT}\circ \text {RL}\subseteq \omega \text {-NMSOT}\).    \(\square \)

In conjunction Lemmas 1 and 2 along with the results of [6] allow us to write the following equation, thereby proving Theorem 1.

$$\begin{aligned} \omega \text {-NMSOT}{} = \omega \text {-DMSOT}{} \circ \text {RL}{} = \text {DMT}{} \circ \text {RL}{} = \text {NMT}{} = \omega \text {-NSST}{} \end{aligned}$$

4 MSO-Definable Regular Model Checking

In this section, we explain how algorithms for deciding properties of regular relations can be used to perform regular model checking. Given two relations \(T_1\) and \(T_2\), their sequential composition is \([\![T_2 \circ T_1]\!]~{\mathop {=}\limits ^{\mathrm{{\tiny def}}}}~\left\{ (x,z) : (x,y) \in [\![T_1]\!], (y,z)\right. \left. \in [\![T_2]\!]\right\} \). Let \(T^k\) denote the k-fold composition of a relation T with itself. Let \(T^*\) denote the transitive closure of T.

Suppose that \(\textsc {init}\) and \(\textsc {bad}\) are regular languages representing sets of states in some system that are initial, and unsafe, respectively. Given a generic transition relation T which captures the dynamics of the system, the regular model checking problem asks to decide whether any element of \(\textsc {bad}\) is reachable from any element of \(\textsc {init}\) via repeated applications of T. In precise terms, the regular model checking problem asks to decide whether the equation \([\![T^*]\!](\textsc {init}) \cap \textsc {bad}= \emptyset \) holds. Bounded model checking, in this setting, asks to decide, given \(n \in \mathbb {N}\), whether \([\![T^k]\!](\textsc {init}) \cap \textsc {bad}= \emptyset \) holds, for all \(k \le n\). Unbounded model checking is undecidable (cf. [19] for a proof), so we focus on bounded model checking.

When T is a rational relation, its image is always a regular language, and this permits the approach of iteratively applying T from \(\textsc {init}\) and checking whether this set intersects with \(\textsc {bad}\) by standard automata-theoretic methods. If T is a regular relation, its image may not be a regular language, and we must iteratively compute compositions of T with itself and test whether these compositions enter the \(\textsc {bad}\) language. To allow this, we establish decidability of the type checking problem for \(\omega \)-NSSTs: given two \(\omega \)-regular languages \(L_1,L_2\) and an \(\omega \)-NSST T, decide if the inclusions \(L_1 \subseteq \mathsf {dom}(T)\) and \([\![T]\!](L_1) \subseteq L_2\) hold.

Theorem 2

The type checking problem for \(\omega \)-NSSTs is decidable in Pspace.

Proof

Suppose that \(T~{\mathop {=}\limits ^{\mathrm{{\tiny def}}}}~(A, B, S, I, F, \varDelta , f, X, U)\) is an NBT and \(L_1 \subseteq A^\omega \) and \(L_2 \subseteq B^\omega \) are \(\omega \)-regular languages, encoded, respectively, as deterministic Muller automata (DMA) \(M_1\) and \(M_2\). We first check whether T is defined for all \(\omega \)-strings \(w \in L_1\), i.e. whether \(L_1 \subseteq \mathsf {dom}(T)\). A nondeterministic Büchi automaton (NBA) \(\mathcal {C}\) that recognizes the domain of T can be constructed in linear time by ignoring variables and output mechanism. The inclusion \(L_1 \subseteq \mathsf {dom}(T)\) can be decided in Pspace by checking emptiness of \(M_1' \cap \mathcal {C}\) where \(M_1'\) is the NBA equivalent to \(M_1\) and \(\mathcal {C}\) is the NBA representing the complement language of \(\mathsf {dom}(T)\). It is known that an NBA can be constructed from a DMA with exponential blowup in the number of states [11]. A complement automaton can be constructed for an NBA with exponential increase in the number of states as well [11]. Hence \(\mathcal {C}\) has exponentially many states relative to T and \(M_1\). Intersection of \(M_1'\) and \(\mathcal {C}\) is a standard product construction with a flag so that both \(M_1'\) and \(\mathcal {C}\) visit good states infinitely often. Thus the intersection NBA \(M_1' \cap \mathcal {C}\) has exponentially many states relative to T and \(M_1\). Thanks to the fact that emptiness of NBA can be checked in NLogSpace [11], the emptiness of this product automaton, can be decided in NPspace = Pspace.

We now assume that T is well-defined on \(L_1\) and construct a nondeterministic Muller automaton (NMA) \(\mathcal {A}\) such that the language of \(\mathcal {A}\) is defined as \(\left\{ w \in L_1 : \exists w' \in [\![T]\!](w) \text { s.t. } w' \not \in L_2 \right\} \). Next, we construct a DMA \(\overline{M_2}\) for \(\overline{L_2}\) by complementing the \(\mathsf {Acc}\) set. The automaton \(\mathcal {A}\) simulates \(M_1\), T and \(\overline{M_2}\) in parallel. Next, we construct an NMT \(T'\) corresponding to the NBT T in order to homogenize the acceptance condition accross these machines. Let us fix the definition for all three machines: (i) \(M_1~{\mathop {=}\limits ^{\mathrm{{\tiny def}}}}~(A, S_{1}, p_0, \mathbb {F}_{1}, \varDelta _{1})\), (ii) \(T'~{\mathop {=}\limits ^{\mathrm{{\tiny def}}}}~(A, B, S, I, \mathbb {F}', \varDelta , {f}, X, U)\), (iii) \(\overline{M_2}~{\mathop {=}\limits ^{\mathrm{{\tiny def}}}}~(B, S_{2}, r_{0}, \mathbb {F}_2, \varDelta _{2})\).

The NMA \(\mathcal {A}\) is defined as the product of \(M_1\) and \(T'\) (without the output mechanism), and it stores a state summary map—i.e. the effect of running current valuation of each variable starting from all states of \(\overline{M_2}\)—in each of its own states. Formally, the states of \(\mathcal {A}\) comprise a finite subset of \(S_1 \times S \times \left( S_{2} \times X \rightarrow S_2 \cup \left\{ \bot \right\} \right) \). A state (qpg) with \(g(r, x) = r'\) represents that, starting from state r, if we read the current value of variable x, then we reach state \(r'\). If \(g(r, x) = \bot \), it indicates that there is no run on valuation of x starting from r. This information can be updated along the run of \(\mathcal {A}\). For instance, if a transition of T updates x as aybx, then the summary map g is updated to \(g'\) such that \(g'(r, x) = g(\varDelta _2(g(\varDelta _2(r, a), y), b), x)\), and summarizes the effect of reading \(x = aybx\) in \(\overline{M_2}\) starting from state r.

The set of states of \(\mathcal {A}\) is \(S_{\mathcal {A}} = S_{1} \times S \times \left( S_2 \times X \rightarrow S_2 \cup \left\{ \perp \right\} \right) \), in which \(S_1\), S, and \(S_2\) represent the state sets of \(M_1\), \(T'\), and \(\overline{M_2}\), respectively. The transition relation \(\varDelta _\mathcal {A}\) is defined such that \((q', p', g') \in \varDelta _{\mathcal {A}}((q, p, g), a)\) iff (i) \(\varDelta _1(q, a) = q'\), (ii) \((u, p') \in \varDelta _{1}(p, a)\), and (iii) \(g'(r, x) = r'\) and \(\varDelta _2(r, \mathsf {val}_{u(x)}) = r'\), for all \(x \in X\) and \(r \in S_2\),. Initial states are the product of initial states i.e. a set \(I_{\mathcal {A}} = \{(q_0, p_0, r_0) : q_0 \in I \}\). The Muller accepting set of \(\mathcal {A}\) is defined as the collection of all \(P \subseteq S_{\mathcal {A}}\) such that (i) \(\pi _1(P) \in \mathbb {F}_{1}\), (ii) \(\pi _2(P) \in \mathbb {F}\), and (iii) \((\pi _3(P))(r_0, {f}) \in \mathbb {F}_{2}\), where \(\pi _i\) is the \(i^{th}\) projection. The size of \(\text {NMA}{}\) \(\mathcal {A}\) is exponential in the number variables of T, polynomial in the number of states of \(M_1\) and T. Thanks to the fact that emptiness of an NMA can be determined in NLogSpace [11], emptiness of \(\mathcal {A}\) having exponential states in the inputs T, \(M_1\) and \(M_2\), can be decided in NPspace and thus, by Savitch’s theorem, also in Pspace.    \(\square \)

Since regular relations are definable in MSO, they are closed under sequential composition. In combination with Theorems 1 and 2, this establishes the necessary conditions for bounded regular model checking with regular relations to be possible. Thus, we have the following corollary.

Corollary 1

Bounded model checking with regular relations is decidable.

Despite the fact that unbounded regular model checking is undecidable, bounded regular model checking provides a refutation procedure. That is, it allows us to search for a witness for proving the system unsafe. Unfortunately, we cannot use bounded model checking of this kind to decide if the system does satisfy the desired property. On the other hand, we identify several special cases of the problem which permit the safety of the system to be verified in finite time. In general, we assume that \(\textsc {init}\subseteq \overline{\textsc {bad}}\), where \(\overline{\textsc {bad}}\) is the complement of \(\textsc {bad}\).

Functional Fixed Points. The first instance applies when T is functional, i.e. \([\![T]\!]\) is a function, and relies on the following result of Alur, Filiot, and Trivedi [6].

Theorem 3

Given an \(\omega \)-NSST T, it is decidable if \([\![T]\!]\) is a function. Given a pair of functional \(\omega \)-NSSTs \(T_1\) and \(T_2\), it is decidable if \([\![T_1]\!] = [\![T_2]\!]\).

At every step of the bounded regular model checking procedure, one can check if \(T^k\) is functional, if \(T^{k+1}\) is functional, and if \([\![T^k]\!] = [\![T^{k+1}]\!]\). If these three conditions hold, then, for all \(m \ge 0\), we have that \([\![T^k]\!] = [\![T^{k+m}]\!]\). When this occurs and \([\![T^k]\!](\textsc {init}) \subseteq \overline{\textsc {bad}}\) holds, it follows that \([\![T^k]\!] = [\![T^*]\!]\) and therefore that \([\![T^*]\!](\textsc {init}) \subseteq \overline{\textsc {bad}}\) which implies \([\![T^*]\!](\textsc {init}) \cap \textsc {bad}= \emptyset \). Note that \(T^k\) can be functional even when T is not. To see this, consider a non-functional \(\omega \)-NSST T such that \([\![T]\!](a^\omega ) = \left\{ b^\omega , c^\omega \right\} \), and \([\![T]\!](b^\omega ) = d^\omega = [\![T]\!](c^\omega )\). If \(a^\omega \in \textsc {init}\) and \(|[\![T]\!](w)| = 1\) for every other input w and \(a^\omega \notin \mathsf {im}(T)\), then \(T^2\) is functional.

Inductive Invariants. An alternative approach involves showing that \([\![T]\!]\) satisfies some inductive invariant. Select, as a candidate invariant, a regular or \(\omega \)-regular language L which is contained in the set of safe states \(L \subseteq \overline{\textsc {bad}}\). Now, L provides a witness to the unbounded safety of the system if the following pair of conditions are met: (i) \(\textsc {init}\subseteq L\) and (ii) \([\![T]\!](L) \subseteq L\). Together, (i) and (ii) imply that \([\![T^*]\!](\textsc {init}) \subseteq L\), and in combination with the assumption that \(L \subseteq \overline{\textsc {bad}}\) this yields that \([\![T^*]\!](\textsc {init}) \cap \textsc {bad}= \emptyset \). The necessary inclusions can be formulated as instances of the type checking problem, and so, given an appropriately chosen inductive invariant in the form of an \(\omega \)-regular language, the global safety of such a system may be verified in polynomial space. This method is easily generalized by searching for k-inductive invariants: \(\omega \)-regular languages for which there is a \(k \in \mathbb {N}\) such that \([\![T^k]\!](L) \subseteq L\). The k-inductive approach complements bounded regular model checking, since, for a given k, bounded regular model checking lets us decide if the system is safe for up to k transitions while k-induction lets us decide if it is safe after at least k transitions.

Fig. 5.
figure 5

An \(\omega \text {-SST}{}\) squaring a number with binary expansion of the form \(1^n0^\omega \). The output at \(s_1\) and \(s_2\) is x. Notice that this function can not be expressed as a GSM.

5 Conclusion

We introduced \(\omega \)-NSSTs as a computational model for regular relations over infinite strings, and showed that the relations definable by \(\omega \)-NSST coincide exactly with those definable in MSO. Motivated by potential applications in formal verification, we studied algorithmic properties of these objects and established the minimal theoretical results required for bounded regular model checking to be possible with regular transition relations.

Regular functions and relations provide an intriguing class of models for real valued functions, see Fig. 5 for example. In [15, 21] analytic properties such as continuity and differentiability of real functions encoded by \(\omega \)-automata have been studied. Extending this line of research by going beyond standard \(\omega \)-automata is both theoretically interesting and could be leveraged towards applications involving verification and control of dynamical systems. The present work indicates the viability of generalizing the automata-theoretic approach to modeling real functions. With this application in mind, it would be worthwhile to study the approximation techniques developed for traditional regular model checking to see if they generalize to handle regular relations.