1 Introduction

Model checking is a widely used technique in the verification of hardware and software systems, scaling from case studies in academic publications to real systems in industry; the importance of model checking has been recognized by means of the 2007 Turing award, which has been assigned to Edmund M. Clarke, E. Allen Emerson, and Joseph Sifakis for “their roles in developing model checking into a highly effective verification technology, widely adopted in the hardware and software industries”.

Large systems are usually obtained by developing several small components that interact concurrently with each other so to globally achieve the desired functionality. The main obstacle in applying model checking to concurrent systems is the well-known state explosion problem [31]. The number of global states of such systems can be enormous: it is actually of the form \(n^{p}\) where p is the number of processes and n is the number of states in each process. There have been several approaches proposed in literature to combat the state explosion problem, such as symbolic model checking based on BDDs [78], bounded model checking [18], and learning-based compositional verification [34]. The latter approach, the learning-based compositional verification, tries to learn models of the single components that are smaller than the original processes while preserving their behavior. The learning algorithm used in [34] is the well-known \(\textsf {L}^{*}\) algorithm proposed by Dana Angluin [8], which allows one to learn deterministic finite automata (DFAs).

Automata learning algorithms have received significant attention from the verification community in the past two decades. Besides being used to improve the efficiency and scalability of compositional verification, automata learning has also been successfully applied in other aspects of verification: among others, it has been used to automatically generate interface models of computer programs [7], to learn a model of the traces of the system errors for diagnosis purposes [27], to find bugs in the implementation of network protocols [88], to extract behavior model of programs for statistical program analysis [29], and to do model-based testing and verification [81, 107]. Later in 2017, Frits Vaandrager [103] surveyed the concept of model learning used in the above applications.

In order to be of practical use, the learning algorithms have to be computationally efficient and easily adaptable to the different learning scenarios. On the one hand, with more complex tasks at hand, some researchers have proposed several optimizations to improve the efficiency of finite automata learning, such as learning algorithms based on classification trees [59, 63], efficient counterexample analysis for learning algorithms [87], learning algorithm \(\textsf {NL}^{*}\) for nondeterministic finite automata (NFAs) [20], and learning algorithms for alternating finite automata [10].

On the other hand, due to the demands from the different verification tasks, some researchers also develop and apply learning algorithms for richer models. For example, there are learning algorithms for I/O automata [2], event-recording automata [50], register automata [57, 58], timed systems [75], probabilistic systems [43], and nominal automata [79]. Specially, van Heerdt et al. in [53] proposed an automata learning framework based on category theory which unifies the learning of several automata including DFAs and weighted automata.

However, aforementioned learning algorithms are all designed for the automata accepting finite words; those automata are used to model the finite behaviors of the systems, which are usually characterized by safety properties expected to hold. For instance, one can use a DFA to recognize all possible bad behaviors, i.e., behaviors leading in a finite number of steps to a state violating a safety property. Instead, for characterizing the infinite behaviors of the systems, generally corresponding to liveness properties, automata accepting infinite words are used.

In his seminal work [25], Büchi introduced automata accepting infinite words to prove the decidability of a restricted monadic second order (MSO) logic; now such automata are widely known as Büchi automata (BA). A Büchi automaton has the same structure as an NFA, except that it operates on infinite words: instead of accepting a finite word if it leads the run of the automaton to end in an accepting state, a BA accepts an infinite word if it leads the automaton to visit an accepting state infinitely often. Büchi automata are nowadays very popular in the model checking field, in particular when the specification is given by a linear temporal logic (LTL) formula; see the introductory paper by Vardi [104] on the use of BAs for LTL analysis.

Besides being used in LTL verification and synthesis, Büchi automata have been also used as a standard model to describe the liveness properties of distributed systems [6]. Therefore, in order to verify whether a concurrent system satisfies a liveness property, one can model every process of the system as a Büchi automaton. It follows that if one can learn smaller Büchi automata for the processes, then performing compositional verification on the given concurrent system can become less expensive, similarly to the DFA case. Motivated by that, Farzan et al. presented in [42] the first learning algorithm for the complete-class of \(\omega \)-regular languages represented as Büchi automata; the algorithm is able to extract automatically a Büchi automaton as an assumption from a component of concurrent systems for compositional verification. Note that already in 1995 Maler and Pnueli [76] introduced the first learning algorithm for Büchi automata, but it learns Büchi automata accepting only a proper subset of \(\omega \)-regular languages. In 2014, Angluin and Fisman proposed in [11] a learning algorithm for the \(\omega \)-regular languages by means of a formalism called a family of DFAs (FDFAs). Later in [73], Li et al. proposed to use classification trees to learn FDFAs rather than observation tables used by Angluin and Fisman. Learning algorithms based on classification trees usually need less runtime memory and can be much more efficient when compared to its observation table based counterparts [59]. Further, Li et al. presented in [73] a more efficient learning algorithm for Büchi automata based on FDFAs and classification trees compared to the learning algorithm in [42].

There are already a few learning algorithms for Büchi automata available in the literature, yet the learning algorithms are not widely used in the model checking community. One reason for this is that the learning algorithms are quite technically demanding and not so easy to follow and understand; in this paper we give a simple presentation of one BA learning algorithm, with simple but complete examples, to introduce the reader to such learning framework. Another reason is that there are fewer learning libraries available for Büchi automata compared to those implemented for learning automata accepting finite words: for instance, for learning automata accepting finite words there are robust and publicly available libraries such as libalf [21] and LearnLib [60]. To the best of our knowledge, there is, however, only one publicly available library for learning Büchi automata named ROLL [73] which implements also the BA learning algorithm described in this paper.

In this paper, we review the BA learning algorithm proposed in [73] by learning the simple \(\omega \)-regular language \((ab)^{\omega }\): besides giving the reader an overview of the algorithm, it guides them on how the algorithm works step by step. Our main goal in this work is to give an intuitive explanation of the different learning algorithms for both finite and \(\omega \)-regular languages; in this way the reader can get the ideas underlying the learning algorithms before getting involved in their formalism, presented in the related literature; we achieve this by means of the examples we carefully chose so to be simple but still exposing the different challenges the learning algorithms for Büchi automata face and the solution techniques that have been adopted.

Further, we discuss two possible interesting applications of the BA learning algorithms. The complementation problem for Büchi automata is a challenging problem in the research community both in theory and practice. We show that the BA learning algorithm can be easily applied to complement Büchi automata. Experimental results show that the learning-based complementation algorithm of Büchi automata can yield much smaller complement automata for some cases than classical algorithms. Lastly, we discuss how the learning algorithms can be also applied in proving the termination of C programs. Heizmann et al. in [55] proposed a novel termination analysis algorithm based on Büchi automata. Interestingly, the efficiency and scalability of this termination analysis algorithm highly depend on getting smaller complement automata of Büchi automata, where one naturally can use the learning based complementation algorithm.

Organization of the Paper. We first set up some notions and notations for this work in Sect. 2. We then introduce some basic operations on Büchi automata in Sect. 3, together with their complexity analysis, before turning to the learning algorithms in the following sections. In order to ease the presentation, we first present the learning algorithm for DFAs in Sect. 4 and then move onto the learning of Büchi automata in Sect. 5. After that, we show how to apply our learning algorithm to the complementation problem of Büchi automata in Sect. 6. Before concluding the paper in Sect. 8, we consider the application of BA learning algorithm to program termination analysis in Sect. 7.

2 Preliminaries

Let X and Y be two sets; we use to denote their symmetric difference, i.e., the set \((X \setminus Y) \cup (Y \setminus X)\). We use \([i \cdots j]\) to denote the set \(\{ i, i+1, \cdots , j \}\).

Let \(\varSigma \) denote a finite non-empty set of letters called alphabet. A word is a finite or infinite sequence \(w = w_{1} w_{2} \cdots \) of letters in \(\varSigma \); we denote by \(|w|\) the length of the word w, i.e., the number letters in w. If w is infinite, then \(|w| = \infty \), and we call it an \(\omega \)-word. We use \(\varepsilon \) to denote the word of length 0, i.e., the empty word. We denote by \(\varSigma ^{*}\) and \(\varSigma ^{\omega }\) the sets of all finite and infinite words, respectively. Moreover, we use \(\varSigma ^{+}\) to represent the set \(\varSigma ^{*}\setminus \{ \varepsilon \}\).

We denote by \(w{[i]}\) the i-th letter of a word w. We use \(w {[i..k]}\) to denote the subword of w starting at the i-th letter and ending at the k-th letter, inclusive, when \(i\le k\) and the empty word \(\varepsilon \) when \(i > k\). For \(u \in \varSigma ^{*}\), we denote by \(\text {Pref}(u)\) the set of its prefixes, i.e., \(\text {Pref}(u) = \{ \varepsilon , u{[1]}, u {[1..2]}, \cdots , u {[1..|u|]} \}\). Similarly, we denote by \(\text {Suf}(u)\) the set of its suffixes, i.e., \(\text {Suf}(u) = \{ u {[1..|u|]}, u {[2..|u|]}, \cdots , u{[|u|]}, \varepsilon \}\). Given a finite word \(u = u_{1} \cdots u_{k}\) and a word w, we denote by \(u \cdot w\) the concatenation of u and w, i.e., the finite or infinite word \(u \cdot w = u_{1} \cdots u_{k} w_{1} \cdots \). We may just write uw instead of \(u \cdot w\).

Definition 1

An acceptor automaton is a tuple \(A= (\varSigma , Q, \bar{q}, \delta , F)\) consisting of the following components: a finite alphabet \(\varSigma \), a finite set \(Q\) of states, an initial state \(\bar{q} \in Q\), a transition relation \(\delta \subseteq Q\times \varSigma \times Q\), and an accepting condition F.

For convenience, we also use \(\delta (q, a)\) to denote the set \(\{\, q' \in Q \mid (q, a, q') \in \delta \,\}\).

In the remainder of the paper, we assume that all automata share the same alphabet \(\varSigma \), which we may omit from their definitions.

A run of an acceptor automaton on a finite word \(v = a_{1} a_{2} a_{3} \cdots a_{n}\), \(n \ge 1\), is a sequence of states \(q_{0}, q_{1}, \cdots , q_{n}\) such that \(q_{0} = \bar{q}\) and \((q_{i}, a_{i+1}, q_{i+1}) \in \delta \) for every \(0 \le i < n\); similarly, a run of an acceptor automaton on an infinite word \(w = a_{1} a_{2} a_{3} \cdots \) is a sequence of states \(q_{0}, q_{1}, \ldots \) such that \(q_{0} = \bar{q}\) and \((q_{i}, a_{i+1}, q_{i+1}) \in \delta \) for each \(i \in \mathbb {N}\). The run on a word is accepting if it satisfies the accepting condition F. A word is accepted by an acceptor automaton \(A\) if \(A\) has an accepting run on it.

A finite language is a subset of \(\varSigma ^{*}\) while an \(\omega \)-language is a subset of \(\varSigma ^{\omega }\); the language of an acceptor automaton \(A\), denoted by \(\mathcal {L}(A)\), is the set \(\{\, u \in \varSigma ^{*}\cup \varSigma ^{\omega } \mid u \text { is accepted by } A\,\}\).

A deterministic acceptor automaton is an acceptor automaton such that \(|\delta (q, a)| \le 1\) for any \(q \in Q\) and \(a \in \varSigma \). For deterministic acceptor automata, we may write instead of . The transition relation of a deterministic acceptor automaton can be lifted to finite words by defining \(\delta (q, \varepsilon ) = q\) and \(\delta (q,av) = \delta (\delta (q,a), v)\) for each \(q \in Q\), \(a \in \varSigma \), and \(v \in \varSigma ^{*}\). We also use \(A(v)\) as a shorthand for \(\delta (\bar{q}, v)\).

A finite automaton (FA) is an acceptor automaton where \(F \subseteq Q\) and a finite word v is accepted if there is a run \(q_{0}, q_{1}, \cdots , q_{n}\) on v such that \(q_{n} \in F\); no infinite word is accepted. A deterministic finite automaton (DFA) is a FA which is also a deterministic acceptor automaton. A complement DFA \(A^{\mathcal {C}}\) of a DFA \(A\) is a DFA such that \(\mathcal {L}(A^{\mathcal {C}}) = \varSigma ^{*}\setminus \mathcal {L}(A)\). Complementing a DFA is easy: it is enough to add an accepting sink state collecting all missing transitions and complement the original set of accepting states. Let \(A\) and \(B\) be two FAs; one can construct a product FA, denoted by \(A \times B\), accepting the language \(\mathcal {L}(A) \cap \mathcal {L}(B)\) using a standard product construction; see, e.g., [56].

A Büchi automaton (BA) is an acceptor automaton where \(F \subseteq Q\) and an infinite word w is accepted if there is a run \(\rho = q_{0}, q_{1}, \ldots \) on w such that for each \(i \in \mathbb {N}\), there exists \(j > i\) such that \(\rho {[j]} \in F\); no finite word is accepted. Intuitively, an infinite word w is accepted by a BA if there exists a run on w visiting at least one accepting state in F infinitely often. A deterministic Büchi automaton (DBA) is a BA which is also a deterministic acceptor automaton.

A BA is a limit deterministic Büchi automaton (LDBA) if its set of states \(Q\) can be partitioned into two disjoint sets \(Q_{N}\) and \(Q_{D}\), such that (1) \(\delta (q, a) \subseteq Q_{D} \) and \(|\delta (q, a)| \le 1\) for each \(q \in Q_{D}\) and \(a \in \varSigma \), and (2) \(F \subseteq Q_{D}\). It is trivial to note that each DBA is also an LDBA, by taking \(Q_{N} = \emptyset \) and \(Q_{D} = Q\).

Fig. 1.
figure 1

Examples of Büchi automata and their accepted languages

Example 1

As examples of Büchi automata, consider the two automata shown in Fig. 1. The automaton \(A\) is a DBA with alphabet \(\varSigma = \{ a, b \}\), set of states \(Q= \{ q_{0}, q_{1} \}\), initial state \(\bar{q} = q_{0}\) (marked by the small incoming arrow), transition relation \(\delta = \{ (q_{0}, a, q_{1}), (q_{0}, b, q_{0}), (q_{1}, a, q_{1}), (q_{1}, b, q_{0}) \}\), and \(F = \{ q_{1} \}\) (denoted as a double-circled state). The language accepted by \(A\) is the \(\omega \)-regular language \(\mathcal {L}(A) = \{\, w \in \varSigma ^{\omega } \mid w\) has infinitely many a’s}.

The automaton \(B\) is an NBA that accepts the language \(\mathcal {L}(B) = \{\, w \in \varSigma ^{\omega } \mid w \text { has finitely many } a\)’s}, which is the complement of \(\mathcal {L}(A)\). Note that \(B\) is also a limit deterministic Büchi automaton, where the corresponding partition of \(Q\) is given by \(Q_{N} = \{ q_{0} \}\) and \(Q_{D} = \{ q_{1} \}\).

We call the language of an FA a regular language. An \(\omega \)-language \(L\subseteq \varSigma ^{\omega }\) is \(\omega \)-regular if there exists a BA \(A\) such that \(L= \mathcal {L}(A)\). Words of the form \(uv^{\omega }\), where \(u \in \varSigma ^{*}\) and \(v \in \varSigma ^{+}\), are called ultimately periodic words. We use a pair of finite words (uv) to denote the ultimately periodic word \(w = uv^{\omega }\). We also call (uv) a decomposition of w; note that an ultimately periodic word can have several decompositions: for instance (uv), (uvv), and (uvv) are all decompositions of \(uv^{\omega }\). For an \(\omega \)-language \(L\), let \(\text {UP}(L) = \{\, uv^{\omega } \in L \mid u \in \varSigma ^{*}, v \in \varSigma ^{+} \,\}\) denote the set of all ultimately periodic words in \(L\). Note that the set of ultimately periodic words of an \(\omega \)-regular language \(L\) can be seen as the fingerprint of \(L\), as stated by the following theorem.

Theorem 1

(Ultimately Periodic Words of \(\omega \)-Regular Languages [25, 26]). (1) Every non-empty \(\omega \)-regular language \(L\) contains at least one ultimately periodic word. (2) Let \(L\), \(L'\) be two \(\omega \)-regular languages. Then \(L= L'\) if and only if \(\text {UP}(L) = \text {UP}(L')\).

We refer interested reader to [25, 26] for the proof of Theorem 1. An immediate consequence of Theorem 1 is that, for any two \(\omega \)-regular languages \(L\) and \(L'\), if \(L\ne L'\) then there must exist some ultimately periodic word \(uv^{\omega } \in \text {UP}(L) \ominus \text {UP}(L')\).

3 Operations on Büchi Automata

In this section we present how nondeterministic Büchi automata support the standard set operations on their languages, namely, union, intersection, and complementation, as well as derived operations and decision problems. The main result is that nondeterministic Büchi automata are closed under such operations, e.g., giving two Büchi automata \(A_{0}\) and \(A_{1}\), we can construct another Büchi automaton \(A\) such that \(\mathcal {L}(A) = \mathcal {L}(A_{0}) \cap \mathcal {L}(A_{1})\). Deterministic Büchi automata, however, are strictly less expressive than nondeterministic ones, since there are \(\omega \)-regular languages accepted by a nondeterministic BA for which there does not exist a deterministic BA accepting them; DBAs are also not closed under complementation, i.e., there is a DBA whose complement language can only be accepted by a nondeterministic BA.

There are several resources available in literature for the readers interested in more details on \(\omega \)-languages and their automata; see, e.g., [30, 32, 49, 62, 94, 96, 97].

3.1 Union of Büchi Automata

Given two Büchi automata \(A_{0}\) and \(A_{1}\), it is rather easy to construct a Büchi automaton \(A_{0 \cup 1}\) such that \(\mathcal {L}(A_{0 \cup 1}) = \mathcal {L}(A_{0}) \cup \mathcal {L}(A_{1})\). In fact, since by definition of language of a Büchi automaton, a word w belongs to its language if there exists an accepting run on w, it is enough to create an automaton having all runs of \(A_{0}\) and \(A_{1}\): this can be easily achieved by just considering \(A_{0}\) and \(A_{1}\) as a single automaton, up to some minor adaptation on the initial state.

Proposition 1

Given two Büchi automata \(A_{0} = (Q_{0}, \bar{q}_{0}, \delta _{0}, F_{0})\) and \(A_{1} = (Q_{1}, \bar{q}_{1}, \delta _{1}, F_{1})\) such that \(Q_{0} \cap Q_{1} = \emptyset \), let \(A_{0 \cup 1} = (Q, \bar{q}, \delta , F)\) be the Büchi automaton whose components are defined as follows:

  • \(Q= Q_{0} \cup Q_{1} \cup \{ \bar{q} \}\) where \(\bar{q}\) is a fresh state such that \(\bar{q} \notin Q_{0} \cup Q_{1}\),

  • \(\delta = \delta _{0} \cup \delta _{1} \cup \{\, (\bar{q}, a, q_{0}) \mid q_{0} \in \delta _{0}(\bar{q}_{0}, a) \,\} \cup \{\, (\bar{q}, a, q_{1}) \mid q_{1} \in \delta _{1}(\bar{q}_{1}, a) \,\}\), and

  • \(F = F_{0} \cup F_{1}\).

Then, \(\mathcal {L}(A_{0 \cup 1}) = \mathcal {L}(A_{0}) \cup \mathcal {L}(A_{1})\) with \(|Q| = |Q_{0}| + |Q_{1}| + 1\).

The proof of the above proposition is rather trivial: given an \(\omega \)-word w, except for the initial state \(\bar{q}\), a run on w of the automaton \(A_{0 \cup 1}\) is identical to a run on w of either \(A_{0}\) or \(A_{1}\).

Note that the requirement that \(A_{0}\) and \(A_{1}\) must have disjoint sets of states can be easily fulfilled by simply renaming their states, since actual state names play no role in accepting a word. Moreover, if we would have allowed a set of initial states instead of a single initial state, then the union automaton would be just the component-wise union of the two given Büchi automata.

3.2 Intersection of Büchi Automata

The construction of a Büchi automaton accepting the intersection of the languages of \(A_{0}\) and \(A_{1}\) is slightly more involved than their union. The main idea underlying the intersection construction is to run on the input word in parallel in both \(A_{0}\) and \(A_{1}\), by means of a product construction similar to the one for the intersection of finite automata; as accepting condition, we require that we reach the accepting states of \(A_{0}\) and \(A_{1}\) in an alternating mode, i.e., every time we reach an accepting state in \(A_{c}\) for \(c \in \{ 0,1 \}\), then we have to reach an accepting state in \(A_{1 - c}\). If we can alternate infinitely often, then both automata accept the input word, i.e., it is in the intersection of their languages; if we alternate only finitely often, this means that the BA where we get stuck is not accepting such a word, so the intersection automaton must reject the word as well. This is different from the accepting condition for finite automata, where a product state is accepting if both states in the pair are accepting: in fact, for infinite words it does not matter whether the two BAs reach an accepting state exactly at the same moment, since it can also be the case that both automata accept an \(\omega \)-word w but \(A_{0}\) reaches an accepting state only once every ten times \(A_{1}\) has reached an accepting state.

Proposition 2

Given two Büchi automata \(A_{0} = (Q_{0}, \bar{q}_{0}, \delta _{0}, F_{0})\) and \(A_{1} = (Q_{1}, \bar{q}_{1}, \delta _{1}, F_{1})\), let \(A_{0 \cap 1} = (Q, \bar{q}, F, \delta )\) be the Büchi automaton whose components are defined as follows:

  • \(Q= Q_{0} \times Q_{1} \times \{ 0,1 \}\);

  • \(\bar{q} = (\bar{q}_{0}, \bar{q}_{1}, 0)\);

  • \(\delta = \{\, ((q_{0}, q_{1}, c), a, (q'_{0}, q'_{1}, next (q_{0}, q_{1}, c))) \mid q'_{0} \in \delta _{0}(q_{0},a), q'_{1} \in \delta _{1}(q_{1},a) \,\}\) where \( next :Q_{0} \times Q_{1} \times \{ 0,1 \} \rightarrow \{ 0,1 \}\) is defined as

    $$ next (q_{0}, q_{1}, c) = {\left\{ \begin{array}{ll} 1 - c &{} \text { if } q_{c} \in F_{c}, \\ c &{} \text { otherwise; } \end{array}\right. } $$
  • \(F = F_{0} \times Q_{1} \times \{ 0 \}\).

Then, \(\mathcal {L}(A_{0 \cap 1}) = \mathcal {L}(A_{0}) \cap \mathcal {L}(A_{1})\) with \(|Q| = 2 \cdot |Q_{0}| \cdot |Q_{1}|\).

The above construction is based on the transformation of generalized Büchi automata to Büchi automata. Generalized BAs differ from BAs only on the fact that they have multiple accepting sets; an \(\omega \)-word w is accepted if there exists a run on w reaching a state in each accepting set infinitely often. Since generalized BAs have the same expressive power as ordinary BAs and are not used in this work, we refer the interested reader to, e.g., [32] for more details.

3.3 Complementation of Büchi Automata

Complementing Büchi automata is the most difficult operation on their languages. First of all, the usual subset construction used for converting nondeterministic finite automata to equivalent deterministic finite automata and then easily complement the resulting DFAs can not be adapted to Büchi automata since DBAs are strictly less expressive than BAs:

Proposition 3

(cf. [68]). There exists an \(\omega \)-regular language \(L\) that is recognizable by a BA but not by a DBA.

This means that for such a language \(L\), we can find a BA \(A\) such that \(\mathcal {L}(A) = L\) but there does not exist a DBA \(D\) such that \(\mathcal {L}(D) = L\). As a consequence, applying a subset construction to \(A\) does not lead to a DBA accepting the same language.

Note that the language witnessing the correctness of the above result is rather simple: \(L= \varSigma ^{*}\cdot b^{\omega }\), that is, \(L\) is the language of all words having only b occurring infinitely often. For \(\varSigma = \{ a, b \}\), this language is recognized by the BA \(B\) shown in Fig. 1; its complement, i.e., the language whose words contain infinitely many a, is easily recognized by the DBA \(A\) also shown in Fig. 1. This means that DBAs are not closed under complementation, while BAs are indeed closed, as witnessed by the several complementation algorithms that have been proposed in literature.

Before presenting such algorithms, we want to introduce the main result about the complexity of complementing Büchi automata.

Proposition 4

(cf. [90]). Given a BA \(A\) with n states, it is possible to construct a BA \(A^{\mathcal {C}}\) such that \(\mathcal {L}(A^{\mathcal {C}}) = \varSigma ^{\omega }\setminus \mathcal {L}(A)\) whose number of states is in \(\varOmega ( tight (n-1))\) and \(\mathcal {O}( tight (n+1))\), where \( tight (n) \approx (0.76n)^{n}\).

In practice, the above is the best known complexity result for the complementation of Büchi automata, where the lower- and upper-bounds about the number of states of the complement Büchi automaton have a minor gap lying in \(\mathcal {O}(n^{2})\).

There are mainly four types of complementation algorithms, according to the classification proposed in [19, 98]: Ramsey-based [24, 25, 93], rank-based [46, 51, 65, 90], determinization-based [44, 82, 89, 91], and slice-based [5, 61, 98, 106] complementation. A complementation construction unifying the rank-based and slice-based approaches can be found in [45]. All these algorithms construct the complement Büchi automata based on the transition structures of the input Büchi automata. Besides the complementation algorithm proposed for nondeterministic Büchi automata, there are also complementation algorithms specialized for limit deterministic Büchi automata [19, 28] and for deterministic Büchi automata [66].

Given the highly demanding technicalities involved in the above complementation algorithms for Büchi automata, we refer the interested reader to the cited literature for more details on the different approaches and algorithms.

3.4 Difference of Büchi Automata

The BA language difference operation is tightly connected to the complementation operation, from which it derives its super-exponential complexity, as stated by the following proposition.

Proposition 5

Given two BAs \(A_{0}\) and \(A_{1}\) with \(n_{0}\) and \(n_{1}\) states, respectively, it is possible to construct a BA \(A_{0 \setminus 1}\) such that \(\mathcal {L}(A_{0 \setminus 1}) = \mathcal {L}(A_{0}) \setminus \mathcal {L}(A_{1})\) whose number of states is in \(\varOmega (n_{0} \cdot tight (n_{1}-1))\) and \(\mathcal {O}(n_{0} \cdot tight (n_{1}+1))\).

The language difference operation is based on the complementation operation: in order to get an automaton \(A_{0 \setminus 1}\) such that \(\mathcal {L}(A_{0 \setminus 1}) = \mathcal {L}(A_{0}) \setminus \mathcal {L}(A_{1})\), it is enough to construct the automaton for the language \(\mathcal {L}(A_{0}) \cap \mathcal {L}(A_{1}^{\mathcal {C}})\). Thus, the complexity result follows from Propositions 2 and 4.

Note that we can not improve the complexity of the language difference operation to be better than \(\varOmega ( tight (n_{1}-1))\), since otherwise we would be able to improve the complexity of the complementation operation as well, since trivially we have that \(\mathcal {L}(A_{1}^{\mathcal {C}}) = \varSigma ^{\omega }\setminus \mathcal {L}(A_{1})\) where \(\varSigma ^{\omega }\) is the language of the BA \(A_{0}\) having exactly one state, the initial state, being accepting with only self-loops as transitions, so in Proposition 8 we would have \(n_{0} = 1\).

3.5 Decision Problems on Büchi Automata

Besides the three main operations presented above, namely union, intersection, and complementation, there are three main decision problems relative to the languages of Büchi automata: emptiness, universality, and language inclusion.

Given a BA \(A\), the emptiness problem is relative to decide whether \(\mathcal {L}(A) = \emptyset \) while the universality problem refers to the equality \(\mathcal {L}(A) = \varSigma ^{\omega }\). Finally, the language inclusion problem requires to decide whether \(\mathcal {L}(A_{0}) \subseteq \mathcal {L}(A_{1})\) for the given BAs \(A_{0}\) and \(A_{1}\). These problems have different complexity results, which are summarized by the following propositions. The corresponding proofs can be found in the cited papers or in [32, Sect. 4.4].

Proposition 6

(cf. [40, 41, 93]). Given a BA \(A\), the emptiness problem \(\mathcal {L}(A) = \emptyset \) is decidable in linear time and is NLOGSPACE-complete.

The proof of the linear time complexity is based on finding a strongly connected component, i.e., a set of states each one reachable from each other, which is reachable from the initial state and contains a state in F. This can be easily done by a simple graph exploration based on depth-first visit. In theory, we can also nondeterministically find an accepting state and the accepting run of \(A\) visiting the accepting state infinitely often, which is in NLOGSPACE. In fact, it is enough to guess an accepting state \(q_{f} \in F\) and two paths: a stem path from \(\bar{q}\) to \(q_{f}\) and a lasso path from \(q_{f}\) to \(q_{f}\) itself, both of them with length at most \(|Q|\). Clearly storing \(q_{f} \in F\) requires a space that is logarithmic in \(|Q|\); for the paths, it is enough to store the current state q and a counter \( cnt \) to keep track of the length of the path so far; both require logarithmic space.

The algorithm works as follows: initially, \(q = \bar{q}\) and the following steps are repeated to find a stem path from \(\bar{q}\) to \(q_{f}\): (1) from q, a successor is chosen nondeterministically and \( cnt \) is increased; (2) if \( cnt \) exceeds \(|Q|\), then “no” is returned; (3) if \(q = q_{f}\) and \( cnt \le |Q|\), then the algorithm turns to look for a lasso path. Starting with \(q = q_{f}\), the following steps are repeated to find a lasso path from \(q_{f}\) to \(q_{f}\) itself: (1) from q, a successor is chosen nondeterministically and \( cnt \) is increased; (2) if \( cnt \) exceeds \(|Q|\), then “no” is returned; (3) if \(q = q_{f}\) and \( cnt \le |Q|\), then “yes” is returned. We refer interested reader to [40, 41, 93] for the proof of the NLOGSPACE-hardness result.

Proposition 7

(cf. [93]). Given a BA \(A\), the universality problem \(\mathcal {L}(A) = \varSigma ^{\omega }\) is decidable in exponential time and is PSPACE-complete.

The universality problem is decided by means of a reduction to the emptiness problem: in order to decide \(\mathcal {L}(A) = \varSigma ^{\omega }\), it is enough to check \(\mathcal {L}(A^{\mathcal {C}}) = \emptyset \), where \(A^{\mathcal {C}}\) is the complement BA of \(A\). Since \(A^{\mathcal {C}}\) is exponentially larger than \(A\), the complexity results follow from Proposition 6.

Proposition 8

(cf. [93]). Given two BAs \(A_{0}\) and \(A_{1}\), the language inclusion problem \(\mathcal {L}(A_{0}) \subseteq \mathcal {L}(A_{1})\) is decidable in exponential time and is PSPACE-complete.

The language inclusion problem is decided by means of a reduction to the emptiness problem: in order to decide \(\mathcal {L}(A_{0}) \subseteq \mathcal {L}(A_{1})\), it is enough to check whether \(\mathcal {L}(A_{0}) \cap \mathcal {L}(A_{1}^{\mathcal {C}}) = \emptyset \), where \(A_{1}^{\mathcal {C}}\) is the complement BA of \(A_{1}\). Since \(A_{1}^{\mathcal {C}}\) is exponentially larger than \(A_{1}\), the complexity results follow from Propositions 2 and 6.

4 Learning Finite Automata

In this section, we present a variant of the learning algorithm for finite automata used in [11]. In 1987, in her seminal work [8], Angluin proposed the \(\textsf {L}^{*}\) algorithm to learn a DFA accepting a target regular language; \(\textsf {L}^{*}\) belongs to the class of active automata learning algorithms [103], in which the learner can interact with an oracle until the correct automaton is constructed.

Fig. 2.
figure 2

DFA active automata learning framework

4.1 Overview of the DFA Learning Algorithm

As depicted in Fig. 2, in the active automata learning setting presented in [8], there is a teacher and a learner. The teacher knows the target language \(L\) which can be a regular language or an \(\omega \)-regular language. The learner wants to learn the target language, represented by an automaton, from the teacher by means of two kinds of queries: membership queries and equivalence queries. A membership query \(\mathrm {MQ}(w)\) asks whether a word w belongs to \(L\) while an equivalence query \(\mathrm {EQ}(A)\) asks whether the conjectured automaton A accepts \(L\). Depending on whether the conjectured automaton A is correct, the teacher replies with either “yes” or“no”. In case of a positive answer, the learner outputs A and completes his job. For the negative answer, the teacher provides as well a witness \(w \in L\ominus \mathcal {L}(A)\) which allows the learner to further refine the conjectured automaton A.

In this paper, the learner uses a data structure called observation table to store all answers to the membership queries, since it is easy to present and understand. We remark that observation tables have been originally adopted by Angluin for her \(\textsf {L}^{*}\) algorithm [8]. Instead of observation tables, the learner can use a tree-based data structure called classification tree to store such answers, which is usually more compact than observation tables; we refer the interested reader to [59, 63, 73] for the details on classification trees.

Fig. 3.
figure 3

A DFA \(\mathcal {M}\), its regular language \(R\), and an observation table \(\mathcal {T}\) for \(R\)

In the following, we play the role of the learner to learn a regular language represented by a DFA from a teacher. Regular language learning is actually a procedure for a learner to gradually identify the states in the minimal DFA \(\mathcal {M}\) recognizing the target language. As an example, consider the regular language \(R\) accepted by the DFA \(\mathcal {M}\) shown in Fig. 3, where \(R= \{\, u \in \{ a, b \}^{+} \mid \text { the number of } b \text { in } u \text { is } 4n + 3 \text { for some } n \in \mathbb {N} \,\}\). We observe that for any pair of words \(u_{1}, u_{2} \in \{ a, b \}^{*}\), \(\mathcal {M}(q_{0}, u_{1}) \ne \mathcal {M}(q_{0}, u_{2})\) if there exists some word \(v \in \{ a, b \}^{*}\) such that \(\mathcal {M}(q_{0}, u_{1} v) = q_{3}\) while \(\mathcal {M}(q_{0}, u_{2} v) \ne q_{3}\). That is, in the DFA \(\mathcal {M}\), for any pair of words \(u_{1}, u_{2} \in \{ a, b \}^{*}\), if there exists some word \(v \in \{ a, b \}^{*}\) such that \(u_{1} v \in \mathcal {L}(\mathcal {M})\) while \(u_{2} v \notin \mathcal {L}(\mathcal {M})\), then \(\mathcal {M}(q_{0}, u_{1})\) and \( \mathcal {M}(q_{0}, u_{2})\) must be two different states. Our goal is to develop a learner who can identify the states in the DFA \(\mathcal {M}\); using such a word extension v to distinguish words \(u_{1}\) and \(u_{2}\) is a good means to identify two different states in \(\mathcal {M}\).

4.2 Right Congruences and Myhill-Nerode Theorem

This idea of distinguishing words by extensions is formalized by the notion of right congruence. A right congruence is an equivalence relation \(\backsim \) on \(\varSigma ^{*}\) such that \(x \backsim y\) implies \(xv \backsim yv\) for every \(x, y, v \in \varSigma ^{*}\). The right congruence relation is the theoretical foundation for the DFA learning algorithms to discover the states in a target DFA \(\mathcal {M}\).

We denote by \(|\mathord {\backsim }|\) the index of \(\backsim \), i.e., the number of equivalence classes of \(\backsim \). We use \(\varSigma ^{*}/_{\backsim }\) to denote the equivalence classes of the right congruence \(\backsim \). A finite right congruence is a right congruence with a finite index. The following theorem guarantees that every regular language has a right congruence relation of finite index.

Theorem 2

(Myhill-Nerode Theorem [56]). For a language \(R\) over \(\varSigma \), the following statements are equivalent:

  1. 1.

    \(R\) is a regular language.

  2. 2.

    \(R\) is the union of some equivalence classes of a right congruence equivalence relation of finite index.

  3. 3.

    The right congruence relation \(\backsim _{R}\) is of finite index, where \(x \backsim _{R}y\) if and only if for each \(v\in \varSigma ^{*}\), \(xv \in R\iff yv \in R\).

The theorem basically states that given a regular language \(R\) over \(\varSigma \), the whole set of finite words \(\varSigma ^{*}\) can be partitioned into a finite number of equivalence classes by the right congruence relation \(\backsim _{R}\). For a word \(u \in \varSigma ^{*}\), we denote by \([u]_{\backsim }\) the equivalence class of the right congruence \(\backsim \) u belongs to.

Given a right congruence relation \(\backsim _{R}\) for the language \(R\), we can construct an automaton accepting \(R\) by means of \(\backsim _{R}\): as set of states \(Q\), we just use the equivalence classes induced by \(\backsim _{R}\); the initial state \(\bar{q}\) is simply the class of the empty word \(\varepsilon \); the transition relation just considers as the a-successor of the class of u the class of ua; finally, the accepting states F are the classes of the words in \(R\).

Definition 2

(DFA induced by \(\backsim _{R}\)). Given a right congruence relation \(\backsim _{R}\) for the language \(R\), the corresponding DFA \(A_{\backsim _{R}}\) is the tuple \(A_{\backsim _{R}} = (Q, \bar{q}, \delta , F)\) where

  • \(Q= \varSigma ^{*}/_{\backsim _{R}}\);

  • \(\bar{q} = [\varepsilon ]_{\backsim _{R}}\);

  • for each \(u \in \varSigma ^{*}\) and \(a \in \varSigma \), \(\delta ([u]_{\backsim _{R}}, a) = [ua]_{\backsim _{R}}\); and

  • \(F = \{\, [u]_{\backsim _{R}} \in Q \mid u \in R \,\}\).

As an example, consider the regular language \(R\) shown in Fig. 3; we have four equivalence classes in \(\varSigma ^{*}/_{\backsim _{R}}\), namely \([\varepsilon ]_{\backsim _{R}}\), \([b]_{\backsim _{R}}\), \([bb]_{\backsim _{R}}\), and \([bbb]_{\backsim _{R}}\), which intuitively correspond to how many b’s have been seen so far, modulo 4; in particular, the regular language \(R\) is exactly the equivalence class \([bbb]_{\backsim _{R}}\). The automaton constructed from \(\backsim _{R}\) is \(\mathcal {M}\), whose states \(q_{0}\), \(q_{1}\), \(q_{2}\), and \(q_{3}\) represent the four equivalence classes \([\varepsilon ]_{\backsim _{R}}\), \([b]_{\backsim _{R}}\), \([bb]_{\backsim _{R}}\), and \([bbb]_{\backsim _{R}}\), respectively.

We can use the word bab to distinguish the words in the equivalence class \([b]_{\backsim _{R}}\) from the words in the other three equivalence classes \([\varepsilon ]_{\backsim _{R}}\), \([bb]_{\backsim _{R}}\), and \([bbb]_{\backsim _{R}}\). For instance, \(\varepsilon \cdot bab \notin R\) while \(b \cdot bab \in R\), hence . One can check, as hinted by the column headers of the table in Fig. 3, that it is enough to use the word extensions \(\varepsilon \), bab, and ab to distinguish the words from the four equivalence classes in \(\varSigma ^{*}/_{\backsim _{R}}\). We can use any other word as extension, as long as it distinguishes words: for instance, we could use \((aba)^{12}\) instead of \(\varepsilon \) or \(a(ababa)^{300}b\) instead of ab. Note however that longer extensions slow down the learning algorithm, whose complexity depends also on the length of the distinguishing words (cf. Theorem 4).

Assume that we want to design a learner to learn the regular language \(R= \{\, u \in \{ a, b \}^{+} \mid \text { the number of } b \text { in } u \text { is } 4n + 3, \text { for some } n \in \mathbb {N} \,\}\), i.e., to discover all states in the target automaton \(\mathcal {M}\) as shown in Fig. 3. By Theorem 2, we know that the right congruence relation \(\backsim _{R}\), by means of word extensions, can help us to distinguish the equivalence classes of \(\varSigma ^{*}\) generating \(R\), which intuitively correspond to the states of \(\mathcal {M}\). However, we do not know \(R\) and we also do not know \(\backsim _{R}\); in order to learn them, the idea is to ask for a few words whether they belong to \(R\), and use the obtained information to conjecture a DFA which is supposed to accept \(R\). Yet there are still several things we are missing:

  1. 1.

    How does an observation table organize the results of membership queries we have collected so far?

  2. 2.

    How can we build a DFA from an observation table correctly?

  3. 3.

    How can we update an observation table and discover new states from the returned counterexample if the conjectured DFA is incorrect?

The answers to the three questions are the key cornerstones of the DFA learning algorithm. In the following, we first show how an observation table organizes the results of membership queries. Then we show how to build a DFA from an observation table. Afterwards we explain how to analyze a returned counterexample to update an observation table so to discover new states in target DFA \(\mathcal {M}\). At last, we present our DFA learner for regular languages.

4.3 Observation Tables

An observation table is a tuple \(\mathcal {T}= (U, V, T)\) where \(U\) is a prefix-closed set of words called access strings, \(V\) is a set of words called experiments, and \(T:(U\cup U\varSigma )V\rightarrow \{ 0, 1 \}\) is a total mapping.

As the name suggests, an observation table \(\mathcal {T}\) is represented by a table, where rows and columns are labelled with words taken from \(U\,\cup \, U\varSigma \) and \(V\), respectively, and the table entries are the value assigned by \(T\) to them. Consider for instance the observation table \(\mathcal {T}= (U, V, T)\) shown in Fig. 3; the labels of the four rows in the upper part of the table correspond to the set \(U= \{ \varepsilon , b, bb, bbb \}\); the labels of the five rows in the bottom part of the table are those in \(U\varSigma \setminus U\), so \(U\,\cup \, U\varSigma \) is exactly the set of labels of the rows of the table; the labels of the three columns in the table correspond to the set \(V= \{ \varepsilon , bab, ab \}\). The entry value of row u and column v represents the value assigned by \(T(\,\cdot \,)\) to the word uv, i.e., \(T(uv)\); such a value is 1 if \(uv \in R\) and 0 otherwise. As depicted in Fig. 3, the entry value of row b and column bab is \(T(bbab) = 1\) since \(b \cdot bab \in R\), while the entry value of row b and column ab is \(T(bab) = 0\) since \(b \cdot ab \notin R\).

Given an observation table for a language \(R\) with right congruence \(\backsim _{R}\), like the one in Fig. 3, we can see that every equivalence class \([u]_{\backsim _{R}}\) of \(\varSigma ^{*}/_{\backsim _{R}}\) has a representative word u in \(U\). Therefore we also use the representative word such as \(\varepsilon \) to represent the equivalence class \([\varepsilon ]_{\backsim _{R}}\). A word in \(V\) is a word extension or experiment used to distinguish the words belonging to different equivalence classes. For instance, consider the column ab in Fig. 3: the entry value at row \(\varepsilon \) is 0 while the entry value at row bb is 1, which indicates that the words from those two equivalence classes can be distinguished by the word ab. Hence any two rows with different entry values in the table are classified to be different equivalence classes while any two rows with the same entry values are seen as one equivalence class. For instance, in Fig. 3, row b in the upper table and row ba in the lower table are seen as one equivalence class since they have the same entry values for each experiment.

We remark that those rows which are currently seen as one equivalence class may later be classified into different equivalence classes, as result of a counterexample returned by the teacher.

The domain of the mapping \(T\) also contains the set \(U\varSigma \), i.e., there are also some rows labelled by the words from \(U\varSigma \) in the table. The existence of this set \(U\varSigma \) of rows in the table makes it possible for the learner to look for the next equivalence class or the successor state \([ua]_{\backsim _{R}}\) in the DFA construction after reading a letter \(a \in \varSigma \) at the equivalence class or state \([u]_{\backsim _{R}}\), where \(u \in U\). For example, suppose that we want to compute the a-successor of state \(\varepsilon \) in Fig. 3: the expected successor is \(\varepsilon \cdot a\). In order to find its actual representative, we first look for the row \(\varepsilon \cdot a\) in the table and then get the successor state \(\varepsilon \in U\) which has the same entry values as row \(\varepsilon \cdot a\). From the table we see that the row \(\varepsilon \cdot a = a\) has entry values 000; the same entry values occur for the row \(\varepsilon \in U\), thus the a-successor of \(\varepsilon \) is \(\varepsilon \) itself.

In order to efficiently check whether two rows represent the same equivalence class, we formally define the rows as the total function \( row :(U\cup U\varSigma ) \rightarrow (V\rightarrow \{ 0, 1 \})\). To a word \(u \in U\cup U\varSigma \) we assign a total function \( row (u) :V\rightarrow \{ 0, 1 \}\) such that \( row (u)(v) = T(uv)\) for each \(v \in V\). We call such a function a row of \(\mathcal {T}\) and we denote by \( Rows (\mathcal {T})\) the set of rows of \(\mathcal {T}\); similarly, we denote by \( Rows _{ upp }(\mathcal {T}) = \{\, row (u) \mid u \in U \,\}\) the set of rows in the “upper” part of the table and \( Rows _{ low }(\mathcal {T}) = \{\, row (u) \mid u \in U\varSigma \setminus U \,\}\) the set of rows appearing in the “lower” part. Consider again the observation table in Fig. 3: we have \( Rows _{ low }(\mathcal {T}) = \{ row (a), row (ba), row (bba), row (bbba), row (bbbb) \}\), where, e.g., \( row (a)\) is the constant function \( row (a)(v) = 0\) for each \(v \in V\). In practice, we can identify each function \( row (u) :V\rightarrow \{ 0, 1 \}\) with the content of \(\mathcal {T}\) in the row labelled by u.

In the learning framework depicted in Fig. 2, the learner can ask the teacher two types of queries, namely, membership and equivalence queries. In order to pose an equivalence query, the learner has to generate a DFA from the information stored in the observation table, which has to contain all information that is needed to build such a DFA. As Angluin proposed in [8], a table has such a needed information when it is closed and consistent.

A table \(\mathcal {T}\) is closed if for any \(u \in U\) and \(a \in \varSigma \), there exists \(u' \in U\) such that \( row (ua) = row (u')\); similarly, a table is consistent if for any \(u_{1}, u_{2} \in U\) and \(a \in \varSigma \), \( row (u_{1}) = row (u_{2})\) implies \( row (u_{1} a) = row (u_{2} a)\). Intuitively, the closeness of a table makes sure that every successor of a state is in the set of discovered states while the consistency of a table ensures that those words which have been classified into the same equivalence class should behave consistently, i.e., they have the same successor equivalence classes, when extended with the same letter.

Fig. 4.
figure 4

Table \(\mathcal {T}_{0}\) and DFA \(A_{0}\)

We now present how the learner proceeds in the learning algorithm to learn the target language \(R\) represented by a DFA. Let the alphabet be \(\varSigma = \{ a, b \}\). At the beginning, the learner has no information, so he initializes both \(U\) and \(V\) to \(\{ \varepsilon \}\) and defines \(T(u v)\) for every \(u \in U\cup U\varSigma \) and \(v \in V\) according to the results of membership queries, that is, he asks the teacher the membership queries \(\mathrm {MQ}(\varepsilon )\), \(\mathrm {MQ}(a)\), and \(\mathrm {MQ}(b)\); the teacher answers “no” to all of them, so the learner sets \(T(\,\cdot \,)\) to be the constant function 0. The result is shown as \(\mathcal {T}_{0}\) in Fig. 4. Since \(\mathcal {T}_{0}\) is closed and consistent, we can build the DFA \(A_{0}\), also depicted in Fig. 4, according to Definition 3.

In case the current \(\mathcal {T}\) is not closed, the learner makes it closed by repeatedly updating \(\mathcal {T}\) as follows: he looks for a word \(u \in U\varSigma \) such that there is no \(u' \in U\) with \( row (u) = row (u')\); then moves u to \(U\) and for every \(a \in \varSigma \), he adds ua to \(U\varSigma \) whenever needed while setting \(T(ua v)\) for each \(v \in V\) by means of membership queries. According to [11], whenever \(\mathcal {T}\) is closed, it is also consistent since by construction there do not exist \(u_{1}, u_{2} \in U\), \(u_{1} \ne u_{2}\), with \( row (u_{1}) = row (u_{2})\).

Instead of moving a single word u from \(U\varSigma \) to \(U\) when \(\mathcal {T}\) is not closed, the learner can also add all its prefixes \(\text {Pref}(u)\) to \(U\) just as the \(\textsf {L}^{*}\) algorithm does in [8]. This may result in a quicker growth of \(\mathcal {T}\), which anyway does not change the correctness (cf. Theorem 3) and complexity (cf. Theorem 4) of the DFA learning algorithm.

4.4 DFA Construction from an Observation Table

Definition 3 answers the second question about how to build a conjecture DFA \(A\) from an observation table correctly.

Definition 3

(DFA of a Table). Let \(\mathcal {T}\) be a closed and consistent observation table. We can construct a DFA \(A= (Q, \bar{q}, \delta , F)\) from \(\mathcal {T}\) as follows.

  • \(Q= Rows _{ upp }(\mathcal {T}) = \{ row (u) \mid u \in U\}\),

  • \(\bar{q} = row (\varepsilon )\),

  • \(\delta ( row (u), a) = row (ua)\), and

  • \(F = \{\, row (u) \in Rows _{ upp }(\mathcal {T}) \mid row (u)(\varepsilon ) = 1 \,\}\).

Consider the observation table \(\mathcal {T}_{0}\) shown in Fig. 4. The learner can construct from \(\mathcal {T}_{0}\) the DFA \(A_{0} = (Q_{0}, \bar{q}, \delta _{0}, F_{0})\) where \(Q_{0} = \{ row (\varepsilon ) = 0 \}\), \(\bar{q} = row (\varepsilon )\), \(F_{0} = \emptyset \), and \(\delta _{0}\) as depicted in Fig. 4. Note that in the whole paper we use the representative words \(u \in U\) instead of the row functions \( row (u)\) defined in Definition 3 to mark the states in a DFA; for instance, we mark the single state of \(A_{0}\) with the representative word \(\varepsilon \) instead of the row function \( row (\varepsilon )\). In this way, it is easier for the reader to relate the equivalence classes to the states in the conjectured automaton.

Now the conjectured DFA \(A_{0}\) is constructed and the learner can pose the equivalence query \(\mathrm {EQ}(A_{0})\) to the teacher. \(A_{0}\) is clearly not the right conjecture, so the teacher answers “no” together with a counterexample, say \(bbab \in \mathcal {L}(\mathcal {M}) \ominus \mathcal {L}(A_{0})\). In the following we provide the answer to the third question, that is, how to update the observation table from the received counterexample.

4.5 Counterexample Analysis

On receiving the counterexample w, the learner has to analyze w in order to update the observation table; this would then allow the learner to expand the conjectured DFA by adding new states to correctly classify the received counterexample. To discover new states in \(\mathcal {M}\), we essentially need new experiments for the table; the following lemma provides a way to find such new experiments.

Lemma 1

Let \(R\) be the target language and \(A\) be the conjectured DFA. On receiving a counterexample \(v \in R\ominus \mathcal {L}(A)\), we can always find an experiment \(v' \in \text {Suf}(v)\), words \(u, u' \in U\), and letter \(a \in \varSigma \) such that \( row (ua) = row (u')\) and \(uav' \in R\iff u' v' \notin R\).

As a notation, we use \(\mathrm {MQ}(s, w)\) to denote the membership query \(\mathrm {MQ}(s \cdot w)\) in order to give a clear presentation of the analysis procedure on the returned counterexample v as explained in the following. On receiving a counterexample \(v \in R\) and \(v \notin \mathcal {L}(A)\), the learner can check whether the membership queries return different results for v and \(\tilde{v}\) where \(\tilde{v} = A(v)\). Let \(n = |v|\) and for \(i \in [1 \cdots n]\), let \(s_{i} = A(v {[1..i]})\) be the state reached after reading the first i letters of v. Recall that \(s_{i} \in U\) is the representative word of that state in the upper part of the observation table. In particular, \(s_{0} = \varepsilon \). Therefore, \(\tilde{v} = s_{n}\) and there is a sequence of membership queries \(\mathrm {MQ}(s_{0}, {v {[1..n]}} = v)\), \(\mathrm {MQ}(s_{1}, {v {[2..n]}})\), \(\mathrm {MQ}(s_{2}, {v {[3..n]}})\), and so on, up to \(\mathrm {MQ}(s_{n}, \varepsilon ) = \mathrm {MQ}(\tilde{v}, \varepsilon )\). This sequence has different results for the first and the last query since \(s_{0} \cdot v \in R\) while \(\tilde{v} \cdot \varepsilon \notin R\) by the assumption. It follows that there exists an experiment \(v {[i+1..n]}\) for the earliest \(1 \le i \le n\) distinguishing \(s_{i-1} a'\) from \(s_{i}\). Let \(u = s_{i-1}\), \(u' = s_{i}\), \(a = a'\), and \(v' = v {[i+1..n]}\). According to Definition 3, we have \( row (ua) = row (u')\) since \(A(s_{i-1}a) = A(ua) = u' = s_{i}\) and \(uav' \in R\) while \(u'v' \notin R\). The handling for the other case when \(v \notin R\) and \(v \in \mathcal {L}(A)\) is symmetric.

According to Lemma 1, on receiving a counterexample \(bbab \in R\ominus \mathcal {L}(A_{0})\), the learner poses a sequence of membership queries \(\mathrm {MQ}(s_{0} = \varepsilon , bbab)\), \(\mathrm {MQ}(s_{1} = \varepsilon , bab)\), \(\mathrm {MQ}(s_{2} = \varepsilon , ab)\), \(\mathrm {MQ}(s_{3} = \varepsilon , b)\), and \(\mathrm {MQ}(s_{4} = \varepsilon , \varepsilon )\); it is easy to check that the experiment \(v {[2..4]} = bab\) distinguishes \(s_{0} b = b\) from \(s_{1} = \varepsilon \). Therefore, the learner adds bab into the set \(V\) and updates the mapping \(T\) via membership queries, until obtaining the observation table \({\mathcal {T}}'_{0}\) shown in Fig. 5. As \({\mathcal {T}}'_{0}\) is not closed since there is no \(u \in U\) such that \( row (u) = row (b)\), the learner moves the row b to the upper table, i.e., to the set \(U\), and adds the rows ba and bb—the one letter extensions of b—to the lower part of the table as mentioned before. The learner then fills the missing entry values by means of membership queries; the resulting observation table is \(\mathcal {T}_{1}\) shown in Fig. 5. As \(\mathcal {T}_{1}\) is closed and also consistent, the learner can build the DFA \(A_{1}\) from \(\mathcal {T}_{1}\), depicted in Fig. 5.

Fig. 5.
figure 5

Tables \(\mathcal {T}'_{0}\), \(\mathcal {T}_{1}\), and DFA \(A_{1}\)

We remark that instead of finding just one experiment \(v' \in \text {Suf}(v)\), our learner may also add all its suffixes \(\text {Suf}(v)\) into \(V\) just as the algorithm does in [76]. This may also result in a quicker grown of \(\mathcal {T}\), which anyway does not change the correctness (cf. Theorem 3) and complexity (cf. Theorem 4) of the DFA learning algorithm we are presenting.

The learner poses now the equivalence query \(\mathrm {EQ}(A_{1})\) to the teacher; since \(\mathcal {L}(A_{1}) \ne R\), the teacher returns “no” and a counterexample, say again \(bbab \in \mathcal {L}(\mathcal {M}) \ominus \mathcal {L}(A_{1})\). Similarly to the previous counterexample analysis, the learner asks the sequence of membership queries \(\mathrm {MQ}(s_{0} = \varepsilon , bbab)\), \(\mathrm {MQ}(s_{1} = b, bab)\), \(\mathrm {MQ}(s_{2} = \varepsilon , ab)\), \(\mathrm {MQ}(s_{3} = \varepsilon , b)\), and \(\mathrm {MQ}(s_{4} = \varepsilon , \varepsilon )\), which allows the learner to find the experiment \(w {[3..4]} = ab\) to distinguish \(s_{1} b = bb\) from \(s_{2} = \varepsilon \). The learner adds ab into the set \(V\) and updates \(T\) by further membership queries, resulting in the observation table \(\mathcal {T}'_{1}\) shown in Fig. 6.

\(\mathcal {T}'_{1}\) is not closed, since there is no row in the upper part corresponding to \( row (bb)\), so the learner moves bb to the upper part, adds bba and bbb to the lower part, and fills the content of the table by means of membership queries. The result of these operations is table \(\mathcal {T}''_{1}\) which is still not closed since there does not exist \(u \in U\) such that \( row (bbb) = row (u)\). Therefore, as before, the learner moves bbb to the upper part, adds the missing words bbba and bbbb to the lower part, and fills the content, obtaining the table \(\mathcal {T}_{2}\) depicted in Fig. 6, which is now closed. The DFA \(A_{2}\) constructed from \(\mathcal {T}_{2}\) is depicted in Fig. 6 and the learner gets the answer “yes” from the teacher after posing the equivalence query \(\mathrm {EQ}(A_{2})\), which means that he has completed his learning task.

Fig. 6.
figure 6

Tables \(\mathcal {T}'_{1}\), \(\mathcal {T}''_{1}\), \(\mathcal {T}_{2}\), and DFA \(A_{2}\)

4.6 The Learner

In the previous part of this section we have introduced a regular language learning algorithm by means of a running example. We now give the formal definition of the learner by means of Algorithm 1 for completeness of presentation; we can see that it agrees with the learning procedure we presented above. The function \( CloseTable \) is responsible for closing a table \(\mathcal {T}\), so it needs to perform membership queries \(\mathrm {MQ}(\,\cdot \,)\) to fill the missing entry values in \(\mathcal {T}\). Moreover, as we have seen, it may repeatedly move rows from the lower to the upper part of the input table and add new rows to the lower part, until the table becomes closed. All conjectured DFAs are constructed from the table \(\mathcal {T}\) by calling the function \( Aut (\mathcal {T})\) based on Definition 3. On receiving a counterexample v, the function \( FindDistinguishingExperiment (v)\) gets a new experiment which is later added into the set \(V\). The refinement loop of the conjecture \(A\) terminates once we get a positive answer from the teacher.

figure a

The soundness and completeness of Algorithm 1 is guaranteed by Theorem 3.

Theorem 3

Assume that \(R\) is the target regular language. Algorithm 1 terminates and returns a DFA \(A\) such that \(\mathcal {L}(A) = R\).

The returned DFA \(A\) from Algorithm 1 is a correct conjecture automaton simply because the teacher has approved it. The remaining problem is how we show the termination of Algorithm 1. The reason why Algorithm 1 terminates is that: (1) by Lemma 1, we can discover new states, i.e., new equivalence classes in \(\varSigma ^{*}/_{\backsim _{R}}\), whenever receiving a counterexample from the teacher and (2) the index of \(\backsim _{R}\) is finite according to Theorem 2. It follows immediately the complexity result in Theorem 4.

Theorem 4

Let \(R\) be the target regular language and \(n = |\mathord {\backsim _{R}}|\); let m be the maximum length of any counterexample returned by the teacher.

  1. 1.

    Algorithm 1 terminates on receiving at most n counterexamples.

  2. 2.

    The number of membership queries is in \(\mathcal {O}(n^{2} \cdot |\varSigma | + n \cdot m)\).

5 Learning Büchi Automata

After presenting the DFA learning algorithm in Sect. 4, we are now ready to introduce the learning algorithm for Büchi automata. Throughout this section, except stated otherwise, we let the \(\omega \)-regular language \(L\) be the target language.

We have seen that, for learning a regular language \(R\), the right congruence relation \(\backsim _{R}\) plays an important role in identifying the equivalence classes in \(\varSigma ^{*}/_{\backsim _{R}}\), so we could consider to extend such an approach to the \(\omega \)-regular language setting. It would be easy to learn \(\omega \)-regular languages by means of BAs if we can characterize them by a right congruence relation \(\backsim _{L}\) of finite index for each given \(\omega \)-regular language \(L\). There are, however, few questions to be answered for such an extension:

  • How can we use finite memory to represent an \(\omega \)-word, which has infinite length?

  • Is there a right congruence relation \(\backsim _{L}\) of finite index for a given \(\omega \)-regular language \(L\)?

The answer to the first question is easy: we only need to learn the set of ultimately periodic words \(\text {UP}(L)\) for a given \(\omega \)-regular language \(L\), since by Theorem 1 the set \(\text {UP}(L)\) is the fingerprint of \(L\); given that every ultimately periodic word w can be written as a pair of finite words (uv) with \(w = uv^{\omega }\), only finite memory is needed for storing w.

5.1 Right Congruences for \(\omega \)-Regular Languages

In contrast, the answer to the second question is more tricky: a first proposal for extending the right congruence relation \(\backsim _{R}\) with respect to the \(\omega \)-regular language \(L\) replaces the extension \(v \in \varSigma ^{*}\) with the ultimately periodic extension \(xy^{\omega }\) for \(x \in \varSigma ^{*}\) and \(y \in \varSigma ^{+}\).

Definition 4

Let \(u_{1}\) and \(u_{2}\) be words in \(\varSigma ^{*}\). \(u_{1} \backsim _{L}u_{2}\) if and only if for every \(x \in \varSigma ^{*}\) and \(y \in \varSigma ^{+}\), \(u_{1} xy^{\omega } \in L\iff u_{2} xy^{\omega } \in L\).

Based on the right congruence \(\backsim _{L}\), Maler and Pnueli [76] introduced a learning algorithm to learn a strict subset of \(\omega \)-regular languages. Nonetheless, the right congruence relation \(\backsim _{L}\) is in general not enough to learn an \(\omega \)-regular language \(L\), as the following example shows.

Example 2

Assume \(L= \{ a, b \}^{*} \cdot b^{\omega }\). The index of \(\backsim _{L}\) is 1 and the only equivalence class is \([\varepsilon ]_{\backsim _{L}}\). This follows from the fact that for any \(u \in \varSigma ^{*}\), we have \(u \cdot x y^{\omega } \in L\) if \(y^{\omega } = b^{\omega }\), otherwise \(u \cdot x y^{\omega } \notin L\). Therefore, we only have one state with self-loops in the conjectured BA \(A\) which certainly does not recognize the target language \(L\), since \(A\) accepts either \(\varSigma ^{\omega }\) or \(\emptyset \), depending on whether the single state is accepting or not, respectively.

The reason why it is so difficult to learn \(\omega \)-regular languages via Büchi automata is that there is a lack of right congruence for Büchi automata compared to DFAs and regular languages. Farzan et al. in [42] proposed the first learning algorithm to learn the complete class of \(\omega \)-regular languages by means of Büchi automata; their algorithm circumvents the lack of right congruence by first using \(\textsf {L}^{*}\) to learn the DFA \(D_{\$}\), as defined in [26], and then transforming \(D_{\$}\) to a BA. Basically, the DFA \(D_{\$}\) captures the set of ultimately periodic words of \(L\) by means of the regular language \(\mathcal {L}(D_{\$}) = \{\, u \$ v \mid u \in \varSigma ^{*}, v \in \varSigma ^{+}, u v^{\omega } \in L \,\}\), where \(\$ \notin \varSigma \).

Another way to solve the lack of right congruence is to define a Myhill-Nerode like theorem for \(\omega \)-regular languages. Inspired by the work of Arnold [12], Maler and Staiger [77] proposed the notion of family of right-congruences (FORC for short) and presented a “Myhill-Nerode” theorem for \(\omega \)-languages. The idea underlying the definition of FORC is based on the fact that every \(\omega \)-regular language \(L\) can be written in the form of an \(\omega \)-regular expression \(\bigcup _{i = 1}^{n} U_{i} \cdot V_{i}^{\omega }\) for some \(n \in \mathbb {N}\), where for any \(i \in [1 \cdots n]\), \(U_{i}\) and \(V_{i}\) are regular languages. So the intuition of using FORC is to first define a right congruence \(\backsim \) to distinguish all finite word prefixes, and then define a right congruence \(\approx _{u}\) for the finite word periods for each equivalence class \([u]_{\backsim }\) of the finite word prefixes. Hence we see that \([u_{i}]_{\backsim } = U_{i}\) with \(u_{i} \in U_{i}\) being an equivalence class in \(\varSigma ^{*}/_{\backsim }\) and \([v_{i}]_{\approx _{u_{i}}}= V_{i}\) being an equivalence class in \(\varSigma ^{*}/_{\approx _{u_{i}}}\) such that \(u_{i} \cdot V_{i}^{\omega } \subseteq L\).

5.2 Family of Deterministic Finite Automata

Based on this idea of FORC, Angluin and Fisman [11] recently proposed to learn \(\omega \)-regular languages via a formalism called family of DFAs (FDFA for short), in which every DFA corresponds to a right congruence of finite index. Further, Angluin et al. [9] suggest to use FDFAs as language acceptors of \(\omega \)-regular languages. The BA learning algorithm described in this section first learns an FDFA and then transforms it to a BA. The formal definition of an FDFA is as follows.

Definition 5

(Family of DFAs [9]). A family of DFAs \(\mathcal {F}= (M, \{ A^{q} \})\) consists of a leading DFA \(M= (Q, \bar{q}, \delta , \emptyset )\) and a set of progress DFAs \(\{\, A^{q} = (Q_{q}, \bar{q}_{q}, \delta _{q}, F_{q}) \mid q \in Q \,\}\).

Fig. 7.
figure 7

An example of an FDFA \(\mathcal {F}= (M, \{ A^{\varepsilon } \})\)

An example of FDFA \(\mathcal {F}\) is depicted in Fig. 7 where the leading DFA \(M\) has only one state \(\varepsilon \) and the progress DFA corresponding to the state \(\varepsilon \) is \(A^{\varepsilon }\).

Each FDFA \(\mathcal {F}\) characterizes a set of ultimately periodic words \(\text {UP}(\mathcal {F})\) by the acceptance condition defined as follows.

Definition 6

(Acceptance condition of FDFA). Let \(\mathcal {F}= (M, \{ A^{q} \})\) be a FDFA and w be an ultimately periodic word. We say that

  • w is accepted by \(\mathcal {F}\) if there exists a decomposition (uv) of w accepted by \(\mathcal {F}\);

  • a decomposition (uv) is accepted by \(\mathcal {F}\) if \(M(uv) = M(u)\) and the decomposition (uv) is captured by \(\mathcal {F}\); and

  • a decomposition (uv) is captured by \(\mathcal {F}\) if \(v \in \mathcal {L}(A^{q})\) where \(q = M(u)\).

Consider the FDFA \(\mathcal {F}\) in Fig. 7: \((ab)^{\omega }\) is accepted by \(\mathcal {F}\) since there exists the decomposition (aba) of \((ab)^{\omega }\) such that \(M(a \cdot ba) = M(a) = \varepsilon \) and \(ba \in \mathcal {L}(A^{M(a)}) = \mathcal {L}(A^{\varepsilon })\). Note that the decomposition (abab) of \((ab)^{\omega }\) is not accepted by \(\mathcal {F}\) since (abab) is not captured by \(\mathcal {F}\), i.e., \(ab \notin \mathcal {L}(A^{M(ab)}) = \mathcal {L}(A^{\varepsilon })\).

In the following, we recall the definition of the complement of an FDFA \(\mathcal {F}\).

Definition 7

(Complement of FDFA [9]). Given an FDFA \(\mathcal {F}= (M, \{ A^{q} \})\), the complement \(\mathcal {F}^{\mathcal {C}}\) of \(\mathcal {F}\) is the FDFA \(\mathcal {F}^{\mathcal {C}} = (M, \{ (A^{q})^{\mathcal {C}} \})\).

It is easy to see that the complement FDFA \(\mathcal {F}^{\mathcal {C}}\) captures every decomposition (uv) in \(\varSigma ^{*}\times \varSigma ^{+}\) which is not captured by \(\mathcal {F}\).

It is shown in [11] that for every \(\omega \)-regular language \(L\), there exists an FDFA \(\mathcal {F}\) such that \(\text {UP}(\mathcal {F}) = \text {UP}(L)\). More precisely, Angluin and Fisman [11] suggest to use three kinds of FDFAs as canonical representations of \(\omega \)-regular languages, namely periodic FDFAs, syntactic FDFAs, and recurrent FDFAs. In this work, we only consider the periodic FDFAs to simplify the presentation of the BA learning algorithm; we refer the interested reader to [11, 73] for more details on the other two canonical FDFAs.

The definition of periodic FDFAs provided in [11] is given in terms of right congruences.

Definition 8

(Periodic FDFA [11]). Let \(L\) be an \(\omega \)-regular language.

Given \(u \in \varSigma ^{*}\), the periodic right congruence \(\approx ^{u}_{P}\) is an equivalence relation on \(\varSigma ^{*}\) such that for each \(x,y \in \varSigma ^{*}\), \(x \approx ^{u}_{P}y\) if and only if for each \(v \in \varSigma ^{*}\), it holds \(u(xv)^{\omega } \in L\iff u(yv)^{\omega } \in L\).

The periodic FDFA \(\mathcal {F}\) of \(L\) is the FDFA \(\mathcal {F}= (M, \{ A^{u} \})\) where:

  • the DFA \(M= (\varSigma ^{*}/_{\backsim _{L}}, [\varepsilon ]_{\backsim _{L}}, \delta , \emptyset )\) is the leading DFA, where \(\delta ([u]_{\backsim _{L}}, a) = [ua]_{\backsim _{L}}\) for each \(u \in \varSigma ^{*}\) and \(a \in \varSigma \);

  • for each \([u]_{\backsim _{L}} \in \varSigma ^{*}/_{\backsim _{L}}\), the DFA \(A^{u} = (\varSigma ^{*}/_{\approx ^{u}_{P}}, [\varepsilon ]_{\approx ^{u}_{P}}, \delta _{u}, F_{u})\) is a progress DFA, where \(\delta _{u}([v]_{\approx ^{u}_{P}}, a) = [va]_{\approx ^{u}_{P}}\) for each \(v \in \varSigma ^{*}\) and \(a \in \varSigma \), and \(F_{u} = \{\, [v]_{\approx ^{u}_{P}} \in \varSigma ^{*}/_{\approx ^{u}_{P}} \mid uv^{\omega } \in L \,\}\).

As shown in [11], given an \(\omega \)-regular language \(L\), \(\backsim _{L}\) and \(\approx ^{u}_{P}\) for any \(u \in \varSigma ^{*}\) are all right congruences of finite index, so DFAs can be built from them.

Fig. 8.
figure 8

A periodic FDFA \(\mathcal {F}= (M, \{ A^{\varepsilon }, A^{a}, A^{b} \})\) with \(\text {UP}(\mathcal {F}) = (ab)^{\omega }\)

We remark that the set of ultimately periodic words \(\text {UP}(\mathcal {F})\) accepted by the periodic FDFA \(\mathcal {F}\) is consistent with those characterized by the regular language \(\mathcal {L}(D_{\$})\) defined in [26].

Consider the periodic FDFA \(\mathcal {F}\) depicted in Fig. 8 where \(\mathcal {F}\) characterizes the \(\omega \)-regular language \(L= (ab)^\omega \). The leading DFA \(M\) of \(\mathcal {F}\) has three states, namely \(\varepsilon \), a, and b which correspond to the equivalence classes \([\varepsilon ]_{\backsim _{L}}\), \([a]_{\backsim _{L}}\), and \([b]_{\backsim _{L}}\), respectively, given by the upper part \(U\) of \(\mathcal {T}\). The set of experiments \(V\) contains decompositions of ultimately periodic words, which play the role of x and y in Definition 4 introducing \(\backsim _{L}\). As shown in table \(\mathcal {T}\) for \(M\), the experiments \((\varepsilon , ab)\) and \((\varepsilon , ba)\) are enough to distinguish the equivalence classes in \(\varSigma ^{*}/_{\backsim _{L}}\). In fact, for any \(u_1, u_2 \in \varSigma ^{*}\), an experiment \(xy^{\omega }\) for which \(xy^{\omega } \ne ((ab)^{+})^{\omega }\) and \(xy^{\omega } \ne ((ba)^{+})^{\omega }\) cannot distinguish \(u_1\) and \(u_2\), since for sure we have \(u_1 \cdot x y^{\omega } \notin L\) and \(u_2 \cdot x y^{\omega } \notin L\). Since the leading DFA has three states \(\varepsilon \), a, and b, there are three progress DFAs in \(\mathcal {F}\) associated to them, namely \(A^{\varepsilon }\), \(A^{a}\), and \(A^{b}\), respectively.

In the following, we show how the periodic FDFA \(\mathcal {F}\) corresponds to the \(\omega \)-regular expression \(\varepsilon (ab)^{*} \cdot ((ab)^{+})^{\omega } \cup a (ba)^{*} \cdot ((ba)^{+})^{\omega } \cup \{ b, a(ba)^{*}a \} \{ a, b \}^{*} \cdot \emptyset \), where for clarity of presentation we use \(\cup \) instead of the usual symbol \(+\) to distinguish the different \(\omega \)-regular expressions.

Let us consider the first part of the \(\omega \)-regular expression, i.e., \(\varepsilon (ab)^{*} \cdot ((ab)^{+})^{\omega }\), which corresponds to the state \(\varepsilon \) of \(M\) and its progress DFA \(A^{\varepsilon }\), whose language is clearly \(\mathcal {L}(A^{\varepsilon }) = (ab)^{+}\). Let the components of \(M\) be \((Q, \varepsilon , \delta , \emptyset )\) and consider the DFA \(M^{\varepsilon } = (Q, \varepsilon , \delta , \{ \varepsilon \})\) obtained by setting the accepting set of \(M\) to \(\{ \varepsilon \}\). It is easy to see that \(\mathcal {L}(M^{\varepsilon }) = \varepsilon (ab)^{*}\). Let \(U_1 = \mathcal {L}(M^{\varepsilon }) = \varepsilon (ab)^{*}\) and \(V_1 = \mathcal {L}(A^{\varepsilon }) = (ab)^{+}\). It follows that the expression \(U_1 \cdot V_1^{\omega }\) is exactly \(\varepsilon (ab)^{*} \cdot ((ab)^{+})^{\omega }\). Similarly we can get the other two \(\omega \)-regular expressions \(a (ba)^{*} \cdot ((ba)^{+})^{\omega }\) and \(\{ b, a(ba)^{*}a \} \{ a, b \}^{*} \cdot \emptyset \) from the remaining two states a and b of \(M\) and their corresponding progress DFAs \(A^{a}\) and \(A^{b}\), respectively. Note that \(\varepsilon (ab)^{*} \cdot ((ab)^{+})^{\omega } \cup a (ba)^{*} \cdot ((ba)^{+})^{\omega } \cup \{ b, a(ba)^{*}a \} \{ a, b \}^{*} \cdot \emptyset = (ab)^{\omega }\), that is, the induced \(\omega \)-regular expression corresponds to the language accepted by \(\mathcal {F}\). In general, we can construct from the periodic FDFA \(\mathcal {F}\) accepting \(L\) a unique \(\omega \)-regular expression representing \(L\).

We remark that by fixing a state of \(M\), say state \(\varepsilon \), the right congruence \(\approx ^{\varepsilon }_{P}\) is actually the same as the right congruence \(\backsim _{R}\) for the regular language \(R= V_{1}\). Recall that the idea underlying FORC is to first define a right congruence \(\backsim \) distinguishing all finite word prefixes, and then define, for each equivalence class \([u]_{\backsim }\) of the finite word prefixes, a right congruence \(\approx _{u}\) for the finite word periods. Thus after fixing the equivalence class \([\varepsilon ]_{\backsim _{L}}\) for the finite word prefixes of \(L\), we can define the right congruence \(\backsim _{R}\) for the regular language \(R= V_{1}\) of finite word periods defined as \(\{\, v \in \{ a, b \}^{+} \mid \varepsilon \cdot v^{\omega } \in L \,\}\) and call it \(\approx ^{\varepsilon }_{P}\). These right congruences allow for the development of a learning algorithm for \(\omega \)-regular languages represented by FDFAs, where the FDFA learner can be seen as a procedure to simultaneously run an instance of the DFA learner for each DFA in the FDFA. In the remaining part of this section we first introduce a periodic FDFA learner and then present the learning algorithm for BAs.

5.3 Learning a Family of DFAs

In order to present the periodic FDFA learner, we need first introduce the observation tables for each internal DFA learner. In this work, we often use FDFA learner as a shorthand for the periodic FDFA learner since we only consider periodic FDFAs. We remark that the FDFA learner introduced in this section is specialized for the periodic FDFAs which differs from the FDFA learner specified in [11, 73] by requiring the received counterexamples satisfying Definition 9.

Observation Tables for a Family of DFAs. An observation table \(\mathcal {T}\) for the leading DFA learner, called leading table, has the same structure \((U, V, T)\) as the one for the DFA learner presented in Sect. 4.3 except that \(T\) and \(V\) are adapted to handle \(\omega \)-regular words: \(V\) is a set of decompositions rather than a set of finite words; \(T:(U\cup U\varSigma ) V\rightarrow \{ 0, 1 \}\) is still a mapping but the entry value of row u and column (xy), denoted by \(T(u, (x, y))\), is 1 if \(uxy^{\omega } \in L\) and 0 otherwise. Consider for instance the leading table \(\mathcal {T}\) shown in Fig. 8: we have that \(V= \{ (\varepsilon , ab), (\varepsilon , ba) \}\) is the set of experiments and \(T(a, (\varepsilon , ab)) = 0\) since \(a \cdot \varepsilon \cdot (ab)^{\omega } \notin L\) while \(T(a, (\varepsilon , ba)) = 1\) since \(a \cdot \varepsilon \cdot (ba)^{\omega } \in L\). The \( row \) function remains unchanged, thus we still have \( row :(U\cup U\varSigma ) \rightarrow (V\rightarrow \{ 0, 1 \})\) being a total function such that for each word \(u \in U\cup U\varSigma \), \( row (u) :V\rightarrow \{ 0, 1 \}\) is a total function defined as \( row (u)(x, y) = T(u, (x, y))\) for each \((x, y) \in V\).

For every \(u \in U\) of the leading table \(\mathcal {T}\), there exists an observation table \(\mathcal {T}_{u}\) for the progress DFA learner called progress table. \(\mathcal {T}_{u}\) has the same structure \((U_{u}, V_{u}, T_{u})\) as the one for the DFA learner (cf. Sect. 4.3) except that the entry value of row x and column v, denoted by \(T_{u}(x, v)\), is 1 if \(u \cdot (xv)^{\omega } \in L\) and 0 otherwise. Consider for instance the table \(\mathcal {T}_{a}\) shown in Fig. 8: \(T_{a}(\varepsilon , ba) = 1\) since \(a\cdot (\varepsilon ba)^{\omega } \in L\) while \(T_{a}(\varepsilon , a) = 0\) since \(a\cdot (\varepsilon a)^{\omega } \notin L\).

Unless stated otherwise, all remaining notions for the table of a DFA learner can be also directly applied to the leading table and progress tables, such as the DFA construction from a table and the closeness and consistency of a table.

The Learning Procedure of the FDFA Learner. After the introduction of the observation tables for the FDFA learner, we are now ready to give the intuition about how the FDFA learner works by learning the \(\omega \)-regular language \(L= (ab)^{\omega }\) over \(\varSigma = \{ a, b \}\).

As for the DFA learner, at the beginning the FDFA learner has no information so he initializes the components \(U\) and \(V\) of the leading table \(\mathcal {T}\) to \(\{ \varepsilon \}\) and \(\{ (\varepsilon , \varepsilon ) \}\), respectively. Then he turns to fill the content of \(T\), so for each \(u \in U\cup U\varSigma \) and \((x, y) \in V\), he makes a membership query \(\mathrm {MQ}(u \cdot x, y)\) whose answer is stored as \(T(u, (x, y))\); the membership query \(\mathrm {MQ}(f, g)\) is used to asks the FDFA teacher whether the word \(fg^{\omega }\) belongs to \(L\).

Once \(T\) is fully defined, the learner checks whether \(\mathcal {T}\) is closed; if it is not closed, he repeatedly moves rows from the lower part to the upper part, adds the new rows in \(U\varSigma \) as needed, and fills \(T\), as done by the DFA learner (see Sect. 4.3), until \(\mathcal {T}\) becomes closed.

As soon as \(\mathcal {T}\) is closed, the learner constructs the corresponding leading DFA \(M\) and then turns to the progress tables: for each \(u \in U\) of \(\mathcal {T}\), he first creates a progress table \(\mathcal {T}_{u}\) and then initializes both \(U_{u}\) and \(V_{u}\) to \(\{ \varepsilon \}\). For every \(x \in U_{u} \cup U_{u} \varSigma \) and \(v \in V_{u}\), \(T_{u}(x, v)\) is defined according to the result of the membership query \(\mathrm {MQ}(u, xy)\). Then the learner makes sure that each progress table \(\mathcal {T}_{u}\) is closed before constructing the corresponding progress DFA \(A^{u}\). Once all DFAs are constructed, he is ready to pose the first equivalence query \(\mathrm {EQ}(\mathcal {F}_{0})\) for the conjectured \(\mathcal {F}_{0}\) to the FDFA teacher; \(\mathcal {F}_{0}\) is shown in Fig. 9 together with its corresponding tables.

Fig. 9.
figure 9

The initial FDFA \(\mathcal {F}_{0}\) and its corresponding tables while learning \((ab)^{\omega }\)

On receiving \(\mathrm {EQ}(\mathcal {F})\), the teacher has to decide whether the conjectured FDFA \(\mathcal {F}\) is an appropriate periodic FDFA of the target language \(L\). \(\mathcal {F}_{0} = (M, \{ A^{\varepsilon } \})\) is clearly not the right conjecture so she answers “no” and provides a counterexample, say the decomposition \((\varepsilon , ab)\). Note that the counterexample (xy) returned by the teacher is not just an ultimately periodic word \(xy^{\omega } \in \text {UP}(\mathcal {F}) \ominus \text {UP}(L)\), but it needs to satisfy additional requirements given in the following definition, in order to be useful for the learner to refine the conjectured FDFA.

Definition 9

(Counterexample for the FDFA learner). Let \(L\) be the target language and \(\mathcal {F}\) be the conjectured FDFA. We say that a counterexample (uv) is

  • positive if (uv) is not captured by \(\mathcal {F}\) and \(uv^{\omega } \in \text {UP}(L)\), and

  • negative if (uv) is captured by \(\mathcal {F}\) and \(uv^{\omega } \notin \text {UP}(L)\).

Remark 1

Besides the periodic FDFAs, Angluin and Fisman [11] introduced also the recurrent and the syntactic FDFAs, which make use of a different definition of right congruence. Similarly to the periodic case, also for these two FDFAs it is possible to define positive and negative counterexamples, which are however more involved. We refer the interested reader to [73] for more details on these two other types of FDFAs.

Refinement of the Conjectured FDFA \(\mathcal {F}\). In order to decide which DFA in the conjectured \(\mathcal {F}\) has to be refined, the learner acts differently depending on whether the received counterexample (uv) is positive or negative.

If (uv) is a positive counterexample, the learner proceeds as follows: let \(\tilde{u} = M(u)\); if \(\tilde{u} \cdot v^{\omega } \in \text {UP}(L)\), then the progress DFA \(A^{\tilde{u}}\) is refined, otherwise the leading DFA \(M\) is refined. In case (uv) is a negative counterexample, the learner just acts symmetrically: if \(\tilde{u} \cdot v^{\omega } \in \text {UP}(L)\), then \(M\) is refined, otherwise \(A^{\tilde{u}}\) is refined.

Consider again the conjectured FDFA \(\mathcal {F}\) shown in Fig. 9 and the returned counterexample \((\varepsilon , ab)\): \((\varepsilon , ab)\) is clearly a positive counterexample so the conjectured progress DFA \(A^{\varepsilon }\) has to be refined since \(\tilde{u} = \varepsilon = M(\varepsilon )\) and \(\varepsilon \cdot (ab)^{\omega } \in \text {UP}(L)\).

Refinement of the Progress DFA \(A^{\tilde{u}}\). Assume that (uv) is a positive counterexample: by definition we have that \(\tilde{u} \cdot v^{\omega } \in \text {UP}(L)\) and \(A^{\tilde{u}}\) has to be refined so to accept v.

The counterexample analysis is similar to Lemma 1 due to the close relation of \(\approx ^{u}_{P}\) with \(\backsim _{R}\): let \(n = |v|\) and for each \(i \in [1 \cdots n]\), let \(s_{i} = A^{\tilde{u}}(v {[1..i]})\) be the state in \(A^{\tilde{u}}\) after reading the first i letters of v; recall that \(s_{0} = \varepsilon \). There exists a sequence of membership queries \(\mathrm {MQ}(\tilde{u}, s_{0} \cdot {v {[1..n]}})\), \(\mathrm {MQ}(\tilde{u}, s_{1} \cdot {v {[2..n]}})\), and so on, up to \(\mathrm {MQ}(\tilde{u}, s_{n} \cdot \varepsilon )\). By assumption we have \(\tilde{u} \cdot (s_{0} \cdot v {[1..n]})^{\omega } \in L\) while \(\tilde{u} \cdot (s_{n} \cdot \varepsilon )^{\omega } \notin L\) due to the fact that (uv) is not captured by \(\mathcal {F}\) and thus \(s_{n}\) is not an accepting state. Recall that by Definition 8, the accepting set \(F_{\tilde{u}}\) of the progress DFA \(A^{\tilde{u}}\) is the set of equivalence classes \(\{\, [v]_{\approx ^{\tilde{u}}_{P}} \mid \tilde{u} \cdot v^{\omega } \in L \,\}\). Therefore, the learner can find the first experiment \(v {[j+1..n]}\) such that \(\tilde{u} \cdot (s_{j - 1} v{[j]} \cdot v {[j+1..n]})^{\omega } \in L\) while \(\tilde{u} \cdot (s_{j} \cdot v {[j+1..n]})^{\omega } \notin L\); this means that \(s_{j - 1} v{[j]}\) and \(s_{j}\) do not represent the same equivalence class and must be split.

Consider again the conjectured FDFA \(\mathcal {F}_{0}\) shown in Fig. 9 that has to be refined by means of the positive counterexample \((\varepsilon , ab)\), which requires to refine the progress \(A^{\varepsilon }\): from the sequence of membership queries \(\mathrm {MQ}(\varepsilon , \langle s_{0} = \varepsilon \rangle \cdot ab)\), \(\mathrm {MQ}(\varepsilon , \langle s_{1} = \varepsilon \rangle \cdot b)\), and \(\mathrm {MQ}(\varepsilon , \langle s_{2} = \varepsilon \rangle \cdot \varepsilon )\), the learner finds the experiment b distinguishing \(\varepsilon \cdot a\) from \(\varepsilon \). So he first adds b to \(V_{\varepsilon }\) of table \(\mathcal {T}_{\varepsilon }\), then fills the missing entries, makes the table \(\mathcal {T}_{\varepsilon }\) closed, and constructs from \(\mathcal {T}_{\varepsilon }\) a new FDA \(A^{\varepsilon }\), resulting in a new conjectured FDFA \(\mathcal {F}_{1}\), shown in Fig. 10.

Fig. 10.
figure 10

The refined FDFA \(\mathcal {F}_{1}\) and its corresponding tables while learning \((ab)^{\omega }\)

With \(\mathcal {F}_{1}\) at hand, the learner can ask the teacher the equivalence query \(\mathrm {EQ}(\mathcal {F}_{1})\); she answers “no” with for instance the counterexample \((\varepsilon , bab)\). According to Definition 9, \((\varepsilon , bab)\) is a negative counterexample, since it is captured by \(\mathcal {F}_{1}\) but clearly \(\varepsilon (bab)^{\omega } \notin L= (ab)^{\omega }\). The learner has to refine again the progress DFA \(A^{\varepsilon }\): after asking the sequence of membership queries \(\mathrm {MQ}(\varepsilon , \langle s_{0} = \varepsilon \rangle \cdot bab)\), \(\mathrm {MQ}(\varepsilon , \langle s_{1} = \varepsilon \rangle \cdot ab)\), \(\mathrm {MQ}(\varepsilon , \langle s_{2} = a \rangle \cdot b)\), and \(\mathrm {MQ}(\varepsilon , \langle s_{3} = ab \rangle \cdot \varepsilon )\), he finds the experiment ab distinguishing \(\varepsilon \cdot b\) from \(\varepsilon \). In general, on receiving a negative counterexample (uv), the sequence of membership queries has different results for the first query \((\tilde{u}, \varepsilon \cdot v)\) and the last query \((\tilde{u}, A^{\tilde{u}}(v) \cdot \varepsilon )\). This is because \(\tilde{u} \cdot (\varepsilon \cdot v)^{\omega } \notin L\) by assumption while \(\tilde{u} \cdot (A^{\tilde{u}}(v) \cdot \varepsilon )^{\omega } \in L\) since \(A^{\tilde{u}}(v)\) is an accepting state. The learner thus uses the experiment ab to update the table \(\mathcal {T}_{\varepsilon }\) as seen before and constructs a new progress DFA \(A^{\varepsilon }\) out of \(\mathcal {T}_{\varepsilon }\), which are shown in Fig. 11.

Fig. 11.
figure 11

The intermediate FDFA \(\mathcal {F}_{2}\) and its corresponding tables while learning \((ab)^{\omega }\)

The learner is ready to ask the equivalence query \(\mathrm {EQ}(\mathcal {F}_{2})\) obtaining yet another time “no” as answer, together with a counterexample, say (aab), which is again a negative counterexample. Since \(\tilde{u} = M(a) = \varepsilon \) and \(\varepsilon \cdot (ab)^{\omega } \in L\), the learner this time has to refine the leading DFA \(M\).

Refinement of the Leading DFA \(M\). Assume that the learner has received a negative counterexample (uv); the case of positive counterexamples is symmetric and thus omitted here. Let \(\tilde{u} = M(u)\); by definition we have \(u v^{\omega } \notin L\) while \(\tilde{u} v^{\omega } \in L\). Let \(n = |u|\) and for every \(i \in [1 \cdots n]\), let \(s_{i} = M(u {[1..i]})\) be the state in \(M\) after reading the first i letters of u. In particular, \(s_{0} = \varepsilon \). As in the previous analyses, there is the sequence of membership queries \(\mathrm {MQ}(s_0 \cdot {u {[1..n]}}, v)\), \(\mathrm {MQ}(s_{1} \cdot {u {[2..n]}}, v)\), and so on, up to \(\mathrm {MQ}(s_{n} \cdot \varepsilon , v)\). This sequence has different results for the first and the last query since \(s_0 \cdot u {[1..n]} \cdot v^{\omega } \notin L\) while \(s_{n} \cdot \varepsilon \cdot v^{\omega } \in L\). Therefore, the learner can find the first experiment \((u {[j+1..n]}, v)\) such that \(s_{j - 1} \cdot u{[j]} \cdot u {[j+1..n]} \cdot v^{\omega } \notin L\) while \(s_{j} \cdot u {[j+1..n]} \cdot v^{\omega } \in L\), which means that the experiment \((u {[j+1..n]}, v)\) can be used to distinguish \(s_{j - 1} \cdot u{[j]}\) from \(s_{j}\).

Consider again the FDFA \(\mathcal {F}_{2}\) shown in Fig. 11 and the negative counterexample (aab): the learner finds the experiment \((\varepsilon , ab)\) to distinguish \(\varepsilon \cdot a\) from \(\varepsilon \). As usual, after updating the leading table \(\mathcal {T}\) by adding the experiment \((\varepsilon , ab)\) and closing \(\mathcal {T}\), the learner constructs a new conjecture leading DFA \(M\), which is depicted in Fig. 12. Moreover, for every new state \(u \in U\) of \(\mathcal {T}\), he initializes a new progress table \(\mathcal {T}_{u}\) and builds the corresponding progress DFA \(A^{u}\) as before; see for example the progress table \(\mathcal {T}_{a}\) and the progress DFA \(A^{a}\) depicted in Fig. 12.

The learner asks the teacher whether \(\mathcal {F}_{2}\) is correct. Assume that the teacher answers “no” with the counterexample (bbab) which is negative. By following the same procedure as above, he finds the experiment (bab) to distinguish \(\varepsilon \cdot b\) from a, which is used to update the leading table \(\mathcal {T}\) with experiment (bab) and to add the new progress DFA \(A^{b}\) for the state b of \(M\), obtaining the FDFA \(\mathcal {F}_{4}\) shown in Fig. 13.

Fig. 12.
figure 12

The intermediate FDFA \(\mathcal {F}_{3}\) and its corresponding tables while learning \((ab)^{\omega }\)

Fig. 13.
figure 13

The FDFA \(\mathcal {F}_{4}\) and its corresponding tables while learning \((ab)^{\omega }\)

By comparing the leading DFA \(M\) in \(\mathcal {F}_{4}\) in Fig. 13 with the one in Fig. 7, we can see that they are the same, so \(M\) is not going to be changed anymore since it is already consistent with the one induced by \(\backsim _{L}\) in Definition 8. However, the progress DFA \(A^{a}\) is still not correct so the teacher answers “no” to the equivalence query \(\mathrm {EQ}(\mathcal {F}_{4})\) posed by the learner. Assume that the teacher returns the counterexample (aba) which is positive. The learner then finds the experiment a to refine the progress DFA \(A^{a}\) and finally he generates the new conjectured FDFA \(\mathcal {F}_{5}\) depicted in Fig. 14.

Fig. 14.
figure 14

The intermediate FDFA \(\mathcal {F}_{5}\) and its corresponding tables while learning \((ab)^{\omega }\)

The FDFA \(\mathcal {F}_{5}\) is still not the right conjecture and the teacher answers again “no” to the equivalence query for it. Assume that the returned counterexample is (aaba) which is clearly negative. As before, the learner refines the progress DFA \(A^{a}\) and gets a new conjecture FDFA \(\mathcal {F}_{6}\) shown in Fig. 15.

The teacher now answers “yes” to the equivalence query \(\mathrm {EQ}(\mathcal {F}_{6})\) and the learner has completed his job.

The FDFA Learner. By means of the previous example, we have introduced informally the \(\omega \)-regular language learning algorithm, which is formalized in Algorithm 2 as the periodic FDFA learner. We can note that the learning procedure we described in the running example follows exactly the steps of the algorithm. In Algorithm 2 we have functions acting on DFAs that are specialized for the leading DFA \(M\) (whose with subscript l) and functions specialized for the progress DFAs (whose with subscript p). Since for the refinement of the progress DFA \(A^{\tilde{u}}\) the learner does not need the word u but just \(\tilde{u}\), function \( FindDistinguishingExperiment _{p}\) takes the parameter instance \(\tilde{u}\) instead of u.

The soundness and completeness of Algorithm 2 are guaranteed by Theorem 5.

Theorem 5

Assume that \(L\) is the target \(\omega \)-regular language. Algorithm 2 terminates and returns a periodic FDFA \(\mathcal {F}\) capturing the set of decompositions \(\{\, (u, v) \in \varSigma ^{*}\times \varSigma ^{+} \mid u v^{\omega } \in L \,\}\).

Fig. 15.
figure 15

The final FDFA \(\mathcal {F}_{6}\) and its corresponding tables while learning \((ab)^{\omega }\)

Clearly, the FDFA \(\mathcal {F}\) returned by Algorithm 2 is a correct conjecture because the teacher has approved it; Algorithm 2 terminates because: (1) we can discover new states, i.e., new equivalence classes in \(\varSigma ^{*}/_{\backsim _{L}}\) or in \(\varSigma ^{*}/_{\approx ^{\tilde{u}}_{P}}\), whenever receiving a counterexample (uv) from the teacher, where \(\tilde{u} = M(u)\) (cf. [72]); and (2) \(\backsim _{L}\) and \(\approx ^{\tilde{u}}_{P}\) for any \(u \in \varSigma ^{*}\) are all right congruences of finite index (cf. [11]).

figure b

The complexity of Algorithm 2 is stated in Theorem 6; let the length of a decomposition (uv) be the sum of the lengths of u and v, i.e., \(|(u,v)| = |u| + |v|\).

Theorem 6

Given a target \(\omega \)-regular language \(L\), let n be the sum of the indexes of the right congruences, i.e., \(n = |\mathord {\backsim _{L}}| + \sum _{[u]_{\backsim _{L}} \in \varSigma ^{*}/_{\backsim _{L}}} |\mathord {\approx ^{u}_{P}}|\), and m be the maximum length of any counterexample (uv) returned by the teacher.

  1. 1.

    Algorithm 2 terminates on receiving at most n counterexamples.

  2. 2.

    The number of membership queries is in \(\mathcal {O}(n^{2} \cdot |\varSigma | + n \cdot m)\).

The reason why Algorithm 2 terminates on receiving at most n counterexamples is obvious since there are n states in the periodic FDFA of \(L\). The reason why the number of membership queries is in \(\mathcal {O}(n^{2} \cdot |\varSigma | + n \cdot m)\) is the following: let l be the number of states in the leading DFA \(M\) and \(p_{1}, p_{2}, \cdots , p_{l}\) be the number of states in the progress DFAs, respectively; we have \(l + \sum _{i=1}^{l} p_{i} = n\). By Theorem 4, a DFA with \(k_{1}\) states and longest counterexample of length \(k_{2}\) can be learned by at most \(k^{2}_{1} \cdot |\varSigma | + k_{1} \cdot k_{2}\) membership queries, thus we need at most \(l^{2} \cdot |\varSigma | + l \cdot m + \sum _{i=1} (p_{i}^{2} \cdot |\varSigma | + p_{i} \cdot m) \in \mathcal {O}(n^{2} \cdot |\varSigma | + n \cdot m)\) membership queries.

5.4 Learning Büchi Automata

In the previous section we presented the FDFA learning algorithm, which is our secret ingredient in learning a BA: we first learn an FDFA \(\mathcal {F}\) and then transform the learned \(\mathcal {F}\) to a BA. This is just one sentence introduction to the BA learning algorithm; there are however several details than need to be concretized in order to get a working algorithm.

Overview of the BA Learning Framework. In the following we begin with an introduction of the framework presented in [73] for learning BA as depicted in Fig. 16. In this section, we let the \(\omega \)-regular language \(L\) be the target language and we assume that we already have a BA teacher who knows the language \(L\) and can answer membership and equivalence queries about \(L\). In order to distinguish membership and equivalence queries posed by the FDFA learner and the BA learner, we use a superscript like \(\text {FDFA}\) and \(\text {BA}\) to mark queries from the FDFA learner and the BA learner, respectively. For instance, the membership query \(\mathrm {{MQ}^{FDFA}}(\,\cdot \,)\) is posed by the FDFA learner while \(\mathrm {{MQ}^{BA}}(\,\cdot \,)\) is asked by the BA learner.

The BA learner, shown in Fig. 16 surrounded by the dashed box, has three components, namely the FDFA learner, the component transforming an FDFA \(\mathcal {F}\) to a BA \(B_{\mathcal {F}}\), and the counterexample analysis component. The BA learner first uses the FDFA learner to learn an FDFA \(\mathcal {F}\) by means of membership and equivalence queries. This makes some problem the BA learner has to solve: on the one side, in order to answer queries posed by the FDFA learner, the BA learner needs an FDFA teacher to answer membership and equivalence queries about the target periodic FDFA of \(L\); one the other side, there is only a BA teacher who can answer queries about the target language.

Fig. 16.
figure 16

Overview of the BA learning framework based on FDFA learning

In this situation, the BA learner acts as an interface between the FDFA teacher and the FDFA learner and tries to pretend to be an FDFA teacher when he has to answer the queries from the FDFA learner. In other words, the BA learner becomes the FDFA teacher by interacting with the BA teacher. To that end, the FDFA teacher answers to a membership query \(\mathrm {{MQ}^{FDFA}}(u, v)\) by simply forwarding the answer to the membership query \(\mathrm {{MQ}^{BA}}(uv^{\omega })\) obtained from the BA teacher to the FDFA learner, which is trivial.

It is, however, more tricky for the FDFA teacher to answer an equivalence query \(\mathrm {{EQ}^{FDFA}}(\mathcal {F})\) posed by the FDFA learner. The FDFA teacher first needs to transform the conjectured FDFA \(\mathcal {F}\) to a BA \(B_{\mathcal {F}}\) and then poses the equivalence query \(\mathrm {{EQ}^{BA}}(B_{\mathcal {F}})\) to the BA teacher. If the BA teacher answers “yes”, the BA learner first receives the answer and then outputs the BA \(B_{\mathcal {F}}\) as he has completed the learning task. Otherwise the BA teacher returns “no” together with a counterexample \(uv^{\omega }\) given as a decomposition (uv). The BA learner then performs the counterexample analysis and, by acting as an FDFA teacher, he feeds the FDFA learner with a valid decomposition \((u', v')\) which satisfies Definition 9, so that the FDFA learner can further refine the current FDFA \(\mathcal {F}\).

Note that in Fig. 16 there is a dashed arrow labeled with \(\mathcal {F}\) entering the counterexample analysis block: it indicates the fact that the FDFA teacher needs to use the current conjectured FDFA \(\mathcal {F}\) in the analysis of the counterexample, as we will see later. We want to remark that, according to Fig. 16, the BA teacher is oblivious of the FDFA learner, since she only sees a BA learner interacting with her and similarly, the FDFA learner does not know that there is a BA teacher since it is the FDFA teacher that is answering his queries.

From the framework depicted in Fig. 16, we get the rough idea about how to build a BA learner out of an FDFA learner. Yet there are still few details we have to sort out:

  • How can we transform an FDFA \(\mathcal {F}\) to a BA \(B_{\mathcal {F}}\)?

  • How can we get a valid counterexample \((u', v')\) for the FDFA learner out of a counterexample (uv) returned by the BA teacher?

The answers to the above questions are the missing bricks we need to build a BA learner based on an FDFA learner. In the following, we first answer the question on how to do the transformation from an FDFA to a BA and then introduce the counterexample analysis through an example.

From FDFA \(\mathcal {F}\) to BA \(B_{\mathcal {F}}\). Assume that we want to learn a BA which accepts the \(\omega \)-regular language \(L= (ab)^{\omega }\) over \(\varSigma = \{ a, b \}\). To that end, the BA learner first initializes an FDFA learner which constructs the initial conjectured FDFA \(\mathcal {F}_{0}\) as depicted in Fig. 9 via membership queries. On receiving the conjectured FDFA \(\mathcal {F}_{0}\), the BA learner has to construct a BA \(B_{\mathcal {F}_{0}}\) from \(\mathcal {F}_{0}\) which we illustrate in the following.

To answer an equivalence query \(\mathrm {{EQ}^{FDFA}}(\mathcal {F})\), the BA learner needs fist to covert \(\mathcal {F}\) into a BA \(B_{\mathcal {F}}\) in order to exploit the BA teacher to answer the query. The first question one may ask in doing this is:

  • Is it possible to construct a precise BA \(B_{\mathcal {F}}\) for each given FDFA \(\mathcal {F}\) such that \(\text {UP}(\mathcal {L}(B_{\mathcal {F}})) = \text {UP}(\mathcal {F})\)?

The answer is actually no, as the following example shows.

Fig. 17.
figure 17

An FDFA \(\mathcal {F}\) such that \(\text {UP}(\mathcal {F})\) does not characterize an \(\omega \)-regular language

Example 3

(Non-regular \(\omega \)-language accepted by an FDFA [73]). Consider the FDFA \(\mathcal {F}\) depicted in Fig. 17 where \(\text {UP}(\mathcal {F}) = \bigcup _{n = 0}^{\infty } \{ a, b \}^{*} \cdot (ab^{n})^{\omega }\). Assume that \(\text {UP}(\mathcal {F})\) characterizes an \(\omega \)-regular language \(L\). It is claimed in [11] that for every \(\omega \)-regular language, there exists a periodic FDFA recognizing it and the index of each right congruence of the periodic FDFA is finite. Therefore we let \(\mathcal {F}{'}\) be the periodic FDFA of \(L\) and we know that the right congruence \(\approx ^{\varepsilon }_{P}\) of \(\mathcal {F}{'}\) is of finite index. However, we can show that the right congruence \(\approx ^{\varepsilon }_{P}\) of \(\mathcal {F}{'}\) has to be of infinite index. Observe that \(ab^{k} \not \approx ^{\varepsilon }_{P} ab^{j}\) for any \(k, j \ge 1\) and \(k \ne j\), since \(\varepsilon \cdot (ab^{k} \cdot ab^{k})^{\omega } \in \text {UP}(\mathcal {F})\) and \(\varepsilon \cdot (ab^{j} \cdot ab^{k})^{\omega } \notin \text {UP}(\mathcal {F})\) according to Definition 8. It follows that \(\approx ^{\varepsilon }_{P}\) is of infinite index. Contradiction. Thus we conclude that \(\text {UP}(\mathcal {F})\) cannot characterize an \(\omega \)-regular language.

Therefore, in general, one can not construct a BA \(B_{\mathcal {F}}\) from an FDFA \(\mathcal {F}\) such that \(\text {UP}(\mathcal {L}(B_{\mathcal {F}})) = \text {UP}(\mathcal {F})\). The authors of [73] suggested two BA constructions to approximate the set of ultimately periodic words \(\text {UP}(\mathcal {F})\): the under-approximation and the over-approximation construction. In this work, we only introduce the under-approximation construction from [73], which produces a BA \(B_{\mathcal {F}}\) that under-approximates \(\text {UP}(\mathcal {F})\), i.e., \(\text {UP}(\mathcal {L}(B_{\mathcal {F}})) \subseteq \text {UP}(\mathcal {F})\). This construction was originally proposed by Calbrix et al. in [26].

We first give the main idea behind the under-approximation method and then give its formal definition. Let \(\mathcal {F}\) be the FDFA \(\mathcal {F}= (M, \{ A^{u} \})\) with \(M= (Q, \bar{q}, \delta , \emptyset )\) and \(A^{u} = (Q_{u}, s_{u}, \delta _{u}, F_{u})\) for each \(u \in Q\). Let \(M^{s}_{v} = (Q, s, \delta , \{ v \})\) and \((A^{u})^{s}_{v} = (Q_{u}, s, \delta _{u}, \{ v \})\) be the DFAs obtained from \(M\) and \(A^{u}\) by setting their initial state and accepting states to s and \(\{ v \}\), respectively. We define \(N_{(u,v)} = \{\, v^{\omega } \mid M(uv) = M(u) \wedge v \in \mathcal {L}((A^{u})^{s_{u}}_{v}) \,\}\), which contains only the words \(v \in \mathcal {L}((A^{u})^{s_{u}}_{v})\) such that \(u = M(u) = M(uv)\). Recall that we use words u and v to represent the states in the DFAs. Therefore, according to the acceptance condition of FDFAs in Definition 6, we have that \(\text {UP}(\mathcal {F}) = \bigcup _{u \in Q, v \in F_{u}} \mathcal {L}(M^{\bar{q}}_{u}) \cdot N_{(u,v)}\) where \(\mathcal {L}(M^{\bar{q}}_{u})\) contains the set of finite prefixes and \(N_{(u,v)}\) contains the set of finite periodic words for every state pair (uv).

We construct \(B_{\mathcal {F}}\) by approximating the set \(N_{(u,v)}\), i.e., the set of finite periodic words. We first define the FA \(P_{(u,v)} = (Q_{(u,v)}, s_{(u,v)}, \delta _{(u,v)}, \{ f_{(u,v)} \}) = M^{u}_{u} \times (A^{u})^{s_{u}}_{v} \times (A^{u})^{v}_{v}\) and let \(\underline{N}_{(u,v)} = \mathcal {L}(P_{(u,v)})^{\omega }\). Recall that the notation \(\times \) here is the intersection operation of FAs. Then we can construct the BA \((Q_{(u,v)} \cup \{ f \}, s_{(u,v)}, \delta _{(u,v)} \cup \delta _{f}, \{ f \})\) recognizing \(\underline{N}_{(u,v)}\) where f is a “fresh” state and \(\delta _{f} = \{ (f, \varepsilon , s_{(u,v)}), (f_{(u, v)}, \varepsilon , f) \}\). Note that \(\varepsilon \) transitions can be taken without consuming any letters and can be removed by standard methods in automata theory, see, e.g., [56]. Intuitively, we under-approximate the set \(N_{(u,v)}\) as \(\underline{N}_{(u,v)}\) by only keeping \(v^{\omega } \in N_{(u,v)}\) if \(A^{u} (v) = A^{u} (v \cdot v)\) where \(v \in \varSigma ^{+}\).

In Definition 10 we provide the construction procedure for a BA \(B_{\mathcal {F}}\) such that \(\text {UP}(\mathcal {L}(B_{\mathcal {F}})) = \bigcup _{u \in Q, v \in F_{u}} \mathcal {L}(M^{\bar{q}}_{u}) \cdot \underline{N}_{(u,v)} = \bigcup _{u \in Q, v \in F_{u}} \mathcal {L}(M^{\bar{q}}_{u}) \cdot (\mathcal {L}(P_{(u,v)}))^{\omega }\), as originally proposed in [26].

Definition 10

([73]). Let \(\mathcal {F}= (M, \{ A^{u} \})\) be an FDFA where \(M= (Q, \bar{q}, \delta , \emptyset )\) and \(A^{u} = (Q_{u}, s_{u}, \delta _{u}, F_{u})\) for each \(u \in Q\). Let \((Q_{(u,v)}, s_{(u,v)}, \delta _{(u,v)}, \{ f_{(u,v)} \})\) be a BA recognizing \(\underline{N}_{(u,v)}\). Then the BA \(B_{\mathcal {F}}\) is defined as the tuple \(B_{\mathcal {F}}= (Q_{B_{\mathcal {F}}}, \bar{q}_{B_{\mathcal {F}}}, \delta _{B_{\mathcal {F}}}, F_{B_{\mathcal {F}}})\) where

  • \(Q_{B_{\mathcal {F}}} = Q\cup \bigcup \limits _{ u \in Q, v \in F_{u}} Q_{(u,v)}\),

  • \(\bar{q}_{B_{\mathcal {F}}} = \bar{q}\),

  • \(\delta _{B_{\mathcal {F}}} = \delta \cup \bigcup \limits _{u \in Q, v \in F_{u}} \delta _{(u,v)} \cup \bigcup \limits _{u \in Q, v \in F_{u}} \{ (u, \varepsilon , s_{(u,v)}) \}\), and

  • \(F_{B_{\mathcal {F}}} = \bigcup \limits _{u \in Q, v \in F_{u}} \{ f_{(u, v)} \}\)

Intuitively, we connect the leading DFA \(M\) to the BA recognizing \(\underline{N}_{(u,v)}\) by linking the state u of \(M\) and the initial state \(s_{(u,v)}\) of the BA with an \(\varepsilon \)-transition for every state pair (uv) where \(v \in F_{u}\).

We now present Lemma 2 which is used later for the counterexample analysis.

Lemma 2

(cf. [73, Lemma 4]). Let \(\mathcal {F}\) be an FDFA, and \(B_{\mathcal {F}}\) be the BA constructed from \(\mathcal {F}\) according to Definition 10. If \((u, v^{k})\) is accepted by \(\mathcal {F}\) for every \(k \ge 1\), then \(uv^{\omega } \in \text {UP}(\mathcal {L}(B_{\mathcal {F}}))\).

The following theorem is the main result of our BA construction. We refer the interested reader to [72] for the proofs of Lemma 2 and Theorem 7.

Theorem 7

(cf. [73, Lemma 3]). Let \(\mathcal {F}\) be the current conjectured FDFA and \(B_{\mathcal {F}}\) be the BA constructed from \(\mathcal {F}\) according to Definition 10. Let n and k be the number of states in the leading DFA and the largest progress DFA of \(\mathcal {F}\), respectively. Then

  • the number of states in \(B_{\mathcal {F}}\) is in \(\mathcal {O}(n^{2} k^{3})\);

  • \(\text {UP}(\mathcal {L}(B_{\mathcal {F}})) \subseteq \text {UP}(\mathcal {F})\);

  • \(\text {UP}(\mathcal {L}(B_{\mathcal {F}})) = \text {UP}(\mathcal {F})\) if \(\mathcal {F}\) is the periodic FDFA accepting \(\text {UP}(\mathcal {F})\).

For instance, the initial BA \(B_{\mathcal {F}_{0}}\) constructed from the FDFA \(\mathcal {F}_{0}\) shown in Fig. 9 is depicted in Fig. 18. The state space of \(Q_{(u, v)}\) of \(B_{\mathcal {F}_{0}}\) defined in Definition 10 is empty since \(F_{\varepsilon }\) of \(A^{\varepsilon }\) in \(\mathcal {F}_{0}\) is empty.

Fig. 18.
figure 18

The BA \(B_{\mathcal {F}_{0}}\) constructed for answering the equivalence query \(\mathrm {{EQ}^{FDFA}}(\mathcal {F}_{0})\), with \(\mathcal {F}_{0}\) shown inFig. 9

The BA \(B_{\mathcal {F}_{0}}\) is clearly not a right conjecture so the BA teacher answers “no” for the equivalence query \(\mathrm {{EQ}^{BA}}(B_{\mathcal {F}_{0}})\) together with a counterexample, say \((ab)^{\omega } \in \mathcal {L}(B_{\mathcal {F}_{0}}) \ominus L\), given by the decomposition \((\varepsilon , ab)\). Since the counterexample \((\varepsilon , ab)\) from the BA teacher is a positive counterexample for the FDFA learner, according to Definition 9, the FDFA teacher who is disguised by the BA learner just sets \((u', v')\) to be \((\varepsilon , ab)\) in the counterexample analysis and returns it to the FDFA learner as counterexample for the “no” answer. We remark that if the BA learner applies the counterexample analysis to the valid positive counterexample \((\varepsilon , ab)\) for the FDFA learner, the procedure also outputs a valid positive counterexample which satisfies Definition 9. In other words, in practice the counterexample analysis on the received counterexample directly generates a valid counterexample, so the BA learner does not have to decide whether the counterexample received from the BA teacher is valid for the FDFA learner.

Since he has received a negative answer with a counterexample for the equivalence query \(\mathrm {{EQ}^{FDFA}}(\mathcal {F}_{0})\), the FDFA learner refines the current FDFA \(\mathcal {F}_{0}\) according to the received counterexample, as we have seen in Sect. 5.3, and then poses the equivalence query \(\mathrm {{EQ}^{FDFA}}(\mathcal {F}_{1})\) for the new conjectured FDFA \(\mathcal {F}_{1}\), which is shown in Fig. 10.

The BA learner then builds from \(\mathcal {F}_{1}\) the under-approximation BA \(B_{\mathcal {F}_{1}}\), which is depicted in Fig. 19: observe that the \(\omega \)-word \((bab)^{\omega } \in \mathcal {L}(B_{\mathcal {F}_{1}})\) is accepted by \(\mathcal {F}_{1}\) since the decomposition \((\varepsilon , bab)\) is accepted by \(\mathcal {F}_{1}\); the BA \(\underline{N}_{(\varepsilon , ab)}\) is defined as the DFA \(M^{\varepsilon }_{\varepsilon } \times (A^{\varepsilon })^{\varepsilon }_{ab} \times (A^{\varepsilon })^{ab}_{ab}\) augmented with an extra state f and it is shown in the dashed box in Fig. 19.

Fig. 19.
figure 19

The BA \(B_{\mathcal {F}_{1}}\) constructed for answering the equivalence query \(\mathrm {{EQ}^{FDFA}}(\mathcal {F}_{1})\), with \(\mathcal {F}_{1}\) shown in Fig. 10

Again, the conjectured BA \(B_{\mathcal {F}_{1}}\) is not the right conjecture. The BA teacher answers the equivalence query for the BA \(B_{\mathcal {F}_{1}}\) with “no” and, say, the counterexample \((bab)^{\omega } \in \mathcal {L}(B_{1}) \ominus L\), given as the decomposition (babb).

The counterexample (babb) is however not a valid counterexample for the FDFA learner according to Definition 9 since \((bab)^{\omega } \notin \text {UP}(L)\) and (babb) is not captured by the current FDFA \(\mathcal {F}_{1}\). Suppose that the BA learner feeds the FDFA learner with the counterexample (babb); it is easy to verify that he is not able to identify new states with the help of (babb), so he is going to conjecture again the FDFA \(\mathcal {F}_{1}\). Therefore, if the FDFA learner repeatedly poses the equivalence query \(\mathrm {{EQ}^{FDFA}}(\mathcal {F}_{1})\) for \(\mathcal {F}_{1}\) and the BA teacher always answers (babb), the learning procedure is going to get stuck in an infinite loop. This motivates the need of the counterexample analysis, which ensures that the counterexample returned by the BA teacher can be adapted to a valid counterexample for the FDFA learner which allows him to refine the conjectured FDFA.

Counterexample Analysis for the FDFA Teacher. In order to ensure the termination of the learning procedure, the BA learner has to execute the counterexample analysis so to get a valid counterexample for the FDFA learner out of (babb).

To distinguish the different counterexamples from the different teachers, we define the counterexample from the BA teacher as follows.

Definition 11

(Counterexample for the FDFA teacher). Let \(L\) be the target \(\omega \)-regular language and \(B_{\mathcal {F}}\) be the current conjectured BA. We say a counterexample (uv) with \(uv^{\omega } \in \mathcal {L}(B_{\mathcal {F}}) \ominus L\) is

  • positive if \(uv^{\omega } \in L\) and \(uv^{\omega } \notin \mathcal {L}(B_{\mathcal {F}})\), and

  • negative if \(uv^{\omega } \notin L\) and \(uv^{\omega } \in \mathcal {L}(B_{\mathcal {F}})\).

This is a symmetric definition when compared with the counterexample for the FDFA learner given in Definition 9.

We call a positive counterexample \(uv^{\omega }\) spurious if \(uv^{\omega } \in \text {UP}(\mathcal {F})\). A spurious positive counterexample (uv) witnesses that \(\text {UP}(\mathcal {L}(B_{\mathcal {F}})) \subset \text {UP}(\mathcal {F})\) holds; the reason for this is that: according to Theorem 7, we have \(\text {UP}(\mathcal {L}(B_{\mathcal {F}})) \subseteq \text {UP}(\mathcal {F})\); by the definition of positive counterexample for the FDFA teacher, we have \(uv^{\omega } \notin \text {UP}(\mathcal {L}(B_{\mathcal {F}}))\) yet \(uv^{\omega } \in \text {UP}(\mathcal {F})\) holds.

Fig. 20.
figure 20

The cases for the counterexample analysis

In order to analyze the counterexample (uv), it is useful to know how the received counterexample relates with the conjectured FDFA \(\mathcal {F}\). For instance, it may be that \(\mathcal {F}\) captures (uv) but \(uv^{\omega }\) is not in the target language, so (uv) should be rejected; symmetrically, we can have that \(\mathcal {F}\) rejects any decomposition of \(uv^{\omega }\) but \(uv^{\omega }\) is in the target language, so at least (uv) should be captured; moreover, it may be that \(\mathcal {F}\) rejects just (uv) while capturing another decomposition \((u', v')\) of \(uv^{\omega }\), but \(B_{\mathcal {F}}\) does not accept \((u', v')\). To distinguish the above cases, which are shown in Fig. 20, we use three DFAs accepting different decompositions of \(uv^{\omega }\).

Let \(\mathcal {F}= (M, \{ A^{u} \})\) be the current conjectured FDFA. In order to analyze the counterexample (uv) for the FDFA teacher, we define the following three DFAs, where \({\$}\) is a letter not in \(\varSigma \).

  • a DFA \(D_{\$}\) with \(\mathcal {L}(D_{\$}) = \{\, u' {\$} v' \mid u' \in \varSigma ^{*}, v' \in \varSigma ^{+}, u'v'^{\omega } = uv^{\omega } \,\}\),

  • a DFA \(\mathcal {D}_1\) with \(\mathcal {L}(\mathcal {D}_1) = \{\, u' {\$} v' \mid u' \in \varSigma ^{*}, v' \in \varSigma ^{*}, v' \in \mathcal {L}(A^{M(u')}) \,\}\), and

  • a DFA \(\mathcal {D}_2\) with \(\mathcal {L}(\mathcal {D}_2) = \{\, u' {\$} v' \mid u' \in \varSigma ^{*}, v' \in \varSigma ^{*}, v' \notin \mathcal {L}(A^{M(u')}) \,\}\).

Intuitively, \(D_{\$}\) accepts every possible decomposition \((u', v')\) of \(uv^{\omega }\); \(\mathcal {D}_1\) recognizes every decomposition \((u', v')\) which is captured by \(\mathcal {F}\); and \(\mathcal {D}_2\) accepts every decomposition \((u', v')\) which is not captured by \(\mathcal {F}\).

The DFA \(D_{\$}\) can be constructed according to the procedure presented in [26, 72]; the DFA \(\mathcal {D}_1\) can be obtained from \(\mathcal {F}\) by simply connecting each state u of \(M\) to the initial state \(s_{u}\) of \(A^{u}\) via letter \({\$}\); finally, the construction of the DFA \(\mathcal {D}_2\) is similar to the one of the DFA \(\mathcal {D}_1\) except that we use the complement FDFA \(\mathcal {F}^{\mathcal {C}}\) of \(\mathcal {F}\) instead of \(\mathcal {F}\). Note that the DFAs \(\mathcal {D}_1\) and \(\mathcal {D}_2\) in this section are specialized for the periodic FDFA which are different from those defined for all three kinds of FDFAs in [73]. We refer the interested reader to [73] for more details on the different constructions of \(\mathcal {D}_1\) and \(\mathcal {D}_2\) for the recurrent and syntactic FDFAs.

The analysis for the returned counterexample has first to identify the kind of counterexample it is analyzing and then to deal with it accordingly. More concretely, the three kinds of counterexamples, shown in Fig. 20 by means of different shapes, are the following:

  • case U1: \(uv^{\omega } \in \text {UP}(L)\) while \(uv^{\omega } \notin \text {UP}(\mathcal {F})\) (square). The \(\omega \)-word \(uv^{\omega }\) is a positive counterexample for the FDFA teacher since \(uv^{\omega } \notin \text {UP}(\mathcal {L}(B_{\mathcal {F}}))\). On the one hand, \(uv^{\omega } \in \text {UP}(L)\) holds and all decompositions of \(uv^{\omega }\) should be accepted by \(\mathcal {F}\); on the other hand, \(uv^{\omega } \notin \text {UP}(\mathcal {F})\) holds, which indicates that no decomposition of \(uv^{\omega }\) is accepted by \(\mathcal {F}\). Therefore, in order to further refine \(\mathcal {F}\), the FDFA learner needs to make \(\mathcal {F}\) accept at least one decomposition \((u', v')\) of \(uv^{\omega }\). That is, the analysis has to find a valid positive counterexample \((u', v')\) out of \(uv^\omega \) for the FDFA learner such that \(v' \notin \mathcal {L}(A^{M(u')})\). This can be easily done by taking a word \(u' {\$} v' \in \mathcal {L}(D_{\$}) \cap \mathcal {L}(\mathcal {D}_2)\). This follows from the fact that DFA \(D_{\$}\) accepts every decomposition of \(uv^{\omega }\) and \(\mathcal {D}_2\) accepts all decompositions which are not accepted by \(\mathcal {F}\). Therefore, a word \(u' {\$} v' \in \mathcal {L}(D_{\$}) \cap \mathcal {L}(\mathcal {D}_2)\) is a valid positive counterexample for the FDFA learner. Note that the analysis tries to find a decomposition \((u', v')\) which is not captured by \(\mathcal {F}\) instead of a decomposition not accepted by \(\mathcal {F}\). The reason is that a decomposition \((u', v')\) not captured by \(\mathcal {F}\) is also a decomposition which is not accepted by \(\mathcal {F}\) according to Definition 6. We refer interested reader to [72, Appendix D.3] for more details on case U1.

  • case U2: \(uv^{\omega } \notin \text {UP}(L)\) while \(uv^{\omega } \in \text {UP}(\mathcal {F})\) (circle). The \(\omega \)-word \(uv^{\omega }\) is a negative counterexample for the FDFA teacher since \(uv^{\omega } \in \text {UP}(\mathcal {L}(B_{\mathcal {F}}))\). It follows that \(\mathcal {F}\) should reject every decomposition of \(uv^{\omega }\) since \(uv^{\omega } \notin \text {UP}(L)\). In other words, in order to further refine \(\mathcal {F}\), the FDFA learner needs to make \(\mathcal {F}\) not capture at least one decomposition \((u', v')\) of \(uv^{\omega }\). Therefore, the analysis needs to find a valid negative counterexample \((u', v')\) out of \(uv^{\omega }\) for the FDFA learner that is accepted by \(\mathcal {F}\). This can be done by taking a word \(u' {\$} v' \in \mathcal {L}(D_{\$}) \cap \mathcal {L}(\mathcal {D}_1)\).

  • case U3: \(uv^{\omega } \in \text {UP}(L)\) and \(uv^{\omega } \in \text {UP}(\mathcal {F})\) (diamond). The \(\omega \)-word \(uv^{\omega }\) is a spurious positive counterexample since \(uv^{\omega } \in \text {UP}(\mathcal {F})\) but \(uv^{\omega } \notin \text {UP}(\mathcal {L}(B_{\mathcal {F}}))\). On the one hand, this case is quite similar to case U1 since \(uv^{\omega } \in \text {UP}(L)\) and normally we should make \(\mathcal {F}\) accept \(uv^{\omega }\); on the other hand, however, differently from U1, \(\mathcal {F}\) already accepts at least one decomposition of \(uv^{\omega }\) in this case. Suppose that the decomposition (xy) of \(uv^\omega \) is accepted by \(\mathcal {F}\); according to Lemma 2, there must exist some \(k\ge 1\) such that \((x, y^{k})\) is not accepted by \(\mathcal {F}\) since otherwise we would have \(uv^{\omega } \in \mathcal {L}(B)\). Therefore, similar to case U1, it is possible for the analysis to find a valid positive counterexample \((u', v')\) out of \(uv^\omega \) for the FDFA learner such that \(v' \notin \mathcal {L}(A^{M(u')})\). The word \(u' {\$} v'\) can also be taken from \(\mathcal {L}(D_{\$}) \cap \mathcal {L}(\mathcal {D}_2)\). We refer interested reader to [72, Appendix D.3] for more details on case U3.

We remark that from an implementation point of view, checking whether \(uv^{\omega } \in \text {UP}(L)\) can be replaced by a membership query \(\mathrm {{MQ}^{BA}}(u, v)\) while testing whether \(uv^{\omega } \in \text {UP}(\mathcal {F})\) reduces to checking whether \(u {\$} v \in \mathcal {L}(\mathcal {D}_1)\).

Fig. 21.
figure 21

The three DFAs \(D_{\$}\), \(\mathcal {D}_1\), and \(\mathcal {D}_2\) for the FDFA \(\mathcal {F}_{1}\) from Fig. 10 and the counterexample (babb)

Consider again the conjectured FDFA \(\mathcal {F}_{1}\) and the counterexample (babb) returned by the BA teacher, which is a negative counterexample since \(b \cdot (abb)^{\omega } \notin L\). The counterexample (babb) falls into case U2 since \(b \cdot (abb)^{\omega } \notin \text {UP}(L)\) while \(b \cdot (abb)^{\omega } \in \text {UP}(\mathcal {F}_{1})\). In order to analyze it, the BA learner needs to construct the DFAs \(D_{\$}\) and \(\mathcal {D}_1\), which are depicted in Fig. 21. For completeness of presentation, Fig. 21 shows also the DFA \(\mathcal {D}_2\), so to allow the reader to compare it with \(\mathcal {D}_1\). Assume that the BA learner gets the word \(\varepsilon {\$} bab \in \mathcal {L}(D_{\$}) \cap \mathcal {L}(\mathcal {D}_1)\), which gives a negative counterexample \((\varepsilon , bab)\) for the FDFA learner.

As seen in Sect. 5.3, with the negative counterexample \((\varepsilon , bab)\) at hand, the FDFA learner is able to get the refined conjecture \(\mathcal {F}_{2}\) depicted in Fig. 11. To answer the equivalence query \(\mathrm {{EQ}^{FDFA}}(\mathcal {F}_{2})\), the BA learner has again to construct a BA \(B_{\mathcal {F}_{2}}\), depicted in Fig. 22, from \(\mathcal {F}_{2}\); then he poses the equivalence query \(\mathrm {{EQ}^{BA}}(B_{\mathcal {F}_{2}})\) to the BA teacher. Note that we have \(\text {UP}(\mathcal {F}_{2}) = \text {UP}(\mathcal {L}(B_{\mathcal {F}_{2}})) = \{ a, b \}^{*} (ab)^{\omega }\) for the constructed BA \(B_{\mathcal {F}_{2}}\). This follows from the fact that \(\mathcal {F}_{2}\) is a periodic FDFA accepting \(\{ a, b \}^{*} (ab)^{\omega }\), which results in \(\text {UP}(\mathcal {F}_{2}) = \text {UP}(\mathcal {L}(B_{\mathcal {F}_{2}}))\) according to Theorem 7.

The BA teacher answers again “no” with a counterexample, say (aab), which is already a negative counterexample for the FDFA learner. However, the BA learner is not aware of this fact, thus he performs a counterexample analysis on (aab). In order to analyze the counterexample \(a \cdot (ab)^{\omega }\), the BA learner first builds the two DFAs \(D_{\$}\) and \(\mathcal {D}_1\) shown in Fig. 23.

Fig. 22.
figure 22

The BA \(B_{\mathcal {F}_{2}}\) constructed for answering the equivalence query \(\mathrm {{EQ}^{FDFA}}(\mathcal {F}_{2})\), with \(\mathcal {F}_{2}\) shown in Fig. 11

Fig. 23.
figure 23

The DFAs \(D_{\$}\) and \(\mathcal {D}_1\) for the FDFA \(\mathcal {F}_{2}\) from Fig. 11 and the counterexample (aab)

Suppose that this time the BA learner takes the word \(a {\$} ab \in \mathcal {L}(D_{\$}) \,\cap \, \mathcal {L}(\mathcal {D}_1)\), which means that the counterexample the FDFA learner receives is again (aab). After the refinement of the FDFA \(\mathcal {F}_{2}\), the FDFA learner now asks an equivalence query for the FDFA \(\mathcal {F}_{3}\) shown in Fig. 12. On receiving the query \(\mathrm {{EQ}^{FDFA}}(\mathcal {F}_{3})\), the BA \(B_{\mathcal {F}_{3}}\) given in Fig. 24 is constructed by the BA learner to ask the BA teacher another equivalence query.

Clearly, \(B_{\mathcal {F}_{3}}\) is not a right conjecture and the BA teacher returns “no” as well as a counterexample, say (bbaba), which is a negative counterexample for the FDFA teacher. The BA learner builds the two DFAs \(D_{\$}\) and \(\mathcal {D}_1\) depicted in Fig. 25 in order to get a valid counterexample for the FDFA learner from \(\mathcal {L}(D_{\$}) \cap \mathcal {L}(\mathcal {D}_1)\).

Fig. 24.
figure 24

The BA \(B_{\mathcal {F}_{3}}\) constructed for answering the equivalence query \(\mathrm {{EQ}^{FDFA}}(\mathcal {F}_{3})\), with \(\mathcal {F}_{3}\) shown in Fig. 12

Fig. 25.
figure 25

The DFAs \(D_{\$}\) and \(\mathcal {D}_1\) for the FDFA \(\mathcal {F}_{3}\) from Fig. 12 and the counterexample (bbaba)

Assume that the FDFA learner receives the counterexample (bbab) from the FDFA teacher disguised by the BA learner. The FDFA learner is now able to get the refined FDFA \(\mathcal {F}_{4}\) shown in Fig. 13, so he can ask \(\mathrm {{EQ}^{FDFA}}(\mathcal {F}_{4})\). As usual, the BA learner constructs a BA from \(\mathcal {F}_{4}\) in order to solve the equivalence query \(\mathrm {{EQ}^{FDFA}}(\mathcal {F}_{4})\) posed by the FDFA learner.

The BA \(B_{\mathcal {F}_{4}}\) constructed by the BA learner is shown in Fig. 26 and we can see that \(B_{\mathcal {F}_{4}}\) exactly accepts the target language \(L= (ab)^{\omega }\), which means that \(B_{\mathcal {F}_{4}}\) is already a right conjecture. Therefore, the BA teacher answers positively with “yes” to the BA learner as the result of the equivalence query \(\mathrm {{EQ}^{BA}}(B_{\mathcal {F}_{4}})\). The BA learner then outputs the learned BA \(B_{\mathcal {F}_{4}}\) as he has finally completed the learning task.

We notice that the current FDFA \(\mathcal {F}_{4}\) is still not a periodic FDFA of \(L\) yet we can build a BA such that \(\mathcal {L}(B_{\mathcal {F}_{4}}) = L\). In practice, the BA learning algorithm very often infers a BA recognizing \(L\) before converging to a periodic FDFA of \(L\). In the worst case, the FDFA learner inside the BA learner has to learn a periodic FDFA of \(L\) in order to get a right conjectured BA according to Theorem 7.

Fig. 26.
figure 26

The BA \(B_{4}\) constructed for answering the equivalence query \(\mathrm {{EQ}^{FDFA}}(\mathcal {F}_{4})\), with \(\mathcal {F}_{4}\) shown in Fig. 13

The BA Learner. By means of the previous example, we have introduced informally the \(\omega \)-regular language learning algorithm, which is formalized in Algorithm 3 as the BA learner. We can note that the learning procedure we described in the running example follows exactly the steps of Algorithm 3. The function \( constructBA \) is an implementation of the under-approximation BA construction and \( ceAnalysis \) is the procedure for analyzing counterexamples from the BA teacher. The refinement loop of the conjecture \(B_{\mathcal {F}}\) terminates once we get a positive answer from the teacher.

figure c

As discussed before, we can construct from an FDFA \(\mathcal {F}\) a BA \(B_{\mathcal {F}}\) such that \(\text {UP}(\mathcal {F}) = \text {UP}(\mathcal {L}(B_{\mathcal {F}}))\) if \(\mathcal {F}\) is a periodic FDFA of \(\text {UP}(\mathcal {F})\). Thus in the worst case, the FDFA learner inside the BA learner needs to learn a periodic FDFA of target language \(L\) in order to get a right conjectured BA. The main result of this section then follows.

Theorem 8

(Correctness and Termination). The BA learning algorithm based on the FDFA learner and the under-approximation BA construction always terminates and returns a BA recognizing the target \(\omega \)-regular language \(L\) in polynomial time.

6 Learning to Complement Büchi Automata

As we have seen in Sect. 3.3, the complementation of Büchi automata [25] is a classic problem that has been extensively studied for more than half a century; see [105] for a survey. The complementation of Büchi automata is a valuable tool in formal verification (cf. [67]), in particular when the property to be satisfied by a model is given by means of a Büchi automaton, in the program termination analysis (cf. Sect. 7), and when studying language inclusion problems of \(\omega \)-regular languages [1, 3, 4]. As Proposition 4 shows, the complementation of Büchi automata is super-exponential, i.e., it can be really expensive in practice as well. While this is generally unavoidable [108], we believe that there is no inherent reason to assume that the complement language is harder than the initial language: in model checking, when the property is given as a formula \(\phi \), the typical approach assumes that the translation into a Büchi automaton is equally efficient for the formula and its negation, so instead of translating \(\phi \) to \(A_{\phi }\) and then complementing \(A_{\phi }\), it first negates \(\phi \) and then translates \(\lnot \phi \) to \(A_{\lnot \phi }\) so that \(\mathcal {L}(A_{\lnot \phi }) = \varSigma ^{\omega }\setminus \mathcal {L}(A_{\phi })\). Would the complement language of \(\phi \) be indeed more complex than the language of \(\phi \), this approach would suffer in translating the negation of the formula, since such a negation corresponds to the complement of the original property’s language. Besides this, we have that complementing twice a language \(L\) gives \(L\) itself, while complementing a Büchi automaton twice would generate an automaton of incredible size: for instance, complementing twice a BA with 10 states would result in a BA accepting the same language with roughly at least \(10^{7 \cdot 10^{7}}\) states, according to the approximation given by Proposition 4.

This begs to ask the question, whether we can disentangle the complement BA from the syntactic representation of the BA accepting the language we want to complement. By taking inspiration from the regular languages setting, where the minimal DFA accepting a given regular language can be learned by the DFA learning algorithm, in this section we show how we can learn a BA accepting the complement of a given target \(\omega \)-regular language \(L\).

Fig. 27.
figure 27

The learning framework for complementing a Büchi automaton \(A\)

6.1 The Complement BA Learning Framework

The learning framework for complementing a Büchi automaton is shown in Fig. 27 and it has been proposed in [74], to which we refer the interested reader for more details. It is based on a variation of the FDFA learning algorithm to learn \(\mathcal {F}\), explained in Sects. 4 and 5. As we can see from Fig. 27, the learner is exactly the FDFA learner used to learn BAs (cf. Fig. 16). This means that the learner first uses membership queries for \(\mathcal {F}\) until a consistent automaton is created and then he turns to equivalence queries, while being oblivious of the fact that he is actually learning \(\varSigma ^{\omega }\setminus \mathcal {L}(A)\) instead of \(\mathcal {L}(A)\). The difference with the learning algorithm for BAs shown in Fig. 16 lies completely in the teacher: for membership queries, the teacher uses—cheap—standard queries [11, 73]; the real novelty is in a careful design of the answer to the equivalence queries that makes use of cheap operations whenever possible.

These equivalence queries are not executed with the FDFA \(\mathcal {F}\) and its complement \(\mathcal {F}^{\mathcal {C}}\), but with the Büchi automata \(B_{\mathcal {F}}\) and \(B_{\mathcal {F}^{\mathcal {C}}}\) that under-approximate them. The teacher first checks whether \(\mathcal {L}(B_{\mathcal {F}})\) is disjoint from \(\mathcal {L}(A)\) we want to complement. This step is cheap, and if the answer is negative, then she returns to the learner an ultimately periodic word \(uv^{\omega } \in \mathcal {L}(A)\), where at least some decomposition of \(uv^{\omega }\) is (wrongly) accepted by \(\mathcal {F}\).

In case \(\mathcal {L}(B_{\mathcal {F}}) \,\cap \, \mathcal {L}(A) = \emptyset \), the teacher checks whether the language of \(B_{\mathcal {F}^{\mathcal {C}}}\) is included in the language of \(A\). This is an interesting twist, since language inclusion is one of the traditional justifications for complementing Büchi automata, as mentioned in Sect. 3.5. But while the problem is PSPACE complete (cf. Proposition 8), it can usually be handled well by using efficient tools like RABIT [1, 3, 4]. In particular, RABIT makes use of a powerful set of computationally effective preprocessing and automata-exploration based heuristics that usually allow the language inclusion problem to be answered very efficiently.

Non-inclusion comes with a witness in the form of an ultimately periodic word \(xy^{\omega }\) accepted by \(B_{\mathcal {F}^{\mathcal {C}}}\), but not by \(A\). Thus, some decomposition (uv) of \(xy^{\omega }\) is (incorrectly) rejected by \(\mathcal {F}\) and she returns it to the learner. In case \(\mathcal {L}(B_{\mathcal {F}^{\mathcal {C}}}) \subseteq \mathcal {L}(A)\) holds, the teacher then concludes that \(\mathcal {L}(B_{\mathcal {F}}) = \varSigma ^{\omega }\setminus \mathcal {L}(A)\) and terminates the algorithm with \(B_{\mathcal {F}}\) as the complement of \(A\). Note that the learner is not required to construct an FDFA \(\mathcal {F}\) such that \(\mathcal {L}(\mathcal {F}) = \varSigma ^{\omega }\setminus \mathcal {L}(A)\); it is enough that \(\mathcal {L}(B_{\mathcal {F}}) = \varSigma ^{\omega }\setminus \mathcal {L}(A)\), which can save the framework to manage further membership and equivalence queries.

More details about the correctness of the proposed complementation framework, its complexity, and its experimental evaluation can be found in [74]. We want, however, to give some more detail about the use of RABIT to solve the language inclusion problem the teacher may need to answer in an equivalence query \(\mathrm {EQ}(\mathcal {F})\). As said above, RABIT is equipped with a powerful set of heuristics; among others, RABIT makes use of the following ones, in an increasing order of their efficacy and amount of computation they need: (1) try simple automata-pruning algorithms, which help in reducing the size of the considered automata; (2) try delayed simulations, which is intended to prove the language inclusion by analyzing the structure of the automata; (3) if inclusion was not established in step 2 then try to find a counterexample to inclusion by the Ramsey-based method [3, 4] with a small timeout value; (4) if no counterexample was found in step 3 then try the automata minimization algorithms proposed in [33], which simplify the two automata by changing their languages while preserving their language inclusion relationship.

Since these heuristics are not complete, RABIT uses as the last resort the Ramsey-based inclusion testing algorithms already used in step 3, this time without timeout, to finally decide whether the language inclusion holds. From the experimental evaluation presented in [74] we can see that the learning-based complementation algorithm is really effective in getting the complement automaton, in particular when the automaton to be complemented is of large size. One interesting thing we noted in the experiments is that the automaton \(B_{\mathcal {F}}\) used in the check \(\mathcal {L}(B_{\mathcal {F}}) \cap \mathcal {L}(A) = \emptyset \) can change sensibly between an equivalence query and the following one, which makes it difficult to predict how much RABIT is able to exploit its heuristics. Anyway, in very few cases RABIT needed to use the Ramsey-based inclusion testing algorithms to finally decide whether the language inclusion holds, which usually consumes most of the running time in the corresponding experiment.

6.2 The Complement BA Learning Framework in Action

Suppose that we want to learn the complement of the NBA \(B\) depicted in Fig. 1; recall that \(\mathcal {L}(B) = \varSigma ^{*}\cdot b^{\omega }\). The learning algorithm works as follows: the learner first poses several membership queries and constructs the initial conjectured FDFA \(\mathcal {F}_{1}\) shown in Fig. 28.

Fig. 28.
figure 28

Initial FDFA \(\mathcal {F}_{1} = (M_{1}, \{ A^{\varepsilon }_{1} \})\) and the corresponding under-approximation Büchi automaton \(B_{\mathcal {F}_{1}}\).

Afterwards, the learner performs the equivalence query \(\mathrm {EQ}(\mathcal {F}_{1})\) to verify whether \(\mathcal {F}_{1}\) is correct. In order to answer this equivalence query, the teacher first constructs the Büchi automaton \(B_{\mathcal {F}_{1}}\), also shown in Fig. 28, and then checks the emptiness of \(\mathcal {L}(B_{\mathcal {F}_{1}}) \cap \mathcal {L}(B)\). This check fails: assume that the teacher gets the \(\omega \)-word \(b(bb)^{\omega } \in \mathcal {L}(B_{\mathcal {F}_{1}}) \cap \mathcal {L}(B)\); by means of the counterexample analysis, the teacher is able to answer negatively to the query \(\mathrm {EQ}(\mathcal {F}_{1})\) posed by the learner by returning the negative counterexample \((\varepsilon , b)\), a decomposition of \(b(bb)^{\omega }\).

Fig. 29.
figure 29

Second FDFA \(\mathcal {F}_{2} = (M_{2}, \{ A^{\varepsilon }_{2} \})\) and the corresponding under-approximation Büchi automaton \(B_{\mathcal {F}_{2}}\).

Upon receiving \((\varepsilon , b)\), the learner refines the current FDFA \(\mathcal {F}_{1}\) to \(\mathcal {F}_{2}\), shown in Fig. 29, by means of membership queries; then it poses the equivalence query \(\mathrm {EQ}(\mathcal {F}_{2})\) for \(\mathcal {F}_{2}\). As before, the teacher first transforms \(\mathcal {F}_{2}\) to \(B_{\mathcal {F}_{2}}\) and then checks for the emptiness of \(\mathcal {L}(B_{\mathcal {F}_{2}}) \cap \mathcal {L}(B)\). It is easy to see that \(\mathcal {L}(B_{\mathcal {F}_{2}})\) is indeed disjoint from \(\mathcal {L}(B)\). Therefore, the teacher has first to compute the Büchi automaton \(B_{\mathcal {F}_{2}^{\mathcal {C}}}\) under-approximating \(\mathcal {F}_{2}^{\mathcal {C}}\), shown in Fig. 30, and then to check the language inclusion \(\mathcal {L}(B_{\mathcal {F}_{2}^{\mathcal {C}}}) \subseteq \mathcal {L}(B)\); this check fails.

Assume that the teacher finds \(b(ab)^{\omega } \in \mathcal {L}(B_{\mathcal {F}_{2}^{\mathcal {C}}}) \setminus \mathcal {L}(B)\); she then answers negatively to \(\mathrm {EQ}(\mathcal {F}_{2})\) by means of the positive counterexample (bab) obtained from \(b(ab)^{\omega }\).

Fig. 30.
figure 30

Under-approximation Büchi automata \(B_{\mathcal {F}_{2}^{\mathcal {C}}}\) and \(B_{\mathcal {F}_{3}^{\mathcal {C}}}\) for \(\mathcal {F}_{2}^{\mathcal {C}}\) (depicted in Fig. 29) and \(\mathcal {F}_{3}^{\mathcal {C}}\) (shown in Fig. 31), respectively

The learner uses the received counterexample (bab) to further refine the current FDFA \(\mathcal {F}_{2}\); after asking several membership queries, he generates the candidate FDFA \(\mathcal {F}_{3}\) and then asks an equivalence query for it. As in the previous cases, the teacher starts by constructing the Büchi automaton \(B_{\mathcal {F}_{3}}\) for \(\mathcal {F}_{3}\), shown in Fig. 31. Since \(\mathcal {L}(B_{\mathcal {F}_{3}}) \cap \mathcal {L}(B)\) is empty, the teacher proceeds to the second check, so she constructs the BA \(B_{\mathcal {F}_{3}^{\mathcal {C}}}\), shown in Fig. 30, and then proceeds to perform the last check, i.e., whether \(\mathcal {L}(B_{\mathcal {F}_{3}^{\mathcal {C}}}) \subseteq \mathcal {L}(B)\), which is obviously the case. Thus, the teacher terminates the learning algorithm by returning \(B_{\mathcal {F}_{3}}\), shown in Fig. 31, as the complement of \(B\).

Fig. 31.
figure 31

Final FDFA \(\mathcal {F}_{3} = (M_{3}, \{ A^{\varepsilon }_{3} \})\) and the corresponding under-approximation Büchi automaton \(B_{\mathcal {F}_{3}}\)

6.3 Experimental Evaluation

To support our claim that there is no actual super-exponential dependency between the language \(L\) we want to complement and the size of the complement \(A^{\mathcal {C}}\) of the BA \(A\) such that \(\mathcal {L}(A) = L\), we briefly recall the experiments we conducted in [74], where the complementation learning framework has been presented.

There we implemented our learning approach as Buechic, based on the ROLL learning library [73]; the inclusion check \(\mathcal {L}(B_{\mathcal {F}^{\mathcal {C}}}) \subseteq \mathcal {L}(A)\) (cf. Fig. 27) is delegated to RABIT [1, 3, 4]. In the experiments, we compared Buechic with two tools: GOAL [99], which is a mature and well-known tool for manipulating Büchi automata, for which we consider four different implemented complementing algorithms; and SPOT [39], which is the state-of-the-art platform for manipulating \(\omega \)-automata, including Büchi automata. All tools accept as input automata represented in the Hanoi Omega-Automata (HOA) format [13]. Recall that SPOT does not provide a complementation function for generic Büchi automata directly, thus we first use SPOT to get a deterministic automaton from the given NBA, then complement the resulting deterministic automaton (for parity automata this just means adding 1 to all priorities), and finally transform the resulting complement automaton to an equivalent NBA. Since SPOT is a highly optimized tool that uses effective heuristics, it very often produces very small automata, but the heavy use of heuristics makes the comparison lopsided. Moreover, SPOT uses symbolic data structures called OBDDs which provide a more efficient way to manipulate automata compared to GOAL and Buechic.

Table 1. Comparison between GOAL, SPOT, and Buechic on complementing Büchi  Store. Note that the transitions in SPOT are represented denser—the same automaton attracts a lower transitions count.
Fig. 32.
figure 32

Comparison between the number of states and transitions of automata generated by SPOT and Buechic on 72 automata corresponding to formulas from [92].

Table 1 reports the results of the complementation on the automata from Büchi  Store [100], which contains 295 NBAs with 1 to 17 states and with 0 to 123 transitions. However, since one of such automata has only one state without transitions and GOAL fails in recognizing it as a Büchi automaton, we decided to exclude it from the experiments and consider only the remaining 294 cases.

By inspecting the entries in Table 1 we can see that our learning based complementation method always outperforms the complementation methods offered by GOAL when we consider the number of states and transitions. When compared with SPOT, we see that the optimizations in SPOT are really effective, in both runtime and size of generated automata, for the small input automata (cf. block 1), but the transformation to parity automata starts to show its effects for larger automata (cf. blocks 2 and 3).

We have also considered 72 further Büchi automata generated from 72 formulas from [92]. In summary, Ramsey-based, Determinization-based, Rank-based, and Slice-based GOAL approaches solve 49, 58, 61, and 62 complementation tasks, respectively, within 5 min, while SPOT solves 66 tasks and Buechic solves 65 tasks. Among these, there are 64 tasks solved by both SPOT and Buechic, while the remaining cases are disjoint, which implies that our algorithm complements existing complementation approaches very well.

In Fig. 32, relative to the 64 commonly solved tasks, the coordinate values of the y axis and x axis are the corresponding number of states (respectively, transitions) in the complement automata of Buechic and SPOT, respectively. All points below the dotted diagonal indicate that the complement automata learned by our algorithm have smaller values than the complement automata constructed by SPOT, which is the case for almost all large examples. We recall that SPOT merges transitions that share the same source state and target state as one transition, so in the right scatter plot of Fig. 32, many points are above the diagonal line. Nevertheless, we can learn from the plots that only SPOT produces those automata with more than \(10^{3}\) states or \(10^{4}\) transitions, which indicates that the reduction optimizations of SPOT do not work well on large automata and our algorithm performs much better on large automata.

Fig. 33.
figure 33

States comparison of GOAL, SPOT, and Buechic on the formula pattern \(\bigwedge _{i=1}^k (\mathsf {GF} a_{i}) \rightarrow \mathsf {GF} b\)

In order to show how the growing trend of the number of states in the complement automata of the complementation algorithms behaves when we increase the size of the given Büchi automata in some cases, we take the generated Büchi automata for the formula pattern \(\bigwedge _{i=1}^k (\mathsf {GF} a_{i}) \rightarrow \mathsf {GF} b\). The growing trend of the number of states in the complement automata for the approaches in GOAL, SPOT, and Buechic are plotted in Fig. 33. The number of states in the complement automaton constructed by GOAL and SPOT is growing exponentially with respect to the parameter k, while the number of states in the complement automaton learned by our algorithm grows linearly. The experimental results show that the performance of our algorithm can be much more stable for some automata with their growth of the states. Thus an advantage of our learning approach is that it has potentially better performance on large automata compared to classic complementation techniques.

7 Application of Büchi Automata in Termination Analysis

In this section we present how Büchi automata and their complements are used in practice for complex verification tasks, like in program termination analysis.

Termination analysis of programs is a challenging area of formal verification, which has attracted the interest of many researchers approaching the problem from different angles; see, e.g., [14, 22, 36,37,38, 47, 48, 52, 55, 64, 69, 70, 80, 84,85,86, 95, 101, 102]. In general, while analyzing the termination of a program, we need to deal with the following challenge: when a program contains loops with branching or nesting, how can we devise a termination argument that holds for any possible interleaving of the different paths through the loop body?

Due to the difficulty of solving the general problem, many researchers have focused on its simplified version that addresses only lasso-shaped programs, i.e., programs where the control flow consists of a stem followed by a simple loop without any branching. Proving termination for this class of programs can be done rather efficiently [15,16,17, 23, 35, 54, 71, 83], but its extension to general programs is not easy.

7.1 Automata-Based Termination Analysis

In order to simplify our presentation, we consider only C programs without function calls and pointers; the variable updates are restricted to linear combinations. Since our goal in this section is to describe the modular termination analysis for a given program \(P\), we assume that every sampled path can be proved to be terminating. Therefore, in the end, we can prove that \(P\) always terminates.

Fig. 34.
figure 34

The flow of the automata-based termination analysis

The approach of Heizmann et al. [55] proposes a modular construction of termination proofs for a general program \(P\) from termination proofs of lasso-shaped programs obtained from its concrete paths as depicted in Fig. 34. On a high level, the approach repeatedly performs the following sequence of operations: first, it samples a path \(\tau = uv^{\omega }\) from the possible behaviours of \(P\) and attempts to prove its termination [15,16,17, 23, 35, 54, 71, 83] by using an off-the-shelf termination checker, like LassoRanker, part of the Ultimate Automizer suite [55]. The returned result of this step is possibly a termination argument of the sampled path, a non-termination argument of the sampled path, or “unknown” which indicates that the termination checker failed to decide the termination of the sampled path. Second, it generalizes \(\tau \) into a (potentially infinite) set of paths \(\mathcal {M}\), called a certified module, that all share the same termination proof with \(\tau \). Finally, it checks whether the behaviour of \(P\) contains a path \(\tau '\) not covered by any certified module generated so far and, if so, the procedure is repeated. This sequence is repeated until either a non-terminating path is found, “unknown” is returned, or all behaviours of \(P\) are covered by the modules.

Fig. 35.
figure 35

An example of program and its BA representation

7.2 Automata-Based Termination Analysis: An Example

As an example of the above approach, consider the insertion sort program \(P^{\mathtt {ins}}\) shown in Fig. 35(a); Fig. 35(b) shows the control flow graph (CFG) of \(P^{\mathtt {ins}}\) as a Büchi automaton \(\mathcal {A}^{P^{\mathtt {ins}}}\).

The alphabet of \(\mathcal {A}^{P^{\mathtt {ins}}}\) is the set of all statements occurring in \(P^{\mathtt {ins}}\), like assignments and guards, while the states of \(\mathcal {A}^{P^{\mathtt {ins}}}\) are the locations of \(P^{\mathtt {ins}}\); the initial state is the first location of the program, i.e., its entry point. The transitions connect states according to the way each location is reachable from another: for instance, we have the transition from \(\ell _{1}\) to \(\ell _{2}\) with action since \(\ell _{2}\) is reached after such initialization in location \(\ell _{1}\); similarly, we have a transition from \(\ell _{5}\) to \(\ell _{8}\) with action since \(\ell _{8}\) is reached when the guard of the while statement at location \(\ell _{5}\) is not satisfied. All states of \(\mathcal {A}^{P^{\mathtt {ins}}}\) are accepting so each feasible infinite sequence of statements of the program corresponds to an infinite word in the language \(\mathcal {L}(\mathcal {A}^{P^{\mathtt {ins}}})\).

The aim of the termination analysis is to cover the executions of \(\mathcal {A}^{P^{\mathtt {ins}}}\) by the accepted words of a finite set of BAs \(\{ \mathcal {A}_{1}, \cdots , \mathcal {A}_{n} \}\) such that \(\mathcal {L}(\mathcal {A}^{P^{\mathtt {ins}}}) \subseteq \mathcal {L}(\mathcal {A}_{1}) \cup \dots \cup \mathcal {L}(\mathcal {A}_{n})\) which is reduced to checking whether \(\mathcal {L}(\mathcal {A}^{P^{\mathtt {ins}}}) \cap \mathcal {L}(\mathcal {A}_{1}^{\mathcal {C}}) \cap \cdots \cap \mathcal {L}(\mathcal {A}_{n}^{\mathcal {C}}) = \emptyset \), as we have seen in Sect. 3.5. If we can prove that each BA \(\mathcal {A}_{i}\) represents a program with a termination argument, then since every execution of \(P^{\mathtt {ins}}\) is represented by a word in \(\mathcal {A}^{P^{\mathtt {ins}}}\), \(P^{\mathtt {ins}}\) is guaranteed to terminate by the arguments for the single BAs \(\mathcal {A}_{i}\).

Fig. 36.
figure 36

A certified module for the lasso word

In order to have a termination argument, each BA \(\mathcal {A}_{i}\) is associated with a ranking function \(f_{i}\) and a rank certificate \(\mathcal {I}_{i}\) mapping each state to a predicate over the program variables. The triple \(\mathcal {M}_{i} = (\mathcal {A}_{i}, f_{i}, \mathcal {I}_{i})\) is called a certified module. The construction of the set \(\{ \mathcal {M}_{1}, \cdots , \mathcal {M}_{n} \}\) is progressive (see Fig. 34). First, we sample an ultimately periodic word \(uv^{\omega } \in \mathcal {L}(\mathcal {A}^{P^{\mathtt {ins}}})\)—which is essentially a lasso-shaped program—and use an off-the-shelf tool to check if it corresponds to a terminating argument. In our example, we start with sampling the word . We can prove termination of the path corresponding to \(uv^{\omega }\) by finding, e.g., the ranking function \(f_{1}(\mathtt {i}, \mathtt {j}, \mathtt {n}) = \mathtt {j} + 1\), for which it holds that at each iteration of the inner loop, the value of \(f_{1}(\mathtt {i}, \mathtt {j}, \mathtt {n})\) decreases since \(\mathtt {j}\) is decreased by 1. The resulting certified module is shown in Fig. 36, where \(\texttt {oldrnk}\) is a fresh variable used to keep track of the value of the ranking function at the previous visit of the accepting state.

In the following, we denote the inner loop of \(\mathcal {A}^{P^{\mathtt {ins}}}\) as and its outer loop as . We can observe that \(f_{1}\) is also a ranking function for the set of paths obtained by generalizing \(uv^{\omega }\) into the set of words that correspond to all paths that eventually stay in the inner loop, i.e., words from . The language \(\mathcal {L}_{1}\) together with a ranking function \(f_{1}\) and a rank certificate \(\mathcal {I}_{1}\) can be represented by the certified module \(\mathcal {M}_{1} = (\mathcal {A}_{1}, f_{1}, \mathcal {I}_{1})\), depicted in Fig. 37; the transitions labelled with action or are a shorthand for the corresponding sequences of transitions and states: for instance, the self-loop on \(q_{5}\) with action stands for the states and transitions reachable from \(q_{5}\) in Fig. 36.

Fig. 37.
figure 37

The certified module \(\mathcal {M}_{1}\) for the language

Fig. 38.
figure 38

A certified module for the lasso word

Fig. 39.
figure 39

The certified module \(\mathcal {M}_{2}\) for the language

We proceed by removing all paths covered by \(\mathcal {L}_{1}\) from \(\mathcal {A}^{P^{\mathtt {ins}}}\) to know which paths still need to be examined. The removal can be performed by executing a BA difference algorithm, presented in Sect. 3.4, followed by checking language emptiness (potentially finding a new counterexample \(uv^{\omega }\) on failure). In our example, the difference corresponds to the (non-empty) language represented by \(\mathcal {A}^{P^{\mathtt {ins}}}_{|\mathcal {A}_{1}}\). Suppose the next sampling gives us , for which, e.g., the ranking function \(f_{2}(\mathtt {i}, \mathtt {j}, \mathtt {n}) = \mathtt {n} - \mathtt {i}\) is applicable; the corresponding certified module is shown in Fig. 38.

Note that \(f_{2}\) is also a valid ranking function for all paths taking the outer while loop infinitely often, i.e., all paths corresponding to words from \(\mathcal {L}_{2} =\) . We represent these paths by the certified module \(\mathcal {M}_{2} = (\mathcal {A}_{2}, f_{2}, \mathcal {I}_{2})\) where \(\mathcal {L}(\mathcal {A}_{2}) = \mathcal {L}_{2}\), shown in Fig. 39.

After removing the words of \(\mathcal {A}_{2}\) from \(\mathcal {L}(\mathcal {A}^{P^{\mathtt {ins}}}_{|\mathcal {A}_{1}})\), we, finally, obtain the BA \(\mathcal {A}^{P^{\mathtt {ins}}}_{|\mathcal {A}_{1},\mathcal {A}_{2}}\), whose language is empty. This means that the modules \(\mathcal {M}_{1}\) and \(\mathcal {M}_{2}\) cover all possible paths of the program \(P^{\mathtt {ins}}\) and, because each of them comes with a termination argument, we can conclude that all paths of \(P^{\mathtt {ins}}\) are guaranteed to terminate.

7.3 Automata-Based Termination Analysis: Difficulties

As we have seen in the example above, the general termination analysis involves several operations based on Büchi automata, like emptiness check and complementation or language difference. While emptiness is really cheap (cf. Proposition 6), complementation or language difference are in general extremely expensive (cf. Propositions 4 and 5), so it would be better to limit their number as much as possible.

Note however that the complementation of a lasso-shaped automaton corresponding to a lasso-shaped word is really easy: it is enough to add an accepting sink state collecting all missing transitions and make the original accepting state no more accepting. While this is cheap, the net effect in the termination analysis is really limited: in this way we remove only one infinite word at a time, so there is a negligible progress in the termination analysis.

The generalizations we have seen before are useful to avoid such negligible progress, since they allow us to remove a possibly very large set of words at each iteration. This comes at the expense of the complexity of complementing the corresponding Büchi automaton, which now can suffer from the super-exponential complexity of the language difference or complementation operations.

It is easy to recognize that there is a trade-off between the size of the set of paths of the program \(P\) covered by current certified module \(\mathcal {M}_{i}\) and the complexity of complementing \(\mathcal {M}_{i}\) itself. There are several techniques to balance these two aspects, together with specialized algorithms for them; see [28] for more details on the different generalization techniques, their effectiveness in covering the paths of the input program, and further explanations and references about the creation of certified modules from a lasso-shaped word.

We are confident that learning the complement of Büchi automata, shown in Sect. 6, is a useful technique that can complement the existing proposals, in particular for tackling the more challenging cases where the ordinary techniques start suffering from the super-exponential grown of the complement BA. This is left to future work.

8 Conclusion

In this work, we have presented a learning algorithm for Büchi automata by means of its learning of the simple \(\omega \)-regular language \((ab)^{\omega }\). We have also demonstrated how the learning algorithm can be used in classical automata operations such as complementation checking and in the termination analysis context. We believe that with the intuitive explanation of the different learning algorithms for both finite and \(\omega \)-regular languages, it will benefit both the learning community and the model checking community.