1 Introduction

The language inclusion checking of Büchi automata is a fundamental problem in the field of automated verification. Specially, in the automata-based model checking [25] framework, when both system and specification are given as Büchi automata, the model checking problem of verifying whether some system’s behavior violates the specification reduces to a language inclusion problem between the corresponding Büchi automata.

In this paper, we target at the language inclusion checking problem of Büchi automata. Since this problem has already been proved to be PSPACE-complete [18], researchers have been focusing on devising algorithms to reduce its practical cost. A naïve approach to checking the inclusion between Büchi automata \(\mathcal {A}\) and \(\mathcal {B}\) is to first construct a complement automaton \({\mathcal {B}}^{\mathsf {c}}\) such that \(\mathcal {L}({\mathcal {B}}^{\mathsf {c}}) = \varSigma ^{\omega }\setminus \mathcal {L}(\mathcal {B})\) and then to check the language emptiness of \(\mathcal {L}(\mathcal {A})\,\cap \,\mathcal {L}({\mathcal {B}}^{\mathsf {c}})\), which is the algorithm implemented in SPOT [11], a highly optimized symbolic tool for manipulating LTL formulas and \(\omega \)-automata.

The bottleneck of this approach is computing the automaton \({\mathcal {B}}^{\mathsf {c}}\), which can be exponentially larger than \(\mathcal {B}\) [26]. As a result, various optimizations—such as subsumption and simulation—have been proposed to avoid exploring the whole state-space of \({\mathcal {B}}^{\mathsf {c}}\), see, e.g., [1, 2, 9, 10, 13, 14]. For instance, RABIT is currently the state-of-the-art tool for checking language inclusion between Büchi automata, which has integrated the simulation and subsumption techniques proposed in [1, 2, 9]. All these techniques improving the language inclusion checking, however, focus on proving inclusion. In particular, the simulation techniques in [9, 13] are specialized algorithms mainly proposed to obtain such proof, which ensures that for every initial state \(q_{a}\) of \(\mathcal {A}\), there is an initial state \(q_{b}\) of \(\mathcal {B}\) that simulates every possible behavior from \(q_{a}\).

From a practical point of view, it is widely believed that the witness of a counterexample (or bug) found by a verification tool is equally valuable as a proof for the correctness of a program; we would argue that showing why a program violates the specification is also intuitive for a programmer, since it gives a clear way to identify and correct the error. Thus, the search for a proof and the search for counterexamples (bugs) are complementary activities that need to be pursued at the same time in order to maximize the practical use of verification tools. This is well-understood in the termination analysis of programs, as the techniques for searching the proof of the termination [6, 7, 20] and the counterexamples [12, 16, 21] are evolving concurrently. Counterexamples to Büchi automata language inclusion, instead, are the byproducts of a failure while proving language inclusion. Such a failure may be recognized after a considerable amount of efforts has been spent on proving inclusion, in particular when the proposed improvements are not effective. In this work, instead, we focus directly on the problem of finding a counterexample to language inclusion.

The main contribution is a novel algorithm called \(\mathsf {IMC}^{2}\) for showing language non-inclusion based on sampling and statistical hypothesis testing. Our algorithm is inspired by the Monte Carlo approach proposed in [15] for model checking systems against LTL specifications. The algorithm proposed in [15] takes as input a Büchi automaton \(\mathcal {A}\) as system and an LTL formula \(\varphi \) as specification and then checks whether \(\mathcal {A}\,\not \models \,\varphi \) by equivalently checking \(\mathcal {L}(\mathcal {A}) \not \subseteq \mathcal {L}(\mathcal {B}_{\varphi })\), where \(\mathcal {B}_{\varphi }\) is the Büchi automaton constructed for \(\varphi \). The main idea of the algorithm for showing \(\mathcal {L}(\mathcal {A}) \not \subseteq \mathcal {L}(\mathcal {B}_{\varphi })\) is to sample lasso words from the product automaton \(\mathcal {A} \times \mathcal {B}^{\mathsf {c}}_{\varphi }\) for \(\mathcal {L}(\mathcal {A}) \cap \mathcal {L}(\mathcal {B}^{\mathsf {c}}_{\varphi })\); lasso words are of the form \(uv^{\omega }\) and are obtained as soon as a state is visited twice. If one of such lasso words is accepted by \(\mathcal {A} \times \mathcal {B}^{\mathsf {c}}_{\varphi }\), then it is surely a witness to \(\mathcal {L}(\mathcal {A}) \not \subseteq \mathcal {L}(\mathcal {B}_{\varphi })\), i.e., a counterexample to \(\mathcal {A} \,\models \, \varphi \). Since in [15] the algorithm gets an LTL formula \(\varphi \) as input, the construction of \(\mathcal {B}^{\mathsf {c}}_{\varphi }\) reduces to the construction of \(\mathcal {B}_{\lnot \varphi }\) and it is widely assumed that the translation into a Büchi automaton is equally efficient for a formula and its negation. In this paper, we consider the general case, namely the specification is given as a generic Büchi automaton \(\mathcal {B}\), where the construction of \({\mathcal {B}}^{\mathsf {c}}\) from \(\mathcal {B}\) can be very expensive [26].

To avoid the heavy generation of \({\mathcal {B}}^{\mathsf {c}}\), the algorithm \(\mathsf {IMC}^{2}\) we propose directly sampling lasso words in \(\mathcal {A}\), without making the product \(\mathcal {A} \times {\mathcal {B}}^{\mathsf {c}}\). We show that usual lasso words, like the ones used in [15], do not suffice in our case, and propose a rather intriguing sampling procedure. We allow the lasso word \(uv^{\omega }\) to visit each state of \(\mathcal {A}\) multiple times, i.e., the run \(\sigma \) of \(\mathcal {A}\) on the finite word uv can present small cycles on both the u and the v part of the lasso word. We achieve this by setting a bound K on the number of times a state can be visited: each state in \(\sigma \) is visited at most \(K-1\) times, except for the last state of \(\sigma \) that is visited at most K times. We show that \(\mathsf {IMC}^{2}\) gives a probabilistic guarantee in terms of finding a counterexample to inclusion when K is sufficiently large, as described in Theorem 4. This notion of generalized lasso allows our approach to find counterexamples that are not valid lassos in the usual sense. The extensive experimental evaluation shows that our approach is generally very fast and reliable in finding counterexamples to language inclusion. In particular, the prototype tool we developed is able to manage easily Büchi automata with very large state space and alphabet on which the state-of-the-art tools such as RABIT and SPOT fail. This makes our approach fit very well among tools that make use of Büchi automata language inclusion tests, since it can quickly provide counterexamples before having to rely on the possibly time and resource consuming structural methods, in case an absolute guarantee about the result of the inclusion test is desired.

Organization of the Paper. In the remainder of this paper, we briefly recall some known results about Büchi automata in Sect. 2. We then present the algorithm \(\mathsf {IMC}^{2}\) in Sect. 3 and give the experimental results in Sect. 4 before concluding the paper with some remark in Sect. 5.

All missing proofs can be found in the report [23].

2 Preliminaries

Büchi Automata. Let \(\varSigma \) be a finite set of letters called alphabet. A finite sequence of letters is called a word. An infinite sequence of letters is called an \(\omega \)-word. We use \(|\alpha |\) to denote the length of the finite word \(\alpha \) and we use \(\lambda \) to represent the empty word, i.e., the word of length 0. The set of all finite words on \(\varSigma \) is denoted by \(\varSigma ^{*}\), and the set of all \(\omega \)-words is denoted by \(\varSigma ^{\omega }\). Moreover, we also denote by \(\varSigma ^{+}\) the set \(\varSigma ^{*}\setminus \{\lambda \}\).

A nondeterministic Büchi automaton (NBA) is a tuple \(\mathcal {B}= (\varSigma , Q, Q_{I}, \mathrm {T}, Q_{F})\), consisting of a finite alphabet \(\varSigma \) of input letters, a finite set \(Q\) of states with a non-empty set \(Q_{I}\subseteq Q\) of initial states, a set \(\mathrm {T}\subseteq Q\times \varSigma \times Q\) of transitions, and a set \(Q_{F}\subseteq Q\) of accepting states.

A run of an NBA \(\mathcal {B}\) over an \(\omega \)-word \(\alpha = a_{0} a_{1} a_{2} \cdots \in \varSigma ^{\omega }\) is an infinite alternation of states and letters \(\rho = q_{0} a_{0} q_{1} a_{1} q_{2} \cdots \in (Q\times \varSigma ) ^{\omega }\) such that \(q_{0} \in Q_{I}\) and, for each \(i \ge 0\), \(\big (\rho (i), a_{i}, \rho (i+1)\big ) \in \mathrm {T}\) where \(\rho (i) = q_{i}\). A run \(\rho \) is accepting if it contains infinitely many accepting states, i.e., \(\mathop {\mathrm {Inf}}({\rho }) \cap Q_{F}\ne \emptyset \), where \(\mathop {\mathrm {Inf}}({\rho }) = \{\, q \in Q \mid \forall i \in \mathbb {N}. \exists j > i : \rho (j) = q \,\}\). An \(\omega \)-word \(\alpha \) is accepted by \(\mathcal {B}\) if \(\mathcal {B}\) has an accepting run on \(\alpha \), and the set of words \(\mathcal {L}(\mathcal {B}) = \{\, \alpha \in \varSigma ^{\omega } \mid \alpha \text { is accepted by } \mathcal {B}\,\}\) accepted by \(\mathcal {B}\) is called its language.

We call a subset of \(\varSigma ^{\omega }\) an \(\omega \)-language and the language of an NBA an \(\omega \)-regular language. Words of the form \(uv^{\omega }\) are called ultimately periodic words. We use a pair of finite words (uv) to denote the ultimately periodic word \(w = uv^{\omega }\). We also call (uv) a decomposition of w. For an \(\omega \)-language \(L\), let \(\text {UP}(L) = \{\, uv^{\omega } \in L \mid u \in \varSigma ^{*}, v \in \varSigma ^{+} \,\}\) be the set of all ultimately periodic words in \(L\). The set of ultimately periodic words can be seen as the fingerprint of \(L\):

Theorem 1

(Ultimately Periodic Words [8]). Let \(L\), \(L'\) be two \(\omega \)-regular languages. Then \(L= L'\) if, and only if, \(\text {UP}(L) = \text {UP}(L')\).

An immediate consequence of Theorem 1 is that, for any two \(\omega \)-regular languages \(L_{1}\) and \(L_{2}\), if \(L_{1} \ne L_{2}\) then there is an ultimately periodic word \(xy^{\omega } \in \big (\text {UP}(L_{1}) \setminus \text {UP}(L_{2})\big ) \cup \big (\text {UP}(L_{2}) \setminus \text {UP}(L_{1})\big )\). It follows that \(xy^{\omega } \in L_{1} \setminus L_{2}\) or \(xy^{\omega } \in L_{2} \setminus L_{1}\). Let \(\mathcal {A}\), \(\mathcal {B}\) be two NBAs and assume that \(\mathcal {L}(\mathcal {A}) \setminus \mathcal {L}(\mathcal {B}) \ne \emptyset \). One can find an ultimately periodic word \(xy^{\omega } \in \mathcal {L}(\mathcal {A}) \setminus \mathcal {L}(\mathcal {B})\) as a counterexample to \(\mathcal {L}(\mathcal {A}) \subseteq \mathcal {L}(\mathcal {B})\).

Language inclusion between NBAs can be reduced to complementation, intersection, and emptiness problems on NBAs. The complementation operation of an NBA \(\mathcal {B}\) is to construct an NBA \({\mathcal {B}}^{\mathsf {c}}\) accepting the complement language of \(\mathcal {L}(\mathcal {B})\), i.e., \(\mathcal {L}({\mathcal {B}}^{\mathsf {c}}) = \varSigma ^{\omega }\setminus \mathcal {L}(\mathcal {B})\).

Lemma 1

(cf. [17, 19]). Let \(\mathcal {A}\), \(\mathcal {B}\) be NBAs with \(n_{a}\) and \(n_{b}\) states, respectively.

  1. 1.

    It is possible to construct an NBA \({\mathcal {B}}^{\mathsf {c}}\) such that \(\mathcal {L}({\mathcal {B}}^{\mathsf {c}}) = \varSigma ^{\omega }\setminus \mathcal {L}(\mathcal {B})\) whose number of states is at most \((2n_{b}+2)^{n_{b}} \times 2^{n_{b}}\), by means of the complement construction.

  2. 2.

    It is possible to construct an NBA \(\mathcal {C}\) such that \(\mathcal {L}(\mathcal {C}) = \mathcal {L}(\mathcal {A}) \cap \mathcal {L}({\mathcal {B}}^{\mathsf {c}})\) whose number of states is at most \(2 \times n_{a} \times (2n_{b}+2)^{n_{b}} \times 2^{n_{b}}\), by means of the product construction. Note that \(\mathcal {L}(\mathcal {A}) \subseteq \mathcal {L}(\mathcal {B})\) holds if and only if \(\mathcal {L}(\mathcal {C}) = \emptyset \) holds.

  3. 3.

    \(\mathcal {L}(\mathcal {C}) = \emptyset \) is decidable in time linear in the number of states of \(\mathcal {C}\).

Further, testing whether an \(\omega \)-word w is accepted by a Büchi automaton \(\mathcal {B}\) can be done in time polynomial in the size of the decomposition (uv) of \(w = uv^{\omega }\).

Lemma 2

(cf. [17]). Let \(\mathcal {B}\) be an NBA with n states and an ultimately periodic word (uv) with \(|u| + |v| = m\). Then checking whether \(uv^{\omega }\) is accepted by \(\mathcal {B}\) is decidable in time and space linear in \(n \times m\).

Random Sampling and Hypothesis Testing. Statistical hypothesis testing is a statistical method to assign a confidence level to the correctness of the interpretation given to a small set of data sampled from a population, when this interpretation is extended to the whole population.

Let Z be a Bernoulli random variable and X the random variable with parameter \(p_{Z}\) whose value is the number of independent trials required until we see that \(Z = 1\). Let \(\delta \) be the significance level that \(Z = 1\) will not appear within N trials. Then \(N = \lceil \ln \delta /\ln (1 - p_{Z}) \rceil \) is the number of attempts needed to get a counterexample with probability at most \(1 - \delta \).

If the exact value of \(p_{Z}\) is unknown, given an error probability \(\varepsilon \) such that \(p_{Z}\ge \varepsilon \), we have that \(M = \lceil \ln \delta /\ln (1 - \varepsilon ) \rceil \ge N = \lceil \ln \delta /\ln (1 - p_{Z}) \rceil \) ensures that \(p_{Z}\ge \varepsilon \implies \mathbf {Pr}[X \le M] \ge 1 - \delta \). In other words, M is the minimal number of attempts required to find a counterexample with probability \(1 - \delta \), under the assumption that \(p_{Z}\ge \varepsilon \). See, e.g., [15, 27] for more details about statistical hypothesis testing in the context of formal verification.

3 Monte Carlo Sampling for Non-inclusion Testing

In this section we present our Monte Carlo sampling algorithm \(\mathsf {IMC}^{2}\) for testing non-inclusion between Büchi automata.

3.1 \(\mathsf {MC}^{2}\): Monte Carlo Sampling for LTL Model Checking

In [15], the authors proposed a Monte Carlo sampling algorithm \(\mathsf {MC}^{2}\) for verifying whether a given system A satisfies a Linear Temporal Logic (LTL) specification \(\varphi \). \(\mathsf {MC}^{2}\) works directly on the product Büchi automaton \(\mathcal {P}\) that accepts the language \(\mathcal {L}(A) \cap \mathcal {L}(\mathcal {B}_{\lnot \varphi })\). It essentially checks whether \(\mathcal {L}(\mathcal {P})\) is empty.

First, \(\mathsf {MC}^{2}\) takes two statistical parameters \(\varepsilon \) and \(\sigma \) as input and computes the number of samples M for this experiment. Since every ultimately periodic word \(xy^{\omega } \in \mathcal {L}(\mathcal {P})\) corresponds to some cycle run (or “lasso”) in \(\mathcal {P}\), \(\mathsf {MC}^{2}\) can just find an accepting lasso whose corresponding ultimately periodic word \(xy^{\omega }\) is such that \(xy^{\omega } \in \mathcal {L}(\mathcal {P})\). In each sampling procedure, \(\mathsf {MC}^{2}\) starts from a randomly chosen initial state and performs a random walk on \(\mathcal {P}\)’s transition graph until a state has been visited twice, which consequently gives a lasso in \(\mathcal {P}\). \(\mathsf {MC}^{2}\) then checks whether there exists an accepting state in the repeated part of the sampled lasso. If so, \(\mathsf {MC}^{2}\) reports it as a counterexample to the verification, otherwise it continues with another sampling process if necessary. The correctness of \(\mathsf {MC}^{2}\) is straightforward, as the product automaton \(\mathcal {P}\) is non-empty if and only if there is an accepting lasso.

3.2 The Lasso Construction Fails for Language Inclusion

The Monte Carlo Sampling algorithm in [15] operates directly on the product. For language inclusion, as discussed in the introduction, this is the bottleneck of the construction. Thus, we aim at a sampling algorithm operating on the automata \(\mathcal {A}\) and \(\mathcal {B}\), separately. With this in mind, we show first that, directly applying \(\mathsf {MC}^{2}\) can be incomplete for language inclusion checking.

Fig. 1.
figure 1

Two NBAs \(\mathcal {A}\) and \(\mathcal {B}\).

Example 1

Consider checking the language inclusion of the Büchi automata \(\mathcal {A}\) and \(\mathcal {B}\) in Fig. 1. As we want to exploit \(\mathsf {MC}^{2}\) to find a counterexample to the inclusion, we need to sample a word from \(\mathcal {A}\) that is accepted by \(\mathcal {A}\) but not accepted by \(\mathcal {B}\). In [15], the sampling procedure is terminated as soon as a state is visited twice. Thus, the set of lassos that can be sampled by \(\mathsf {MC}^{2}\) is \(\{s_{1} a s_{1}, s_{1} b s_{2} b s_{2}\}\), which yields the set of words \(\{a^{\omega }, b^{\omega }\}\). It is easy to see that neither of these two words is a counterexample to the inclusion. The inclusion, however, does not hold: the word \(ab^{\omega } \in \mathcal {L}(\mathcal {A}) \setminus \mathcal {L}(\mathcal {B})\) is a counterexample.    \(\lozenge \)

According to Theorem 1, if \(\mathcal {L}(\mathcal {A}) \setminus \mathcal {L}(\mathcal {B}) \ne \emptyset \), then there must be an ultimately periodic word \(xy^{\omega } \in \mathcal {L}(\mathcal {A}) \setminus \mathcal {L}(\mathcal {B})\) as a counterexample to the inclusion. It follows that there exists some lasso in \(\mathcal {A}\) whose corresponding ultimately periodic word is a counterexample to the inclusion. The limit of \(\mathsf {MC}^{2}\) in checking the inclusion is that \(\mathsf {MC}^{2}\) only samples simple lasso runs, which may miss non-trivial lassos in \(\mathcal {A}\) that correspond to counterexamples to the inclusion. The reason that it is sufficient for checking non-emptiness in the product automaton is due to the fact that the product automaton already synchronizes behaviors of \(\mathcal {A}\) and \(\mathcal {B}_{\lnot \varphi }\).

In the remainder of this section, we shall propose a new definition of lassos by allowing multiple occurrences of states, which is the key point of our extension.

3.3 \(\mathsf {IMC}^{2}\): Monte Carlo Sampling for Inclusion Checking

We now present our Monte Carlo sampling algorithm called \(\mathsf {IMC}^{2}\) specialized for testing the language inclusion between two given NBAs \(\mathcal {A}\) and \(\mathcal {B}\).

We first define the lassos of \(\mathcal {A}\) in Definition 1 and show how to compute the probability of a sample lasso in Definition 2. Then we prove that with our definition of the lasso probability space in \(\mathcal {A}\), the probability of a sample lasso whose corresponding ultimately periodic word \(xy^{\omega }\) is a counterexample to the inclusion is greater than 0 under the hypothesis \(\mathcal {L}(\mathcal {A}) \not \subseteq \mathcal {L}(\mathcal {B})\). Thus we eventually get for sure a sample from \(\mathcal {A}\) that is a counterexample to the inclusion, if inclusion does not hold. In other words, we are able to obtain a counterexample to the inclusion with high probability from a large amount of samples.

In practice, a lasso of \(\mathcal {A}\) is sampled via a random walk on \(\mathcal {A}\)’s transition graph, starting from a randomly chosen initial state and picking uniformly one outgoing transition. In the following, we fix a natural number \(K \ge 2\) unless explicitly stated otherwise and two NBAs \(\mathcal {A} = (\varSigma , Q, Q_{I}, \mathrm {T}, Q_{F})\) and \(\mathcal {B}\). We assume that each state in \(\mathcal {A}\) can reach an accepting state and has at least one outgoing transition. Note that each NBA \(\mathcal {A}\) with \(\mathcal {L}(\mathcal {A}) \ne \emptyset \) can be pruned to satisfy such assumption; only NBAs \(\mathcal {A}'\) with \(\mathcal {L}(\mathcal {A}') = \emptyset \) do not satisfy the assumption, but for these automata the problem \(\mathcal {L}(\mathcal {A}') \subseteq \mathcal {L}(\mathcal {B})\) is trivial.

Definition 1 (Lasso)

Given two NBAs \(\mathcal {A}\), \(\mathcal {B}\) and a natural \(K \ge 2\), a finite run \(\sigma = q_{0} a_{0} q_{1} \cdots a_{n-1} q_{n} a_{n} q_{n+1}\) of \(\mathcal {A}\) is called a K-lasso if (1) each state in \(\{q_{0}, \dotsc , q_{n}\}\) occurs at most \(K - 1\) times in \(q_{0} a_{0} q_{1} \cdots a_{n-1} q_{n}\) and (2) \(q_{n+1} = q_{i}\) for some \(0 \le i \le n\) (thus, \(q_{n+1}\) occurs at most K times in \(\sigma \)). We write \(\sigma \bot \) for the terminating K-lasso \(\sigma \), where \(\bot \) is a fresh symbol denoting termination. We denote by \(S^{K}_{\mathcal {A}}\) the set of all terminating K-lassos for \(\mathcal {A}\).

We call \(\sigma \bot \in S^{K}_{\mathcal {A}}\) a witness for \(\mathcal {L}(\mathcal {A}) \setminus \mathcal {L}(\mathcal {B}) \ne \emptyset \) if the associated \(\omega \)-word \((a_{0} \cdots a_{i-1}, a_{i} \cdots a_{n})\) is accepted by \(\mathcal {A}\) but not accepted by \(\mathcal {B}\).

It is worth noting that not every finite cyclic run of \(\mathcal {A}\) is a valid K-lasso. Consider the NBA \(\mathcal {A}\) shown in Fig. 1 for instance: the run \(s_{1} a s_{1} b s_{2} b s_{2}\) is not a lasso when \(K = 2\) since by Definition 1 every state except the last one is allowed to occur at most \(K - 1 = 1\) times; \(s_{1}\) clearly violates this requirement since it occurs twice and it is not the last state of the run. The run \(s_{1} b s_{2} b s_{2}\) instead is obviously a valid lasso when \(K = 2\).

Remark 1

A K-lasso \(\sigma \) is also a \(K'\)-lasso for any \(K' > K\). Moreover, a terminating K-lasso can be a witness without being an accepting run: according to Definition 1, a terminating K-lasso \(\sigma \bot \) is a witness if its corresponding word \(uv^{\omega }\) is accepted by \(\mathcal {A}\) but not accepted by \(\mathcal {B}\). This does not imply that \(\sigma \) is an accepting run, since there may be another run \(\sigma '\) on the same word \(uv^{\omega }\) that is accepting.

In order to define a probability space over \(S^{K}_{\mathcal {A}}\), we first define the probability of a terminating K-lasso of \(\mathcal {A}\). We denote by \(\#(\sigma , q)\) the number of occurrences of the state q in the K-lasso \(\sigma \).

Definition 2 (Lasso Probability)

Given an NBA \(\mathcal {A}\), a natural number \(K \ge 2\), and a stopping probability \(p_{\bot }\in (0,1)\), the probability \(\mathbf {Pr}_{p_{\bot }}[\sigma \bot ]\) of a terminating K-lasso \(\sigma \bot = q_{0} a_{0} \cdots q_{n} a_{n} q_{n+1}\bot \in S^{K}_{\mathcal {A}}\) is defined as follows:

$$\begin{aligned}&\mathbf {Pr}_{p_{\bot }}[\sigma \bot ] = {\left\{ \begin{array}{ll} \mathbf {Pr}'_{p_{\bot }}[\sigma ] &{} \text {if } \#(\sigma , q_{n+1}) = K,\\ p_{\bot }\cdot \mathbf {Pr}'_{p_{\bot }}[\sigma ] &{} \text {if } \#(\sigma , q_{n+1}) < K; \end{array}\right. } \\&\mathbf {Pr}'_{p_{\bot }}[\sigma ] = {\left\{ \begin{array}{ll} \frac{1}{|Q_{I}|} &{} \text {if } \sigma = q_{0};\\ \mathbf {Pr}'_{p_{\bot }}[\sigma '] \cdot \pi [q_{l} a_{l} q_{l+1}] &{} \text {if } \sigma = \sigma ' a_{l} q_{l+1} \text { and } \#(\sigma ', q_{l}) = 1;\\ (1 {-} p_{\bot }) \cdot \mathbf {Pr}'_{p_{\bot }}[\sigma '] \cdot \pi [q_{l} a_{l} q_{l+1}] &{} \text {if } \sigma = \sigma ' a_{l} q_{l+1} \text { and } \#(\sigma ', q_{l}) > 1, \end{array}\right. } \end{aligned}$$

where \(\pi [q a q'] = \frac{1}{m} \) if \((q, a, q') \in \mathrm {T}\) and \(|\mathrm {T}(q)| = m\), 0 otherwise.

We extend \(\mathbf {Pr}_{p_{\bot }}\) to sets of terminating K-lassos in the natural way, i.e., for \(S \subseteq S^{K}_{\mathcal {A}}\), \(\mathbf {Pr}_{p_{\bot }}[S] = \sum _{\sigma \bot \in S} \mathbf {Pr}_{p_{\bot }}[\sigma \bot ]\).

Assume that the current state of run \(\sigma \) is q. Intuitively, if the last state s of the run \(\sigma \) has been already visited at least twice but less than K times, the run \(\sigma \) can either terminate at s with probability \(p_{\bot }\) or continue with probability \(1 - p_{\bot }\) by taking uniformly one of the outgoing transitions from the state q. However, as soon as the state q has been visited K times, the run \(\sigma \) has to terminate.

Fig. 2.
figure 2

An instance \(\mathcal {T}\) of the trees used in the proof of Theorem 2. Each leaf node is labeled with a terminating 3-lasso \(\sigma \bot \in S^{3}_{\mathcal {A}, \mathcal {B}}\) for the NBAs \(\mathcal {A}\) and \(\mathcal {B}\) shown in Fig. 1, and its corresponding probability value \(\mathbf {Pr}_{\frac{1}{2}}[\sigma \bot ]\).

Theorem 2 (Lasso Probability Space)

Let \(\mathcal {A}\) be an NBA, \(K \ge 2\), and a stopping probability \(p_{\bot }\in (0,1)\). The \(\sigma \)-field \((S^{K}_{\mathcal {A}}, 2^{S^{K}_{\mathcal {A}}})\) together with \(\mathbf {Pr}_{p_{\bot }}\) defines a discrete probability space.

Proof

(sketch). The facts that \(\mathbf {Pr}_{p_{\bot }}[\sigma ]\) is a non-negative real value for each \(\sigma \in S\) and that \(\mathbf {Pr}_{p_{\bot }}[S_{1} \cup S_{2}] = \mathbf {Pr}_{p_{\bot }}[S_{1}] + \mathbf {Pr}_{p_{\bot }}[S_{2}]\) for each \(S_{1}, S_{2} \subseteq S^{K}_{\mathcal {A}}\) such that \(S_{1} \cap S_{2} = \emptyset \) are both immediate consequences of the definition of \(\mathbf {Pr}_{p_{\bot }}\).

The interesting part of the proof is about showing that \(\mathbf {Pr}_{p_{\bot }}[S^{K}_{\mathcal {A}}] = 1\). To prove this, we make use of a tree \(\mathcal {T}= (N, \langle \lambda , 1 \rangle , E)\), like the one shown in Fig. 2, whose nodes are labelled with finite runs and probability values. In particular, we label the leaf nodes of \(\mathcal {T}\) with the terminating K-lassos in \(S^{K}_{\mathcal {A}}\) while we use their finite run prefixes to label the internal nodes. Formally, the tree \(\mathcal {T}\) is constructed as follows. Let \(P = \{\, \sigma ' \in Q\times (\varSigma \times Q)^{*} \mid \sigma ' \text { is a prefix of some } \sigma \bot \in S^{K}_{\mathcal {A}} \,\}\) be the set of prefixes of the K-lassos in \(S^{K}_{\mathcal {A}}\). \(\mathcal {T}\)’s components are defined as follows.

  • \(N = \big (P \times (0, 1]\big ) \cup \big (S^{K}_{\mathcal {A}} \times (0, 1]\big ) \cup \{\langle \lambda , 1 \rangle \}\) is the set of nodes,

  • \(\langle \lambda , 1 \rangle \) is the root of the tree, and

  • \(E \subseteq \Big (\{\langle \lambda , 1 \rangle \} \times \big (P \times (0, 1]\big )\Big ) \cup \Big (P \times (0, 1]\Big )^{2} \cup \Big (\big (P \times (0, 1]\big ) \times \big (S^{K}_{\mathcal {A}} \times (0, 1]\big )\Big )\) is the set of edges defined as

    where \(\sigma _{l}\) denotes the last state \(s_{n}\) of the finite run \(\sigma = s_{0} a_{0} s_{1} \dotsc a_{n-1} s_{n}\).

Then we show a correspondence between the reachable leaf nodes and the terminating K-lassos with their \(\mathbf {Pr}_{p_{\bot }}\) probability values, and that the probability value in each internal node equals the sum of the probabilities of its children. By the finiteness of the reachable part of the tree we then derive \(\mathbf {Pr}_{p_{\bot }}[S^{K}_{\mathcal {A}}] = 1\).    \(\square \)

Example 2

(Probability of lassos). Consider the Büchi automaton \(\mathcal {A}\) of Fig. 1 and \(p_{\bot }= \frac{1}{2}\). For \(K = 2\), there are only two terminating 2-lassos, namely \(s_{1} a s_{1}\bot \) and \(s_{1} b s_{2} b s_{2}\bot \). According to Definition 2, we know that each lasso occurs with probability \(\frac{1}{2}\) and they are not witnesses since the corresponding ultimately periodic words \(a^{\omega }\) and \(bb^{\omega }\) do not belong to the language \(\mathcal {L}(\mathcal {A}) \setminus \mathcal {L}(\mathcal {B})\). If we set \(K = 2\) to check whether \(\mathcal {L}(\mathcal {A}) \subseteq \mathcal {L}(\mathcal {B})\), we end up concluding that the inclusion holds with probability 1 since the probability to find some lasso of \(\mathcal {A}\) related to the \(\omega \)-word \(ab^{\omega } \in \mathcal {L}(\mathcal {A}) \setminus \mathcal {L}(\mathcal {B})\) is 0. If we want to find a witness K-lasso, we need to set \(K = 3\) at least, since now the terminating 3-lasso \(s_{1} a s_{1} b s_{2} b s_{2}\bot \) with corresponding \(\omega \)-word \(abb^{\omega } \in \mathcal {L}(\mathcal {A}) \setminus \mathcal {L}(\mathcal {B})\) can be found with probability \(\frac{1}{16} > 0\).

We remark that the Monte Carlo method proposed in [15] uses lassos that are a special instance of Definition 2 when we let \(K = 2\) and \(p_{\bot }= 1\), thus their method is not complete for NBA language inclusion checking.    \(\lozenge \)

According to Theorem 2, the probability space of the sample terminating K-lassos in \(\mathcal {A}\) can be organized in the tree, like the one shown in Fig. 2. Therefore, it is easy to see that the probability to find the witness 3-lasso \(s_{1} a s_{1} b s_{2} b s_{2}\bot \) of \(\mathcal {A}\) is \(\frac{1}{16}\), as indicated by the leaf node \(\langle s_{1} a s_{1} b s_{2} b s_{2}\bot , \frac{1}{16} \rangle \).

Definition 3 (Lasso Bernoulli Variable)

Let \(K\ge 2\) be a natural number and \(p_{\bot }\) a stopping probability. The random variable associated with the probability space \((S^{K}_{\mathcal {A}}, 2^{S^{K}_{\mathcal {A}}}, \mathbf {Pr}_{p_{\bot }})\) of the NBAs \(\mathcal {A}\) and \(\mathcal {B}\) is defined as follows: \(p_{Z} = \mathbf {Pr}_{p_{\bot }}[Z = 1] = \sum _{\sigma \bot \in S_{w}} \mathbf {Pr}_{p_{\bot }}[\sigma \bot ]\) and \(q_{Z} = \mathbf {Pr}_{p_{\bot }}[Z = 0] = \sum _{\sigma \bot \in S_{n}} \mathbf {Pr}_{p_{\bot }}[\sigma \bot ]\), where \(S_{w}, S_{n} \subseteq S^{K}_{\mathcal {A}}\) are the set of witness and non-witness lassos, respectively.

Under the assumption \(\mathcal {L}(A) \setminus \mathcal {L}(B) \ne \emptyset \), there exists some witness K-lasso \(\sigma \bot \in S_{w}\) that can be sampled with positive probability if \(\mathbf {Pr}_{p_{\bot }}[Z = 1] > 0\), as explained by Example 3.

Example 3

For the NBAs \(\mathcal {A}\) and \(\mathcal {B}\) shown in Fig. 1, \(K = 3\), and \(p_{\bot }= \frac{1}{2}\), the lasso probability space is organized as in Fig. 2. The lasso Bernoulli variable has associated probabilities \(p_{Z} = \frac{1}{8}\) and \(q_{Z} = \frac{7}{8}\) since the only witness lassos are \(s_{1} a s_{1} b s_{2} b s_{2}\bot \) and \(s_{1} a s_{1} b s_{2} b s_{2} b s_{2}\bot \), both occurring with probability \(\frac{1}{16}\).    \(\lozenge \)

Therefore, if we set \(K=3\) and \(p_{\bot }= \frac{1}{2}\) to check the inclusion between \(\mathcal {A}\) and \(\mathcal {B}\) from Fig. 1, we are able to find with probability \(\frac{1}{8}\) the \(\omega \)-word \(ab^{\omega }\) as a counterexample to the inclusion \(\mathcal {L}(\mathcal {A}) \subseteq \mathcal {L}(\mathcal {B})\). It follows that the probability we do not find any witness 3-lasso after 50 trials would be less than 0.002, which can be made even smaller with a larger number of trials.

Fig. 3.
figure 3

NBA \(\mathcal {K}_{K}\) making \(p_{Z} = 0\) when checking \(\mathcal {L}(\mathcal {A}) \subseteq \mathcal {L}(\mathcal {K}_{K})\) by means of sampling terminating K-lassos from \(\mathcal {A}\) shown in Fig. 1.

As we have seen in Example 2, the counterexample may not be sampled with positive probability if K is not sufficiently large, that is the main problem with \(\mathsf {MC}^{2}\) algorithm from [15] for checking language inclusion. The natural question is then: how large should K be for checking the inclusion? First, let us discuss about K without taking the automaton \(\mathcal {B}\) into account. Consider the NBA \(\mathcal {A}\) of Fig. 1: it seems that no matter how large K is, one can always construct an NBA \(\mathcal {K}\) with \(K + 1\) states to make the probability \(p_{Z} = 0\), as the counterexample \(a^{l} b^{\omega } \in \mathcal {L}(\mathcal {A}) \setminus \mathcal {L}(\mathcal {B})\) can not be sampled for any \(l \ge K\). Figure 3 depicts such NBA \(\mathcal {K}\), for which we have \(\mathcal {L}(K) = \{ b^{\omega }, ab^{\omega }, aab^{\omega }, \dotsc , a^{K-1}b^{\omega }\}\). One can easily verify that the counterexample \(a^{l}b^{\omega }\) can not be sampled from \(\mathcal {A}\) when \(l \ge K\), as sampling this word requires the state \(s_{1}\) to occur \(l+1\) times in the run, that is not a valid K-lasso. This means that K is a value that depends on the size of \(\mathcal {B}\). To get a K sufficiently large for every \(\mathcal {A}\) and \(\mathcal {B}\), one can just take the product of \(\mathcal {A}\) with the complement of \(\mathcal {B}\) and check how many times in the worst case a state of \(\mathcal {A}\) occurs in the shortest accepting run of the product.

Lemma 3

(Sufficiently Large K). Let \(\mathcal {A}\), \(\mathcal {B}\) be NBAs with \(n_{a}\) and \(n_{b}\) states, respectively, and Z be the random variable defined in Definition 3. Assume that \(\mathcal {L}(\mathcal {A}) \setminus \mathcal {L}(\mathcal {B}) \ne \emptyset \). If \(K \ge 2 \times (2n_{b} + 2)^{n_{b}} \times 2^{n_{b}} + 1\), then \(\mathbf {Pr}_{p_{\bot }}[Z = 1] > 0\).

Remark 2

We want to stress that choosing K as given in Lemma 3 is a sufficient condition for sampling a counterexample with positive probability; choosing this value, however, is not a necessary condition. In practice, we can find counterexamples with positive probability with K being set to a value much smaller than \(2 \times (2n_{b} + 2)^{n_{b}} \times 2^{n_{b}} + 1\), as experiments reported in Sect. 4 indicate.

figure a

Now we are ready to present our \(\mathsf {IMC}^{2}\) algorithm, given in Algorithm 1. On input the two NBAs \(\mathcal {A}\) and \(\mathcal {B}\), the bound K, the stopping probability \(p_{\bot }\), and the statistical parameters \(\varepsilon \) and \(\delta \), the algorithm at line 2 first computes the number M of samples according to \(\varepsilon \) and \(\delta \). Then, for each \(\omega \)-word \((u,v) = uv^{\omega }\) associated with a terminating lasso sampled at line 4 according to Definitions 1 and 2, it checks whether the lasso is a witness by first (line 5) verifying whether \(uv^{\omega } \in \mathcal {L}(\mathcal {A})\), and then (line 6) whether \(uv^{\omega } \notin \mathcal {L}(\mathcal {B})\). If the sampled lasso is indeed a witness, a counterexample to \(\mathcal {L}(\mathcal {A}) \subseteq \mathcal {L}(\mathcal {B})\) has been found, so the algorithm can terminate at line 7 with the correct answer \( false \) and the counterexample (uv). If none of the M sampled lassos is a witness, then the algorithm returns \( true \) at line 8, which indicates that hypothesis \(\mathcal {L}(\mathcal {A}) \not \subseteq \mathcal {L}(\mathcal {B})\) has been rejected and \(\mathcal {L}(\mathcal {A}) \subseteq \mathcal {L}(\mathcal {B})\) is assumed to hold. It follows that \(\mathsf {IMC}^{2}\) gives a probabilistic guarantee in terms of finding a counterexample to inclusion when K is sufficient large, as formalized by the following proposition.

Proposition 1

Let \(\mathcal {A}\), \(\mathcal {B}\) be two NBAs and K be a sufficiently large number. If \(\mathcal {L}(\mathcal {A}) \setminus \mathcal {L}(\mathcal {B}) \ne \emptyset \), then \(\mathsf {IMC}^{2}\) finds a counterexample to the inclusion \(\mathcal {L}(\mathcal {A}) \subseteq \mathcal {L}(\mathcal {B})\) with positive probability.

In general, the exact value of \(p_{Z}\), the probability of finding a word accepted by \(\mathcal {A}\) but not accepted by \(\mathcal {B}\), is unknown or at least very hard to compute. Thus, we summarize our results about \(\mathsf {IMC}^{2}\) in Theorems 3 and 4 with respect to the choice of the statistical parameters \(\varepsilon \) and \(\delta \).

Theorem 3 (Correctness)

Let \(\mathcal {A}\), \(\mathcal {B}\) be two NBAs, K be a sufficiently large number, and \(\varepsilon \) and \(\delta \) be statistical parameters. If \(\mathsf {IMC}^{2}\) returns \( false \), then \(\mathcal {L}(\mathcal {A}) \not \subseteq \mathcal {L}(\mathcal {B})\) is certain. Otherwise, if \(\mathsf {IMC}^{2}\) returns \( true \), then the probability that we would continue and with probability \(p_{Z} \ge \varepsilon \) find a counterexample is less than \(\delta \).

Theorem 4 (Complexity)

Given two NBAs \(\mathcal {A}\), \(\mathcal {B}\) with \(n_{a}\) and \(n_{b}\) states, respectively, and statistical parameters \(\varepsilon \) and \(\delta \), let \(M = \lceil \ln \delta / \ln (1 - \varepsilon ) \rceil \) and \(n = \max (n_{a}, n_{b})\). Then \(\mathsf {IMC}^{2}\) runs in time \(\mathcal {O}(M \cdot K \cdot n^{2})\) and space \(\mathcal {O}(K \cdot n^{2})\).

4 Experimental Evaluation

We have implemented the Monte Carlo sampling algorithm proposed in Sect. 3 in ROLL [22] to evaluate it. We performed our experiments on a desktop PC equipped with a 3.6 GHz Intel i7-4790 processor with 16 GB of RAM, of which 4 GB were assigned to the tool. We imposed a timeout of 300 s (5 min) for each inclusion test. In the experiments, we compare our sampling inclusion test algorithm with RABIT 2.4.5 [1, 2, 9] and SPOT 2.8.4 [11]. ROLL and RABIT are written in Java while SPOT is written in C/C++. This gives SPOT some advantage in the running time, since it avoids the overhead caused by the Java Virtual Machine. For RABIT we used the option -fastc while for ROLL we set parameters \(\varepsilon = 0.1\%\) and \(\delta = 2\%\), resulting in sampling roughly \(4\,000\) words for testing inclusion, \(p_{\bot }= \frac{1}{2}\), and K to the maximum of the number of states of the two automata. The automata we used in the experiment are represented in two formats: the BA format used by GOALFootnote 1 [24] and the HOA format [4]. RABIT supports only the former, SPOT only the latter, while ROLL supports both. We used ROLL to translate between the two formats and then we compared ROLL (denoted ROLL\({}_{H}\)) with SPOT on the HOA format and ROLL (denoted ROLL\({}_{B}\)) with RABIT on the BA format. When we present the outcome of the experiments, we distinguish them depending on the used automata format. This allows us to take into account the possible effects of the automata representation, on both the language they represent and the running time of the tools.

Table 1. Experiment results on random automata with fixed state space and alphabet.

4.1 Experiments on Randomly Generated Büchi Automata

To run the different tools on randomly generated automata, we used SPOT to generate 50 random HOA automata for each combination of state space size \(|Q| \in \{10, 20, \dotsc , 90, 100, 125, \dotsc , 225, 250\}\) and alphabet size \(|\varSigma | \in \{2, 4, \dotsc , 18, 20\}\), for a total of \(8\,000\) automata, that we have then translated to the BA format. We then considered 100 different pairs of automata for each combination of state space size and alphabet size (say, for instance, 100 pairs of automata with 50 states and 10 letters or 100 pairs with 175 states and 4 letters). The resulting \(16\,000\) experiments are summarized in Table 1.

For each tool, we report the number of inclusion test instances that resulted in an answer for language inclusion and not inclusion, as well as the number of cases where a tool went timeout, ran out of memory, or failed for any other reason. For the “included” case, we indicate in parenthesis how many times ROLL has failed to reject the hypothesis \(\mathcal {L}(\mathcal {A}) \subseteq \mathcal {L}(\mathcal {B})\), that is, ROLL returned “included” instead of the expected “not included”. For the “non included” case, instead, we split the number of experiments on which multiple tools returned “not included” and the number of times only this tool returned “not included”; for instance, we have that both SPOT and ROLL\({}_{H}\) returned “not included” on \(10\,177\) cases, that only SPOT returned so in 53 more experiments (for a total of \(10\,230\) “not included” results), and that only ROLL\({}_{H}\) identified non inclusion in \(3\,194\) additional experiments (for a total of \(13\,371\) “not included” results).

We can see in Table 1 that both ROLL\({}_{H}\) and ROLL\({}_{B}\) were able to solve many more cases than their counterparts SPOT and RABIT, respectively, on both “included” and “not included” outcomes. In particular, we can see that both ROLL\({}_{H}\) and ROLL\({}_{B}\) have been able to find a counterexample to the inclusion for many cases (\(3\,194\) and \(1\,052\), respectively) where SPOT on the HOA format and RABIT on the BA format failed, respectively.

On the other hand, there are only few cases where SPOT or RABIT proved non inclusion while ROLL failed to do so. In particular, since ROLL implements a statistical hypothesis testing algorithm for deciding language inclusion, we can expect few experiments where ROLL fails to reject the alternative hypothesis \(\mathcal {L}(\mathcal {A}) \subseteq \mathcal {L}(\mathcal {B})\). In the experiments this happened 5 (ROLL\({}_{H}\)) and 45 (ROLL\({}_{B}\)) times; this corresponds to a failure rate of less than \(0.6\%\), well below the choice of the statistical parameter \(\delta = 2\%\).

Regarding the 13 failures of ROLL\({}_{H}\) and the 9 ones of ROLL\({}_{B}\), they are all caused by a stack overflow in the strongly connected components (SCC) decomposition procedure for checking membership \(uv^{\omega } \in \mathcal {L}(\mathcal {A})\) or \(uv^{\omega } \in \mathcal {L}(\mathcal {B})\) (i.e., \(\mathcal {L}(\mathcal {A})\,\cap \,\{uv^{\omega }\} = \emptyset \) or \(\mathcal {L}(\mathcal {B})\,\cap \,\{uv^{\omega }\} = \emptyset \), cf. [17]) at lines 5 and 6 of Algorithm 1, since checking whether the sampled lasso is an accepting run of \(\mathcal {A}\) does not suffice (cf. Remark 1). The 119 timeouts of ROLL\({}_{H}\) occurred for 3 pairs of automata with 200 states and 20 letters, 12/21 pairs of automata with 225 states and 18/20 letters, respectively, and 40/43 pairs of automata with 250 states and 18/20 letters, respectively. We plan to investigate why ROLL\({}_{H}\) suffered of these timeouts while ROLL\({}_{B}\) avoided them, to improve ROLL ’s performance.

Fig. 4.
figure 4

Experiment running time on the random automata with fixed state space and alphabet.

About the execution running time of the tools, they are usually rather fast in giving an answer, as we can see from the plot in Fig. 4. In this plot, we show on the y axis the total number of experiments, each one completed within the time marked on the x axis; the vertical gray line marks the timeout limit. The plot is relative to the number of “included” and “not included” outcomes combined together; the shape of the plots for the two outcomes kept separated is similar to the combined one we present in Fig. 4; the only difference is that in the “not included” case, the plots for ROLL\({}_{B}\) and ROLL\({}_{H}\) would terminate earlier, since all experiments returning “not included” are completed within a smaller time than for the “included” case. As we can see, we have that ROLL rather quickly overcame the other tools in giving an answer. This is likely motivated by the fact that by using randomly generated automata, the structure-based tools such as RABIT and SPOT are not able to take advantage of the symmetries or other structural properties one can find in automata obtained from, e.g., logical formulas. From the result of the experiments presented in Table 1 and Fig. 4, we have that the use of a sampling-based algorithm is a very fast, effective, and reliable way to rule out that \(\mathcal {L}(\mathcal {A}) \subseteq \mathcal {L}(\mathcal {B})\) holds. Moreover, we also conclude that \(\mathsf {IMC}^{2}\) complements existing approaches rather well, as it finds counterexamples to the language inclusion for a lot of instances that other approaches fail to manage.

4.2 Effect of the Statistical Parameters \(\varepsilon \) and \(\delta \)

To analyze the effect of the choice of \(\varepsilon \) and \(\delta \) on the execution of the sampling algorithm we proposed, we have randomly taken 100 pairs of automata where, for each pair \((\mathcal {A}, \mathcal {B})\), the automata \(\mathcal {A}\) and \(\mathcal {B}\) have the same alphabet but possibly different state space. On these 100 pairs of automata, we repeatedly ran ROLL\({}_{H}\) 10 times with different values of \(\varepsilon \) in the set \(\{0.00001, 0.00051, \dotsc , 0.00501\}\) and of \(\delta \) in the set \(\{0.0001, 0.0051, \dotsc , 0.0501\}\), for a total of \(121\,000\) inclusion tests.

The choice of \(\varepsilon \) and \(\delta \) plays essentially no role in the running time for the cases where a counterexample to the language inclusion is found: the average running time is between 1.67 and 1.77 s. This can be expected, since ROLL stops its sampling procedure as soon as a counterexample is found (cf. Algorithm 1). If we consider the number of experiments, again there is almost no difference, since for all combinations of the parameters it ranges between 868 and 870.

On the other hand, \(\varepsilon \) and \(\delta \) indeed affect the running time for the “included” cases, since they determine the number M of sampled words and all such words have to be sampled and tested before rejecting the “non included” hypothesis. The average running time is 1 s or less for all choices of \(\varepsilon \ne 0.00001\) and \(\delta \), while for \(\varepsilon = 0.00001\), the average running time ranges between 12 and 36 s when \(\delta \) moves from 0.0501 to 0.0001, which corresponds to testing roughly \(300\,000\) to \(1\,000\,000\) sample words, respectively.

4.3 Effect of the Lasso Parameters K and \(p_{\bot }\)

At last, we also experimented with different values of K and \(p_{\bot }\) while keeping the statistical parameters unchanged: we have generated other 100 pairs of automata as in Sect. 4.2 and then checked inclusion 10 times for each pair and each combination of \(K \in \{2, 3, 4, 5, 6, 8, 11, 51, 101, 301\}\) and \(p_{\bot }\in \{0.05, 0.1, \dotsc , 0.95\}\).

As one can expect, low values for \(p_{\bot }\) and large values of K allow \(\mathsf {IMC}^{2}\) to find more counterexamples, at the cost of a higher running time. It is worth noting that \(K = 2\) is still rather effective in finding counterexamples: out of the \(1\,000\) executions on the pairs, \(\mathsf {IMC}^{2}\) returned “non included” between 906 and 910 times; for \(K = 3\) it ranged between 914 and 919 for \(p_{\bot }\le 0.5\) and between 909 and 912 for \(p_{\bot }> 0.5\). Larger values of K showed similar behavior. Regarding the running time, except for \(K=2\) the running time of \(\mathsf {IMC}^{2}\) is loosely dependent on the choice of K, for a given \(p_{\bot }\); this is likely motivated by the fact that imposing e.g. \(K=51\) still allows \(\mathsf {IMC}^{2}\) to sample lassos that are for instance 4-lassos. Instead, the running time is affected by the choice of \(p_{\bot }\) for a given \(K \ge 3\): as one can expect, the smaller \(p_{\bot }\) is, the longer \(\mathsf {IMC}^{2}\) takes to give an answer; a small \(p_{\bot }\) makes the sampled words \(uv^{\omega } \in \mathcal {L}(\mathcal {B}_{1})\) to be longer, which in turn makes the check \(uv^{\omega } \in \mathcal {L}(\mathcal {B}_{2})\) more expensive.

Experiments suggest that taking \(0.25 \le p_{\bot }\le 0.5\) and \(3 \le K \le 11\) gives a good tradeoff between running time and number of “non included” outcomes. Very large values of K, such as \(K > 50\), are usually not needed, also given the fact that usually lassos with several repetitions occur with rather low probability.

5 Conclusion and Discussion

We presented \(\mathsf {IMC}^{2}\), a sample-based algorithm for proving language non-inclusion between Büchi automata. Experimental evaluation showed that \(\mathsf {IMC}^{2}\) is very fast and reliable in finding such witnesses, by sampling them in many cases where traditional structure-based algorithms fail or take too long to complete the analysis. We believe that \(\mathsf {IMC}^{2}\) is a very good technique to disprove \(\mathcal {L}(\mathcal {A}) \subseteq \mathcal {L}(\mathcal {B})\) and complements well the existing techniques for checking Büchi automata language inclusion. As future work, our algorithm can be applied to scenarios like black-box testing and PAC learning [3], in which inclusion provers are either not applicable in practice or not strictly needed. A uniform word sampling algorithm was proposed in [5] for concurrent systems with multiple components. We believe that extending our sampling algorithms to concurrent systems with multiple components is worthy of study.