1 Introduction

In recent work [10,11,12] we applied bounded model checking to verify reachability properties of threshold-based fault-tolerant distributed algorithms (FTDA), which are parameterized in the number of processes n and the fraction of faults t. FTDAs typically work only under arithmetic resilience conditions such as \(n>3t\). Our methods allow us to do parameterized verification of sophisticated FTDAs [3, 5, 6, 18, 20, 21] that have not been automatically verified before. Our bounded model checking technique produces a number of queries to a Satisfiability Modulo Theories solver (SMT). These queries correspond to different execution patterns.

In [12], we conjectured that, by design, this technique allows many SMT queries to be checked in parallel. In this paper, we present a parallel extension of ByMC that executes SMT queries in a computer cluster.

The contributions of this paper are as follows:

  1. 1.

    We present the tool ByMC 2.4.1 that implements sequential and parallel verification [10, 12]. The parallel verification is implemented with MPI (Message Passing Interface).

  2. 2.

    We introduce the details of the parallel extension of the technique and perform experimental evaluation, both for the sequential and parallel versions of the tool.

  3. 3.

    We report the experimental results both for the abstractions that are automatically constructed from Promela code (as in [10, 12]) and for manual abstractions in terms of threshold automata, which we use as a direct input for the first time. Our experiments show that explicit modeling of fault-tolerant distributed algorithms with threshold automata leads to a dramatic speed up in most cases.

2 Distributed Algorithm Example: Naïve Voting

In order to describe what kind of distributed algorithms our tool ByMC is designed for, we start with a simple threshold-guarded algorithm. In this section, we take the point of view of an algorithm designer and apply the arguments that can be found in the distributed algorithms literature [1, 16]. Consider a distributed system of n processes, whose goal is to unanimously decide on a binary value \(v \in \{0, 1\}\). We would like to design a distributed algorithm that satisfies the following three propertiesFootnote 1:

  • Agreement. No two correct processes decide on different values (0 and 1).

  • Validity. If a correct process decides on a value \(v \in \{0, 1\}\), then there is a process i, whose initial value \(u_i\) equals v.

  • Termination. All correct processes eventually decide.

Fig. 1.
figure 1

Naïve Voting algorithm

Fig. 2.
figure 2

A threshold automaton for Naïve Voting

Figure 1 shows a naïve attempt to solve this problem by majority voting. As usual in the distributed algorithms literature, we give a solution in pseudo-code, which is supposed to work as follows. Each process starts with a binary value \(u_i \in \{0, 1\}\) and sends \(u_i\) to all processes, including itself. When a process receives a value \(v \in \{0, 1\}\) from a majority of processes, it decides on v.

Does Naïve Voting satisfy agreement, validity, and termination? Unfortunately, the pseudo-code does not provide us with sufficient detail to answer the question: Assumptions about the process scheduler, message-delivery, possible faults, etc. are missing. For instance, if messages can be lost, a process may never receive sufficiently many messages to get over the guard in line 4. Thus we have to specify systems assumptions. Let us consider an asynchronous model [1, 16] with crashes and Byzantine faults [19]:

  • Asynchronous computations. Every correct process is scheduled infinitely often, and there are no assumptions on the relative processor speeds. The process steps are interleaved.

  • Reliable communication. The processes communicate via message passing. Every message sent by a correct process is eventually delivered, although there are no timing or ordering assumptions about message delivery.

  • Faults. A fraction f of processes may fail. For instance, they can crash or behave Byzantine — the faulty processes do not follow the algorithm. There is an upper bound \(t \ge f\) on the number of faults. We assume \(n > 3t\) for the Byzantine faults, and \(n > 2t\) for the crash faults.

Manual proofs. Below, we manually reason about the algorithm’s correctness. Such proofs are common in the distributed algorithms literature, cf. [21, 22, 24].

  • Validity. We consider the Byzantine case here, which is more complicated. In order to decide on a value v in line 6, a correct process has to receive \(\lceil \frac{n+1}{2}\rceil \) messages carrying v. By the assumption on the number of faults (\(n > 3t\) and \(t \ge f\)), we have \(f < \lceil \frac{n+1}{2}\rceil \), and if a process decides on v in 6, there is at least one correct process that has sent the value v in line 3. Thus, the algorithm satisfies “Validity”.

  • Agreement. Whether the algorithm satisfies “Agreement” depends on the considered fault model:

    • No faults or crash faults. By line 4, a process has to receive the same value from \(\lceil \frac{n+1}{2}\rceil \) distinct processes. Since \(2 \cdot \lceil \frac{n+1}{2}\rceil > n\), and each process sends only one value (line 3), no two processes \(i, j:\ 1 \le i < j \le n\) can reach line 6 with different values \(v_i \ne v_j\). Thus, the processes cannot decide differently, and agreement is satisfied.

    • Byzantine faults. When \(f > 0\), the Byzantine processes can send value 0 to a process i and value 1 to a process \(j: j \ne i\). If the initial states of the correct processes are split into two equal sets, that is, \(n - f = 2 \cdot |\{k \in \{1 \dots n\} :k \text { is correct} \text { and } u_k = 0 \}|\), then the processes i and j reach line 6 with the values \(v_i=0\) and \(v_j=1\). As a result, agreement can be violated, and a verification tool must produce a counterexample.

  • Termination. Assume that there are no faults (\(f=0\)) and the initial states are equally partitioned, that is, \(n = 2 \cdot |\{k \in \{1 \dots n\} :k \text { is correct} \text { and } u_k = 0 \}|\). No process can pass beyond line 4, as none of the initial value sets form a majority. Therefore, Naïve Voting violates liveness, namely, “Termination”. This subtle bug renders the algorithm useless! A tool should thus not only check invariants, but also find counterexamples to liveness specifications.

The manual proofs are tricky, as they combine several kinds of reasoning: temporal reasoning, local reasoning about process code, global reasoning about the number of messages, correct and faulty processes, etc. Our tool ByMC automatically proves temporal properties (or finds counterexamples) of distributed algorithms that (i) communicate by sending to all, and (ii) contain actions that are guarded by comparison of the number of received messages against a linear combination of parameter values (e.g., for a majority).

3 Inputs: Parametric Promela and Threshold Automata

The algorithm in Fig. 1 looks quite simple. However, as one can see from the assumptions on, e.g., faults and communication in Sect. 2, many details (that are often deemed “non-essential” by algorithm designers) are missing in the pseudo code. Our tool addresses this challenge by supporting two formal languages that are tailored for modeling of threshold-guarded distributed algorithms and the system assumptions: parametric Promela [8, 9] and threshold automata [11]. Parametric Promela offers modeling that closely mimicks the behavior of the pseudo code statements, whereas threshold automata are an abstraction that allows for efficient model checking techniques [10, 12]. When given code in parametric Promela, ByMC internally applies data abstraction to construct a threshold automaton, as explained in [13]. However, the automatically computed threshold automata are usually much larger than those constructed manually by a distributed algorithms expert. For this reason, the user can directly give a threshold automaton as the input to the tool.

Fig. 3.
figure 3

Modeling Naïve Voting in Parametric Promela

3.1 Parametric Promela

Promela is the input language of the Spin model checker [7]. As it is designed to specify concurrent systems, several features are suitable for capturing distributed algorithms. However, Spin is a finite state model checker, and so Promela only allows us to specify finite state systems. We have thus extended Promela in order to have a parametric number of processes and faults, etc. In the following we will discuss some of our extensions.

Figure 3 shows a model of the Naïve Voting algorithm from Fig. 1. This example contains all the essential features of parametric Promela. In line 2, we declare parameters: the number of processes n, the number of Byzantine processes f, and the minimal size of a majority set, that is, \(\lceil \frac{n+1}{2} \rceil \). In line 3, we declare two shared integer variables nsnt0 and nsnt1 that store the number of zeroes and ones sent by the correct processes. The expressions assume(...) in lines 4–5 restrict the choice of parameter values.

The behavior of the \(n-f\) correct processes is modeled in lines 6–32. To describe a process state, we introduce the following local variables:

  • pc to store the algorithm’s control state, that is, whether a process is initialized with values 0 and 1 (i.e., pc=V0 and pc=V1 resp.), sent a message (pc=SE), decided on values 0 and 1 (i.e., pc=D0 and pc=D1 resp.)

  • nrcvd0 and nrcvd1 to store the number of zeroes and ones received from the correct and Byzantine processes; and

  • next-state variables next_pc, next_nrcvd0, and next_nrcvd1 that are used to perform a process step.

An initial process state is chosen non-deterministically in lines 12–13.

A single process step is encoded as an atomic block in lines 14–31, which corresponds to an indivisible receive-compute-send step. In lines 15–18, a process possibly receives new messages: by invoking havoc(x), we forget the contents of a variable x, and by writing assume(e), we restrict the variable values to those that satisfy a logical expression e. Note that the statements havoc and assume do not belong to the standard Promela; they belong to parametric Promela and are inspired by the similar statements in Boogie [2]. Lines 20–27 encode the computations that can be found in pseudo-code in Fig. 1. Like in Promela, a process non-deterministically picks an option of the form “:: guard -> actions”, if guard evaluates to true, and executes actions.

To specify temporal properties, we first define atomic propositions in lines 34–38. The keywords some and all correspond to existential and universal quantification over the processes; they belong to parametric Promela. In lines 40–43, define LTL formulas that capture the properties of consensus (cf. Sect. 2).

Promela code in Fig. 3 models the informal pseudo code of Naïve Voting. Note that the manual translation from pseudo code is straightforward, except for one thing: It may seem more honest to maintain sets of sent and received messages, instead of storing only integer message counters such as nrcvd0 and nsnt0. It has been proven that modeling with sets is equivalent (bisimilar) to modeling with message counters [14]. Obviously, modeling with message counters produces smaller transition systems (cf. [9]).

3.2 Threshold Automata

Our code in parametric Promela has several features: (i) each atomic step is encoded as an imperative sequence of statements, (ii) and the processes explicitly store the number of received messages in local variables such as nrcvd0 and nrcvd1. One can argue that this level of detail is not necessary, and it makes the verification problem harder. Threshold automata are a more abstract model for threshold-guarded fault-tolerant distributed algorithms [11], as they enable guarded transitions as soon as sufficiently many messages have been sent. Intuitively, the reception variables nrcvd0 and nrcvd1 are bypassed by such modeling. In this section, we introduce threshold automata in a way that explains how automata capture local transitions of individual processes. The semantics of threshold automata are then defined via counter systems in Sect. 4 that model runs of collections of processes, that is, distributed computations.

Fig. 4.
figure 4

A threshold automaton for Naïve Voting in the .ta format

We model Naïve Voting with the threshold automaton shown in Fig. 2. Its code in the .ta input format of ByMC is shown in Fig. 4. We are running \(n-f\) instances of the threshold automaton; each instance is modelling a correct process. The automata operate on shared variables such as \(\mathsf {nsnt_0}\) and \(\mathsf {nsnt_1}\), which can be only incremented. A threshold automaton resides in a local state from a finite set \({\mathcal L}\), e.g., in our example, \({\mathcal L}=\{\textsf {V0}, \textsf {V1}, \textsf {SE}, \textsf {D0}, \textsf {D1}\}\). A rule (corresponding to an edge in Fig. 2) can move an automaton from one local state to another, provided that the shared variables in the current global state satisfy the rule’s threshold guard, e.g., \(2 \cdot (\mathsf {nsnt_0}+ f) \ge n + 1\). If a rule is labeled with an increment of a shared variable, e.g., \(\mathsf {nsnt_0}\small {\texttt {++}}\), then the shared variable is updated accordingly.

4 Theoretical Background

4.1 System

We assume fixed three finite sets: the set \({\mathcal L}\) contains the local states, the set \(\varGamma \) contains the shared variables that range over non-negative integers, and the set \(\varPi \) contains the parameters that range over non-negative integers.

Configurations \(\varSigma \) and \(I\). A configuration is a vector \(\sigma =({\mathbf {\varvec{\kappa }}},\mathbf {g},\mathbf {p})\), where \(\sigma .{\mathbf {\varvec{\kappa }}}\) is a vector of counter values, \(\sigma .\mathbf {g}\) is a vector of shared variable values, and \(\sigma .\mathbf {p}= \mathbf {p}\) is a vector of parameter values. In \(\sigma .{\mathbf {\varvec{\kappa }}}\) we store for each local state \(\ell \), how many processes are in this state. All values are non-negative integers. In every initial configuration global variables have value zero, and all “modelled” processes are in initial locations. If specifications do not limit the behavior of faulty processes (which is typically the case with Byzantine faults), we only model the correct processes explicitly, while the impact of faulty processes is modelled as non-determinism in the environment.

Threshold Guards are defined according to the following grammar:

figure a

Transition relation \(R\). A transition is a pair \(t=({ rule },{ factor })\) of a rule of the \({\textsf {TA}}\) and a non-negative integer called the acceleration factor, or just factor for short. If the factor is always 1, this corresponds that at each step exactly one processes takes a step, that is, interleaving semantics. Having factors greater than 1 permits a specific form of acceleration where an arbitrary number of processes that are ready to execute a rule can do that at the same time.

Transition t is applicable (or enabled) in configuration \(\sigma \), if the guard of \(t.{ rule }\) evaluates to true, and \(\sigma .{\mathbf {\varvec{\kappa }}}[t.{ from }] \ge t.{ factor }\). Configuration \(\sigma '\) is the result of applying the enabled transition t to \(\sigma \), and write \(\sigma ' = t(\sigma )\), if

  • \(\sigma '.\mathbf {g}= \sigma .\mathbf {g}+ t.{ factor }\cdot t.\mathbf {u}\) and \(\sigma '.\mathbf {p}= \sigma .\mathbf {p}\)

  • if \(t.{ from }\ne t.{ to }\) then

    • \(\sigma '.{\mathbf {\varvec{\kappa }}}[t.{ from }]=\sigma .{\mathbf {\varvec{\kappa }}}[t.{ from }] - t.{ factor }\),

    • \(\sigma '.{\mathbf {\varvec{\kappa }}}[t.{ to }]=\sigma .{\mathbf {\varvec{\kappa }}}[t.{ to }] + t.{ factor }\), and

    • \(\forall \ell \in {\mathcal L}\setminus \{t.{ from }, t.{ to }\}\) it holds that \(\sigma '.{\mathbf {\varvec{\kappa }}}[\ell ]=\sigma .{\mathbf {\varvec{\kappa }}}[\ell ]\)

  • if \(t.{ from }= t.{ to }\) then \(\sigma '.{\mathbf {\varvec{\kappa }}}= \sigma .{\mathbf {\varvec{\kappa }}}\)

Finally, the transition relation \(R\subseteq \varSigma \times \varSigma \) of the counter system is defined as follows: \((\sigma , \sigma ') \in R\) iff there is a rule \(r\in {\mathcal R}\) and a factor \(k\in {\mathbb N}_0\) such that \(\sigma ' = t(\sigma )\) for \(t=(r,k)\).

Observe that configurations, transitions, guard, etc. can be encoded in linear integer arithmetic.

4.2 Safety and Liveness Specifications

Using counter systems, we can also easily express the temporal properties, e.g., those of Naïve Voting. To this end, for every local state \(\ell \in {\mathcal L}\), we introduce a proposition “\({\mathbf {\varvec{\kappa }}}_\ell =0\)”, which tests that there are no processes in \(\ell \). Since threshold automata do not explicitly track received messages, the assumption of reliable communication is modeled as a fairness assumption over local states and actions. The following formula captures the required fairness, that is, (i) eventually all processes leave their initial state \(\textsf {V0}\) or \(\textsf {V1}\), and (ii) if threshold guards become true, then eventually all processes fire the corresponding rules and thus evacuate the local state \(\textsf {SE}\) (the latter implication is written as disjunction):

Agreement (A), Validity (V), and Termination (T) can be written as follows:

$$\begin{aligned}&\qquad \qquad \qquad \qquad \qquad \qquad {{\mathbf {\mathsf{{G}}}}}\,({\mathbf {\varvec{\kappa }}}_\textsf {D0}= 0 \vee {\mathbf {\varvec{\kappa }}}_\textsf {D1}= 0) \quad \qquad \qquad \qquad \quad \quad \qquad \mathrm{{(A)}} \end{aligned}$$
Table 1. The syntax of \({\textsf {ELTL}}_{\textsf {FT}}\)-formulas [10]: \( pform \) defines propositional formulas, and \(\psi \) defines temporal formulas. We assume that \( Locs \subseteq {\mathcal L}\) and \( guard \in \varPhi ^{\mathrm {rise}}\cup \varPhi ^{\mathrm {fall}}\).

In [12], we have introduced a bounded model checking technique with SMT that checks reachability in counter systems of threshold automata for all combinations of the parameters. We proved that if a configuration is reachable, then there is a short schedule that reaches this configuration. As a result, bounded model checking is a complete method for reachability checking in our case. In [10], this technique was extended to \({\textsf {ELTL}}_{\textsf {FT}}\) — a fragment of \({\textsf {ELTL}}({\textsf {F}}, {\textsf {G}})\), which allows us to verify safety and liveness of counter systems of threshold automata. The syntax of \({\textsf {ELTL}}_{\textsf {FT}}\) is given in Table 1. We use this logic to express counterexamples, that is, negations of the safety and liveness specifications from above.

For instance, the negation of agreement and termination in Equations (A) and (T) fit into \({\textsf {ELTL}}_{\textsf {FT}}\), and can be written as follows:

Technically, the negation of the formula for validity given in Equation (V) does not belong to the fragment \({\textsf {ELTL}}_{\textsf {FT}}\). However, it can be easily rewritten as two formulas, for the values of i equal to 0 and 1:

5 Parameterized Model Checking by Schema Enumeration

Our verification technique consists of the following steps: From the \({\textsf {ELTL}}_{\textsf {FT}}\) specifications, our tool enumerates all shapes counterexamples can have. Each of these shapes is encoded as an SMT query, and using SMT solvers, our tool checks for each shape, whether there exists a run of the system that has this shape. Such a run would then be a witness to the violation of a specification.

Consider the agreement property (A) of Naïve Voting. A counterexample is a run of the system that starts in an initial state and satisfies its negation:

Each counterexample thus (i) satisfies the constraints for initial states, and (ii) is a sequence of applicable transitions, that (iii) end up in a state where \(({\mathbf {\varvec{\kappa }}}_\textsf {D0}\ne 0 \wedge {\mathbf {\varvec{\kappa }}}_\textsf {D1}\ne 0)\) holds. Indeed checking (A) boils down to checking reachability of a state that satisfies \(({\mathbf {\varvec{\kappa }}}_\textsf {D0}\ne 0 \wedge {\mathbf {\varvec{\kappa }}}_\textsf {D1}\ne 0)\). Our technique from [12] enumerates all shapes of such counterexamples.

The central notion is a simple schema:

$$\begin{aligned} \{ pre \}r_1^*, \dots , r^*_k\{ post \} \end{aligned}$$

where \( pre , post \subseteq \) are constraints that encode evaluation of guards, and constraints on the counters (e.g., \({\mathbf {\varvec{\kappa }}}_\textsf {D0}\ne 0\)). Thus, the schema captures that \( pre \) holds, then some transitions with rules \(r_1^*, \dots , r^*_k\) are executed to reach a state where \( post \) holds. We denote a simple schema by S. A schema is then a concatenation of simple schemas \(S_1, S_2, \dots S_k\), for some k.

For our example, the technique from [12] would generate among others, a schema like the following

$$\begin{aligned}&S_1, S_2, S_3 =\\&\qquad \qquad \qquad \qquad \qquad \quad \,\,\,\,\{{\mathbf {\varvec{\kappa }}}_\textsf {V0}+ {\mathbf {\varvec{\kappa }}}_\textsf {V1}= n \} r_1^*, \dots , r^*_4 \\&\qquad \qquad \qquad \qquad \qquad \{2 \cdot (\mathsf {nsnt_0}+ f) \ge n + 1\} r_1^*, \dots , r^*_4 \\&\qquad \qquad \{(2 \cdot (\mathsf {nsnt_0}+ f) \ge n + 1), (2 \cdot (\mathsf {nsnt_1}+ f) \ge n + 1)\} r_1^*, \dots , r^*_4\\&\qquad \{(2 \cdot (\mathsf {nsnt_0}+ f) \ge n + 1), (2 \cdot (\mathsf {nsnt_1}+ f) \ge n + 1), ({\mathbf {\varvec{\kappa }}}_\textsf {D0}\ne 0 \wedge {\mathbf {\varvec{\kappa }}}_\textsf {D1}\ne 0)\} \end{aligned}$$

that is, initially, all of the n processes are in the initial locations \(\textsf {V0}\) and \(\textsf {V1}\), then after application of some rules one of the threshold guards becomes true, then after another application of some rules both guards are true and finally a bad state is reached. The SMT solver now has to find whether an executions exists that has that form. This is done by replacing each Kleene star by a distinct variable that encodes how often a rule r is applied.

A different schema can be obtained by changing the order in which the two threshold guards become true. In general each possible order generates a different schema. The number of different schemas to be checked is factorial in the number of guards [12]. As our benchmarks have only a small number of guards, the number of calls to the SMT solver is still practical.

5.1 Checking a Single Lasso Schema with SMT

In [10] we prove that for our counter systems, a counterexample to a liveness specification has lasso shape, that is:

$$\begin{aligned} S_1 \dots S_k (S_{k+1}\dots S_{k+m})^\omega \end{aligned}$$

In this way we obtain a finite representation of an infinite execution, which again can be checked with an SMT solver.

Thus, our tool generates multiple schemas: for each safety or liveness specification, a different schema is obtained by changing the order in which the threshold guards become true. A detailed algorithm for constructing schemas is presented in [10, Fig. 10]. In a nutshell, the algorithm constructs a graph that represents the partial order on when propositions and threshold guards evaluate to true in an execution, e.g., the one in Fig. 5. Each linear extension of this partial order then defines a sequence on which propositions and guards become true. Two neighboring elements in the sequence are the \( pre \) and \( post \) of a simple schema; the concatenation of all these simple schemas is the schema our tool checks for satisfiability.

Fig. 5.
figure 5

The graph constructed from the automata guards and formula (\(\text {NV}_0\))

Our tool encodes each schema in SMT and then calls a back-end solver in order to check whether the schema generates a counterexample. In [10], we explained the SMT encoding. As the schemas are independent, these checks can be done in parallel. We have implemented and exploited this feature in [15]. As [15] was concerned with synthesis, we did not discuss the effects of parallelization there. In the following we discuss and compare the sequential and the parallel approaches.

Sequential Schema Enumeration. In the sequential mode, the schemas are simply checked one-by-one until either a counterexample is found, or all schemas have been enumerated and no counterexample has been found. (Detailed pseudo-code of the function check_one_order can be found in Fig. 10 of [10].)

figure c

Parallel Schema Enumeration. In the MPI mode, the tool runs as a system of N processes, one per CPU; the physical arrangement of the CPUs depends on the cluster configuration. Every process is assigned a unique value \( rank \) from 0 to \(N-1\): The process with \( rank =0\) is the master, whereas the other processes are the workers. Every process is enumerating the schemas as in the sequential mode but checks a schema only if the schema’s sequence number i matches the rule: \((i \mod N) = rank \). In order to terminate quickly when one process has found a bug, the workers asynchronously communicate with the master. After leaving the loop, the workers communicate with the master to deliver a counterexample, if one was found. For presentation, we assume that the master can send to and receive messages from itself.

figure d
Table 2. The experiments with the sequential (SEQ) and parallel (MPI) techniques on two kinds of inputs: Promela (white rows) and threshold automata (gray rows). The sequential experiments were run with GNU parallel [23] at AMD Opteron® 6272, 32 cores, 192 GB. The MPI benchmarks were run at Vienna Scientific Cluster 3 using 16 nodes \(\times \) 16 cores (256 processes). The symbol “” indicates timeout of 24 h.

6 Benchmarks and Experiments

Byzantine model checker is written in OCaml. Its source code and the virtual machines are available from the tool web pageFootnote 2. For the experiments conducted in this paper, we used Z3 4.6.0 [4] as a back-end SMT solver, which was linked to ByMC via Z3 OCaml bindings.

In earlier work [9], we encoded our benchmarks in Parametric Promela, using a shared variable to record the number of processes that have sent a message, and using for each process a local variable that records how many messages a process received. For this modeling we presented a data abstraction and counter abstraction in [8]. To compare later verification techniques with these initial results, we kept that encoding, although the newer techniques rest on a more abstract model of threshold automata, which have finitely many local states.

The threshold automata constructed by data abstraction are significantly larger than threshold automata constructed by a human expert. To see the influence of these modeling decisions on the verification results, we manually encoded our benchmarks as threshold automata. These benchmarks are available from our benchmark repository Footnote 3. Table 2 compares the size of the threshold automata that are: (1) produced automatically by abstraction and (2) hand-coded. The essential features of the automata are: the number of local states \(|{\mathcal L}|\), the number of rules \(|{\mathcal R}|\), and the numbers of the guards \(|\varPhi ^{\mathrm {rise}}|\) and \(|\varPhi ^{\mathrm {fall}}|\), that is, the guards of the form \(x \ge \dots \) and \(x < \dots \) respectively. Moreover, due to data abstraction, we had to consider several cases that differ in the order between the thresholds. They are mentioned in the column “Case”.

Table 2 shows the verification results for benchmarks in Promela as well as threshold automata. We ran the sequential schema enumeration (SEQ,  [10]) and the parallel schema checking technique (MPI) that is presented in this paper. The parallel experiments were run at Vienna Scientific Cluster using 256 CPU cores. For each benchmark, we picked the most challenging specifications —many of them are liveness properties —and show experimental results for them. (Needless to say, we did not run the MPI technique on the benchmarks that could be enumerated with the sequential technique in seconds.) Two columns show the essential features of the enumerated schemas: “number” displays the total number of explored schemas, and “length avg” displays the average length of schemas. For both techniques, we report the computation times and maximal memory usage during a run. For the MPI experiments, we report the average time per CPU core (column “MPI avg”) as well as the maximum time per CPU core (column “MPI max”). The deviation from the average case is negligible.

As expected, the hand-coded benchmarks are usually verified much faster. Interestingly, the manually constructed threshold automaton for one-step consensus (c1cs [3]) has more threshold guards than the abstract one: We had to more accurately encode algorithm’s decisions, crash faults, and fairness. The sequential technique times out on this benchmark. The parallel technique takes about seven times longer than with the automatic abstraction.

The parallel technique benefits from running on multiple cores, though the actual gains from parallelism depend on the benchmark. As in our experiments the verification times of a single schema negligibly deviate from the average case, the uniform distribution of schemas among the nodes seems sufficient. However, one can construct threshold automata that produce schemas whose verification times significantly vary from each other. We conjecture that an implementation with a dynamic balancer would make better use of cluster resources.

7 Conclusions

We presented our tool ByMC, and compared its sequential verification implementation to its parallel one. Moreover, by experimental evaluation we showed that manual abstractions give us threshold automata that can be verified significantly faster than those that result from automatic abstraction.

We observe that the sizes of the manually constructed threshold automata are not significantly larger than the (manually crafted) models of round-based distributed consensus presented in [17]. In their theory, threshold-guarded expressions also play a central role. Our gains in efficiency in this paper—due to manual encodings—show that the discrepancy was a result of automatic abstraction and not of the technique that uses threshold automata as its input.

We needed from one to three hours per benchmark to specify and debug a threshold automaton, while it usually took us less than 30 min to specify the same benchmark in Parametric Promela. The most difficult part of the encoding with threshold automata was to faithfully express fairness constraints over shared variables and process counters. In case of Parametric Promela, fairness constraints were much easier to write, as one could refer to the shared and local variables, which count the number of sent and received messages respectively.