1 Introduction

Classical probabilistic verification techniques rely on iterative approximation algorithms for linear equation systems and linear programs, such as value iteration (VI), e.g.  [Put14]. However, the scalability of such numeric analyses is severely limited, compared to standard non-quantitative (hardware or software) verification, since exact transformations, such as abstraction or partial-order reduction, are more difficult to use. Consequently, weaker guarantees such as probably approximately correct (PAC) results become acceptable even for completely known systems (white box) and not only in contexts where the system is executable but unknown (black box), and where thus absolute guarantees are principally impossible.

Example 1

Consider the task of model checking a reachability property of a probabilistic communication protocol, which starts by generating a few, say k, random bits. Thus the execution immediately branches into \(2^k\) states. If there are only few or hard-to-find symmetries in the behaviour, standard analysis quickly becomes infeasible. In the following, we discuss drawbacks of previously studied alternative approaches; then we suggest a new one that overcomes the difficulties for a wide class of models.

The exponential state-space explosion quickly renders explicit VI unable to propagate information by more than a single step. Besides, if the transition probabilities depend on the generated bits, even the symbolic variants of VI  [BCH+97] cannot help much. There have been two major alternatives proposed, both relying on extensive use of simulations.

  • (I) For large and possibly unknown systems, statistical model checking (SMC)  [YS02] reincarnates the Monte Carlo method. It runs simulations of the system; the resulting statistics then yields confidence intervals, i.e. PAC results. However, for unbounded-horizon properties, such as reachability or linear temporal logic (LTL)  [Pnu77], performing simulations of finite length requires some information about the model  [Kře16]:

    1. 1.

      Either the second eigenvalue of the transition matrix can be bounded  [LP08, YCZ10], which requires essentially the complete knowledge of the system (white box) and is as hard as solving the model checking problem, or

    2. 2.

      the topology of the underlying state-graph is known  [YCZ10, HJB+10] (sometimes called grey box, e.g.  [AKW19]) and the whole system is preprocessed, which beats the purpose of sublinear analysis, or

    3. 3.

      a bound on the minimum transition probability \(p_{\min }\) is known as is the case in  [BCC+14, DHKP17]. This is the closest to black box, thus called black SMC here.

    In black SMC, long enough simulations can be run to ensure the system passes from the transient to the recurrent part and reliable information on the whole infinite run is obtained. While the a-priori length is practically infeasible  [BCC+14], early detection of recurrent behaviour has been proposed  [DHKP17] as follows. Based on the observed part of a simulation run, a hypothesis on the topology of the system is made, answering what bottom strongly connected component (BSCC) this run ends up in. With repetitive observations of transitions over the run, the confidence grows that what currently looks as a BSCC indeed is a BSCC. Since quite a few repetitions of all transitions in the BSCC are required, this approach turns out practical only for systems with small BSCCs and not too small \(p_{\min }\).

    In this paper, assuming knowledge of the system (white-box setting), we twist the technique to a more efficient one as follows. After quickly gaining (unreliably low) confidence that the run entered a BSCC, we use the knowledge of the topology to confirm this information—again very quickly since not the whole model is built but only the local BSCC. Consequently, BSCCs are detected fast even in the case with larger BSCCs or small \(p_{\min }\). As the information used turns out quite limited, corresponding to the grey-box setting, we call this approach grey SMC.

  • (II) The other alternative to VI, now in the context or large but known systems, is the asynchronous value iteration, e.g.  [BT89], a generalization of the Gauss-Seidel method and the core of reinforcement learning and approximate dynamic programming. There, the VI updates on states of the system are performed in varying orders, in particular possibly entirely skipping some states. The class of algorithms providing guarantees is represented by bounded real-time dynamic programming (BRTDP)  [MLG05, BCC+14, AKW19] where the states to be updated at each moment are those appearing on a current simulation run. Consequently, states with low probability of visiting and thus low impact on the overall value are ignored. While this allows for treating very “wide” systems with lots of unimportant branches, the scalability problem persists as soon as the branching is very uniform (see also Example 5 on Fig. 2b). From this perspective, grey SMC relaxes the rigorous approximation in the transient part and replaces it with a statistical estimate.

Overall, grey SMC fills the gap in the following spectrum:

figure a

On the one end, numeric analysis (VI) provides reliable results; in BRTDP, simulations are additionally used in the analysis to improve the performance while preserving the guarantees. On the other end, simulations (SMC) provide PAC guarantees; grey SMC then improves the performance by additional analysis in the simulation.

Our contribution can be summarized as follows:

  • We modify the black SMC for unbounded properties of  [DHKP17] to perform better in the white-box (and actually also in the so-called grey-box) setting.

  • We compare our grey SMC to black SMC, BRTDP and VI both conceptually, illustrating advantages on examples, as well as experimentally, comparing the runtimes on standard benchmarks.

  • We present all algorithms within a unified framework, which in our opinion eases understanding and comparison, provides a more systematic insight, and is pedagogically more valuable.

Outline of the Paper: After recalling necessary definitions in Sect. 2, we describe and exemplify the algorithms in Sect. 3 and the respective key sub-procedure in Sect. 4. Then we compare the algorithms and other related work in Sect. 5, discussing the expected implications, which we confirm experimentally in Sect. 6. For a broader account on related work on SMC in the context of unbounded-horizon properties, we refer the interested reader to the survey  [Kře16].

2 Preliminaries

A probability distribution on a finite set X is a mapping \(\delta : X \rightarrow [0,1]\), such that \(\sum _{x\in X} \delta (x) = 1\). The set of all probability distributions on X is denoted by \(\mathcal {D}(X)\).

Definition 1 (MC)

A Markov chain (MC) is a tuple , where is a finite set of states with a designated initial state , and is a transition function that given a state \(\mathsf s\) yields a probability distribution over successor states. For ease of notation, we write instead of and to denote the set of successors of a state.

The semantics of an MC is given in the usual way by the probability space on paths. An infinite path is an infinite sequence , such that for every \(i \in \mathbb {N}\) we have . A finite path is a finite prefix of an infinite path. The Markov chain together with a state \(\mathsf s\) induces a unique probability distribution \(\mathbb P_{\mathsf s}\) over measurable sets of infinite paths  [BK08, Ch. 10].

Definition 2 (Reachability probability)

For a target set , we write to denote the (measurable) set of all infinite paths which eventually reach \(\mathsf {T}\). For each , we define the value in as

The reachability probability is then the value of the initial state .

The value function satisfies the following system of equations, which is referred to as the Bellman equations:

(1)

Moreover, is the least solution to the Bellman equations, see e.g.  [CH08].

Certain parts of the state space are of special interest for the analysis of MC with respect to unbounded-horizon properties, such as reachability:

Definition 3 (SCC, BSCC)

A non-empty set of states is strongly connected if for every pair there is a path (of non-zero length) from \(\mathsf s\) to \(\mathsf s'\). Such a set T is a strongly connected component (SCC) if it is maximal w.r.t. set inclusion, i.e. there exists no strongly connected \(T'\) with \(T \subsetneq T'\). An SCC T is called bottom (BSCC), if for all states we have , i.e. no transition leaves the SCC.

Note that the SCCs of an MC are disjoint and that, with probability 1, infinitely often reached states on a path form a BSCC.

We consider algorithms that have a limited information about the MC:

Definition 4 (Black box and grey box setting)

An algorithm inputs an MC as black box if it cannot access the whole tuple, but

  • it knows the initial state,

  • for a given state, it can sample a successor \(\mathsf t\) according to ,Footnote 1

  • it knows , an under-approximation of the minimum transition probability.

When input as grey box, it additionally knows the number of successors for each state .Footnote 2

3 Description of Algorithms

In this section, we describe all of the algorithms that we compare in this paper. They all use the framework of Algorithm 1. The differences are in the instantiations of the functions (written in capital letters). This allows for an easy and modular comparison.

figure b

3.1 Value Iteration

Value iteration (VI), e.g.  [Put14], computes the value for all states in the MC. As memory, it saves a rational number (the current estimate of the value) for every state. In \(\mathsf {INITIALIZE}\), the estimate is set to 1 for target states in \(\mathsf {T}\) and to 0 for all others. \(\mathsf {GET\_STATES}\) returns the whole state space, as the estimate of all values is updated simultaneously. The \(\mathsf {UPDATE}\) works by performing a so called Bellman backup, i.e.g.iven the current estimate function , the next estimate is computed by applying the Bellman Equation (1) as follows:

Fig. 1.
figure 1

(a) Example Markov chain (b) Under approximations computed by value iteration, see Example 2 (c) Under- and over-approximations computed by bounded value iteration, see Example 3.

Example 2

Consider the MC from Fig. 1a, with and the reachability objective \(\{t\}\). The estimates that VI computes in the first 4 iterations are depicted in Fig. 1b. The target state \(\mathsf t\) is initialized to 1, everything else to 0. The estimate for stays at 0, as it is a BSCC with no possibility to reach the target state. Since these two states do not change, they are omitted in the figure. In every iteration, the estimates are updated and become more precise, coming closer to the true value 0.5 for , and . However, they converge to 0.5 only in the limit, as for any finite number of iterations there is a positive probability to remain in . Note that always is two steps behind , as it takes two iterations to backpropagate the current estimate.

VI converges to the true value only in the limit, hence we need some termination criterion \(\mathsf {TERM\_CRIT}\) to stop when we are close enough. However, to be certain that the estimate is close, one has to perform an exponential number of iterations  [CH08], which is infeasible. Hence, usually this version of VI does not give convergence guarantees, but instead just runs until the difference between two successive iterations is small. The result of this heuristic is guaranteed to be a lower bound, but can be arbitrarily imprecise  [HM18], as we will also see in Example 3.

3.2 Bounded Value Iteration

To be able to give convergence guarantees, Bounded value iteration (BVI, also called interval iteration) was introduced more generally for Markov decision processes in  [BCC+14, HM18]. In this paper, we only focus on Markov chains, i.e. Markov decision processes with a single action in every state. In addition to the under-approximation computed by VI, this approach also computes a convergent over-approximation. For this, it stores a second rational number for every state. Dually to the under-approximation, \(\mathsf {INITIALIZE}\) sets the estimate to 0 in states that cannot reach the target and 1 everywhere else. Note that finding the states with value 0, i.e. BSCC that do not contain the target, BVI has to perform a graph analysis, e.g. a backwards search from the targets. BVI still works on the whole state space and the update is completely analogous to VI, only this time updating both approximations. As \(\mathsf {TERM\_CRIT}\), BVI checks that difference between the over- and under-approximation in the initial state is smaller than a given precision \(\varepsilon \). This guarantees that the returned value is \(\varepsilon \)-precise.

Example 3

Consider the MC from Fig. 1a with the same objective, but this time with and . Note that by pre-processing we set the over approximation to 0, as it is a BSCC with no possibility of reaching the target. The estimates BVI computes for in the first 4 iterations are depicted in Fig. 1c.

If we were running VI only from below, we might stop after iteration 4, as the lower bound changes by less than 0.01 between these iterations and hence it seems to have converged close to the value. However, the difference between upper and lower bounds is still very high, so BVI knows that there still is a huge uncertainty in the values, as it could be anything between 0.048 and 0.961. Eventually, both estimates converge close enough to 0.5; for example, after around 400 iterations the lower bound is 0.49 and the upper bound 0.51. Then BVI can return the value 0.5 (the center of the interval) with a precision of 0.01, as this value is off by at most that.

3.3 Simulation-Based Asynchronous Value Iteration

The biggest drawback of the two variants we introduced so far is that they always work on the whole state space. Because of the state-space explosion, this is often infeasible. In contrast, asynchronous value iteration only updates parts of the state space in every iteration of the loop, i.e. \(\mathsf {GET\_STATES}\) does not return the whole state space, but instead heuristically selects the states to update next. This not only speeds up the main loop, but also allows the algorithm to reduce the memory requirements. Indeed, instead of storing estimates for all states, one stores estimates only for the partial model consisting of previously updated states. In  [BBS95, MLG05, BCC+14], the heuristic for selecting the states is based on simulation: a path is sampled in the model, and only the states on that path are updated. The partial model contains all states that have been encountered during some of the simulations. If the part of the state space that is relevant for convergence of value iteration is small, this can lead to enormous speed-ups  [BCC+14, KM19]. For more details on why this happens and a formal definition of ’state space relevant for convergence’, we refer the interested reader to  [KM19].

figure c

Algorithm 2 shows how states can be sampled through simulations, as done in  [BCC+14]: Starting from the initial state, in every step of the simulation a successor is chosen from the distribution of the last state on the path. Note that this choice depends on another heuristic \(\mathrm {NEXT\_STATE}\). The successor can be chosen according to the transition probabilities \(\delta \), but it has proven to be advantageous to additionally consider the difference between the upper and lower bound in the successor states  [MLG05, BCC+14]. In consequence, states where we already know a lot (under- and over-approximations are close to each other) are given less priority than states where we still need information.

The simulation is stopped in two cases: Either (i) it reaches a target state or (ii) it is stuck in a BSCC with no path to the target. Different heuristics for checking whether the simulation is stuck are discussed in depth in Sect. 4. Note that being able to differentiate between targets and non-target BSCCs during the simulations allows us not to do anything in \(\mathsf {INITIALIZE}\); we can set the value to 1 when reaching a target and 0 in the other case. The \(\mathsf {UPDATE}\) function for simulation based asynchronous value iteration again uses the Bellman equation (1) to update the estimates of all states on the path; moreover, it can utilize additional information: Since \(\mathsf {GET\_STATES}\) returns a path, there is a notion of order of the states. Updating the states in reverse order backpropagates information faster.

Fig. 2.
figure 2

(a) A Markov chain where exploring the whole state space can be avoided. \(\varepsilon \) denotes a transition probability. The cloud represents an arbitrarily large state space. (b) A Markov chain with high branching. From , there is a uniform probabilistic choice with \(n = \frac{1}{\varepsilon }\) successors.

Example 4

Consider the MC in Fig. 2a, again with reachability objective \(\{t\}\). The cloud represents an arbitrarily large state space. However, since it is only reachable with a very small probability \(\varepsilon \) (and we are interested in an \(\varepsilon \)-precise solution), it need not be explored. Let the first sampled path be . This happens with high probability, as the only other possibility would be to select a successor from the cloud in state , but since the selection process depends on the transition probabilities , going to has a higher probability. After the simulation reaches \(\mathsf t\), this value is backpropagated in reverse order. First the lower estimate is set to 1, then is set to 1, then is set to \(1-\varepsilon \). At this point the algorithm has converged, as difference between the lower and upper bound is \(\varepsilon \).

So in this example, sampling the most probable path a single time gives a good approximation. The algorithm avoids exploring the large cloud and backprogagates values faster than synchronous VI.

Example 5

As an adversarial example, consider the MC in Fig. 2b. Here, the model exhibits high branching, so every single path has a low probability, and only by aggregating all paths we actually get a high value. Unlike the previous example, there is no part of the state space that is clearly irrelevant. In fact, to achieve precision of \(\varepsilon \) the algorithm has to see so many paths that their cumulative probability is \(1-\varepsilon \), which in this case means seeing all but one transition from the starting state. This needs at least \(\frac{1}{\varepsilon }\) simulations, but since the successors are chosen probabilistically, most likely a lot more.

Note that similarly to synchronous VI, there are versions of asynchronous VI without (RTDP  [BBS95]) and with (BRTDP  [MLG05, BCC+14])Footnote 3 guaranteed error bounds.

3.4 Statistical Model Checking

Algorithms for statistical model checking (SMC),  [YS02], are different from all previously described ones in two ways, namely what they store and what they return. The VI-based algorithms store estimates for every (seen) state and they update these values to be ever more precise. Thus, the returned bounds on the values are certainly correct, although possibly quite loose. In contrast, SMC stores only a single accumulator (for the value of the initial state) and the returned value is probably approximately correct (PAC  [Val84]). Being PAC with probability \(\alpha \) and approximation \(\epsilon >0\) guarantees the following: with high probability (\(\alpha \)), the returned value is close to the true value (off by at most \(\varepsilon \)). However, the returned confidence interval is not guaranteed to be a valid under- and over-approximation; if we are unlucky (i.e. with the remaining probability \(1-\alpha \)), there is no guarantee whatsoever on the returned value.

SMC does not need to do anything in \(\mathsf {INITIALIZE}\). It only stores a single accumulator to remember how often a target state was reached. \(\mathsf {GET\_STATES}\) works as in Algorithm 2 with \(\mathrm {NEXT\_STATE}\) typically sampling the successor according to the transition probabilities (in some settings, importance sampling may also be possible, e.g.  [JLS12, BDH17]). \(\mathsf {UPDATE}\) remembers whether we reached the target or not; in the end we can divide the number of reaches by the total number of samples to get the probability estimate. \(\mathsf {TERM\_CRIT}\) is a (typically low) number of samples that depends on the required probability of the guarantee and the width of the confidence interval; see  [DHKP17, Section 2.2] for details or  [JSD19] for more advanced techniques.

Example 6

Consider again the MC depicted in Fig. 1a. Let the first sampled path be . At this point the simulation stops, as we have reached a target state, and we remember that we have seen a target once. Let the second path be . On the one hand, the \(\mathsf {STUCK}\) function has to let the simulation continue, even though is seen 3 times and it looks like a cycle. On the other hand, it has to detect that the simulation will loop forever in and has to stop it. Ways to detect this are discussed in Sect. 4. After detecting that we are stuck, we remember that the simulation did not reach the target.

Let the required probability of the guarantee be \(\alpha =0.9\) and the width of the confidence interval \(\varepsilon =0.1\). Using Hoeffding’s inequality  [Hoe63] we can show that the required number of samples for this is 461. So assume that after 461 simulations we have seen the target 223 times. Then we know that with probability at least 0.9, the value is in the interval , i.e. [0.434, 0.534]. Increasing the number of simulations can both increase the confidence or decrease the width of the interval.

Note that this number of simulations is independent of the system. While 461 simulations are a lot for this small system, the number would be the same if we were considering a model with several billion states where value iteration is impossible.

4 STUCK

In this section, we discuss heuristics for detecting whether a simulation is stuck in a BSCC with no path to a target state. We also propose one new such heuristic with convenient theoretical properties.

For simulation-based asynchronous value iteration, previous work either excluded the existence of non-target BSCCs in their assumptions  [BBS95, MLG05] or used a heuristic with no false negatives, but the possibility of false positives  [BCC+14]. This means that if the simulation is stuck in a BSCC, the simulation definitely is stopped, which is required for termination. However, if the simulation is not stuck in a BSCC, it might still be stopped, guessing the value of the last state in the path is 0, although it might not be. The \(\mathsf {STUCK}\)-heuristics used in previous work either depend on the path length ( [BCC+14, Ujm15, Chapter 7.5]) or simply stops exploring when any state is seen twice  [AKW19, Appendix A.3].

SMC has to be sure with high probability that the simulation is stuck, as otherwise it loses the probabilistic guarantee. In  [YCZ10], two approaches are described. The first approach requires knowledge of the second eigenvalue of the MC in order to guarantee asymptotic convergence. However, getting the second eigenvalue is as hard as the verification problem itself. The second approach works in the grey-box setting and pre-processes the MC so that all potentially infinite paths are eliminated. A similar transformation, using white-box information, was suggested in  [HJB+10]. However, both of these approaches transform the whole model and thus face problems in the case of very large models.

An alternative was suggested in  [DHKP16]. It monitors the finite path sampled during the simulation, implicitly constructing a graph with all seen states as nodes and all seen transitions as edges. The candidate of the current path is the (possibly empty) set of states forming the maximal BSCC of this graph. Intuitively, it is what we believe to be a BSCC given the observation of the current simulation. This candidate has to be validated, because as we saw in Example 6, a state set can look like a BSCC for several steps before being exited. In the black-box setting, this validation works by continuing the simulation until the probability of overlooking some transition exiting the candidate becomes very small  [DHKP16].

In this paper, we pinpoint that in the grey-box or white-box setting, this costly type of validation is not necessary. Instead of validating the candidate by running around in it for a huge number of steps, one can verify it using the additional information on the model. If no successor of any state in the candidate is outside of the candidate, then it indeed is a BSCC. Formally, for a candidate T, we check that (if the topology is known), or alternatively that (in what we defined as the grey-box setting) where yields the number of successors within the observed candidate.

Example 7

Consider again the MC depicted in Fig. 1a. When a simulation enters , \(\mathsf {STUCK}\) should return true in order to stop the simulation, as it has reached a BSCC with no path to a target. In the black box setting of  [DHKP16], this is only possible after continuing the simulation for another huge amount of steps. For example, even in a BSCC with only a single state, hundreds of further steps can be necessary to reach the required confidence. Given the grey-box information, the algorithm can determine that all successors of the states in the candidate () have been seen and conclude that the candidate is indeed a BSCC.

However, this check stops the simulation and can incur an overhead if there are many SCCs in the transient part of the state space. Hence, we can delay it, not checking at the first occurrence of a cycle, but e.g. only when every state in the candidate has been seen twice. Alternatively, one can only allow the check every n (e.g. hundred) steps of the simulation. Depending on the model and the implementation of the algorithm, these heuristics can have some impact on the runtime.

Furthermore, one might modify this heuristic even further. If a state of the BSCC is only reached with low probability, it takes many steps for the simulation to reach it. When we check whether the current candidate is a BSCC, this state might not have occurred in the simulation yet. Instead of concluding that the information is insufficient and the simulation has to continue, one could deterministically explore the unknown successors and compute the BSCC. On the one hand, for small to medium sized BSCCs, this could result in a speed-up. On the other hand, it increases the overhead when transient SCCs are checked by \(\mathsf {STUCK}\). Consequently, in the available benchmarks, this heuristic did not prove advantageous. Hence we do not even report on it in the evaluation section.

5 Discussion

5.1 Dependency of Simulation Length on Topology

Although the number of samples in SMC is independent of the model size, the length of the simulations is highly dependent on the model size and even more on the structure. Indeed, any kind of cyclic behaviour in the transient part of the state space increases the simulation time for two reasons. Firstly, the simulation loops in transient SCCs and does not make progress towards a target or a BSCC. Secondly, the check whether the simulation is stuck in transient SCCs incurs an overhead. An adversarial handcrafted worst-case example where simulations struggle is given in  [HM18, Figure 3]. Moreover, the structure of BSCCs affects the length of the simulation. For cyclic BSCCs, the simulation easily encounters all states of the BSCC and can quickly terminate. For more complex topologies, some states are typically only seen with very low frequency and thus the simulation takes longer.

If the model exhibits many transient SCCs, using any simulation-based technique is problematic.

5.2 Black, Grey and White SMC

The difference between the variants of SMC we report on are their knowledge of the transition system: \(p_{\min }\) corresponds to black, the number of successors to grey and the exact successors and probabilities to the white-box setting. This information can be used in the \(\mathsf {STUCK}\)-check; apart from that, the algorithms are the same.

Comparing grey and black box, it is apparent that simulations in grey box can be much shorter, as upon detection of a candidate that is a BSCCs the simulation is immediately stopped, whereas in the black box setting it has to continue for a number of steps. This number of steps depends on two things: (i) The size of the BSCC, as larger BSCC take longer to explore, especially since all states, no matter how improbable, need to be seen a certain amount of times, and (ii) the given under-approximation of the minimum transition probability \(p_{min}\), as this determines how often every state in the candidate has to be seen until the probability of a false positive is small enough.

Thus, for large BSCCs or small \(p_{\min }\), grey SMC is clearly better, as we also experimentally validate in Table 3 (large BSCC) and Table 2 (various \(p_{\min }\)) in the next section. For small BSCCs (e.g. only of size 1) and not so small \(p_{\min }\), black and grey SMC become more comparable, but grey SMC still has shorter simulations. However, practically, the overhead of verifying the candidates in grey SMC can be so large that black SMC can even be slightly faster than grey SMC (see e.g., leader6_11 in Table 1).

Heuristically reducing the number of checks in grey SMC (as described in Sect. 4) can make it faster again, but the effectiveness of the heuristics depends on the models. So, if it is known that the BSCC-detection is very easy for black SMC (e.g. they are of size 1 or cyclic and \(p_{\min }\) is not too small), black SMC can be a viable choice. However, as black SMC is never far better, using grey SMC is the safer variant when facing models with uncertain topology.

5.3 Comparison of Algorithms

Finally, we compare the (dis-)advantages of the different algorithms, giving a practical decision guidance. If hard guarantees are required, then BVI or BRTDP are to be used. The latter is simulation based, and thus good if only a small part of the state space is relevant for convergence. Additionally, if the model is too large for BVI, BRTDP still has a chance, but quite possibly the partial model will also be too large. Conversely, if the model contains lots of transient SCCs, BVI is preferable, as simulation based approaches fail on this kind of model, see Sect. 5.1. Note that, if there are small probabilities present, it might take very long for BVI and BRTDP to converge, see Example 3.

For a quick estimate, or if PAC guarantees are sufficient, or if the system is too large, so that it is not possible to provide hard guarantees, SMC is to be used, if possible (white or grey box setting) in our grey variant. As both the memory and the termination criterion are independent of the size of the system, SMC always has a chance to yield an estimate, which additionally comes with a probabilistic guarantee.

There is no case in which un-guaranteed (synchronous or asynchronous) VI are preferable, as they suffer from the same drawbacks as BVI and BRTDP, but additionally do not provide guarantees. Whenever hard guarantees are not of interest and the system is not strongly connected, grey SMC should be used for a quick estimate.

5.4 Extensions to Other Unbounded-Horizon Properties

For more complex unbounded-horizon properties  [BK08], such as Until (avoid-reach), LTL or long-run average reward, (B)VI pre-processes the state space to analyze the BSCCs  [BK08] and BRTDP  [BCC+14] can either do the same or analyze the encountered BSCCs only. Black SMC of  [DHKP17] is applicable through additional analysis of the BSCC candidates after they have been found likely to be BSCCs. This is directly inherited by grey SMC and makes it available for these specifications with low overhead.

6 Experimental Evaluation

We implemented grey SMC in a branch of the PRISM Model Checker  [KNP11] extending the implementation of black SMC  [DHKP17]. We ran experiments on (both discrete- and continuous-time) Markov chains from the PRISM Benchmark Suite  [KNP12a]. In addition to a comparison to black SMC, we also provide comparisons to VI and BVI of PRISM and BRTDP of  [BCC+14]. An interested reader may also want to refer  [DHKP17, Table II] for a comparison of black SMC against two unbounded SMC techniques of  [YCZ10].

For every run configuration, we run 5 experiments and report the median. In black SMC, the check for candidates is performed every 1000 steps during path simulations, while in grey SMC the check is performed every 100 steps. Additionally, grey SMC checks if a candidate is indeed a BSCC once every state of the candidate is seen at least twice. In all our tables, ‘TO’ denotes a timeout of 15 min and ‘OOM’ indicates that the tool ran out of memory restricted to 1GB RAM.

Table 1. Runtime (in seconds) comparison of black and grey SMC for various benchmarks. BVI runtimes are also presented as a baseline.

6.1 Comparison of Black and Grey SMC

Table 1 compares black SMC and grey SMC on multiple benchmarks. One can see that, except in the case of leader6_11 and brp_nodl, grey SMC finishes atleast as soon as black SMC. In bluetooth, gridworld, leader and tandem, both the SMC methods are able to terminate without encountering any candidate (i.e. either the target is seen or the left side of the until formula is falsified). In brp_nodl, crowds_nodl and nand, the SMC methods encounter a candidate, however, since the candidate has only a single state (all BSCCs are trivial), black SMC is quickly able to confidently conclude that the candidate is indeed a BSCC. The only interesting behaviour is observed on the herman-17 benchmark. In this case, every path eventually encounters the only BSCC existing in the model. Grey SMC is able to quickly conclude that the candidate is indeed a BSCC, while black SMC has to sample for a long time in order to be sufficiently confident.

Table 2. Effect of \(p_{\min }\) on black SMC runtimes (in seconds) on some of the benchmarks. Lower \(p_{\min }\) demands stronger candidates, due to which black SMC has to sample longer paths.

The performance of black SMC is also a consequence of the \(p_{\min }\) being quite small. Table 2 shows that black SMC is very sensitive towards \(p_{\min }\). Note that grey SMC is not affected by the changes in \(p_{\min }\) as it always checks whether a candidate is a BSCC as soon as all the states in the candidate are seen twice.

6.2 Grey SMC vs. Black SMC/BRTDP/BVI/VI

We now look more closely at the self-stabilization protocol herman  [KNP12b, Her90]. The protocol works as follows: herman-N contains N processes, each possessing a local boolean variable \(x_i\). A token is assumed to be in place i if \(x_i = x_{i-1}\). The protocol proceeds in rounds. In each round, if the current values of \(x_i\) and \(x_{i-1}\) are equal, the next value of \(x_i\) is set uniformly at random, and otherwise it is set equal to the current value of \(x_{i-1}\). The number of states in herman-N is therefore \(2^N\). The goal of the protocol is to reach a stable state where there is exactly one token in place. For example, in case of herman-5, a stable state might be \((x_1=0, x_2=0, x_3=1, x_4=0, x_5=1)\), which indicates that there is a token in place 2. In every herman model, all stable states belong to the single BSCC. The number of states in the BSCC range from 10 states in herman-5 to 2,000,000 states in herman-21.

Table 3. Runtime (in seconds) of the various algorithms on the Herman self-stabilization protocol  [KNP12b] with the property 4tokens. The median runtimes are reported for grey SMC, black SMC  [DHKP17], BRTDP  [BCC+14], Bounded value iteration (BVI) and Value iteration (VI). The SMC algorithms use SPRT method with parameters \(\alpha = 0.01\) and \(\beta = 0.01\). BRTDP, BVI and VI run until a relative error of 0.01 is obtained.

For all herman models in Table 3, we are interested in checking if the probability of reaching an unstable state where there is a token in places 2–5, i.e. \((x_1=1, x_2=1, x_3=1, x_4=1, x_5=1)\) is less than 0.05. This property, which we name 4tokens, identifies \(2^{N-5}\) states as target in herman-N. The results in Table 3 show how well grey SMC scales when compared to black SMC, BRTDP, BVIFootnote 4 and VI. Black SMC times out for all models where \(N \ge 11\). This is due to the fact that the larger models have a smaller \(p_{\min }\), thereby requiring black SMC to sample extremely long paths in order to confidently identify candidates as BSCCs. BVI and VI perform well on small models, but as the model sizes grow and transition probabilities become smaller, propagating values becomes extremely slow. Interestingly, we found that in both grey SMC and black SMC, approximately 95% of the time is spent in computing the next transitions, which grow exponentially in number; an improvement in the simulator implementation can possibly slow down the blow up in run time, allowing for a fairer comparison with the extremely performant symbolic value iteration algorithms.

Finally, we comment on the exceptionally poor performance of BRTDP on herman models. In Table 4, we run BRTDP on three different properties: (i) tokens in places 2–3 (2tokens); (ii) tokens in places 2–4 (3tokens); and (iii) tokens in places 2–5 (4tokens). The number of states satisfying the property decrease when going from 2 tokens to 4 tokens. The table shows that BRTDP is generally better in situations where the target set is larger.

In summary, the experiments reveal the following:

  • For most benchmarks, black SMC and grey SMC perform similar, as seen in Table 1. As expected, the advantages of grey SMC do not show up in these examples, which (almost all) contain only trivial BSCCs.

  • The advantage of grey SMC is clearly visible on the herman-N benchmarks, in which there are non-trivial BSCCs. Here, black SMC quickly fails while grey SMC is extremely competitive.

  • Classical techniques such as VI and BVI fail when either the model is too large or the transition probabilities are too small. However, they are still to be used for strongly connected systems, where the whole state space needs to be analysed for every run in both SMC approaches, but only once for (B)VI.

Table 4. Effect of restrictive properties (satisfied in fewer states) on the runtime (in seconds) of BRTDP in the herman benchmarks.

7 Conclusion

While SMC has found its use also in the white-box setting as a scalable alternative, we introduce the first approach that utilizes the knowledge in a local way, without globally processing the state space, and thus preserves the efficency advantages of black-box SMC. We call this approach grey SMC since we utilize only the topological information and not the quantitative information (sometimes referred to as grey box). On the one hand, this is useful as the quantitative information is often unavailable or imprecise w.r.t. the modelled reality. On the other hand, while the full quantitative information is irrelevant in BSCCs, it plays a major role in the transient phase and could be used to further enhance the approach. For instance, it could be used for importance sampling in order to handle rare events efficiently  [JLS12, BDH17] even in the context of unbounded-horizon properties.