Keywords

1 Introduction

An important feature of modern systems is their complexity. This characteristic makes the design, implementation and verification of complex systems extremely difficult. This difficulty is enhanced by the often critical role of these systems (avionics control process, nuclear power plants, etc.). Probabilistic verification is a set of techniques for formal modelling and analysis of such systems. Probabilistic model checking [1,2,3] involves the construction of a finite-state model augmented with probabilistic information, such as Markov chains or probabilistic automaton [17, 26]. This is then checked against properties specified in probabilistic extensions of temporal logic, such as Probabilistic Computation Tree Logic (PCTL) [18].

Formal methods, including the Probabilistic Model Checking [1,2,3] suffer from the problem of state space explosion. This problem constitutes, even after several years of research, the main obstacle of probabilistic model checking. Compositional verification [14, 15, 19, 24] and Symbolic model checking [7, 27] are two promising approaches to cope with this problem. Compositional verification suggests a divide and conquer strategy to reduce the verification task into simpler subtasks. A popular approach is the assume-guarantee paradigm [9, 11, 29], in which individual system components are verified under assumptions about their environment. Once it has been verified that the other system components do indeed satisfy these assumptions, proof rules can be used to combine individual verification results, establishing correctness properties of the overall system. The success of assume-guarantee reasoning approach depends on discovering appropriate assumptions. The process of generating automatically assumptions can be solved by using machine learning [9, 14], such as CDNF learning algorithm [6]. Symbolic model checking is also a useful technique to cope with the state explosion problem. In symbolic model checking, system states are implicitly represented by Boolean functions, as well as the initial states and transition relation of the system. To verify probabilistic systems encoded using Boolean function, the Boolean function should be converted to another data structures such as Binary Decision Diagrams(BDD) or Multi Terminal BDD(MTBDD) [16], this is due to the absence of SAT-based model checking for probabilistic systems.

In this paper, we present a novel approach for the compositional verification for probabilistic systems through implicit learning. Our aim is to reduce the size of the state space. For that, we propose to encode the system components using Boolean functions and Multi Terminal BDD. This encoding allows to store and explore a large number of states efficiently [9]. We use the Boolean functions as input of the CDNF learning algorithm. This algorithm generates an assumption which simulates a set of MDP component. The idea is to use this assumption for the verification instead of the real system components. Thus, if the size of this assumption is much smaller than the size of the corresponding MDP component, then we can expect significant gain of the verification performance. In our work, Interval Markov Decision Processes (IMDP) are used to represent assumptions. To establish the verification process and guarantee that the generated assumption simulates all the possible behaviour of the set of components, we proposed a sound and complete symbolic assume-guarantee reasoning rule. This rule defines and establish the compositional verification process. We have illustrated our approach using a simple example, and we have applied our approach in a several case studies derived from PRISM benchmarks. Experimental results suggest promising outlooks for the implicit learning of the compositional verification.

The remainder of this paper is organized as follows: In Sect. 2 we provide the most relevant works to our work. Section 3 provides some background knowledge about MDP, Interval MDP and the parallel composition MDP \(\parallel \) IMDP. In Sect. 4, we present our approach, where we detail the process of encoding MDP using Boolean function, our symbolic assume-guarantee reasoning proof rule and the application of the CNDF learning algorithm to generate assumptions. Section 6 concludes the paper and talks about future works.

2 Related Works

In this section, we review some research works related to the symbolic probabilistic model checking, compositional verification and assume-guarantee reasoning. Verification of probabilistic systems have been addressed by Vardi and Wolper [32,33,34], and then by Pnueli and Zuck [30], and by Baier and Kwiatkowska [3]. The symbolic probabilistic model checking algorithms have been proposed by [10, 28]. These algorithms have been implemented in a symbolic probabilistic model checker PRISM [22]. The main techniques used to generate counterexamples was detailed in [21]. A recent work [12] proposed to use causality in order to generate small counterexamples, the authors of this work propose to used the tool DiPro to generate counterexamples, then they applied an aided-diagnostic method to generate the most indicative counterexample. For the compositional verification of non-probabilistic systems, several frameworks have been developed using the assume-guarantee reasoning approach [9, 11, 29]. The compositional verification of probabilistic systems has been a significant progress in these last years [14, 15, 19, 23]. Our approach is inspired by the work of [14, 15]. In this work, they consider the verification of Discrete Time Markov Chains, and they proposed to use CDNF learning algorithm to infer assumptions. Another work relevant to ours is [19]. This work proposed the first sound and complete learning-based composition verification technique for probabilistic safety properties, where they used an adapted \(L^{*}\) learning algorithm to learn weighted automata as assumptions, then they transformed them into MTBDD.

3 Preliminaries

In this section, we give some background knowledge about MDP and IMDP. MDP are often used to describe and study systems exhibit non deterministic and stochastic behaviour.

Definition 1

Markov Decision Process (MDP) is a tuple \(M=(States_{M},s_{0}^{M}, \) \(\varSigma _{M},\delta _{M})\) where \(States_{M}\) is a finite set of states, \(s_{0}^{M} \in States_{M}\) is an initial state, \(\varSigma _{M}\) is a finite set of actions, \(\delta _{M} \subseteq States_{M} \times (\varSigma _{M} \cup \{\tau \}) \times Dist(States_{M})\) is a probabilistic transition relation, where where \(\tau \) denotes a “silent” (or “internal”) action.

In a state s of MDP M, one or more transitions, denoted \((s,a) \rightarrow \mu \), are available, where \(a \in \varSigma _{M}\) is an action label, \(\mu \) is a probability distribution over states, where \(\mu \ne 0\), and \((s, a, \mu ) \in \delta _{M}\). A path through MDP is a (finite or infinite) sequence \((s_{0},a_{0},\mu _{0}) \rightarrow (s_{1},a_{1},\mu _{1}) \rightarrow ... \). An example of two MDP \(M_{0}\) and \(M_{1}\) is shown in Fig. 1.

Interval Markov Chains (IMDP) generalize ordinary MDP by having interval-valued transition probabilities rather than just probability value. In this paper, we use IMDP to represent the assumptions used in our compositional verification.

Fig. 1.
figure 1

Example of two MDP, \(M_{0}\) (left) and \(M_{1}\) (right).

Definition 2

Interval Markov Chain (IMDP) is a tuple \(I=(States_{I},i_{0}^{I},\varSigma _{I}, P^{l},P^{u})\) where \(States_{I},i_{0}^{I}\) and \(\varSigma _{I}\) are respectively the set of states, initial state and the set of actions. \(P^{l},P^{u}: States_{I} \times \varSigma _{I} \times States_{I} \mapsto [0,1]\) are matrices representing the lower/upper bounds of transition probabilities such that: \(P^{l}(i,a)(i') \le P^{u}(i,a)(i')\) for all states \(i,i' \in States_{I}\) and \(a \in \varSigma _{I}\).

An example of IMDP I is shown in Fig. 2, where all represents the set of actions : \(\{open_{M_{0}},\) \(send_{M_{0}},\) \(check_{M_{0}},\) done\(fail\}\) with the probability interval value equal to [0, 1].

Fig. 2.
figure 2

Example of IMDP I.

In Definition 3, we describe how MDP and IMDP are composed together. This is done by using the asynchronous parallel operator (\(\parallel \)) defined by [31], where MDP and IMDP synchronise over shared actions and interleave otherwise.

Definition 3

Parallel composition \(MDP \parallel IMDP\)

Let M and I be MDP and Interval MDP, respectively. Their parallel composition, denoted by \(M \parallel I\), is an Interval MDP MI. \(MI= \{ States_{M} \times States_{I}, (s_{0}^{M},s_{0}^{I}), \varSigma _{M} \cup \varSigma _{I}, P^{l},P^{u} \},\) where \(P^{l},P^{u}\) are defined such that: \((s_{i}, s_{j})\) \(\xrightarrow []{a}\) \([P^{l}(s_{i}, a)(s_{j}) \times \mu _{i}, P^{u}(s_{i}, a)(s_{j}) \times \mu _{i}] \) if and only if one of the following conditions holds: Let \(s_{i}\), \(s'_{i}\) \(\in States_{M}\) and \(s_{j}\), \(s'_{j}\) \( \in States_{I}\): (i) \(s_{i} \xrightarrow []{a,\mu _{i}} s'_{i} , s_{j} \xrightarrow []{P^{l}(s_{j}, a)(s'_{j}), P^{u}(s_{j}, a)(s'_{j})} s'_{j} \), where \(a \in \varSigma _{M} \cap \varSigma _{I}\), (ii) \(s_{i} \xrightarrow []{a,\mu _{i}} s'_{i} \), where \(a \in \varSigma _{M} \setminus \varSigma _{I}\), and (iii) \(s_{j} \xrightarrow []{P^{l}(s_{j}, a)(s'_{j}), P^{u}(s_{j}, a)(s'_{j})} s'_{j} \), where \(a \in \varSigma _{M} \setminus \varSigma _{I}\).

In this work we use the symbolic model checking to verify if a system \(M_{0} \parallel I\) satisfies a probabilistic safety property. The symbolic Model checking uses BDD and MTBDD to encode the state space. It is straightforward to convert a Boolean function to a BDD/MTBDD.

Definition 4

A Binary Decision Diagram (BDD) is a rooted, directed acyclic graph with its vertex set partitioned into non-terminal and terminal vertices (also called nodes). A non-terminal node d is labelled by a variable \(var(d) \in X\), where X is a finite ordered set of Boolean variables. Each non-terminal node has exactly two children nodes, denoted then(d) and else(d). A terminal node d is labelled by a Boolean value val(d) and has no children. The Boolean variable ordering < is imposed onto the graph by requiring that a child \(d'\) of a non-terminal node d is either terminal, or is non-terminal and satisfies \(var(d) < var(d')\).

Definition 5

A Multi-Terminal Binary Decision Diagram (MTBDD) is a BDD where the terminal nodes are labelled by a real number.

4 ACVuIL Approach

Our approach, probabilistic symbolic compositional verification using implicit learning (ACVuIL), aims to mitigate the state explosion problem. Figure 3 illustrates an overview of ACVuIL. The first step consists to encode the system component \(M_{0}\) using Boolean functions \(\beta (M_{0})\). Different from the explicit representing of the state space, the implicit representation using Boolean functions allows to store and explore a large number of states efficiently. \(\beta (M_{0})\) will be used as input of the CDNF learning algorithm as target language. The second step aims to generate an appropriate assumption \(S_{I_{i}}\), which needs to abstract the behaviour of the original competent \(M_{0}\). In our approach, we use the CDNF learning algorithm to generate automatically the assumptions. The second step starts by calling the CDNF learning algorithm, with \(\beta (M_{0})\) as input. At each iteration, the CDNF learns a new assumption \(\beta (I_{i})\) represented as Boolean functions. For the first iteration (\(i=0\)), the CDNF generates true as assumption, for that, we generate a special initial assumption \(S_{I_{0}}\). For (\(i \ge 1\)) iterations, we convert the generated assumption \(\beta (I_{i})\) to MTBDD \(S_{I_{i}}\), then we refine the initial assumption \(S_{I_{0}}\) using \(S_{I_{i}}\). We use the symbolic probabilistic model checking algorithm (SPMC) to verify if \(S_{I_{0}} \parallel S_{M_{1}}\) satisfies the probabilistic safety property \(P_{\le P}[\psi ]\). If \(SPMC(S_{I_{0}}, S_{M_{1}})\) returns true, then we can conclude that \(M_{0} \parallel M_{1} \models P_{\le P}[\psi ]\) is true i.e. \(P_{\le P}[\psi ]\) satisfies \(M_{0} \parallel M_{1}\), otherwise, we generate a counterexample Ctx illustrated why \(P_{\le P}[\psi ]\) is violated. Ctx can be a real counterexample of the system \(M_{0} \parallel M_{1}\) or a spurious counterexample due to the generated assumption. Thus, at each iteration, we analyse if Ctx is real or not. If Ctx is real, then we can conclude that \(M_{0} \parallel M_{1} \nvDash P_{\le P}[\psi ]\) i.e. \(P_{\le P}[\psi ]\) does not satisfy the system \(M_{0} \parallel M_{1}\), otherwise, we return Ctx to CDNF to generate a new assumption. Our compositional verification process is sound and complete. The soundness and completeness is guaranteed by the use of an assume-guarantee reasoning rule. All steps of our approach are described in details in the next sections.

Fig. 3.
figure 3

An overview of our approach (ACVuIL).

4.1 Encoding MDP Using Boolean Functions

MDP can be encoded implicitly as Boolean functions, we denote by \(\beta (M_{0})\) the Boolean functions encoded MDP \(M_{0}\). The encoding process of MDP using Boolean functions aims to reduce the size of the state space. Indeed, many works such as [7, 16, 27, 28] show that the implicit representation is often more efficient than the explicit representation. In addition, this Boolean functions will be used as input of the CDNF learning algorithm. In this section we describe the process of encoding MDP using Boolean functions.

Definition 6

\(\beta (M_{0})=(Init_{M_{0}}, f_{M_{0}}\) \((yxx'e^{l}e^{u}))\) is a pair of Boolean functions encoded the MDP \(M_{0}\), where \(Init_{M_{0}}\) is predicate encoding the initial state \(s_{0}^{M_{0}}\) over the set X and \(f_{M}(yxx'e^{l}e^{u})\) is a transition predicate over \(Y\cup X\cup X' \cup E\) where \(y,x,x',e^{l},e^{u}\) are predicates of receptively \(Y,X,X' \) and E. Y, X, \(X'\) and E are finite ordered set of Boolean variables with \( Y \cap X \cap X' \cap E= \emptyset \). The set X encodes the states of \(M_{0}\), \(X'\) next states, Y encodes actions and E encodes the probability values.

More concretely, let \(M_{0} = (States_{M_{0}},s_{0}^{M},\varSigma _{M_{0}},\delta _{M_{0}})\) be a MDP. Let \(n = |States_{M_{0}}|\), \(m=|\varSigma _{M_{0}}|\) and \(k = \lceil log_{2}(n) \rceil \). We can see \(\delta _{M_{0}}\) as a function of the form \(States_{M_{0}} \times \varSigma _{M_{0}} \times \{1,2,...,r\}\times States_{M_{0}} \rightarrow [0,1]\), where r is the number of non-deterministic choice of a transition. We use a function \(enc: States_{M_{0}} \rightarrow \{0,1\}^{k}\) over \(X=\langle x_{1},\) \(x_{2},\) \(...,x_{k} \rangle \) to encode states in \(States_{M_{0}}\) and \(X'=\langle x'_{1},x'_{2},...,x'_{k}\rangle \) to encode next states. We use also \(Y=\langle \) \(y_{1},\) \(y_{2},\) \(...,y_{m}\) \(\rangle \) to encode actions and we represent the probability values using \(E= \langle e_{1}^{l},e_{1}^{u},e_{2}^{l},e_{2}^{u},...,e_{t}^{l},e^{t} \rangle \), where t is the number of distinct probability value in \(\delta _{M_{0}}\). \(f_{M_{0}}\) \((yxx'e^{l}e^{u})\) encodes the probabilistic transition relation \(\delta _{M_{0}}\) as a disjunction over a set of transition formulae, where each formula encodes a transition between two states. Suppose a transition \(s \xrightarrow []{a,p} s'\), we encode the state s, the next state \(s'\) and the action a using respectively enc(s), \(enc(s')\) and enc(a), where enc is a function encodes: (i) states s over Boolean variable set X, (ii) next states \(s'\) over Boolean variable set \(X'\), and (iii) actions over Boolean variable Y. In addition, to encode the probability value p, we use the Boolean variables \(e^{l}\) and \(e^{u}\), where \(e^{l}\) and \(e^{u}\) encode predicates of the form \(p \ge \mu (s,s')\) and \(p \le \mu (s,s')\) respectively. Thus, a transition of the from \(s \xrightarrow []{a,p} s'\) can be encoded as: \(enc(y) \wedge enc(s) \wedge enc(s') \wedge e^{l} \wedge e^{u} \).

Example 1

To illustrate how we encode MDP as Boolean functions, we consider the MDP \(M_{0}\) (Fig. 1). \(M_{0}\) contains the set of states \(States_{M_{0}}=\{s_{0},\) \(s_{1},\) \(s_{2},\) \(s_{3}\}\) and the set of actions \(\varSigma _{M_{0}} = \{open_{M_{0}},\) \(send_{M_{0}},\) \(check_{M_{0}},\) done\(fail\}\). We use \(X=\langle x_{0},x_{1}\rangle \) to encode the set of states in \(States_{M_{0}}\) as: \(enc(s_{0})\) \(=\lnot x_{0} \wedge \lnot x_{1}\), \(enc(s_{1})\) \(=\lnot x_{0} \wedge x_{1}\), \(enc(s_{2})\) \(= x_{0} \wedge \lnot x_{1}\), \(enc(s_{3})\) \(= x_{0} \wedge x_{1}\); and we use the set \(Y=\langle o, s, c, d, f \rangle \) to encode the actions \(\{open_{M_{0}},\) \(send_{M_{0}},\) \(check_{M_{0}},\) done\(fail\}\), respectively. Table 1 summarizes the process of encoding the transition function \(\delta _{M_{0}}\). \(\beta (M_{0})=(Init_{M_{0}}, f_{M_{0}}\) \((yxx'e^{l}e^{u}))\) encoded \(M_{0}\) is \(Init_{M_{0}} = \lnot x_{0} \wedge \lnot x_{1}\) and

$$\begin{aligned} f_{M_{0}} (yxx'e^{l}e^{u})=\begin{array}{c}((s \wedge \lnot x_{0} \wedge \lnot x_{1} \wedge \lnot x'_{0} \wedge x'_{1} \wedge e_{3}^{l} \wedge e_{3}^{u})\\ \vee (s \wedge \lnot x_{0} \wedge \lnot x_{1} \wedge x'_{0} \wedge \lnot x'_{1} \wedge e_{2}^{l} \wedge e_{2}^{u})\\ \vee (o \wedge \lnot x_{0} \wedge \lnot x_{1} \wedge x'_{0} \wedge \lnot x'_{1} \wedge e_{5}^{l} \wedge e_{5}^{u}) \\ \vee (c \wedge \lnot x_{0} \wedge x_{1} \wedge x'_{0} \wedge \lnot x'_{1} \wedge e_{4}^{l} \wedge e_{4}^{u}) \\ \vee (c \wedge \lnot x_{0} \wedge x_{1} \wedge x'_{0} \wedge x'_{1} \wedge e_{1}^{l} \wedge e_{1}^{u}) \\ \vee (d \wedge x_{0} \wedge \lnot x_{1} \wedge x'_{0} \wedge \lnot x'_{1} \wedge e_{5}^{l} \wedge e_{5}^{u})\\ \vee (f \wedge x_{0} \wedge x_{1} \wedge x'_{0} \wedge x'_{1} \wedge e_{5}^{l} \wedge e_{5}^{u})) \end{array} \end{aligned}$$
Table 1. Encoding the set of states and the probability values of MDP \(M_{0}\) (Fig. 1).

4.2 Encoding MDP Using MTBDD

In this paper, we also consider the implicit representation using MTBDD. MTBDD are used to encode components \(M_{1}\) and to perform the probabilistic model checking. In Definition 7, we introduce Symbolic MDP (SMDP) and we provide the different data structures used to encode MDP. We denoted by \(S_{M_{1}}\) the SMDP encoded the MDP \(M_{1}\).

Definition 7

Symbolic MDP (SMDP) is a tuple \(S_{M}=(X,\) \(Init_{M},\) \(Y,f_{S_{M}}\) \((yxx'))\) where X, \(X'\) and Y are finite ordered set of Boolean variables with \(X\cap X'\cap Y = \emptyset \). Init(X) is a BDD encoded the initial state and \(f_{S_{M}}(yxx')\) is an MTBDD encoded the transition relation. The sets X, \(X'\) and Y are used to encode respectively the set of states, next states and the set of actions of M, and \(y,x,x'\) are valuations of receptively, \(Y,X,X'\).

The encoding of MDP as SMDP follows the same process as the encoding using Boolean functions.

Example 2

We consider the MDP \(M_{1}\) (Fig. 1) to illustrate the encoding of MDP using SMDP. \(M_{1}\) contains the set of states \(States_{M_{1}}=\{t_{0},\) \(t_{1},\) \(t_{2},\) \(t_{3}\}\) and the set of actions \(\varSigma _{M_{1}} = \{open_{M_{1}},\) \(send_{M_{1}},\) \(check_{M_{1}},\) done\(fail\}\). We use the set \(X=\langle x_{0},x_{1}\rangle \) to encode the set of states \(States_{M_{0}}\) as: \(enc(t_{0})\) \(=(00)\), \(enc(t_{1})\) \(=(01)\), \(enc(t_{2})=(10)\), \(enc(t_{3})=(11)\); and we use the set \(Y=\langle s, o, c, d, f\rangle \) to encode the actions \(\{open_{M_{1}},\) \(send_{M_{1}},\) \(check_{M_{1}},\) done\(fail\}\), respectively.

Following the same process to encode MDP implicitly as SMDP, we can encode Interval MDP as SIMDP.

Definition 8

Symbolic Interval MDP (SIMDP) is a tuple \(S_{I}=(X,\) \(Init_{I},\) \(Y,f^{l}_{S_{I}}\) \((yxx'),\) \( f^{u}_{S_{I}}\) \((yxx'))\) where X, \(X'\) and Y are finite ordered set of Boolean variables with \(X\cap X'\cap Y = \emptyset \). \(Init_{I}\) is a BDD encodes the initial state and \(f^{l}_{S_{I}}(yxx')\) and \(f^{u}_{S_{I}}(yxx')\) are MTBDD encode the transition relation over \(Y \cup X \cup X'\). The MTBDD \(f^{l}_{S_{I}}(yxx')\) encodes the lower probability bound and \(f^{u}_{S_{I}}(yxx')\) encodes the lower. The sets X, \(X'\) and Y encode respectively, the set of states, next states and the set of actions, and \(y,x,x'\) are valuations of receptively, \(Y,X,X'\).

4.3 Symbolic Assume-Guarantee Reasoning Rule

To establish the compositional verification process we propose an assume-guarantee reasoning proof rule, where assumptions are represented using IMDP. As described before, the compositional verification aims to generate a symbolic assumption \(I_{i}\) represented using IMDP, where \(M_{0}\) is embedded in \(I_{i}\) (\(M_{0} \preceq I_{i}\)).

Definition 9

Let \(M_{0}=(States_{M_{0}},s_{0}^{M_{0}},\varSigma _{M_{0}},\delta _{M_{0}})\) and \(I_{i}=(States_{I_{i}}, s_{0}^{I_{i}},\varSigma _{I_{i}}\), \(P^{l},P^{u})\) be MDP and IMDP, respectively. We say \(M_{0}\) is embedded in \(I_{i}\), written \(M_{0} \preceq I_{i}\), if and only if: (1) \(States_{M_{0}} = States_{I_{i}}\), (2) \(s_{0}^{M_{0}} = s_{0}^{I_{i}}\), (3) \(\varSigma _{M_{i}} = \varSigma _{I_{i}}\), and (4) \(P^{l}(s,a)(s') \le \mu (s,a)(s') \le P^{u}(s,a)(s')\) for every \(s,s' \in States_{M}\) and \(a \in \varSigma _{M}\).

Example 3

Consider the MDP \(M_{0}\) shown in Fig. 1 and IMDP I shown in Fig. 2. They have the same number of states, identical initial state \((s_{0}\), \(i_{0})\) and the same set of actions \(\varSigma _{M_{1}} = \{open_{M_{1}},\) \(send_{M_{1}},\) \(check_{M_{1}},\) done\(fail\}\). In addition, the transition probability between any two states in \(M_{0}\) lies within the corresponding transition probability interval in I by taking the same action in \(\varSigma _{M_{1}}\). For example, the transition probability between \(s_{0}\) and \(s_{1}\) is \(s_{0} \xrightarrow []{send_{M_{0}},0.7} s_{1}\), which falls into the interval [0, 1] labelled the transition \(i_{0} \xrightarrow []{send_{M_{0}},[0,1]} i_{1}\) in I. Thus, we have \( M_{0} \preceq I\); (\(M_{0}\) is embedded in I).

Theorem 1

Symbolic assume-guarantee reasoning rule

Let \(M_{0}, M_{1}\) be MDP and \(P_{\le P}[\psi ]\) a probabilistic safety property, then the following proof rule is sound and complete: if \( M_{0} \preceq I \) and \( I \parallel M_{1} \models P_{\le P}[\psi ]\) then \( M_{0} \parallel M_{1} \models P_{\le P}[\psi ] \). This proof rule means, if we have a system composed of two components \(M_{0}\) and \(M_{1}\), then we can check the correctness of a probabilistic safety property \(P_{\le P}[\psi ]\) over \(M_{0} \parallel M_{1}\) without constructing and verifying the full state space. Instead, we first generate an appropriate assumption I, where I is an IMDP, then we check if this assumption could be used to verify \(M_{0} \parallel M_{1}\) by checking the two promises: (i) Check if \(M_{0}\) is embedded in I, \(M_{0} \preceq I\), and (ii) Check if \(I \parallel M_{1}\) satisfies the probabilistic safety property \(P_{\le P}[\psi ]\), \( I \parallel M_{1} \models P_{\le P}[\psi ]\). If the two promises are satisfied then we can conclude that \(M_{0} \parallel M_{1}\) satisfies \(P_{\le P}[\psi ]\).

Proof (Soundness). Consider \(M_{0}\) and \(M_{1}\) be MDP, where \(M_{0}=(States_{M_{0}},\) \(s_{0}^{M_{0}},\) \(\varSigma _{M_{0}},\) \(\delta _{M_{0}})\), \(M_{1}=(States_{M_{1}},s_{0}^{M_{1}},\varSigma _{M_{1}},\delta _{M_{1}})\), and IMDP I, \(I=(States_{I},s_{0}^{I},\varSigma _{I},P^{l},P^{u})\). If \( M_{0} \preceq I \) and based on Definition 9 we have \(States_{M} = States_{I}\), \(s_{0}^{M} = s_{0}^{I}\), \(\varSigma _{M} = \varSigma _{I}\), and \(P^{l}(s,a)(s') \le \mu (s,a)(s') \le P^{u}(s,a)(s')\) for every \(s,s' \in States_{M_{0}}\) and \(a \in \varSigma _{M_{0}}\). Based on Definitions 3 and 9, \(M_{0} \parallel M_{1}\) and \(I \parallel M_{1}\) have the same state space, initial state and actions. Since \(P^{l}(s,a)(s') \le \mu (s,a)(s') \le P^{u}(s,a)(s')\), and we suppose the transition probability of \(M_{0} \parallel M_{1}\) as: \(\mu _{M_{0}\parallel M_{1}}((s_{i},s_{j}),a)(s'_{i},s'_{j}) = \mu _{M_{0}}((s_{i}),a)(s'_{i})\times \) \(\mu _{M_{1}}((s_{j}),a)(s'_{j})\) for any state \(s_{i},s'_{i} \in States_{M_{0}}\) and \(s_{j},s'_{j} \in States_{M_{1}}\). Thus, \(P^{l}((s_{i},s_{j}),a)(s'_{i},s'_{j}) \le \mu _{M_{0}\parallel M_{1}}((s_{i},s_{j}),a)\) \((s'_{i},s'_{j}) \le P^{u}((s_{i},s_{j}),a)(s'_{i},s'_{j})\) for the probability between two states \((s_{i},s'_{i})\) and \((s_{j},s'_{j})\). In \(I \parallel M_{1}\) the probability interval between any two states \((s_{i},s_{j})\) and \((s'_{i},s'_{j})\) is restricted by the interval \([P^{l}((s_{i},a)(s'_{i}) \times \mu _{M_{1}}(s_{j}),a)(s'_{j}),\) \( P^{u}((s_{i},a)(s'_{i}) \times \mu _{M_{1}}(s_{j})\) \(,a)(s'_{j})]\), this implies, if \( M_{0} \preceq I \) and \( I \parallel M_{1} \models P_{\le P}[\psi ]\) then \( M_{0} \parallel M_{1} \models P_{\le P}[\psi ] \) is guaranteed.

Proof (Completeness). The completeness of our approach is guarantee since we always generate a new assumption to refine the initial one. In the worst case, the CDNF will learn a final assumption equivalent to the original component.

4.4 CDNF Learning Algorithm

The CDNF learning algorithm [6] is an exact learning algorithm for Boolean functions. It learns a Boolean formula in conjunctive disjunctive normal form (CDNF) for a target Boolean function over a fixed set of Boolean variables x. In this paper, we use this algorithm to learn the symbolic assumptions I for MDP represented by Boolean functions. During the learning process, the CDNF learning algorithm interacts with a Teacher to make two types of queries: (i) membership queries and (ii) equivalence queries. A membership queries are used to check whether a valuation v over Boolean variables x satisfies the target function. Equivalence queries are used to check whether a conjectured Boolean function is equivalent to the target function.

4.5 ACVuIL: Automatic Compositional Verification Using Implicit Learning Algorithm

Algorithm ACVuIL highlighted the main steps of our approach. ACVuIL accepts the system components MDP \(M_{0}\), \(M_{1}\) and the probabilistic safety property \(\varphi = P_{\le P}[\psi ]\) as input. ACVuIL starts by encoding \(M_{0}\) using Boolean functions and \(M_{1}\) using SMDP. Then, it calls the CDNF learning algorithm to learn the initial assumption \(I_{0}\). For the first iteration, CDNF learns true as initial assumption. For that, ACVuIL calls the function GenerateInitialAssumption to generate \(S_{I_{0}}\). The process of generating the SIMDP \(S_{I_{0}}\) is described in the next section.

figure a

4.6 Generate Initial Assumption

The ACVuIL calls the function GenerateInitialAssumption to generate the initial assumption \(S_{I_{0}}\). This function accepts MDP \(M_{0}\) and the Boolean functions \(I_{0}\) as inputs, and returns SIMDP \(S_{I_{0}}\). The process of generating \(S_{I_{0}}\) is described in Algorithm 2.

figure b

GenerateInitialAssumption creates a new IMDP \(Initial\_I_{0}\) equivalent to \(M_{0}\), with transitions equal to [0, 1] between all states, and the set of actions are hold in each transition. Then it encodes the IMDP of \(Initial\_I_{0}\) as SIMDP. The aim behind the generation of \(S_{I_{0}}\) with transition equal to [0, 1] between all states is to reduce the size of the implicit representation of the state space. Indeed, for large probabilistic system, when we use uniform probabilities (0 and 1 in our case) this will reduce the number of terminal nodes as well as non-terminal nodes. Adding transition between all states, will keep our assume-guarantee verified for the initial assumption, since \(M_{0}\) is embedded in \(Initial\_I_{0}\), in addition, this process will help to reduce the size of the implicit representation of \(Initial\_I_{0}\) and this by combining any isomorphic sub-tree into a single tree, and eliminating any nodes whose left and right children are isomorphic.

Example 4

To illustrate our approach, we consider the verification of \(M_{0} \parallel M_{1}\) (Fig. 1) against the probabilistic safety property \(P_{\le 0.0195}[\diamondsuit ``err\)”], where “err” stands for the state (\(s_{3},t_{3}\)). This property means that the maximum probability that the system \(M_{0} \parallel M_{1}\) should never fails, over all possible adversaries, is less than 0.0195. ACVuIL starts by encoding \(M_{0}\) using Boolean functions \(\beta (M_{0})\). \(\beta (M_{0})\) encoded \(M_{0}\) is illustrated in Sect. 4.1. In addition, The encoding process of \(M_{1}\) as SMDP is illustrated in Sect.  4.2. After encoding the system components using implicit representation, ACVuIL calls the function GenerateInitialAssumption to generate the initial assumption. The explicit representation of the initial assumption \(Initial\_I_{0}\) is illustrated in Fig. 2.

Symbolic Probabilistic Model Checking (SPCV). In line 8 and 19, ACVuIL calls the function Symbolic probabilistic model checking (SPCV). To model checking \(S_{I_{i}}\) \(\parallel \) \(S_{M_{1}} \vDash \) \(P_{\le P}[\psi ]\), SPCV computes the parallel composition \(S_{I_{i}}\) \(\parallel \) \(S_{M_{1}}\), where the result is SIMDP, because \(S_{I_{i}}\) is SIMDP. Indeed, model checking algorithm for IMDP was considered in [4, 8], where it was demonstrated that the verification of IMDP is often more consume, in time as well as in space, than the verification of MDP. In this work, our ultimate goal is reducing the size of the state space. Therefore, the verification of IMDP needs to be avoided. Thus, we propose rather than verifying SIMDP \(S_{I_{i}} \parallel S_{M_{1}}\), we verify only a restricted SMDP RI, which is an MTBDD contains the upper probability value of the probability interval associate in each transition of \(S_{I_{i}}\). This can be done by taking the MTBDD \(f^{u}_{S_{I_{i}}}\) of \(S_{I_{i}}\). Then, the verification of \(RI \parallel S_{M_{1}}\) can be done using the standard probabilistic model checking proposed in [19]. The symbolic probabilistic model checking used in this work was proposed in [28].

Example 5

To analyse if \(S_{I_{0}}\) could be used to establish the compositional verification, ACVuIL calls the symbolic model cheeking (SPCV) to check if \(S_{I_{0}} \parallel S_{M_{1}} \models P_{\le 0.0195}[\diamondsuit ``err\)”]. This latter returns false. In practice, to verify \(S_{I_{0}} \parallel S_{M_{1}} \models P_{\le 0.0195}[\diamondsuit fail]\) we used the model PRISM with the engine “MTBDD” [22].

Generate Probabilistic Counterexamples. The probabilistic counterexamples are generated when a probabilistic property \(\varphi \) is not satisfied. They provide a valuable feed back about the reason why \(\varphi \) is violated.

Definition 10

The probabilistic property \(\varphi = P_{\le \rho }[\psi ]\) is refuted when the probability mass of the path satisfying \(\varphi \) exceeds the bound \(\rho \). Therefore, the counterexample can be formed as a set of paths satisfying \(\varphi \), whose combined measure is greater than or equal to \(\rho \).

As denoted in Definition 10, the probabilistic counterexample is a set of finite paths, for example, the verification of the property “a fail state is reached with probability at most 0.01” is refused by a set of paths whose total probability exceeds 0.01. The main techniques used for the generation of counterexamples are described in [21]. The probabilistic counterexamples are a crucial ingredient in our approach, since they are used to analyse and refine the conjecture symbolic assumptions. Thus, our need consist to find the most indicative counterexample. A most indicative counterexample is the minimal counterexample (which has the least number of paths). A recent work [12] proposed to use causality in order to generate small counterexamples.

Example 6

Since \(PSCV(S_{I_{0}} \parallel S_{M_{1}} \models P_{\le 0.0195}[\diamondsuit ``err\)”]) returns false, the ACVuIL calls the function GenerateCounterexample to generate Ctx, which shows the reason why \(P_{\le 0.0195}[\diamondsuit ``err\)”]) is violated. In addition Ctx will be used to check if it is a real counterexample or not. In practice, we used the tool DiPro to generate counterexamples. This returns \(Ctx = \{ (s_{0},t_{0}) \xrightarrow []{open_{M_{1}},1} (s_{3},t_{0}) \xrightarrow []{send_{M_{2}},0.7} (s_{3},t_{1}) \xrightarrow []{check_{M_{2}},0.2} (s_{3},t_{3}) \}\).

Generate Sub-MDP and Analyse the Probabilistic Counterexamples. To analyse if the counterexample Ctx is real or not, ACVuIL generates a sub-MDP, where this latter represents a fragment of the MDP \(M_{0}\) based on the probabilistic counterexample Ctx, where the MDP fragment \(SubM_{0}\) contains only transitions present in Ctx. Thus, the fragment \(SubM_{0}\) is obtained by removing from \(M_{0}\) all states and transitions not appearing in any path of the set Ctx. Since we use symbolic data structures to encode the state space, we encode the MDP fragment \(SubM_{0}\) using SMDP (following the same process to encode MDP). The function GenerateSubMDP is described in Algorithm 3.

figure c

Then ACVuIL calls the function AnalyseCounterexample. This function aims to check whether the probabilistic counterexample Ctx is real or not. Ctx is a real counterexample of the system \( M_{0} \parallel M_{1} \models P_{\le P}[\psi ] \) if and only if \(SubM_{0} \parallel S_{M_{1}} \models P_{\le \rho }[\psi ]\) does not hold i.e. AnalyseCounterexample returns true if and only if the symbolic probabilistic model cheeking of \(SubM_{0} \parallel S_{M_{1}} \models P_{\le \rho }[\psi ]\) returns false, or false otherwise.

Example 7

To analyse the counterexamples, ACVuIL generates a \(sub-MDP\) containing only states and transitions exist in Ctx. For our example, the set Ctx contains transition \(s_{0} \xrightarrow []{open_{M_{1}},1} s_{3}\), where this transition is not present in \(M_{0}\). Thus, AnalyseCounterexample returns false, since no sub-MDP was generated for this counterexample. After a few iterations, ACVuIL returns the final assumption \(S_{I_{f}}\) equivalent to the original component \(M_{0}\). In this example, ACVuIL was not able to generate a final assumption more compact than the original component. Indeed, in the worst case, ACVuIL returns the original component as a final assumption.

If the probabilistic counterexample Ctx is not real, then ACVuIL returns false to the CDNF learning algorithm. When ACVuIL returns false to CDNF, this means that the generated assumption is not equivalent to the target Boolean functions. Thus, CDNF generates a new assumption \(\beta (I_{i})\) (\(i \ge 1\)). In line 18, the ACVuIL calls the functions \(Refine\_S_{I_{i}}\) to refine the initial assumption. The function \(Refine\_S_{I_{i}}\) is described in the next section (Sect. 4.6).

Refinement Process of the Conjecture Symbolic Assumption \(S_{I_{i}}\). At each iteration of the ACVuIL, the generated assumption \(S_{I_{i}}\) converges to the target Boolean functions (\(\beta (M_{0})\)). The function \(Refine\_S_{I_{i}}\) aims to refine the initial assumption \(S_{I_{0}}\) using the new generated assumption. This is done by removing from the initial assumption all transitions between two states, if these states are present in the new generated assumption, and add transitions from the original component between these states.

figure d

5 Implementation and Experimental Results

We have implemented a prototype tool to evaluate our approach. Our tool accepts MDP specified using PRISM code and a probabilistic safety property as input, and returns either true if the MDP satisfies the probabilistic safety property, or false and a counterexample otherwise. To implement our tool, we have used the library BULLFootnote 1, which impelements the CDNF learning algorithm and the tool DiproFootnote 2 to generate counterexamples. In this section, we give the results obtained for the application of our approach in a several case studies derived from the PRISM benchmarkFootnote 3. For each case study, we check the model against a probabilistic safety property using: (i) symbolic monolithic probabilistic model checking and (ii) compositional verification (our approach). The tests were carried on a personal computer with Linux as operating system, 2.30 GHz I5 CPU and 4 GB RAM.

For each case study, we compare the size of the original component \(M_{0}\) and the final assumption \(I_{f}\) and this by considering the number of clauses (\(\#Clauses\)) and the number of nodes (MTBDD nodes). In addition, we compare the symbolic non-compositional verification (SMV) with our approach ACVuIL. For SMV, we report the size (number of MTBDD nodes) and the time for model construction (T4MC) for the model \(S_{M_{0}} \parallel S_{M_{1}}\). For ACVuIL, we report the number of iterations for ACVuIL algorithm to learn the final assumption \(S_{I_{f}}\) (\(\#ite.\)), total time to generate \(S_{I_{f}}\) (T. Gen. \(S_{I_{f}}\)), as well as the size and T4MC to model checking \(S_{I_{f}} \parallel S_{M_{1}}\).

The results are reported in Table 2. The case studies considered in our experimental results are:

(i) Randomized dining philosophers [13, 25], for this case study we check the property \(\varphi _{1} = \) the probability that philosophers do not obtain their shared resource simultaneously is at most 0.1, formally: \(P_{\le 0.1} [ \lozenge ``err\)”], where label “err” sands for every states satisfy: \( [(s_{N} \ge 8) \& (s_{N} \le 9)]\), and N is the component number, (ii) The second case study is Israeli and Jalfon [20] solution for the randomized Self stabilising algorithm, we check the system against property: \(\varphi _{2} = \) the probability to reach a stable configuration for all algorithms is at most 0.999, (iii) The third case study is a variant of the client-server model from [29]. It models a server and N clients. The server can grant or deny a client’s request for using a common resource, once a client receives permission to access the resource, it can either use it or cancel the reservation. Failures might occur with certain probability in one or multiple clients, causing the violation of the mutual exclusion property (i.e. conflict in using resources between clients). In this case study, we consider the property: \(\varphi _{3} = \)the probability a failure state is reached is at most 0.98.

Table 2. Experimental results for the case studies randomized dining philosophers, randomized Self stabilising algorithm and Client-server

The overall results show that ACVuIL successfully generates assumptions for all case studies. As shown in Table 2, CDNF learns assumption \(\beta ({I_{f}})\) smaller than the original component \(\beta ({M_{0}})\). For the case studies R.D. Philos and Client-Server, the implicit representation of the final assumption using MTBDD is more compact than the implicit representation of the original components. However, for R.S. Stab. is the same size, this is due to the fact that ACVuIL had refined all transitions of the initial assumption, therefore, the final assumption is equal to the original component. For the verification time, the symbolic monolithic verification (non-compositional) verifies the system faster than our approach ACVuIL. Indeed, our approach takes more time to generate and refine the assumptions, as well as, the time necessary to generate counterexamples at each iteration.

6 Conclusion and Future Works

In this paper, we proposed a fully-automated probabilistic symbolic compositional verification to verify probabilistic systems, where each component is an MDP. Our approach ACVuIL is based on complete and sound symbolic assume-guarantee reasoning rule. The first step aims to encode the system components using compact data structures such as Boolean functions and MTBDD, then we use the compositional verification to model checking the system against the probabilistic safety property. In addition, we proposed to use the CDNF to learn automatically assumptions used in the verification process. We evaluated our approach using three case studies derived from PRISM benchmark, that are R.D. Philos, R.S. Stab. and Client-Server. The overall results show that our approach successfully generates assumptions. For two of the listed case studies, the CDNF learns assumption with implicit representation smaller than the original competent. For the future works, we plan to proposed other assume-guarantee reasoning rule such as asymmetric rule or circular rule to handle more large and complex systems. In addition, the research present in this paper can be extended to verify other probabilistic properties such as liveness. Furthermore, we plan to evaluate our approach using real-life complex systems such as the verification of the composition of inter-organisational Workflows [5].