Keywords

1 Introduction and Related Work

Model-checking is an effective, powerful, and successful formal verification technique for concurrent and distributed systems that has also been extended to quantitative logics and models. It consists of an efficient procedure that, given a model \(\mathcal{M}\) of the system, typically composed of system states and related transitions, decides whether \(\mathcal{M}\) satisfies a logical formula \(\varPhi \), typically drawn from a temporal logic. Traditionally, model-checking approaches are divided into two broad categories: global approaches and local approaches.

In global model-checking approaches, the procedure determines the set of all states in \(\mathcal{M}\) that satisfy \(\varPhi \). Global model-checking algorithms are popular because of their computational efficiency and can be found in many model-checkers, both in a qualitative and in a probabilistic setting (see e.g. [2, 3, 11, 16, 36]). The set of states that satisfy a formula is constructed recursively in a bottom-up fashion following the syntactic structure of the formula. Moreover, for stochastic model-checking, the global model-checking algorithm relies on existing and well-known algorithms for Markov Chains, such as those for transient and steady-state analysis (see e.g. [2]). Despite their success, the scalability of model-checking algorithms has always remained a concern due to the potential combinatorial explosion of the state space that needs to be searched.

This is unfortunate since current trends in information technology, like the Internet of Things (IoT), include specifically systems composed of a large number of components, often acting collectively and adapting to changing conditions, the so called Collective Adaptive Systems Footnote 1 (CAS), like, for instance gossip protocols, self-organised collective decision making, computer epidemic and smart urban transportation systems and decentralised control strategies for smart grids [4, 10, 12, 49]. Given that large portions of the IoT are intrinsically (part of) critical infrastructures, with safety, security, and, in general, high dependability requirements, it is of great importance that system designers have the possibility to perform formal analysis before developing and deploying them.

In order to mitigate the state space explosion problem, in the qualitative analysis domain, local model-checking algorithms have been proposed that, given a state s in \(\mathcal{M}\), determine whether s satisfies \(\varPhi \). Local model-checking approaches use the so called ‘on-the-fly’ paradigm (see e.g. [5, 17, 26, 32]) and follow a top-down approach that does not require global knowledge of the complete state space. For each state that is encountered, starting from a given state, the outgoing transitions are followed to adjacent states, constructing step by step local knowledge of the state space until it is possible to decide whether the given state satisfies the formula. For qualitative model-checking, local model-checking algorithms have been shown to have the same worst-case complexity as the best existing global procedures for the above mentioned logics. However, in practice, they have better performance when only a subset of the system states need to be analysed to determine whether a system satisfies a formula. Furthermore, local model-checking may still provide some results in case of systems with a very large or even infinite state space where global model-checking approaches would be impossible to use.

In the context of probabilistic model-checking several on-the-fly approaches have been proposed, among which [21, 29, 40]. In [21], a probabilistic model-checker is shown for the time bounded fragment of the Probabilistic Computation Tree Logic (PCTL) [30]. An on-the-fly approach for full PCTL model-checking is proposed in [40] where, actually, a specific instantiation is presented of an algorithm which is parametric with respect to the specific probabilistic processes modelling language and logic, and their specific semantics. Finally, in [29] an on-the-fly approach is used for detecting a maximal relevant search depth in an infinite state space and then a global model-checking approach is used for verifying bounded Continuous Stochastic Logic (CSL) [1, 2] formulas in a continuous time setting on the selected subset of states.

An on-the-fly approach by itself however, does not solve the challenging scalability problems that arise in truly large parallel systems, such as CAS. To address this type of scalability challenges in probabilistic model-checking, recently, several approaches have been proposed. In [28, 31] approximate probabilistic model-checking is introduced. This is a form of statistical model-checking that consists in the generation of random executions of an a priori established maximal length [37]. On each execution the property of interest is checked and statistics are performed over the outcomes. The number of executions required for a reliable result depends on the maximal error-margin of interest. The approach relies on the analysis of individual execution traces rather than a full state space exploration and is therefore memory-efficient. However, the number of execution traces that may be required to reach a desired accuracy may be large and therefore time-consuming. The approach works for general models, i.e. models where stochastic behaviour can also be non Markovian and that do not necessarily model populations of similar objects. On the other hand, the approach is not independent from the number of objects involved.

In [38], we presented a scalable model-checking algorithm, based on mean-field approximation, for the verification of time bounded PCTL properties of an individualFootnote 2 in the context of a system consisting of a large number of interacting objects. Also this algorithm is actually an instantiation of the above mentioned parametric algorithm for (exact) probabilistic model-checking [40]. In this case, the parametric algorithm is instantiated on (time bounded PCTL and) the approximate, mean-field, semantics of a population process modelling language. The approach is based on the idea of fast simulation, as introduced in [46]. More specifically, the behaviour of a generic agent with S states in a clock-synchronous system with a large number N of instances of the agent at given step (i.e. time) t is approximated by \(\mathbf {K}(\varvec{\mu }(t))\) where \(\mathbf {K}({\mathbf m})\) is the \(S \times S\) probability transition matrix of a (inhomogeneous) DTMC and \(\varvec{\mu }(t)\) is a vector of size S approximating the mean behaviour of the global system at t; each element of \(\varvec{\mu }(t)\) is associated with a distinct state of the agent, say C, and gives an approximation of (the average of) the fraction of the instances of the agent that are in state C in the global system, at step t. Note that such an approximation is deterministic, i.e. \(\varvec{\mu }\) is a function of the step t, computed iteratively, using (again) matrix \(\mathbf {K}({\mathbf m})\); the exact behaviour of the rest of the system would instead be a large DTMC in turn. Note furthermore, that \(\mathbf {K}({\mathbf m})\) does not depend on N; in other words, the cost of the analysis is independent from the number of objects involved, but only depends on the number of states of the single individual object. Our work is based on mean-field approximation in the discrete time setting; approximated mean-field model-checking in the continuous time setting has been presented in the literature as well. In the latter case, the deterministic approximation of the global system behaviour is formalised as an initial value problem using a set of differential equations. Preliminary ideas on the exploitation of mean-field convergence in continuous time for model-checking were informally sketched in a presentation at QAPL 2012Footnote 3 [34], but no model-checking algorithms were presented. Follow-up work on the above mentioned approach can be found in [33] which relies on earlier results on fluid model-checking by Bortolussi and Hillston [7], later published in [8]. Bortolussi and Hillston propose a global CSL model-checking procedure for the verification of properties of a selection of individuals in a population. The procedure relies on mean-field convergence and fast simulation results in a continuous time setting (see also [9, 18, 25] and references therein). The approach in [7, 8] is based on an interleaving model of computation, rather than a clock-synchronous one. Furthermore, a global model-checking approach, rather than an on-the-fly approach is adopted; it is also worth noting that the treatment of nested formulas, whose truth value may change over time, turns out to be much more difficult in the interleaving, continuous time, global model-checking approach than in the clock-synchronous, discrete time, on-the-fly one.

We conclude this brief overview on related work by mentioning the approach of using techniques and tools developed for continuous signal monitoring as means for performing approximated global model checking of probabilistic models. In this approach, a deterministic, continuous, approximation of a population system model is first computed [9], and then monitoring techniques are applied on the resulting function of continuous time [23, 24]. Recently, this approach has been extended in order to include also spatial features [50], as originally proposed in [13].

We finally note that one should keep in mind that mean-field/fluid procedures are based on approximations of the global behaviour of a system. Consequently, the techniques should be considered as complementary to other, more accurate analysis techniques for CAS, primarily those based on stochastic simulation, like for example statistical model-checking. In practice, given the high computational cost of simulation based techniques, especially when compared with the very low cost of the mean-field based techniques, the latter are more suitable for getting first ideas on the main features of the models at hand and a first screening thereof. Then, when only a few options are left, more detailed analyses could be performed and more accurate techniques would be recommended.

In this paper, we present a survey of work we have carried out recently within the context of the EU project QUANTICOLFootnote 4 in the area of mean-field approximated probabilistic model-checking. We start with a brief description, in Sect. 2, of FlyFast, the on-the-fly model checker which implements the procedure we proposed in [38, 42, 43]. Then, in Sect. 3 we show a complete example of use of FlyFast in the context of Collective Adaptive Systems, taken from [39]. In Sect. 4 we discuss two additional interesting front-ends for FlyFast; the first one, originally presented in [41], is a translation to FlyFast from CTMC-based population models and (a fragment of) CSL that allows for approximate probabilistic model-checking in the continuous stochastic time setting; the second one, originally presented in [15], is a translation to FlyFast from a predicate-based process interaction language that allows for probabilistic model-checking of bounded PCTL formulas on models based on components equipped both with behaviour and with attributes; component interaction takes place via communication primitives using predicates over attributes for expressing the set of partners in multi-cast communication. We finally draw some conclusions in Sect. 5.

2 A Brief Overview of FlyFast

In this section we recall the main features of FlyFast. Footnote 5 The reader interested in details is referred to [38, 42, 43]. A FlyFast model specification characterises a system consisting of the clock-synchronous product of a (large) number of instances of a probabilistic process. Systems with several distinct types of processes can be specified as well; here we consider only the case of a single process type for the sake of simplicity. The size of the system is assumed constant during system evolution; FlyFast does not support explicit dynamic creation/deletion of processes. The behaviour of the probabilistic process is specified by a set \(\varDelta {}\) of state and action probability definitions. A state definition has the following syntax: \(\mathbf {state} \; {\texttt {C}} \; \{{\texttt {a}}_1. {\texttt {C}}_1 + \ldots + {\texttt {a}}_r . {\texttt {C}}_r\}\) where \({\texttt {a}}_i \in {\mathcal{A}}_{}\)—the set of FlyFast actions\({\texttt {C}},\texttt {C}_i \in {}\)—the set of FlyFast states—and, for \(i,j=1,\ldots ,r\) \({\texttt {a}}_i\not =\texttt {a}_j\) if \(i\not =j\); note that \({\texttt {C}}_i=\texttt {C}_j\) with \(i\not =j\) is allowed instead. The informal meaning of the above definition is that, when in state \({\texttt {C}}\), the process can jump to state \({\texttt {C}}_1\), by executing (atomic) action \({\texttt {a}}_1\), or to state \({\texttt {C}}_2\), by executing (atomic) action \({\texttt {a}}_2\), and so on. Each action has a probability assigned by means of an action probability definition of the form \(\mathbf {action}\; {\texttt {a}}: exp\) where exp is an expression involving constants and \(\mathbf {frc}\,({\texttt {C}})\) terms. Constants are floating point values or names associated to such values using the construct \(\mathbf {const}\; \texttt {name} \, = \, \texttt {value}\)—we let A denote the set of such auxiliary definitions; \(\mathbf {frc}\,({\texttt {C}})\) denotes the element associated to state \({\texttt {C}}\) in the current occupancy measure vector—o.m.v. in the sequel—that is a vector with as many elements as the number of states of the individual process; the element associated to a specific state gives the fraction of the subpopulation currently in that state over the size of the overall population; the o.m.v. is a compact abstract representation of the system global state, where process identity is lost. Thus, the probability of executing a transition of a process in the system may depend on the global distribution of the processes in their local states within the system; process interaction is thus probabilistic and indirect, via transition probabilities, i.e. functions of the o.m.v.. Note that, whenever the exit probability p of a state is smaller than 1, FlyFast implicitly inserts a self-loop in the state, associated with the residual probability \(1-p\). The initial state of the system is specified by means of the \(\mathbf {system}\) construct, followed by the name of the system model, and the vector \(\mathbf {C_0}\) of the names of the initial state of all other instances, which implicitly specifies also the size N of the system. By convention, the first element \(\mathbf {C_0}[1]\) of vector \(\mathbf {C_0}\) refers to the individual process to analyse. In general more then one process can be specified for analysis; here we consider only the case of the single process for the sake of simplicity.

Fig. 1.
figure 1

A FlyFast specification of an epidemic model

Example 1

(An epidemic system model). In Fig. 1 the FlyFast specification of the epidemic model discussed in [38] is shown. The system is composed of 2000 instances of a process with four states; when in state \(\texttt {S}\) (susceptible) the process can become exposed (state \(\texttt {E}\)) either via an external infection, with probability \(\texttt {alpha\_e}\), or via internal infection, with a probability that is proportional to the fraction of processes in the system that are already infected, i.e. \(\texttt {alpha\_i}*\mathbf {frc}\,(\texttt {I})\). The infection activates in an exposed process with probability \(\texttt {alpha\_a}\), leading to state \(\texttt {I}\). An infected process may recover with probability \(\texttt {alpha\_r}\) and then loose immunity with probability \(\texttt {alpha\_l}\). Initially, all 2000 instances are in state \(\texttt {S}\) Footnote 6.

Given specification \({\langle \varDelta {},A,\mathbf {C_0} \rangle }\) for a system model of size N, FlyFast generates a transition probability matrix \(\mathbf {K}({\mathbf m})\) such that \(\mathbf {K}({\mathbf m})_{c,c'}\) is the probability for the (individual) probabilistic process to jump from state \({\texttt {C}}\) to \(\texttt {C}'\), given the current o.m.v. \({\mathbf m}\). Thus, \(\mathbf {K}({\mathbf m})\) is a function of the o.m.v. \({\mathbf m}\); strictly speaking, \(\varDelta \) characterises an inhomogeneous DTMC. In [38, 42] the details of the formal operational semantics definition for the model specification language are provided as well as the procedure for generating \(\mathbf {K}({\mathbf m})\); in the sequel we recall only the main steps. Let \({\varDelta }\) be the set of states defined in \(\varDelta {}\), with \(|{\varDelta }|=S\), \(\mathcal{U}^S = \{(m_1,\ldots ,m_S) \in [0,1]^S| m_1+ \ldots + m_S=1\}\) denote the unit simplex of dimension S, and \(\mathcal{I}:{\varDelta } \rightarrow \{1,\ldots ,S\}\) be a bijection. For state \({\texttt {C}} \in {\varDelta }\), with \(\mathcal{I}({\texttt {C}})=c\), the interpretation \([\![\mathbf {frc}\,({\texttt {C}})]\!]_{{\mathbf m}}\) of \(\mathbf {frc}\,({\texttt {C}})\) in the current o.m.v. \({\mathbf m}=(m_1,\ldots ,m_S)\) is defined as expected: \([\![\mathbf {frc}\,({\texttt {C}})]\!]_{{\mathbf m}}=m_c\), i.e. \(\mathbf {frc}\,({\texttt {C}})\) is the fraction of the subpopulation currently in state \({\texttt {C}}\) over the size of the overall population, which, by definition of the o.m.v., is exactly the element \(m_c\) of \({\mathbf m}\). The probability associated with an action \({\texttt {a}}\) by action probability definition \(\mathbf {action}\; {\texttt {a}}: E\) is a function \(\pi _{{\texttt {a}}}({\mathbf m})\) of the o.m.v. \({\mathbf m}\), defined as \(\pi _{{\texttt {a}}}({\mathbf m})=[\![E]\!]_{{\mathbf m}}\), where the interpretation function \([\![\cdot ]\!]\) is defined recursively on arithmetic expressions E involving \(\mathbf {frc}\,\) and constants, in the obvious way. More precisely, letting \({\texttt {C}} \mathop {\rightarrowtail }\limits ^{{\texttt {a}}} {\texttt {C}}'\) represent an \({\texttt {a}}\)-labelled transition in the operational semantics of the FlyFast modelling language and assuming \(c=\mathcal{I}({\texttt {C}})\not =c'=\mathcal{I}({\texttt {C}}')\), the probability matrix function \(\mathbf {K}: \mathcal{U}^S \times \{1,\ldots ,S\} \times \{1,\ldots ,S\} \rightarrow [0,1]\) is defined as follows: \(\mathbf {K}({\mathbf m})_{c,c'} =\sum _{{\texttt {a}}: {\texttt {C}} \mathop {\rightarrowtail }\limits ^{{\texttt {a}}} {\texttt {C}}'} \pi _{{\texttt {a}}}({\mathbf m})\) and \(\mathbf {K}({\mathbf m})_{c,c} = 1 - \sum _{j \in \{1,\ldots ,S\} \setminus \{c\}} \mathbf {K}({\mathbf m})_{c,j}\). In other words, \(\mathbf {K}({\mathbf m})_{c,c'}\) is the cumulative probability of jumping from \({\texttt {C}}\) to \(\texttt {C}'\), abstracting from the specific action performed by the process in the jump; this abstraction choice is typical of probabilistic, PCTL/DTMC-based approaches. Finally, note that, by construction, \(\mathbf {K}({\mathbf m})\) does not depend on N.

Example 2

It is easy to see that, for the model of Example 1, the resulting matrix is the following one, with \({\mathbf m} = (m_s,m_e,m_i,m_r)\) where \(m_s\) is the fraction of processes in state \(\texttt {S}\), \(m_e\) is the fraction in state \(\texttt {E}\), \(m_i\) is the fraction in state \(\texttt {I}\), and \(m_r\) is the fraction in state \(\texttt {R}\):

$$ \mathbf {K}(m_s,m_e,m_i,m_r)= \left( \begin{array}{c c c c} 1- (0.1 + 0.2m_i) &{} 0.1 + 0.2m_i &{} 0 &{} 0\\ 0 &{} 0.6 &{} 0.4 &{}0\\ 0 &{} 0 &{} 0.8 &{} 0.2\\ 0.1 &{} 0 &{} 0 &{} 0.9 \end{array} \right) $$

The exact probabilistic semantics of the complete system model is easily given as product of N instances of \(\mathbf {K}\) with appropriate o.m.v. parameter and argument states. In other words, the transitions of different processes are intended as stochastically independent Footnote 7. More precisely, for global system state \(\mathbf {C} \in {\varDelta }^N\), let the associated o.m.v. \(\mathbf {M} (\mathbf {C})\) be defined as \(\mathbf {M} (\mathbf {C}) = (M_1,\ldots ,M_S)\) with \( M_{i} = \frac{1}{N}\sum _{n=1}^{N} \mathbf {1}_{\{\mathbf {C}_{[n]} = \mathcal{I}^{-1}(i)\}} \) where \(\mathbf {1}_{\{\alpha = \beta \}}\) is 1, if \(\alpha = \beta \), and 0 otherwise. The probabilistic semantics of the system is the DTMC \(X^{(N)}(t)\) with one-step transition probability \(S^N \times S^N\) matrix \(\mathbf {P}\) with \( \mathbf {P}_{\mathbf {C}, \mathbf {C}'}=\prod _{n=1}^N \mathbf {K}(\mathbf {M} (\mathbf {C}))_{\mathcal{I}(\mathbf {C}_{[n]}),\mathcal{I}(\mathbf {C}_{[n]}')} \) and initial probability mass all in \(\mathbf {C}_0\). FlyFast provides a standard stochastic simulation functionality based on the exact probabilistic semantics, namely matrix \(\mathbf {P}\). In particular one can execute single runs or get averages of a user-specified number of runs. The output is given in the form of traces of the o.m.v. DTMC \(\mathbf {M}^{(N)}(t)=\mathbf {M}(X^{(N)}(t))\). In addition, the tool can perform exact, on-the-fly (full) PCTL model checking using \(\mathbf {P}\). FlyFast accepts state formulas built out of atomic propositions, negations, disjunctions and probabilistic quantification over path-formulas; the latter are next and until formulas. Of course, as opposed to approximate model-checking, exact PCTL model-checking of a formula can be used only if the portion of the state-space which needs to be generated and analysed for deciding satisfaction of the formula is not too large.

Example 3

For the epidemic model of Example 1, but with constant N set to 8, for a system with only 8 processes, we consider the following properties, where \(\mathtt {tt}\) stands for true, \(\texttt {LowInf}\) is defined, using the \(\mathbf {formula}\) construct of FlyFast, as follows: \(\mathbf {formula}{}\; \texttt {LowInf} \, : (\mathbf {frc}\,\texttt {I} )< 0.25\), and \(\texttt {I}\) (\(\texttt {E}\), respectively) labels all system states the first element of which is process state \(\texttt {I}\) (\(\texttt {E}\), respectively):

  1. P1

    the worm will be active in the first component within k steps with a probability that is at most p: \( {\mathcal{P}}_{\le p}(\mathtt {tt}~\, {\mathcal{U}}^{\le k} \,~\texttt {I}) \);

  2. P2

    the probability that the first component is infected, but latent, in the next k steps while the worm is active on less then \(25\%\) of the components is at most p: \( {\mathcal{P}}_{\le p}(\texttt {LowInf} \, {\mathcal{U}}^{\le k} \,~\texttt {E}) \);

  3. P3

    the probability to reach, within k steps, a configuration where the first component is not infected but the worm will be activated with probability greater than 0.3 within 5 steps is at most p:

    $$ {\mathcal{P}}_{\le p}(\mathtt {tt}~\, {\mathcal{U}}^{\le k} \, ( !\texttt {E}\, \wedge \, !\texttt {I}\wedge {\mathcal{P}}_{> 0.3}(\mathtt {tt}~\, {\mathcal{U}}^{\le 5} \,~\texttt {I}))). $$

In Fig. 2 the result of exact PCTL model-checking of Example 1 is reported. On the left the probability of the set of paths that satisfy the path-formulae used in the three formulae above is shown for k from 0 to 70. On the right the time needed to perform the analysis using PRISM [36] and using FlyFast exact PCTL model checking are presentedFootnote 8.

Fig. 2.
figure 2

Exact model-checking results (left) and verification time (right).

More interestingly, FlyFast can compute the deterministic limit of the o.m.v. DTMC, for \(N \rightarrow \infty \), and execute time bounded PCTL model-checking using such a deterministic approximation. The approach has been inspired by Fast Simulation, proposed in [46] and is based on Theorem 4.1 of [46], actually on a simplified version of the theorem, thanks to the specific syntax of the expressions used in FlyFast action probability definitions. Informally, let \(\mathbf {C_0}^{(N)}\) be the initial state of the FlyFast specification of a system with N processes and assume there exists \(\varvec{\mu _0} \in \mathcal{U}^S\) such that almost surely \(\lim _{N\rightarrow \infty } \mathbf {M}(\mathbf {C_0}^{(N)}) = \varvec{\mu _0}\). Let function \(\varvec{\mu }(t)\) be defined as follows: \(\varvec{\mu }(0)=\varvec{\mu _0}\) and \(\varvec{\mu }(t+1)=\varvec{\mu }(t)^T \cdot \mathbf {K}(\varvec{\mu }(t))\), where, as usual, \({\mathbf m}^T\) is the transpose of vector \({\mathbf m}\). Then, for any fixed time \(\tau \), almost surely \( \lim _{N\rightarrow \infty } {\mathbf M}^{(N)}(\tau ) = \varvec{\mu }(\tau ) \)—cfr. Theorem 4.1 of [46]. So, the matrix \( \mathbf {K}({\mathbf m})\) generated by FlyFast can be conveniently (re)used also for approximating the o.m.v., which, we recall, is an abstract representation of the global system state; it is important to stress here that the o.m.v. \(\mathbf {M}^{(N)}(t)\) is a stochastic process, whereas the approximation we use, \(\varvec{\mu }(t)\), is deterministic, i.e. just a function of the step (time) t. We consider now the stochastic process \(\mathcal{H}(t)\) the generic state of which, at time t, is a pair \(({\texttt {C}},\varvec{\mu }(t))\). The first component \({\texttt {C}}\) is the current state of the selected process in the system we are interested in, and the second component \(\varvec{\mu }(t)\) represents the current global system state. It is easy to see that \(\mathcal{H}(t)\) is a DTMC and that the probability of a jump from state \(({\texttt {C}},\varvec{\mu }(t))\) to \(({\texttt {C}}',\varvec{\mu }(t+1))\) is \(\mathbf {K}(\varvec{\mu }(t))_{\mathcal{I}({\texttt {C}}),\mathcal{I}(\texttt {C}')}\). \(\mathcal{H}(t)\) is the approximated probabilistic semantics of the system model. By performing on-the-fly model-checking on \(\mathcal{H}(t)\)—where state labels of the selected process are exported to pair states \(({\texttt {C}},\varvec{\mu })\)FlyFast provides an approximated, mean-field based, efficient time bounded PCTL model-checking functionality. In other words, for any fixed time \(\tau \), sample \(\mathbf {C}\) of X(t) at time \(\tau \) and safe formulaFootnote 9 \(\varPhi \) the following holds: \( \mathbf {C} \models _{X(t)} \varPhi \) if and only if \( (\mathbf {C}[1],\varvec{\mu }(\tau )) \models _{\mathcal{H}(t)} \varPhi . \) In the case of mean-field model-checking, the set of atomic propositions is the set of states of the single agent or assertions on the components of the o.m.v.; in addition one can assign a name to a formula and use it in larger formulas. Finally, note that FlyFast can provide, as a by-product, the plot of \(\varvec{\mu }(\tau )\) for \(\tau \) ranging in a user-specified range.

Example 4

Figure 3 shows the result of mean-field, approximated model-checking by FlyFast on the model of Example 1 with formulas as in Example 3, for the first object of a large population of 2000 objects, each initially in state \(\texttt {S}\). In Fig. 3 (left) the same properties are considered as in Example 3. The analysis takes less than a second and is insensitive to the total population size. Figure 3 (right) shows how the probability measure of the set of paths satisfying the formula \(\mathtt {tt}~\, {\mathcal{U}}^{\le k} \, ( !\texttt {E} \; \wedge !\texttt {I}\wedge {\mathcal{P}}_{> 0.3}(\mathtt {tt}~\, {\mathcal{U}}^{\le 5} \,~\texttt {I}))\) of property P3 on page 8, (for \(k=3\)), changes for initial time t0 varying from 0 to 10.

Fig. 3.
figure 3

Mean-field model-checking results.

We close this section by stressing that the exact full PCTL model-checker and the approximated mean-field time bounded PCTL model-checker are both instances of the same parametric implementation of an on-the-fly model-checking algorithm. Furthermore, the computation of the set of states to be analysed at the next step is a key operation of the on-the-fly procedure and, for exact model-checking, in the worst case the step returns \(S^N\) states, whereas for mean-field model-checking, the number of states returned in the worst case drops dramatically to S.

3 Predator-Prey Model of Lotka-Volterra in FlyFast

The next example we consider is a widely studied model for ecological competition, first independently investigated by the biophysicist Alfred Lotka and the Italian mathematician and physicist Vito Volterra in the twenties of the 19th century [48, 51]. Since then, the model has been studied extensively by numerous other scientists and some of its elements are still at the basis of many population models that have been developed in the course of time, both in continuous time, e.g. [19, 27] and references therein, and in discrete time settings, e.g. [22]. In its simplest form the model can be interpreted as a simplified and idealised description of two species in an ecosystem, often indicated as predator and prey, or foxes and rabbits for a concrete example.

In the variant we consider here we assume that each element of the two species can be in one of two states; it is either alive, or it is somehow ‘dormant’ waiting to get born again. We do this because the language we use does not provide explicit constructs for the dynamic creation of objects and is implicitly assuming that the total population size of all species remains constant. If we choose the size of the ‘dormant’ part of the population of each species large enough, this should not have any effect on the part of the population that is alive, since there are always enough dormant rabbits and foxes to get born.

As in the original version, we assume that the model depends on four parameters:

  • The net probability ‘a’ of an increase in the size of the rabbit population which is the difference between the natural birth and death probabilities.

  • The probability ‘b’ of rabbits that die because they are eaten by foxes

  • The probability ‘e’ of extra foxes being born and surviving because they eat rabbits (efficiency).

  • The net probability ‘c’ of the natural decrease in the population of foxes. Since the life of a fox depends on the availability of rabbits, there is a natural tendency of foxes to die when there a few or no rabbits.

A model in terms of difference equations of the populations of foxes and rabbits can then be given by:

$$\begin{aligned} RD(t+1)= & {} RD(t) + b\cdot h\cdot RL(t)\cdot FL(t) - a\cdot h\cdot RL(t) \nonumber \\ RL(t+1)= & {} RL(t) + a\cdot h\cdot RL(t) - b\cdot h\cdot RL(t)\cdot FL(t)\\ FD(t+1)= & {} FD(t) - e\cdot h\cdot RL(t)\cdot FL(t) + c\cdot h\cdot FL(t)\nonumber \\ FL(t+1)= & {} FL(t) + e\cdot h\cdot RL(t)\cdot FL(t) - c\cdot h\cdot FL(t)\nonumber \end{aligned}$$
(1)

where t ranges over the set of the natural numbers, RD and RL are the fractions of ‘dormant’ and ‘alive’ rabbits, respectively, and FD and FL the fractions of ‘dormant’ and ‘alive’ foxes, respectively. The factor h is a rescaling factor for the duration of steps and \(0< h <1\). The smaller the value of h the smaller the relative probabilities of the different events and the more accurate the results, but at the cost of an increase of the number of steps per time-unit in the model-checking procedure, which takes more time. For the model in this section we chose \(h=0.125\). Note that when this discrete model is interpreted as an approximation of the well-known continuous time model, i.e. in terms of differential equations, this approximation is not perfect, in the sense that the solution of the differential equations would give a perfect oscillating behaviour, whereas the solution of the difference equations will result in a small error in each step. This error has a cumulative effect resulting in oscillations with ever higher peaks, as can easily be observed in the results. A better approximation of the continuous model could be reached by using a more sophisticated integration method instead of the Euler method that is used implicitly in this case study.

Fig. 4.
figure 4

A FlyFast specification of the Lotka-Volterra model

The FlyFast specification of the Lotka-Volterra model is shown in Fig. 4. Assuming \(\mathcal{I}(\texttt {RD})=1\), \(\mathcal{I}(\texttt {RL})=2\), \(\mathcal{I}(\texttt {FD})=3\), \(\mathcal{I}(\texttt {FL})=4\), the \(4 \times 4\) matrix \(\mathbf {K}: \mathcal{U}^4 \times \{1,\ldots ,4\} \times \{1,\ldots ,4\} \rightarrow [0,1]\) generated by FlyFast is shown below, noting that the matrix is stochastic for the time interval of interest (and in particular \(m_1\not =0\not =m_3\)):

$$ \mathbf {K}(m_1, m_2, m_3, m_4) = \left( \begin{array}{c c c c} 1- a\!\cdot \!h\!\cdot \!\frac{m_2}{m_1} &{} a\!\cdot \!h\!\cdot \!\frac{m_2}{m_1} &{} 0 &{} 0\\ b\!\cdot \!h\!\cdot \!m_4 &{} 1- b\!\cdot \!h\!\cdot \!m_4 &{} 0 &{} 0\\ 0 &{} 0 &{} 1- e\!\cdot \!h\!\cdot \!m_2\!\cdot \!\frac{m_4}{m_3} &{} e\!\cdot \!h\!\cdot \!m_2\frac{m_4}{m_3}\\ 0 &{} 0 &{} c\!\cdot \!h &{} 1- c\!\cdot \!h \end{array} \right) $$

It is easy to see that by computing \(\varvec{\mu }(t+1)\) as \(\varvec{\mu }(t+1)=\varvec{\mu }(t)^T \cdot \mathbf {K}(\varvec{\mu }(t))\), where \(\varvec{\mu }(t) = (\mu _1(t), \mu _2(t), \mu _3(t), \mu _4(t))\), one obtains again the difference Eq. (1) of page 11, where, of course, \(\mu _1(t)\) stands for RD(t), \(\mu _2(t)\) for RL(t), \(\mu _3(t)\) for FD(t), and \(\mu _4(t)\) for FL(t).

As it is well known, the global behaviour of the (idealised) model shows oscillations in the populations of rabbits and foxes for certain values of the model parameters. In fact, the model has very interesting behaviour and is therefore widely studied, but in this paper we focus mainly on the illustration of the application of fast mean field model checking of an individual rabbit or fox in the context of the overall oscillating behaviour. For example, for the values of the parameters and initial state as in Fig. 4, we obtain the results for the occupancy measure varying over time shown in Fig. 5, which is the plot of the limit o.m.v. \(\varvec{\mu }(\tau )\) produded by FlyFast.

Fig. 5.
figure 5

Fraction of rabbit and fox populations.

In the predator-prey model one could furthermore be interested to know what is the probability that a rabbit survives for a certain amount of time, and how this probability changes over time with the oscillation of the population of foxes. Figure 6 shows the probability that a fox gets born or dies within time bound t ranging from 0 to 3000 time steps. It also shows the results for a rabbit getting born or dying. The formula for the probability that a rabbit gets born within t time steps is \({\mathcal{P}}_{= ?}(\texttt {RD}~\, {\mathcal{U}}^{\le t} \,~\texttt {RL})\). The other formulas are similar. Figure 6 shows that both foxes and rabbits eventually get born and die when given enough time and starting from the initial state of the overall system. The curves also reflect the oscillations in the populations over time and consequently the change in probability to get born or die.

Fig. 6.
figure 6

Probability that a fox (rabbit) gets born or dies within time bound t ranging from 0 to 3000 time steps.

Figure 7 shows the time-dependent probability of a fox and a rabbit to get born or die in the next 10 time steps, starting from initial times ranging from 0 to 5000. The probability that a rabbit dies within 10 time units is obtained by evaluating the property \({\mathcal{P}}_{= ?}(\texttt {RL}~\, {\mathcal{U}}^{\le 10} \,~\texttt {RD})\), for different initial times. The other formulas are similar. In this oscillating system the time-dependence of these probabilities can be observed very well. The probabilities of a rabbit getting born (dying respectively) within 10 time units and a fox getting born follow closely the oscillations in the respective population sizes. The probability that a fox dies in this model is constant. The amplitude of the oscillations is slowly increasing. This is likely due to the accumulation of small errors in the computation due to the constant step size used in the computations. In fact it is well-known that a mean-field approximation may become less accurate on the longer run, in a discrete time setting. See e.g. [19] for a study of this aspect of the Lotka-Volterra model in the continuous time setting. We will come back to these issues in Sect. 4.1.

Fig. 7.
figure 7

Time dependent probability to get born or die in the next 10 time steps for different initial times from 0 to 5000.

Finally, Fig. 8 shows the time dependent probability of reaching a state, within 100 time steps, in which the probability of an individual rabbit to die within 10 time steps is higher than 0.2. This probability is shown for different initial times ranging from 0 to 5000. This is a typical example of a ‘nested’ formula involving two occurrences of the until operator. The formula is:

$$\begin{aligned} {\mathcal{P}}_{= ?}(\mathtt {tt}~\, {\mathcal{U}}^{\le 100} \,~(~\texttt {RL}~\wedge ~{\mathcal{P}}_{> 0.2}(~\texttt {RL}~\, {\mathcal{U}}^{\le 10} \,~\texttt {RD}))) \end{aligned}$$

The figure shows that there are indeed relatively short periods in which such states can be reached within 100 time steps. Nested formulas are relatively easy to handle due to the iterative and recursive way in which the FlyFast model-checker works.

Fig. 8.
figure 8

Time dependent probability of reaching a state, within 100 time steps, in which the probability of an individual rabbit to die within 10 time steps is less than 0.2 for different initial times ranging from 0 to 5000.

4 Extending the Applicability of FlyFast

In the previous sections we have shown examples of the expected use of FlyFast, namely the development of a probabilistic, discrete time, population model of the system of interest and its analysis, mainly via bounded PCTL model-checking based on mean-field semantics. In this section we briefly describe two extensions of the applicability of the tool, both designed as additional front-ends for FlyFast, so that no modifications are required of the tool itself. The first extension concerns on-the-fly fluid CSL model-checking of continuous time population models; the second extends the FlyFast modelling language, and its underlying agent interaction paradigm, by adding predicate based communication primitives. Details on the first extension can be found in [41] while the second front end is described in detail in [15, 44].

4.1 FlyFast Front-end for Fluid Model-checking of Continuous Time Population Models

Fluid model checking [7, 8, 33] relies on a global model checking approach for time-inhomogeneous Continuous Time Markov Chains (ICTMC) representing an individual object in the context of a large CTMC population model. The rates of the individual may depend on the fraction of the population that is in a particular state. The algorithm relies on the deterministic approximation of the average stochastic behaviour of the system in continuous time, i.e. a fluid approximation [9, 35]. Although the technical and mathematical foundations of the continuous time case are obviously different from those in the discrete case, at the intuitive/conceptual level, the two cases are similar.

Suppose you have system of N agents, each modelled by a ICTMC with states in \(\{1,\ldots , S\}\), and \(S \times S\) infinitesimal generator matrix \(\mathbf {Q}^{(N)}(\varvec{x})\) that may depend on the current o.m.v. \(\varvec{x}\in \mathcal{U}^S\); the o.m.v. process is a CTMC on the space \([0,1]^S\) with initial state \(\varvec{x}_0^{(N)}\) equal to the fraction of agents in each local state, in the initial global state. The average infinitesimal variation of the o.m.v. process, given that it is in state \(\varvec{x}\) is \(F^{(N)}(\varvec{x}) = \varvec{x}^T\cdot \mathbf {Q}^{(N)}(\varvec{x})\). If, for \(N \rightarrow \infty \), \(\mathbf {Q}^{(N)}(\varvec{x})\) converges uniformly to the Lipschitz continuous generator matrix \(\mathbf {Q}(\varvec{x})\), and \(\varvec{x}^{(N)}_0\) to \(\varvec{x}_0\), and, furthermore, if \(\varvec{x}(t)\) is the solution of the ODE \(\frac{d\varvec{x}}{dt}=F(\varvec{x})=\varvec{x}^T\cdot \mathbf {Q}(\varvec{x})\) for initial condition \(\varvec{x}(0)=\varvec{x}_0\), then, almost surely, in the limit, the o.m.v. process behaves the same as \(\varvec{x}(t)\), for any finite time horizon T [18, 35].

This fundamental result has given rise to a fast simulation approach also in the continuous case. Assuming, again by convention and without loss of generality, that we are interested in the first of the N agents, let \(Z^{(N)}(t)\) be the ICTMC on space \(\{1,\ldots , S\}\) modelling the behaviour of such an agent. Let us furthermore consider the ICTMC on \(\{1,\ldots , S\}\) z(t) such that \(\Pr \{z(t+dt)=j | z(t)=i\}= q_{i,j}(\varvec{x}(t))dt\), and let \(\mathbf {Q}_z(\varvec{x}(t)) = (q_{i,j}(\varvec{x}(t)))\). We then have that for any finite horizon T and \(t\le T\) the behaviour of the single object \(Z^{(N)}(t)\) tends to the behaviour of the object that senses the rest of the system only through its limit behaviour given by \(\varvec{x}\), i.e. z(t). On the basis of these results, in [7, 8] a model-checking algorithm has been proposed for CSL robustFootnote 10 formulas.

In [41] we took an alternative approach, showing that, under suitable convergence and scaling assumptionsFootnote 11, and for models that are not too stiffFootnote 12, fluid model checking can be performed exploiting on-the-fly mean field model checking. In particular, in [41] a mechanical translation is defined which derives a time-inhomogeneous DTMC and a bounded PCTL formula from the input ICTMC model and bounded CSL formula. FlyFast can then be used for performing on-the-fly mean-field model-checking of the derived formula on the derived IDTMC.

Our approach starts from the idea that we can interpret the difference equations obtained from a discrete time population model as an instance of the Euler forward method for approximating the solution of a set of ODEs. The set of ODEs we are interested in solving are those of a corresponding continuous population model. This, in turn, means that we need to derive suitable values for the probabilities from the rates in the continuous model. What we are actually interested in is to transform an ICTMC model of an individual (from which the ODEs can be derived) into an IDTMC model, with the same local states and jump structure as the ICTMC; from this IDTMC we get the set of difference equations that can be used to approximate the solution of the ODEs. Intuitively, the IDTMC is obtained from the ICTMC using an approach which is similar to CTMC uniformisationFootnote 13; we define a probability matrix \(\mathbf {K}\) such that \(\mathbf {K}=\mathbf {I}+\frac{1}{q}\cdot \mathbf {Q}\), where \(\mathbf {Q}\) is the infinitesimal rate matrix—which is a function of \(\varvec{x}(t)\)—and q must not only satisfy the standard requirements for uniformisation, but also be such that absolute stability of the method as well as acceptable accuracy are guaranteed [47]. This procedure produces a discretisation of the continuous-time model; of course, also the logical formulas must be translated by consistently discretising them—and in particular the time bound of the bounded until operator.

We refer to [41] for the detailed definition of the translations and their correctness proof. Here we point out that such global fluid model checking algorithms, as described in [7, 8], require the a priori calculation of discontinuity points, i.e. points in time in which the truth values of time-dependent (sub)-formulas of an until formula change. This is a non-trivial task and consists in finding all zeros of an analytic function. In the on-the-fly setting, instead, such points are detected automatically during the computation of the probabilities, up to a difference that is in the order of a small discrete step size; moreover, on-the-fly approaches are particularly efficient when verifying conditional reachability properties because in that case much fewer states need to be generated.

On the other hand, our approach is ultimately based on an Euler forward method to solve differential equations. This poses certain limitations on the continuous time models that can be analysed efficiently this way, in particular they should not be too stiff. For non stiff models the results are promising as shown in [41] for the available benchmark models for which also some results for global fluid model checking and statistical model checking are available in the literature.

4.2 FlyFast Front-End for Predicate-Based Coordination

Recent proposals for CAS modelling and programming languages, like [6, 20], typically assume any such a system be composed of a set of independent components where a component is a process equipped also with a set of attributes describing features of the component. Attributes can be used in predicates appearing in the language input/output primitives. Predicate-based output/input multicast, originally proposed in [45], forms the basis of interaction schemes in languages like SCEL [20] and Carma [6]. In [15] we proposed PiFF—Predicate-based Interaction for FlyFast—a front-end modelling language for FlyFast inspired by Carma, that provides predicate-based input/output multicast actions.

In PiFF, each component consists of a behaviour—modelled, like in FlyFast, as a DTMC-like agent—and a set of attributes. The attribute name-value correspondence is kept in the current store of the component. Associated to each action there is also an (atomic) probabilistic store-update. For instance, assume components have an attribute named \(\mathsf {loc}\) which takes values in the set of points of a space, thus recording the current location of the component. The following action models a multi-cast via channel \(\alpha \) to all components in the same location as the sender, making the latter change its location randomly: \(\alpha ^*[\mathsf {loc}=\mathbf {my}.\mathsf {loc}]{\langle \rangle }{} \texttt {Jump}\). Here \(\texttt {Jump}\) is assumed to randomly update the store and, in particular attribute \(\mathsf {loc}\). The computational model of PiFF is clock-synchronous, as in FlyFast, but at the component level. In addition, each component is equipped with a local outbox. The effect of an output action \(\alpha ^*[\pi _r]{\langle \rangle }\sigma \) is to deliver output label \(\alpha {\langle \rangle }\) to the local outbox, together with the predicate \(\pi _r\), which (the store of) the receiver components will be required to satisfy, as well as the current store of the component executing the action; the current store is then updated according to update \(\sigma \). Note that output actions are non-blocking and that successive output actions of the same component overwrite its outbox. An input action \(\alpha ^*[\pi _s]()\sigma \) by a component will be executed with a probability which is proportional to the fraction of all those components whose outboxes currently contain the label \(\alpha {\langle \rangle }\), a predicate \(\pi _r\) which is satisfied by the component, and a store which satisfies predicate \(\pi _s\) in turn. If such a fraction is zero, then the input action will not take place (input is blocking), otherwise the action takes place, the store of the component is updated via \(\sigma \), and its outbox cleared.

A PiFF model specification is compiled into a FlyFast model specification by means of a (purely mechanical) translation and related bounded PCTL formulas are mechanically translated as well. For the sake of simplicity, we do not describe the translation here; the interested reader can find its definition in [15], where the formal stochastic semantics of PiFF are also given and the translation is shown correct with respect to such semantics; optimisation of the translation is dealt with in [44]. In particular, in [44], a bisimilarity based state-reduction strategy for the target model specification is proposed.

5 Conclusions

Model-checking has proven to be an effective and successful formal verification technique. Initially developed for qualitative models and logics, it has been extended also to quantitative models and logics such as DTMCs and PCTL as well as CTMCs and CSL. It is well known that model-checking suffers from the state-space explosion problem, which makes the technique non-scalable and thus poorly applicable to large scale systems. On the other hand, current trends in information technology, like the Internet of Things, include systems composed of a large number of components, often acting collectively and adapting to changing conditions, the so called Collective Adaptive Systems. In this paper we have briefly described the work we have been carrying out in the area of approximated bounded PCTL model-checking of Population DTMC models. In particular we have given an introductory description of FlyFast, a mean-field, on-the-fly bounded PCTL model-checker, including an overview of its theoretical foundation, its main functionalities and a detailed example of application. A couple of extensions of the applicability of the tool have been shown as well, in the form of specific additional front-ends to the original tool; thus, the tool applicability is extended without actually modifying the tool.

There are several lines of future work of our interest. First of all, following approaches similar to those presented in [46], we plan to investigate the extension of the model-checking technique to systems with memory/rewards. Space and the spatial distribution of agents play a major role in CAS and, consequently, it should be a “first class” component of the modelling language and the underlying framework. For this reason, we have investigated Closure Spaces, a generalisation of Topological Spaces that includes discrete, graph-like, space structures, for which we have developed the Spatial Logic for Closure Spaces, SLCS and a specific model-checking algorithm [13, 14]. A subject for future research is thus to incorporate a notion of space in the FlyFast modelling language and to integrate FlyFast and topochecker, the spatial model-checker for SLCS and its extensions. The investigation of different classes of interaction probability specifications in the FlyFast modelling language and of their implications on issues like model-reduction (see e.g. [44]) is also a promising subject for future research.