1 Introduction

Runtime verification (RV), is a lightweight verification technique that may be used to determine whether the current system run respects a correctness property. Two requirements are crucial for the adoption of this technique. First, runtime monitor overheads need to be kept to a minimum so as not to degrade system performance. Second, instrumented monitors need to form part of the trusted computing base of the verification setup by adhering to an agreed notion of monitor correctness; amongst other things, this normally includes a guarantee that runtime checking corresponds (in some sense) to the property being checked for. Monitor overheads and correctness are occasionally conflicting concerns. For instance, in order to lower monitoring overheads, engineers increasingly use concurrent monitors [16, 29, 35] so as to benefit from the underlying parallel and distributed architectures. However concurrent monitors are also more susceptible to elusive errors such as non-deterministic behaviour, deadlocks or livelocks which may, in turn, affect their correctness.

Ensuring monitor correctness is, in general, non-trivial. One prominent obstacle is the fact that system properties are typically specified using one formalism, e.g. a high-level logic, whereas the respective monitors that check these properties are described using another formalism, e.g. a programming language. This, in turn, makes it hard to ascertain the semantic correspondence between the two descriptions. Automated monitor synthesis can mitigate this problem by standardising the translation from the property logic to the monitor formalism. It also gives more scope for a formal treatment of monitor correctness.

In this work, we investigate the correctness of synthesised monitors in a concurrent setting, whereby (i) the system being verified executes concurrently with the synthesised monitor (ii) the system and the monitor themselves consist of concurrent sub-components. The correctness of monitor synthesis has been studied previously by the seminal work of Geilen, [23], and (more formally) by subsequent work such as that of Sen et al. [34], and Bauer et al. [5]. Our approach differs from these studies in a number of respects. First, the aforementioned work abstracts away from the internal working of a system, representing it as a string of events/states (execution trace); this complicates reasoning about how monitors are instrumented wrt. an executing system. It also focusses on a logic that is readily amenable to runtime analysis, namely linear temporal logic (LTL) [13]. Moreover, it expresses synthesis in terms of abstract or single-threaded monitors—using either pseudocode [23] or automata [5, 34] —executing wrt. such trace. By contrast, we strive towards a more intensional formal definition of online correctness for synthesised concurrent monitors whereby, for arbitrary property \(\varphi \), the synthesised monitor \(M_\varphi \) running concurrently wrt. some system S (denoted as \(S\parallel M_\varphi \)) respects the following condition:

$$\begin{aligned}&\text {Whenever } S \parallel M_\varphi \text { executes to some }\; S'\parallel M' \qquad \text { then } \nonumber \\&\quad \left( \begin{array}{l} \text {If the execution from } S \text { to } S' \text { violates } \varphi \; \text {then} \\ \qquad \qquad \qquad \,M' \text { should consistently flag the violation}\\ \qquad \qquad \qquad \qquad \text { and }\\ \text {If } M' \text { flags a violation} \;\text {then} \;\text {the execution from } S \text { to } S' \text { should violate } \varphi \end{array}\right) \end{aligned}$$
(1)

The setting of (1) brings to the fore additional issues concerning monitor correctness:

(i):

A property logic semantics may be defined over other computational entities apart from traces, i.e. the current execution in (1), such as the entire computational tree of a program. As a result, the logic semantics may not readily lend itself to the formulation of monitor correctness outlined in condition (1) above, which only requires monitor detection whenever a violation occurs. In general, a monitored system that violates a property according to the original logic semantics, may do so along one computational path but not along another; this is often the case for concurrent systems with multiple execution paths as a result of different thread interleavings scheduled at runtime.

(ii):

Concurrent monitors may also have multiple execution paths. Condition (1) thus requires stronger guarantees than those for single-threaded monitors so as to ensure that all these paths correspond to an appropriate runtime check of system property being monitored. Stated otherwise, correct concurrent monitors must always/consistently detect violations and flag them, irrespective of their runtime interleaving.

(iii):

Apart from the formal semantics of the source logic (specifying property \(\varphi \)), we also require a formal semantics for the target languages of both the system and the monitor executing in side-by-side, i.e. \(S\parallel M_\varphi \). In most cases, the latter may not be available.

(iv):

Online monitor correctness needs to ensure that monitor execution cannot be interfered with by the system, and viceversa. Whereas adequate monitor instrumentation typically prevents direct interferences, condition (1) must consider indirect interferences such as system divergences [25, 32], i.e. infinite internal looping making the system unresponsive, which may prevent the monitors from progressing.

(v):

Ensuring correctness along the lines of (1) can be quite onerous because every execution path of the monitor running concurrently with the monitored system, \(S\!\parallel \! M_\varphi \), needs to be analysed to ensure consistent detections along every thread interleaving. Consequently, one needs to devise scalable techniques facilitating monitor correctness analysis.

We conduct our study in terms of actor-based [26] concurrent monitors written in Erlang [3, 11], an industry-strength language for constructing fault-tolerant systems. To alleviate our technical development, we also restrict monitoring to systems written in the same language. We limit ourselves to a logic that describes reactive properties, i.e. system interactions with the environment, and focus on the synthesis of asynchronous monitors, performing runtime analysis through the Erlang virtual machine (EVM)’s tracing mechanism. Despite the typical drawbacks associated with asynchrony, e.g. late detections, our monitoring setup is in line with the asynchrony advocated by the actor concurrency model, which facilitates scalable coding techniques such as fail-fast design patterns [11]. Asynchronous monitoring has also been used in other RV tools, e.g. [14, 17], and has proved to be less intrusive and easier to instrument than synchronous monitoring setups. It is also the more feasible alternative when monitoring distributed systems [16, 20]. More importantly, though, it still allows us to investigate the main issues arising from the correctness setup outlined in (1), and we expect most of the issues investigated to carry over in a straightforward fashion to purely synchronous settings.

As an expository logic for describing reactive properties, we consider an adaptation of sHML [1]—a syntactic subset of the more expressive \(\mu \)-calculus logic—describing safety, i.e. monitorable [28], properties. Our choice for this logic was, in part, motivated by the fact that the full \(\mu \)-calculus had already been adapted to describe concurrent Erlang program behaviour in [22], albeit for model-checking purposes. Given the usual drawbacks associated with full-blown model checking, our work contributes towards an investigation of lightweight verification techniques for \(\mu \)-calculus properties of Erlang programs. It also allows us to investigate how to extend runtime verification techniques to logics that were not originally intended for this verification setup. More precisely, sHML is a syntactic subset of a larger logic used to specify branching-time properties for concurrent systems [2]; our work can thus be used as a first step towards a broader investigation of which other subsets of this branching-time logic can actually be verified at runtime. Crucially, however, sHML acts as an adequate vehicle to study the monitor correctness issues set out in (1) above.

The rest of the paper is structured as follows. Section 2 discusses the formal semantics of our systems and monitor target language. Section 3 discusses reformulations to the logic facilitating the formulation of monitor correctness, discussed later in Sect. 4. Section 5 describes a synthesis algorithm for the logic and a tool built using the algorithm. Subsequently, Sect. 6 proves the correctness of this monitor synthesis. Section 8 concludes.

2 The language

We require a formal semantics for both our monitor-synthesis target language, and the systems we intend to monitor. We partially address this problem by expressing both monitors and systems in terms of the same language, i.e. Erlang, thus only requiring one semantics. However, we still need to describe the Erlang tracing semantics we intend to use for our asynchronous monitoring. Although Erlang semantic formalisations exist, e.g. [8, 22, 36], none describe this tracing mechanism. We therefore define a calculus—following [22, 36]—for modelling the tracing semantics of a (Turing-complete) subset of the Erlang language (we leave out distribution, process linking and fault-trapping mechanisms).

Fig. 1
figure 1

Erlang syntax

Figure 1 outlines the language syntax, assuming disjoint denumerable sets of process/actor identifiers \(i,j,h \in {\textsc {Pid}} \), atoms \(a,b \in {\textsc {Atom}} \), and variables \({\textit{x}} ,{\textit{y}} ,{\textit{z}} \in \textsc {Var} \). An executing Erlang program is made up of a system of actors, Actr, composed in parallel, \(A \parallel B \), where some identifiers are local (scoped) to subsystems of actors, and thus not known to the environment, e.g. \(i\) in a system \(A \!\parallel \!(\upnu \, i) B \). Individual actors, denoted as \(i {\textsf {[}} e \,\triangleleft \, q {\textsf {]}}_{}^{m}\), are uniquely identified by an identifier, \(i\), and consist of an expression, \(e\), executing wrt. a local mailbox, \(q\) (denoted as a list of values); as we explain later, m denotes the actor monitoring modality. Actor expressions typically consist of a sequence of variable binding \(x_i \,{\textsf {=}}\, {e _i}{\textsf {,}}\,\) terminated by an expression, \(e _{\text {final}}\):

$$\begin{aligned} x_1 \,{\textsf {=}}\, e _1{\textsf {,}}\,\;\ldots \textsf {,}\;x_n \,{\textsf {=}}\, e _n{\textsf {,}}\,\;e _{\text {final}} \end{aligned}$$

An expression \(e _i\) in a binding \(x_i \,{\textsf {=}}\, e _i{\textsf {,}}\,e _{i+1}\) is expected to evaluate to a value, v, which is then bound to \(x_i\) in the continuation expression \(e _{i+1}\). When instead \(e _i\) generates an exception, \({\textsf {exit}}\), it aborts subsequent computationsFootnote 1 in \(e _{i+k}\) for \(1 \le k \le (n-i) \). Apart from bindings, expressions may also consist of self references (to the actor’s own identifier), \({\textsf {self}}\), outputs to other actors, \(e _1 {\textsf {!}} e _2\), pattern-matching inputs from the mailbox, \({\textsf {rcv}}\, g \,{\textsf {end}}\), or pattern-matching for case-branches, \({\textsf {case}}\, e \,{\textsf {of}}\, g \,{\textsf {end}}\) (where \(g\) is a list of expressions guarded by patterns, \(p_i \,{{\rightarrow }}\, e _i\)), function applications, \( e _1(e _2)\), and actor-spawning, \({\textsf {spw}}\, e \), amongst others. Values consist of variables, \({\textit{x}} \), process ids, \(i\), recursive functions, \(\upmu {\textit{y}} .\uplambda x.e \) (where the preceding \(\upmu y\) denotes the binder for function self-reference), tuples \({\textsf {\{}}v _1,\ldots ,v _n{\textsf {\}}}\) and lists, \(l\), amongst others.

Remark 1

The functions \({{\mathrm{fv}}}(A)\) and \({{\mathrm{fId}}}(A)\) return the free variables and free process identifiers of \(A\) resp. and are defined in standard fashion. We write \(\uplambda {\textit{x}} .e \) and \(d {\textsf {,}}\,e \) for \(\upmu {\textit{y}} .\uplambda {\textit{x}} .e \) and \({\textit{y}} \,{\textsf {=}}\, d {\textsf {,}}\,e \) resp. when \({\textit{y}} \not \in {{\mathrm{fv}}}(e)\). In \(p \,{{\rightarrow }}\, e \), we replace \({\textit{x}} \) in \(p\) with _ whenever \({\textit{x}} \not \in {{\mathrm{fv}}}(e)\). We use standard shorthand for lists of binders and write \(\upmu {\textit{y}} .\uplambda ({\textit{x}} _1,\ldots ,{\textit{x}} _n).e \) and \((\upnu \, i _1,\ldots ,i _n) \,A \) resp. for \(\upmu {\textit{y}} .\uplambda {\textit{x}} _1.\ldots \uplambda {\textit{x}} _n.e \) and \((\upnu \, i _1) \ldots {(\upnu \, i _n) \,A }\). We sometimes elide mailboxes and write \(i {\textsf {[}} e {\textsf {]}}\), when these are empty, \(i {\textsf {[}} e \,\triangleleft \, \epsilon {\textsf {]}}_{}^{}\), or when they do not change in the transition rules that follow.

Specific to our formalisation, we also subject each individual actor, \(i {\textsf {[}} e \,\triangleleft \, q {\textsf {]}}_{}^{m}\), to a monitoring-modality, \(m,n \in \{\circ ,\bullet ,* \}\), where \(\circ \), \(\bullet \) and \(*\) denote monitored, unmonitored and tracing actors resp. Modalities play a crucial role in our language semantics, defined as a labelled transition system over systems, \(A \xrightarrow {\;\gamma \;} B \), where actions \(\gamma \in \textsc {Act} _{\tau }\), include bound output labels, \((\tilde{h})i \mathtt {!}v \), and input labels, \(i \mathtt {?}v \) and a distinguished internal label, \(\tau \). In line with the reactive properties we consider later, our formalisation only traces system interactions with the environment (send and receive messages) from monitored actors. Thus, whereas unmonitored, \(\bullet \), and tracing, \(*\), actors have standard input and output transition rules

$$\begin{aligned} \textsc {SndU}\frac{m \in \{\bullet ,* \}}{ j {\textsf {[}} i {\textsf {!}} v \,\triangleleft \, q {\textsf {]}}_{}^{m} \;\xrightarrow {\;i \mathtt {!}v \;}\; j {\textsf {[}} v \,\triangleleft \, q {\textsf {]}}_{}^{m} } \qquad \textsc {RcvU} \frac{m \in \{\bullet ,* \} }{ i {\textsf {[}} e \,\triangleleft \, q {\textsf {]}}_{}^{m} \;\xrightarrow {\;i \mathtt {?}v \;}\; i {\textsf {[}} e \,\triangleleft \, q \!\mathop {:}\!v {\textsf {]}}_{}^{m} } \end{aligned}$$

actors with a monitored modality, \(\circ \), i.e. actors \(j\) and \(i\) in rules SndM and RcvM below, produce a residual message reporting the send and receive interactions (\({\textsf {\{}}{\textsf {sd}},i,v {\textsf {\}}}\) and \({\textsf {\{}}{\textsf {rv}},i,v {\textsf {\}}}\) resp.) at the tracer’s mailbox i.e. actor \(h\) with modality \(*\) in the rules below; this models closely the tracing mechanism offered by the Erlang virtual machine (EVM) [11]. In our target language, the list of report messages at the tracer’s mailbox constitutes the system trace to be used for asynchronous monitoring.

$$\begin{aligned}&\textsc {SndM} \frac{}{ j {\textsf {[}} i {\textsf {!}} v \,\triangleleft \, q {\textsf {]}}_{}^{\circ } \parallel h {\textsf {[}} d \,\triangleleft \, r {\textsf {]}}_{}^{*} \;\xrightarrow {\;i \mathtt {!}v \;}\; j {\textsf {[}} v \,\triangleleft \, q {\textsf {]}}_{}^{\circ } \parallel h {\textsf {[}} d \,\triangleleft \, r \!\mathop {:}\!{\textsf {\{}}{\textsf {sd}},i,v {\textsf {\}}} {\textsf {]}}_{}^{*}}\\&\textsc {RcvM} \frac{ }{ i {\textsf {[}} e \,\triangleleft \, q {\textsf {]}}_{}^{\circ } \parallel h {\textsf {[}} d \,\triangleleft \, r {\textsf {]}}_{}^{*} \;\xrightarrow {\;i \mathtt {?}v \;}\; i {\textsf {[}} e \,\triangleleft \, q \!\mathop {:}\!v {\textsf {]}}_{}^{\circ } \parallel h {\textsf {[}} d \,\triangleleft \, r \!\mathop {:}\!{\textsf {\{}}{\textsf {rv}},i,v {\textsf {\}}} {\textsf {]}}_{}^{*}} \end{aligned}$$
Fig. 2
figure 2

Erlang Semantics for Actor Systems

Our LTS semantics assumes well-formed actor systems, whereby every actor identifier is unique; it is termed to be a tracing semantics because a distinguished tracer actor, identified by the monitoring modality \(*\), receives messages recording external communication events by monitored actors. Formally, we write \(A \xrightarrow {\;\gamma \;} B \) in lieu of \(\langle A, \gamma , B \rangle \in \longrightarrow \), the least ternary relation satisfying the rules in Fig. 2. These rules employ evaluation contexts, denoted as \(\mathcal {C}\) (described below) specifying which sub-expressions are active. For instance, an expression is only evaluated when at the top level variable binding, \(x \,{\textsf {=}}\, \mathcal {C}{\textsf {,}}\,e \) or when the expression denoting the destination of an output has evaluated to a value, \(v {\textsf {!}} \mathcal {C}\); the other cases are also fairly standard.Footnote 2 We denote the application of a context \(\mathcal {C}\) to an expression \(e\) as \(\mathcal {C}[e ]\).

$$\begin{aligned} \mathcal {C}&\;\; {:{:}}= \;\quad [-] \;\,|\;\, \mathcal {C} {\textsf {!}} e \;\,|\;\, v {\textsf {!}} \mathcal {C} \;\,|\;\, \mathcal {C}(e) \;\,|\;\, v (\mathcal {C})\;\,|\;\, {\textsf {case}}\, \mathcal {C} \,{\textsf {of}}\, g \,{\textsf {end}} \;\,|\;\, x \,{\textsf {=}}\, \mathcal {C}{\textsf {,}}\,e \;\,|\;\, \cdots \end{aligned}$$

Communication in actor systems happens in two stages: an actor receives messages, keeping them in order in its mailbox, and then selectively reads them at a later stage using pattern-matching—rules \(\textsc {Rd1}\) and \(\textsc {Rd2}\) describe how mailbox messages are traversed in order to find the first one matching a pattern in the pattern list \(g\), releasing the respective guarded expression \(e\) as a result. We choose only to record external communication at tracer processes, i.e. between the system and the environment, and do not trace internally communication between actors within the system, irrespective of their modality (see Com); structural equivalence rules, \(A \equiv B \), are employed to simplify the presentation of our rules—see rule Str and the corresponding structural rules. In Par, the side-condition enforces the single-receiver property, inherent to actor systems; for instance, it prevents a transition with an action \(j \mathtt {!}v \) when actor \(j\) is part of the actor system \(B\). We note that in rule \(\textsc {Spw}\), spawned actors inherit monitorability when launched by a monitored actor, but are launched as unmonitored otherwise.

Mailbox reading—defined by the rules Rd1 and Rd2 in Fig. 2—includes pattern-matching functionality, allowing the actor to selectively choose which messages to read first from its mailbox whenever the first pattern \(p \,{{\rightarrow }}\, e \) from the pattern list \(g\) is matched, returning \(e \sigma \), where \(\sigma \) substitutes free variables in \(e\) for value binding resulting from the pattern-match; when no pattern is matched, mailbox reading blocks - see Definition 1.

Definition 1

(Pattern-Matching) We define \({{\mathrm{mtch}}}: {\textsc {PLst}} \times \textsc {Val} \rightarrow \textsc {Exp} _{\bot }\) and \({{\mathrm{vmtch}}}: {\textsc {Pat}} \times \textsc {Val} \rightarrow \textsc {Sub} _{\bot }\) as follows:

$$\begin{aligned} {{\mathrm{mtch}}}(g,v)&\mathop {=}\limits ^\mathbf{ def }{\left\{ \begin{array}{ll} e \sigma &{}\quad \text {if } g = p \,{{\rightarrow }}\, e \mathop {:}f, {{\mathrm{vmtch}}}(p,v)=\sigma \\ d &{}\quad \text {if }g = p \,{{\rightarrow }}\, e \mathop {:}f, {{\mathrm{vmtch}}}(p,v)=\bot , {{\mathrm{mtch}}}(f,v) = d \\ \bot &{}\quad \text {otherwise} \end{array}\right. }\\ {{\mathrm{vmtch}}}(p,v)&\mathop {=}\limits ^\mathbf{ def }{\left\{ \begin{array}{ll} \emptyset &{}\quad \text {if }\; p =v \text { (whenever }\; p \text { is } a,\; i \text { or } {\textsf {nil}})\\ \{v/x\} &{}\quad \text {if }\; p =x\\ \biguplus _{i=1}^n \sigma _i &{}\quad \text {if } p ={\textsf {\{}}p _1,\ldots ,p _n{\textsf {\}}}, v = {\textsf {\{}}v _1,\ldots ,v _n{\textsf {\}}} \text { where } \\ &{}\quad {{\mathrm{vmtch}}}(p _i, v _i) = \sigma _i\\ \sigma \uplus \{l/x\} &{}\quad \text {if }\; p =o \mathop {:}x, v =u \mathop {:}l \text { where } {{\mathrm{vmtch}}}(o, u) = \sigma \\ \bot &{}\quad \text {otherwise} \end{array}\right. }\\ \sigma _1 \uplus \sigma _2&\mathop {=}\limits ^\mathbf{ def }{\left\{ \begin{array}{ll} \sigma _1 \cup \sigma _2 &{} \text {if }\; {{\mathrm{dom}}}(\sigma _1) \cap {{\mathrm{dom}}}(\sigma _2) = \emptyset \\ \sigma _1 \cup \sigma _2 &{} \text {if } \; \forall v \in {{\mathrm{dom}}}(\sigma _1) \cap {{\mathrm{dom}}}(\sigma _2) . \sigma _1(v) = \sigma _2(v) \\ \bot &{} \text {if }\; \sigma _1 = \bot \text { or } \sigma _2 = \bot \\ \bot &{} \text {otherwise} \end{array}\right. } \end{aligned}$$

Branching for mailbox pattern-matching differs from pattern-match branching for the case construct, described by the rules Cs1 and Cs2 in Fig. 2: similar to the mailbox read construct, it matches a value to the first appropriate pattern in the pattern list, launching the respective guarded expression with the appropriate variable bindings resulting from the pattern-match; if, however, no match is found it generates an exception, \({\textsf {exit}}\), which aborts subsequent computation, Ext. The rest of the transition rules, such as \(\textsc {App}\) for function application and \(\textsc {Slf}\) for retreiving the actor name, are fairly standard.

Remark 2

Our tracing semantics sits at a higher level of abstraction than that offered by the EVM [11] because trace entries typically contain more information. For instance, the EVM records internal communication between monitored actors, as an output trace entry immediately followed by the corresponding input trace entry; we here describe sanitised traces whereby consecutive matching trace entries are filtered out.

Example 1

(Non-deterministic behaviour) Our systems exhibit non-deterministic behaviour through either internal or external choices [25, 30]. Consider the actor system:

$$\begin{aligned} A \;\triangleq \; (\upnu \, j _1,j _2,h) \bigl (\,i {\textsf {[}} {\textsf {rcv}}\, {\textit{x}} \,{{\rightarrow }}\, {\textsf {obs}} {\textsf {!}} {\textit{x}} \,{\textsf {end}} \,\triangleleft \, \epsilon {\textsf {]}}_{}^{\circ } \;\parallel \; j _1{\textsf {[}} i {\textsf {!}} v {\textsf {]}}^{\circ } \;\parallel \; j _2{\textsf {[}} i {\textsf {!}} u {\textsf {]}}^{\circ } \;\parallel \; h {\textsf {[}} e \,\triangleleft \, q {\textsf {]}}_{}^{*} \,\bigr ) \end{aligned}$$

Actors \(j _1, j _2\) and \(h\) are local, i.e. scoped through the construct \((\upnu \, j _1,j _2,h) (\ldots )\), thus not visible to the environment. The monitored actor \(i\) may receive value \(v\) internally from actor \(j _1\), (2) by rule Com, read it from its mailbox, (3) by Rd1, and then output it to some environment actor \({\textsf {obs}}\), (4) by SndM, while recording this external output at \(h\) ’s mailbox (the tracer).

$$\begin{aligned} A&\; \;\xrightarrow {\;\;\;\tau \;\;\;}\; \; (\upnu \, j _1,j _2,h) \bigl (\,i {\textsf {[}} {\textsf {rcv}}\, {\textit{x}} \,{{\rightarrow }}\, {\textsf {obs}} {\textsf {!}} {\textit{x}} \,{\textsf {end}} \,\triangleleft \, v {\textsf {]}}_{}^{\circ } \;\parallel \; j _1{\textsf {[}} v {\textsf {]}}^{\circ } \;\parallel \; j _2{\textsf {[}} i {\textsf {!}} u {\textsf {]}}^{\circ } \;\parallel \; h {\textsf {[}} e \,\triangleleft \, q {\textsf {]}}_{}^{*} \,\bigr ) \end{aligned}$$
(2)
$$\begin{aligned}&\;\;\xrightarrow {\;\;\;\tau \;\;\;}\; \; (\upnu \, j _1,j _2,h) \bigl (\,i {\textsf {[}} {\textsf {obs}} {\textsf {!}} v \,\triangleleft \, \epsilon {\textsf {]}}_{}^{\circ } \;\parallel \; j _1{\textsf {[}} v {\textsf {]}}^{\circ } \;\parallel \; j _2{\textsf {[}} i {\textsf {!}} u {\textsf {]}}^{\circ } \;\parallel \; h {\textsf {[}} e \,\triangleleft \, q {\textsf {]}}_{}^{*} \,\bigr ) \end{aligned}$$
(3)
$$\begin{aligned}&\;\;\xrightarrow {\;{\textsf {obs}} \mathtt {!}v \;}\; \; (\upnu \, j _1,j _2,h) \bigl (\,i {\textsf {[}} v \,\triangleleft \, \epsilon {\textsf {]}}_{}^{\circ } \,\parallel \, j _1{\textsf {[}} v {\textsf {]}} \,\parallel \, j _2{\textsf {[}} i {\textsf {!}} u {\textsf {]}} \,\parallel \, h {\textsf {[}} e \,\triangleleft \, q \mathop {:}{\textsf {\{}}{\textsf {sd}},{\textsf {obs}},v {\textsf {\}}} {\textsf {]}}_{}^{*}\,\bigr ) \end{aligned}$$
(4)

But if actor \(j _2\) sends its value to \(i\) before \(j _1\), we observe a different external behaviour

$$\begin{aligned}&A \; \xrightarrow {\;\;\tau \;\;}\cdot \xrightarrow {\;\;\tau \;\;}\cdot \xrightarrow {\;{\textsf {obs}} \mathtt {!}u \;} \; (\upnu \, j _1,j _2,h) \\&\quad \times \bigl (\,i {\textsf {[}} u \,\triangleleft \, \epsilon {\textsf {]}}_{}^{\circ } \,\parallel \, j _1{\textsf {[}} i {\textsf {!}} v {\textsf {]}} \,\parallel \, j _2{\textsf {[}} u {\textsf {]}} \,\parallel \, h {\textsf {[}} e \,\triangleleft \, q \mathop {:}{\textsf {\{}}{\textsf {sd}},{\textsf {obs}},u {\textsf {\}}} {\textsf {]}}_{}^{*}\,\bigr ) \end{aligned}$$

i.e. \(A\) eventually outputs \(u\) instead of \(v\) to \({\textsf {obs}}\) (accordingly monitor \(h\) would hold the entry \({\textsf {\{}}{\textsf {sd}},{\textsf {obs}},u {\textsf {\}}}\) instead); these behaviours amount to an internal choice.

External choice results when \(A\) receives different external inputs: we can derive

$$\begin{aligned}&A \xrightarrow {\;i \mathtt {?}v _1\;} B _1 \triangleq (\upnu \, \!j _1,j _2,h) \\&\quad \times {\bigl (i {\textsf {[}} {\textsf {rcv}}\, {\textit{x}} \,{{\rightarrow }}\, {\textsf {obs}} {\textsf {!}} {\textit{x}} \,{\textsf {end}} \,\triangleleft \, v _1 {\textsf {]}}_{}^{\circ } \parallel j _1{\textsf {[}} i {\textsf {!}} v {\textsf {]}}^{\circ } \parallel j _2{\textsf {[}} i {\textsf {!}} u {\textsf {]}}^{\circ } \parallel h {\textsf {[}} e \,\triangleleft \, q \mathop {:}{\textsf {\{}}{\textsf {rv}},i,v _1{\textsf {\}}} {\textsf {]}}_{}^{*} \bigr )} \end{aligned}$$

but also \(A \xrightarrow {\;i \mathtt {?}v _2\;} B _2\) for some appropriate \(B _2\). Subsequently, \(B _1\) can only produce the output \(B _1 \xrightarrow {\;\tau \;}^*\cdot \xrightarrow {\;{\textsf {obs}} \mathtt {!}v _1\;} B _3\) whereas from \(B _2\) can only produce \(B _2 \xrightarrow {\;\tau \;}^*\cdot \xrightarrow {\;{\textsf {obs}} \mathtt {!}v _2\;} B _4\). Note that, in the first case, \(h\) ’s mailbox in \(B _3\) is appended by entries \({\textsf {\{}}{\textsf {rv}},i,v _1{\textsf {\}}}\mathop {:}{\textsf {\{}}{\textsf {sd}},{\textsf {obs}},v _1{\textsf {\}}}\) whereas, in the second case, \(h\) ’s mailbox in \(B _4\) is appended by \({\textsf {\{}}{\textsf {rv}},i,v _2{\textsf {\}}}\mathop {:}{\textsf {\{}}{\textsf {sd}},{\textsf {obs}},v _2{\textsf {\}}}\). \(\square \)

Example 2

(Infinite Behaviour) Our systems may exhibit infinite behaviour. Actor \(A\) (below) may produce an infinite number of output actions \(j \mathtt {!}v \) (for \(v \ne {\textsf {exit}}\) ) through repeated sequences of function applications, (5) by rule App, outputs, (6) by rule SndU and variable assignments, (7) by rule Ass.

$$\begin{aligned} A \triangleq i {\textsf {[}} \bigl (\upmu y.\uplambda x.z \,{\textsf {=}}\, j {\textsf {!}} x{\textsf {,}}\, y(z)\bigr )(v) \,\triangleleft \, q {\textsf {]}}_{}^{\bullet }&\;\xrightarrow {\;\;\;\tau \;\;\;}\; i {\textsf {[}} z \,{\textsf {=}}\, j {\textsf {!}} v {\textsf {,}}\, \bigl (\upmu y.\uplambda x.z' \,{\textsf {=}}\, j {\textsf {!}} x{\textsf {,}}\, y(z')\bigr )(z) \,\triangleleft \, q {\textsf {]}}_{}^{\bullet }\end{aligned}$$
(5)
$$\begin{aligned}&\;\xrightarrow {\;j \mathtt {!}v \;}\; i {\textsf {[}} z \,{\textsf {=}}\, v {\textsf {,}}\, \bigl (\upmu y.\uplambda x.z' \,{\textsf {=}}\, j {\textsf {!}} x{\textsf {,}}\, y(z')\bigr )(z) \,\triangleleft \, q {\textsf {]}}_{}^{\bullet }\end{aligned}$$
(6)
$$\begin{aligned}&\;\xrightarrow {\;\;\;\tau \;\;\;}\; i {\textsf {[}} \bigl (\upmu y.\uplambda x.z' \,{\textsf {=}}\, j {\textsf {!}} x{\textsf {,}}\, y(z')\bigr )(v) \,\triangleleft \, q {\textsf {]}}_{}^{\bullet } \;=\; A \end{aligned}$$
(7)

Systems with infinite behaviour may also transition to a different system with each computational step. System \(B\) (below) is a slight modification to \(A\) that delegates the output to a newly spawned actor at each iteration. Using a similar sequence of transitions to (5), (6) and (7) together with rule Spw we are able to obtain the computation below, whereby the number of actors grow with each iteration.

$$\begin{aligned} B \;\triangleq \;&i {\textsf {[}} \bigl (\upmu y.\uplambda x.z \,{\textsf {=}}\, {\textsf {spw}}\, j {\textsf {!}} x {\textsf {,}}\, y(x)\bigr )(v) \,\triangleleft \, q {\textsf {]}}_{}^{\bullet } \;\bigl (\xrightarrow {\;\;\tau \;\;}\bigr )^3\cdot \xrightarrow {\;j \mathtt {!}v \;}\; (\upnu \, h) \bigl ( B \parallel h {\textsf {[}} v \,\triangleleft \, \epsilon {\textsf {]}}_{}^{\bullet }\bigr ) \\&\qquad \qquad \qquad \;\bigl (\xrightarrow {\;\;\tau \;\;}\bigr )^3\cdot \xrightarrow {\;j \mathtt {!}v \;}\; (\upnu \, h, h ') \bigl ( B \parallel h {\textsf {[}} v \,\triangleleft \, \epsilon {\textsf {]}}_{}^{\bullet } \parallel h '{\textsf {[}} v \,\triangleleft \, \epsilon {\textsf {]}}_{}^{\bullet }\bigr ) \;\bigl (\xrightarrow {\;\;\tau \;\;}\bigr )^3\cdot \xrightarrow {\;j \mathtt {!}v \;}\; \ldots \end{aligned}$$

Such infinite behaviour together with the non-deterministic behaviour discussed in Example 1 typically lead to state-explosion problems when systems are verified exhaustively. \(\square \)

3 The logic

In order to specify reactive properties for the actor systems discussed in Sect. 2, we consider an adaptation of SafeHML [1] (sHML), a sub-logic of the Hennessy–Milner Logic (HML) with recursion—the latter logic has been shown to be a reformulation of the expressive \(\mu \)-calculus [27]. It assumes a denumerable set of formula variables, \(X,Y \in \textsc {LVar} \), and is inductively defined by the following grammar:

$$\begin{aligned} \varphi ,\psi \in {\textsc {sHML}}&\; {:{:}}= \;\;{\textsf {ff}} \; \; | \;\; \varphi \mathbf {\wedge } \psi \; \; | \;\; \mathbf {[}\alpha \mathbf {]}\varphi \; \; | \;\; X \; \; | \;\;{\textsf {max}}\mathbf {(}X,\varphi \mathbf {)} \end{aligned}$$

The formulas for falsity, \({\textsf {ff}}\), conjunction, \(\varphi \mathbf {\wedge } \psi \), and action necessity, \(\mathbf {[}\alpha \mathbf {]}\varphi \), are inherited from HML[25], whereas variables X and the recursion construct \({\textsf {max}}\mathbf {(}X,\varphi \mathbf {)}\) are used to define maximal fixpoints; as expected, \({\textsf {max}}\mathbf {(}X,\varphi \mathbf {)}\) is a binder for the free variables X in \(\varphi \), inducing standard notions of open and closed formulas. We only depart from the logic of [1] by limiting formulas to basic actions \(\alpha ,\beta \in \textsc {BAct} \), including just input, \(i \mathtt {?}v \), and unbound outputs, \(i \mathtt {!}v \), so as to keep our technical development manageable.

Remark 3

The handling of bound output actions, \((\tilde{h})i \mathtt {!}v \), is well understood [31] and does not pose problems to monitoring, apart from making action pattern-matching cumbersome (consult [24] for an example of how bound values can be matched); it also complicates instrumentation (see Sects. 4 and 5). Certain (silent) \(\tau \) labels can also be monitored using minor adaptations (see, for instance, Remark 2); they however increase substantially the size of the traces recorded, unnecessarily cluttering the tracing semantics of Sect. 2.

The semantics of our logic is defined for closed formulas, using the operation \(\varphi \{\psi /X\}\), which substitutes free occurrences of X in \(\varphi \) with \(\psi \) without introducing any variable capture. It is specified as the satisfaction relation of Definition 2 (adapted from [1]). In what follows, we write weak transitions \(A \Longrightarrow {\quad } B \) and \(A \mathop {\Longrightarrow }\limits ^{\;\alpha \;} B \), for \(A \xrightarrow {\tau }^*B \) and \(A \xrightarrow {\tau }^*\cdot \xrightarrow {\alpha }\cdot \xrightarrow {\tau }^*B \) resp. We let \(s,t \in (\textsc {BAct})^*\) range over lists of basic actions and write sequences of weak actions \(A \mathop {\Longrightarrow }\limits ^{\alpha _1}\cdots \mathop {\Longrightarrow }\limits ^{\alpha _n} B \), where \(s =\alpha _1,\ldots ,\alpha _n\), as \(A \mathop {\Longrightarrow }\limits ^{\;\;s \;\;}B \) (or as \(A \mathop {\Longrightarrow }\limits ^{\;\;s \;\;}\) when \(B\) is unimportant).

Definition 2

(Satisfiability) A relation \(\mathcal R\in {\textsc {Actr}} \times {\textsc {sHML}} \) is a satisfaction relation iff:

$$\begin{aligned} (A, {\textsf {ff}}) \in \mathcal R&\quad \text {never} \\ (A, \varphi \mathbf {\wedge } \psi ) \in \mathcal R&\quad \text {implies } (A, \varphi ) \in \mathcal R\text { and }(A, \psi ) \in \mathcal R\\ (A, \mathbf {[}\alpha \mathbf {]}\varphi ) \in \mathcal R&\quad \text {implies } (B, \varphi ) \in \mathcal R\text { whenever } A \mathop {\Longrightarrow }\limits ^{\;\alpha \,\;} B \\ (A, {\textsf {max}}\mathbf {(}X,\varphi \mathbf {)}) \in \mathcal R&\quad \text {implies } (A, \varphi \{{\textsf {max}}\mathbf {(}X,\varphi \mathbf {)}/X \}) \in \mathcal R\end{aligned}$$

Satisfiability, \( \models _{\text {s}} \), is the largest satisfaction relation; we write \(A \models _{\text {s}} \varphi \) for \({(A,\varphi )\in \ \models _{\text {s}} }\). It follows from standard fixed-point theory that the implications of satisfaction relation are bi-implications for Satisfiability.

Example 3

(Satisfiability) Consider the safety formula

$$\begin{aligned} \varphi _{\text {safe}}\,&\triangleq {\textsf {max}}\mathbf {(}X,\,\mathbf {[}\alpha \mathbf {]}\mathbf {[}\alpha \mathbf {]}\mathbf {[}\beta \mathbf {]}{\textsf {ff}} \, \mathbf {\wedge } \,\mathbf {[}\alpha \mathbf {]}X \,\mathbf {)} \end{aligned}$$
(8)

stating that a satisfying actor system should never perform a sequence of two external actions \(\alpha \) followed by the external action \(\beta \) (through the subformula \(\mathbf {[}\alpha \mathbf {]}\mathbf {[}\alpha \mathbf {]}\mathbf {[}\beta \mathbf {]}{\textsf {ff}} \)), and that this needs to hold after every \(\alpha \) action (through \(\mathbf {[}\alpha \mathbf {]}X \)); effectively the formula states that trace sequences of \(\alpha \)-actions greater than two cannot be followed by a \(\beta \)-action.

A system \(A _1\) exhibiting just the external behaviour \(A _1 \mathop {\Longrightarrow }\limits ^{\;\alpha \beta \;}\) satisfies \(\varphi _{\text {safe}}\) , as would a system \(A _2\) with just the infinite behaviour \(A _2 \mathop {\Longrightarrow }\limits ^{\;\alpha \;} A _2\). System \(A _3\), with a trace \(A _3 \mathop {\Longrightarrow }\limits ^{\;\alpha \alpha \beta \;}\), does not satisfy this property, \(A _3 \not \models _{\text {s}} \varphi _{\text {safe}}\,\), according to Definition 2. However, this system may be capable of producing other external traces at runtime (e.g. as a result of some internal choice). In fact, if \(A_3\) exhibits the alternate behaviour \(A _3 \mathop {\Longrightarrow }\limits ^{\;\beta \;}\), one would be unable to observe any violation from \(A_3\) at runtime since the external trace \(\beta \) is allowed by \(\varphi _{\text {safe}}\) .

Since actors may violate a property along one execution but satisfy it along another, the inverse of \( \models _{\text {s}} \), i.e. \(A \not \models _{\text {s}} \varphi \), is too coarse to be used for a definition of monitor correctness along the lines of (1) discussed earlier. We thus define a violation relation, Definition 3, characterising actors violating a property along a specific execution trace.

Definition 3

(Violation) The violation relation, denoted as \(\models _{\text {v}}\), is the least relation of the form \(({\textsc {Actr}} \times \textsc {BAct} ^*\times {\textsc {sHML}})\) satisfying the following rules:

$$\begin{aligned} A, s \models _{\text {v}}{\textsf {ff}}&\quad \text {always} \\ A, s \models _{\text {v}}\varphi \mathbf {\wedge } \psi&\quad \text {if } A, s \models _{\text {v}}\varphi \text { or } A, s \models _{\text {v}}\psi \\ A, \alpha s \models _{\text {v}}\mathbf {[}\alpha \mathbf {]}\varphi&\quad \text {if } A \mathop {\Longrightarrow }\limits ^{\;\alpha \,\;} B \text { and } B, s \models _{\text {v}}\varphi \\ A, s \models _{\text {v}}{\textsf {max}}\mathbf {(}X,\varphi \mathbf {)}&\quad \text {if } A, s \models _{\text {v}}\varphi \{{\textsf {max}}\mathbf {(}X,\varphi \mathbf {)}/X \} \end{aligned}$$

We write \(A, s \models _{\text {v}}\varphi \) in lieu of \((A,s,\varphi ) \in \ \models _{\text {v}}\). It also follows from standard fixed-point theory that the constraints of the violation relation are bi-implications.

Example 4

(Violation) Recall the safety formula \(\varphi _{\text {safe}}\) defined in (8). Actor \(A _3\), from Example 3, together with the witness violating trace \(\alpha \alpha \beta \) violate \(\varphi _{\text {safe}}\) , i.e. \((A _3, \alpha \alpha \beta ) \models _{\text {v}}\varphi _{\text {safe}}\,\). However, \(A _3\) together with trace \(\beta \) do not violate \(\varphi _{\text {safe}}\) , i.e. \((A _3, \beta ) \not \models _{\text {v}}\varphi _{\text {safe}}\,\). Definition 3 relates a violating trace with an actor only when that trace leads the actor to a violation: if \(A _3\) cannot perform the trace \(\alpha \alpha \alpha \beta \), by Definition 3, we have \((A _3, \alpha \alpha \alpha \beta ) \not \models _{\text {v}}\varphi _{\text {safe}}\,\), even though the trace is prohibited by \(\varphi _{\text {safe}}\) . A violating trace may also lead a system to a violation before its end, e.g. \((A _3, \alpha \alpha \beta \alpha ) \models _{\text {v}}\varphi _{\text {safe}}\,\) according to Definition 3. \(\square \)

Despite the technical discrepancies between Definition 2 and Definition 3—e.g. the use of maximal versus minimal fixpoints and a differing model—we show that Definition 3 corresponds, in some sense, to the dual of Definition 2.

Theorem 1

(Correspondence) \(\exists s. (A, s) \models _{\text {v}}\varphi \quad \text {iff} \quad A \not \models _{\text {s}} \varphi \)

Proof

For the if case we prove the contrapositive, i.e. that \(\forall s. A, s \not \models _{\text {v}}\varphi \;\text {implies}\; A \models _{\text {s}} \varphi \) by co-inductively showing that \(\mathcal R=\{(A,\varphi )\;|\; \forall s. A, s \not \models _{\text {v}}\varphi \}\) is a satisfaction relation. The proof is by induction on the structure of \(\varphi \). We outline two cases (see [19] for the rest):

\(\varphi \mathbf {\wedge } \psi \)::

From the definition of \(\mathcal R\) we know that \(\forall s. A, s \not \models _{\text {v}}\varphi \mathbf {\wedge } \psi \) and, by Definition 3, this implies that \(\forall s. \bigl (A, s \not \models _{\text {v}}\varphi \text { and } A, s \not \models _{\text {v}}\psi \bigr )\). Distributing the universal quantification yields \(\forall s. A, s \not \models _{\text {v}}\varphi \) and \(\forall s. A, s \not \models _{\text {v}}\psi \), and by the definition of \(\mathcal R\)we obtain \((A,\varphi )\in \mathcal R\) and \((A,\psi )\in \mathcal R\), as required for satisfiability relations by Definition 2.

\(\mathbf {[}\alpha \mathbf {]}\varphi \)::

From the definition of \(\mathcal R\) we know that \(\forall s. A, s \not \models _{\text {v}}\mathbf {[}\alpha \mathbf {]}\varphi \). In particular, for all \(s =\alpha t \), we know that \(\forall t. A, \alpha t \not \models _{\text {v}}\mathbf {[}\alpha \mathbf {]}\varphi \). From Definition 3 it must be the case that whenever \(A \mathop {\Longrightarrow }\limits ^{\;\alpha \,\;} B \) we have that \(\forall t. B, t \not \models _{\text {v}}\varphi \), which in turn implies that \((B,\varphi ) \in \mathcal R\) (from the definition of \(\mathcal R\)); this is the implication required by Definition 2.

For the only-if case we prove \(\exists s. A, s \models _{\text {v}}\varphi \;\text {implies}\; A \not \models _{\text {s}} \varphi \) by rule induction on \(A, s \models _{\text {v}}\varphi \). Note that \(A \not \models _{\text {s}} \varphi \) means that there does not exist any satisfiability relation including the pair \((A,\varphi )\). Again we outline the main cases (see [19] for the rest):

  • \(A, s \models _{\text {v}}\mathbf {[}\alpha \mathbf {]}\varphi \) because \(s =\alpha s ', A \mathop {\Longrightarrow }\limits ^{\;\alpha \,\;} B \text { and } B, s ' \models _{\text {v}}\varphi \): By \(B, s ' \models _{\text {v}}\varphi \) and I.H. we obtain \(B \not \models _{\text {s}} \varphi \), and subsequently, by \(A \mathop {\Longrightarrow }\limits ^{\;\alpha \,\;} B \), we conclude that \(A \not \models _{\text {s}} \mathbf {[}\alpha \mathbf {]}\varphi \).

  • \(A, s \models _{\text {v}}{\textsf {max}}\mathbf {(}X,\varphi \mathbf {)} \) because \(A, s \models _{\text {v}}\varphi \{{\textsf {max}}\mathbf {(}X,\varphi \mathbf {)}/X \}\): By \(A, s \models _{\text {v}}\varphi \{{\textsf {max}}\mathbf {(}X,\varphi \mathbf {)}/X \}\) and I.H. we obtain \(A \not \models _{\text {s}} \varphi \){\({\textsf {max}}\mathbf {(}X,\varphi \mathbf {)}\)/\(X\) } which, in turn, implies that \(A \not \models _{\text {s}} {\textsf {max}}\mathbf {(}X,\varphi \mathbf {)} \).\(\square \)

Definition 3 and Theorem 1 allows us to show that every formula \(\varphi \in {\textsc {sHML}} \) denotes a safety language, as defined in [12, 28].Footnote 3 Informally, this states that whenever we determine that \(A \not \models _{\text {s}} \varphi \) along a particular trace \(s\), such a judgement is preserved for all extensions of that trace. This property, in turn, implies that all formulas in sHML are monitorable [5]. We formally prove this result in Theorem 2, assuming the standard notion of trace prefixes i.e. \(s \le t \) iff \(\exists . s ' \text { such that } t = s s '\).

Theorem 2

(Safety Properties and sHML) \(A, s \models _{\text {v}}\varphi \) and \(s \le t \) implies \(A, t \models _{\text {v}}\varphi \)

Proof

By rule induction on \(A, s \models _{\text {v}}\varphi \). We outline the main cases leaving the rest for [19]:

  • \(A, s \models _{\text {v}}\varphi \mathbf {\wedge } \psi \) because \(A, s \models _{\text {v}}\varphi \): By \(A, s \models _{\text {v}}\varphi \), \(s \le t \) and I.H. we obtain \(A, t \models _{\text {v}}\varphi \) which, by the same rule, implies \(A, t \models _{\text {v}}\varphi \mathbf {\wedge } \psi \).

  • \(A, s \models _{\text {v}}\mathbf {[}\alpha \mathbf {]}\varphi \) because \(s =\alpha s ', A \mathop {\Longrightarrow }\limits ^{\;\alpha \,\;} B \text { and } B, s ' \models _{\text {v}}\varphi \): From \(s =\alpha s '\) and \(s \le t \) we know that \(s ' \le t '\) for some \(t '\) where \(t = \alpha t '\). By \(s ' \le t '\), \(B, s ' \models _{\text {v}}\varphi \) and I.H. we obtain \(B, t ' \models _{\text {v}}\varphi \) and by \(A \mathop {\Longrightarrow }\limits ^{\;\alpha \,\;} B \) and \(t = \alpha t '\) we derive \(A, t \models _{\text {v}}\mathbf {[}\alpha \mathbf {]}\varphi \). \(\square \)

4 Correctness

Specifying online monitor correctness is complicated by the fact that, in general, we have limited control over the behaviour of the systems being monitored. For starters, a system that does not satisfy a property may still exhibit runtime behaviour that does not violate it, as discussed earlier in the case of system \(A _3\) of Example 3 and Example 4. We deal with system non-determinism by only requiring monitor detection when the system performs a violating execution: this can be expressed through the violation relation of Definition 3.

At runtime, a system may also interfere with the execution of monitors. Appropriate instrumentation can limit system effects on the monitors. In our asynchronous actor setting, direct interferences from the system to the monitors can be precluded by (i) locating the monitors at process identifiers not known to the system (ii) preventing the monitors from communicating these identifiers to the system. These measures inhibit the system’s ability to send messages to the monitors.

A system may also interfere with monitor executions indirectly by diverging, i.e. infinite internal computation (\(\tau \)-transitions) without external actions. This can prevent the monitors from progressing during their execution and thus postpone indefinitely violation detections [32]. In our case, divergence is handled, in part, by the EVM itself, which guarantees fair executions for concurrent actors [11]. In settings where fair executions may be assumed, it suffices to require a weaker property for monitors, reminiscent of the condition in fair/should-testing [33]. Definition 4 states that, for an arbitrary basic action \(\alpha \), an actor system \(A\) satisfies the predicate should-\(\alpha \) if, for all sequences of internal actions leading to some system \(B\), there always exists an execution from \(B\) that can produce the action \(\alpha \); in the case of monitors, the external should-action is set to a reserved violation-detection action, e.g., \({\textsf {{fail}}} {\textsf {!}} \).

Definition 4

(Should- \(\alpha \)) \(\;A \Downarrow _{\alpha } \;\;\mathop {=}\limits ^\mathbf{ def }\;\; \bigl (\; A \;\mathop {\Longrightarrow }\limits ^{\quad } \; B \quad \text { implies }\quad B \mathop {\Longrightarrow }\limits ^{\;\;\alpha \;\;} \;\bigr )\)

We limit monitoring to monitorable systems, where all actors are subject to a monitorable modality, i.e. modality \(\circ \).

$$\begin{aligned} A \equiv (\upnu \, \tilde{h}) \bigl (i {\textsf {[}} e \,\triangleleft \, q {\textsf {]}}_{}^{m} \parallel B \bigr ) \quad \text {implies}\quad m =\circ \end{aligned}$$

This guarantees that (i) they can be composed with a tracer actor (ii) all the basic actions produced by the system are recorded as trace entries at the tracer’s mailbox.Footnote 4 Monitor correctness is defined for (unmonitored) basic systems, satisfying the condition:

$$\begin{aligned} A \equiv (\upnu \, \tilde{h}) \bigl (i {\textsf {[}} e \,\triangleleft \, q {\textsf {]}}_{}^{m} \parallel B \bigr ) \quad \text {implies}\quad m = \bullet \end{aligned}$$

which are instrumented to execute in parallel with the monitor. Our instrumentation is defined through the operation \(\lceil - \rceil \), Definition 5, converting basic systems to monitorable ones using trace/2 and set_on_spawn Erlang commands [11]; see Lemma 1. Importantly, instrumentation does not affect the visible behaviour of a basic system; see Lemma 2.

Definition 5

(Instrumentation) \(\lceil - \rceil :{:}{\textsc {Actr}} \rightarrow {\textsc {Actr}} \) is defined inductively as:

$$\begin{aligned} \lceil i {\textsf {[}} e \,\triangleleft \, q {\textsf {]}}_{}^{m} \rceil&\;\mathop {=}\limits ^\mathbf{ def }\; i {\textsf {[}} e \,\triangleleft \, q {\textsf {]}}_{}^{\circ }&\lceil B \parallel C \rceil&\;\mathop {=}\limits ^\mathbf{ def }\; \lceil B \rceil \parallel \lceil C \rceil&\lceil (\upnu \, i) B \rceil&\;\mathop {=}\limits ^\mathbf{ def }\; (\upnu \, i) \lceil B \rceil \end{aligned}$$

Lemma 1

If \(A\) is a basic system then \(\lceil A \rceil \) is monitorable.

Lemma 2

For all basic actors \(A\) where \(i \not \in {{\mathrm{fId}}}(A)\):

$$\begin{aligned} A \xrightarrow {\;\alpha \;}B&\text {iff} {\left\{ \begin{array}{ll} (\upnu \, i) \bigl (\lceil A \rceil \parallel i {\textsf {[}} e \,\triangleleft \, q {\textsf {]}}_{}^{*}\bigr ) \xrightarrow {\;\;j \mathtt {!}v \;\;} (\upnu \, i) \bigl (\lceil B \rceil \parallel i {\textsf {[}} e \,\triangleleft \, q \mathop {:}{\textsf {\{}}{\textsf {sd}},j,v {\textsf {\}}} {\textsf {]}}_{}^{*}\bigr ) &{}\quad \text {if}\; \alpha =j \mathtt {!}v \\ (\upnu \, i) \bigl (\lceil A \rceil \parallel i {\textsf {[}} e \,\triangleleft \, q {\textsf {]}}_{}^{*}\bigr ) \xrightarrow {\;\;j \mathtt {?}v \;} (\upnu \, i) \bigl (\lceil B \rceil \parallel i {\textsf {[}} e \,\triangleleft \, q \mathop {:}{\textsf {\{}}{\textsf {rv}},j,v {\textsf {\}}} {\textsf {]}}_{}^{*}\bigr ) &{}\quad \text {if}\; \alpha \!=\!j \mathtt {?}v \\ (\upnu \, i) \bigl (\lceil A \rceil \parallel i {\textsf {[}} e \,\triangleleft \, q {\textsf {]}}_{}^{*}\bigr ) \xrightarrow {\;\;\tau \;\;} (\upnu \, i) \bigl (\lceil B \rceil \parallel i {\textsf {[}} e \,\triangleleft \, q {\textsf {]}}_{}^{*}\bigr ) &{}\quad \text {if}\; \alpha =\tau \end{array}\right. } \end{aligned}$$

We are now in a position to state monitor correctness, for some predefined violation-detection monitor action \({\textsf {{fail}}} {\textsf {!}} \), Definition 6; in what follows, \({\textsf {fail}}\) is always assumed to be fresh. We restrict our definition to expressions \(e\) located at a fresh scoped location \(i\) (not used by the system, i.e. \(i \not \in {{\mathrm{fId}}}(A)\)) with an empty mailbox, \(\epsilon \); expression \(e\) may then spawn concurrent submonitors while executing. The definition can be extended to generic concurrent monitors, i.e. multiple expressions, in straightforward fashion.

Definition 6

(Correctness) \(e \in \textsc {Exp} \) is a correct monitor for \(\varphi \in {\textsc {sHML}} \) iff for any basic actors \(A \in {\textsc {Actr}} \), \(i \not \in {{\mathrm{fId}}}(A)\), and execution traces \(s \in \bigl (\textsc {Act} \setminus \{ {\textsf {{fail}}} {\textsf {!}} \}\bigr )^*\):

$$\begin{aligned} (\upnu \, i) \bigl (\lceil A \rceil \parallel i {\textsf {[}} e {\textsf {]}}^{*}\bigr ) \;\mathop {\Longrightarrow }\limits ^{\;\;s \;\;}\; B \quad \text {implies}\quad \bigl (A, s \models _{\text {v}}\varphi \quad \text {iff}\quad B \Downarrow _{{\textsf {{fail}}} {\textsf {!}} } \bigr ) \end{aligned}$$

Definition 6 states that \(e\) correctly monitors property \(\varphi \) whenever, for any trace of environment interactions, \(s\), of a monitored system, \((\upnu \, i) \bigl (\lceil A \rceil \parallel i {\textsf {[}} e \,\triangleleft \, \epsilon {\textsf {]}}_{}^{*}\bigr )\), yielding system \(B\), if \(s\) leads \(A\) to a violation of \(\varphi \), then system \(B\) should always detect it, and viceversa. It formalises the definition of monitor correctness outlined earlier in (1) from Sect. 1.

5 Automated monitor synthesis

We define a translation from sHML formulas to Erlang monitors that asynchronously analyse a system and flag an alert whenever they detect violations by the current system execution (for the respective sHML formula). This translation describes the core algorithm for a tool automating monitor synthesis from sHML formulas [21].

Despite its relative simplicity, sHML still provides opportunities for performing concurrent monitoring. The most obvious case is the translation of conjunction formulas, \(\varphi _1 \mathbf {\wedge } \varphi _2\), whereby the resulting code needs to check both sub-formulas \(\varphi _1\) and \(\varphi _2\) so as to ensure that neither is violated; since conjunctions are prevalent in many monitoring logics, we conjecture that the concepts discussed here extend in straightforward fashion to similar runtime verification settings with asynchronous concurrent monitors. More specifically, a translation in terms of two concurrent (sub)monitors, each analysing different parts of the trace so as to ensure the observation of its respective sub-formula, constitutes a natural synthesis of the conjunction formula in our target language: it adheres to recommended Erlang practices advocating for concurrency wherever possible [11], but also allows us to benefit from the advantages of concurrent monitors discussed in Sect. 1.

Example 5

(Conjunction Formulas) Consider the two sHML formulas

$$\begin{aligned} \varphi _\mathrm{{no}\_\mathrm{{dup}}\_\mathrm {ans}}&\triangleq \mathbf {[}\alpha _{\text {call}} \mathbf {]}\,\bigl (\,{\textsf {max}}\mathbf {(}X,\, \,\mathbf {[}\beta _{\text {ans}} \mathbf {]}\,\mathbf {[}\beta _{\text {ans}} \mathbf {]}\,{\textsf {ff}} \; \mathbf {\wedge } \;\mathbf {[}\beta _{\text {ans}} \mathbf {]}\,\mathbf {[}\alpha _{\text {call}} \mathbf {]}\,X \mathbf {)} \,\bigr ) \\ \varphi _{\text {react}\_\mathrm{ans}}&\triangleq {\textsf {max}}\mathbf {(}Y,\, \,\mathbf {[}\beta _{\text {ans}} \mathbf {]}\,{\textsf {ff}} \; \mathbf {\wedge } \;\mathbf {[}\alpha _{\text {call}} \mathbf {]}\,\mathbf {[}\beta _{\text {ans}} \mathbf {]}\,Y \,\mathbf {)} \end{aligned}$$

Formula \(\varphi _\mathrm{{no}\_\mathrm{{dup}}\_\mathrm {ans}}\) requires that call actions \(\alpha _{\text {call}}\) are at most serviced by a single answer action \(\beta _{\text {ans}}\), and that this condition is invariant for any sequence of \((\alpha _{\text {call}}, \beta _{\text {ans}})\) pairs. On the other hand, formula \(\varphi _{\text {react}\_\mathrm{ans}}\) requires that answer actions are only produced in response to call actions; again this is required to be invariant for sequences of \((\alpha _{\text {call}}, \beta _{\text {ans}})\) pairs. Although one can rephrase the conjunction of the two formulas as a formula without a top-level conjunction, it is more straightforward to use two concurrent monitors executing in parallel (one for each sub-formula in \(\varphi _\mathrm{{no}\_\mathrm{{dup}}\_\mathrm {ans}} \mathbf {\wedge } \varphi _{\text {react}\_\mathrm{ans}} \)). There are also other reasons why it would be beneficial to keep the sub-formulas separate: for instance, keeping the formulas disentangled improves maintainability and separation of concerns when subformulas originate from distinct requirement specifications.Footnote 5 \(\square \)

Multiple conjunctions also arise indirectly when used under fix-point operators. When synthesising concurrent monitors analysing separate branches of such recursive properties, it is important to generate monitors that can dynamic spawn further submonitors themselves as required at runtime, so as to keep the monitoring overheads to a minimum.

Fig. 3
figure 3

Monitor Combinator generation for \(\varphi _{\text {safe}}\) of Ex. 3. a Denotation of \(\varphi _{\text {safe}}\) defined in (8). b Constructed concurrent monitors for \(\varphi _{\text {safe}}\) where \(\varphi = \mathbf {[}\alpha \mathbf {]}\mathbf {[}\alpha \mathbf {]}\mathbf {[}\beta \mathbf {]}{\textsf {ff}} \mathbf {\wedge } \mathbf {[}\alpha \mathbf {]}X \). c First expansion of the constructed monitor for \(\varphi _{\text {safe}}\) 

Example 6

(Conjunctions and Fixpoints) Recall \(\varphi _{\text {safe}}\) , from (8) in Example 3.

$$\begin{aligned} \varphi _{\text {safe}}\,&\triangleq {\textsf {max}}\mathbf {(}X,\,\mathbf {[}\alpha \mathbf {]}\mathbf {[}\alpha \mathbf {]}\mathbf {[}\beta \mathbf {]}{\textsf {ff}} \, \mathbf {\wedge } \,\mathbf {[}\alpha \mathbf {]}X \,\mathbf {)} \end{aligned}$$

Semantically, the formula represents the infinite-depth tree with an infinite number of conjunctions, as depicted in Fig. 3a. Although in practice, we cannot generate an infinite number of concurrent monitors, \(\varphi _{\text {safe}}\) will translate into possibly more than two concurrent monitors executing in parallel. The same applies for the conjunctions used in formulas \(\varphi _\mathrm{{no}\_\mathrm{{dup}}\_\mathrm {ans}}\) and \(\varphi _{\text {react}\_\mathrm{ans}}\) from Example 5. \(\square \)

Our monitor synthesis, \([\![-]\!]^\mathbf {m}:{:} {\textsc {sHML}} \rightarrow \textsc {Exp} \), given in Definition 7, takes a closed, guarded Footnote 6 sHML formula and returns an Erlang function. This function takes a map as an argument (encoded as a list of tuples from formula variables to other synthesised monitors of the same form) and releases an expression that performs the monitoring.

The map encodes an environment that maps formula variables to logical formula, introduced by the binding in the construct \({\textsf {max}}\mathbf {(}X,\varphi \mathbf {)}\); it is used for lazy recursive unrolling of formulas so as to minimize monitoring overhead. For instance, when synthesising formula \(\varphi _{\text {safe}}\) from Example 3, the algorithm initially spawns only two concurrent submonitors, one checking for the subformula \(\mathbf {[}\alpha \mathbf {]}\mathbf {[}\alpha \mathbf {]}\mathbf {[}\beta \mathbf {]}{\textsf {ff}} \), and another one checking for the formula \(\mathbf {[}\alpha \mathbf {]}X \), as is depicted in Fig. 3b. Whenever the rightmost submonitor in Fig. 3b observes the action \(\alpha \) and reaches \(X \), it consults its environment and retrieves the respective formula bound to \(X \); this allows it to unfolds \(X \) and spawns an additonal submonitor as depicted in Fig. 3c, thereby increasing monitor overheads incrementally as needed.

As discussed earlier in Sect. 2, traces are encoded as messages ordered inside the respective mailbox of the monitor. Monitoring thus involves reading these messages from the mailbox in order, analysing them, and determining what action to take. In a system of hierarchically-organised concurrent monitors, monitor actions can either flag a violation, forward messages to other sub-monitors, or spawn new monitors.

Definition 7

(Synthesis) \([\![-]\!]^\mathbf {m}\) is defined on the structure of the sHML formula:

$$\begin{aligned}{}[\![{\textsf {ff}} ]\!]^\mathbf {m}&\;\mathop {=}\limits ^\mathbf{ def }\; \uplambda {\textit{x}} _{{\text {env}}}.{\textsf {{fail}}} {\textsf {!}} \\ [\![\varphi _1 \mathbf {\wedge } \varphi _2 ]\!]^\mathbf {m}&\;\mathop {=}\limits ^\mathbf{ def }\; {\left\{ \begin{array}{ll} \uplambda {\textit{x}} _{{\text {env}}}.\,{\textit{y}} _{\text {pid1}} \,{\textsf {=}}\, {\textsf {spw}}\, \bigl ( [\![\varphi _1]\!]^\mathbf {m}({\textit{x}} _{{\text {env}}})\bigr ) {\textsf {,}}\,\\ \qquad \, \;{\textit{y}} _{\text {pid2}} \,{\textsf {=}}\, {\textsf {spw}}\, \bigl ( [\![\varphi _2]\!]^\mathbf {m}({\textit{x}} _{{\text {env}}})\bigr ) {\textsf {,}}\,\\ \qquad \; {\textsf {fork}}({\textit{y}} _{\text {pid1}},{\textit{y}} _{\text {pid2}}) \end{array}\right. }\\ [\![\mathbf {[}\alpha \mathbf {]}\varphi ]\!]^\mathbf {m}&\;\mathop {=}\limits ^\mathbf{ def }\; {\left\{ \begin{array}{ll} \uplambda {\textit{x}} _{{\text {env}}}.{\textsf {rcv}}\\ \qquad \quad \;\;\; {{\mathrm{tr}}}(\alpha )\, \,{{\rightarrow }}\, \, [\![\varphi ]\!]^\mathbf {m}({\textit{x}} _{{\text {env}}}); \\ \qquad \quad \;\;\;\_\, \,{{\rightarrow }}\, \,{\textsf {stop}} \\ \qquad \;{\textsf {end}}\end{array}\right. }\\ [\![{\textsf {max}}\mathbf {(}X,\varphi \mathbf {)} ]\!]^\mathbf {m}&\;\mathop {=}\limits ^\mathbf{ def }\; \uplambda {\textit{x}} _{{\text {env}}}.\,{\textit{y}} _{\text {mon}} \,{\textsf {=}}\, [\![\varphi ]\!]^\mathbf {m}{\textsf {,}}\, {\textit{y}} _{\text {mon}}({\textsf {\{}}'X',{\textit{y}} _{\text {mon}}{\textsf {\}}}\mathop {:}{\textit{x}} _{{\text {env}}})\\ [\![X ]\!]^\mathbf {m}&\;\mathop {=}\limits ^\mathbf{ def }\; \uplambda {\textit{x}} _{{\text {env}}}.\,{\textit{y}} _{\text {mon}} \,{\textsf {=}}\, {\textsf {lookUp}}('X', {\textit{x}} _{{\text {env}}}){\textsf {,}}\,\; {\textit{y}} _{\text {mon}}({\textit{x}} _{{\text {env}}}) \end{aligned}$$

Auxiliary function definitions and meta-operators:

$$\begin{aligned} {\textsf {fork}}&\;\mathop {=}\limits ^\mathbf{ def }\; \upmu {\textit{y}} _{\text {rec}}.\uplambda ({\textit{x}} _{\text {pid1}},{\textit{x}} _{\text {pid2}}).{\textsf {rcv}}\, {\textit{z}} \,{{\rightarrow }}\, \bigl ({\textit{x}} _{\text {pid1}} {\textsf {!}} {\textit{z}} {\textsf {,}}\,\,{\textit{x}} _{\text {pid2}} {\textsf {!}} {\textit{z}} \bigr ) \,{\textsf {end}}{\textsf {,}}\,\, {{\textit{y}} _{\text {rec}}}({\textit{x}} _{\text {pid1}},{\textit{x}} _{\text {pid2}})\\ {\textsf {lookUp}}&\;\mathop {=}\limits ^\mathbf{ def }\; {\left\{ \begin{array}{ll} \upmu {\textit{y}} _{\text {rec}}.\uplambda ({\textit{x}} _{\text {var}}, {\textit{x}} _{\text {map}}).{\textsf {case}}~ {\textit{x}} _{\text {map}} ~{\textsf {of}}\; ({\textsf {\{}}{\textit{x}} _{\text {var}}, {\textit{z}} _{\text {mon}}{\textsf {\}}}\mathop {:}\_)\, \,{{\rightarrow }}\, \,{\textit{z}} _{\text {mon}} \, \\ \qquad \qquad \qquad \qquad \qquad \qquad \quad \_\mathop {:}{\textit{z}} _{\text {tl}}\, \,{{\rightarrow }}\, \, {\textit{y}} _{\text {rec}}({\textit{x}} _{\text {var}}, {\textit{z}} _{\text {tl}}) \\ \qquad \qquad \qquad \qquad \qquad \qquad \quad {\textsf {nil}}\, \,{{\rightarrow }}\, \,{\textsf {exit}}\\ \qquad \qquad \qquad \quad \;{\textsf {end}}\end{array}\right. } \end{aligned}$$

In Definition 7, the synthesised monitor for formula \({\textsf {ff}}\) immediately reports a violation to some supervisor actor handling the violation; we assume that the supervisor actor is identified by the name \({\textsf {fail}}\). Conjunction formulas, \(\varphi _1 \mathbf {\wedge } \varphi _2\), translate into the spawning of the respective monitors for \(\varphi _1\) and \(\varphi _2\), i.e. the command \({\textit{y}} _{{\text {pid}}i}={{\textsf {spw}}\, \bigl ( [\![\varphi _i]\!]^\mathbf {m}({\textit{x}} _{{\text {env}}})\bigr ) }{}\) in Definition 7, and the subsequent forwarding of trace messages to these spawned monitors through the auxiliary function fork. The translated monitor for \(\mathbf {[}\alpha \mathbf {]}\varphi \) behaves as the monitor translation for \(\varphi \) once it receives a trace message encoding the occurrence of action \(\alpha \), i.e. the guarded expression \({{\mathrm{tr}}}(\alpha )\, \,{{\rightarrow }}\, \, [\![\varphi ]\!]^\mathbf {m}({\textit{x}} _{{\text {env}}})\) in Definition 7, which uses the meta-function \({{\mathrm{tr}}}(-)\) defined below:

$$\begin{aligned} {{\mathrm{tr}}}(i \mathtt {?}v) \mathop {=}\limits ^\mathbf{ def }{\textsf {\{}}{\textsf {rv}},i,v {\textsf {\}}} \quad {{\mathrm{tr}}}( i \mathtt {!}v) \mathop {=}\limits ^\mathbf{ def }{\textsf {\{}}{\textsf {sd}},i,v {\textsf {\}}} \end{aligned}$$

Importantly, the monitor for \(\mathbf {[}\alpha \mathbf {]}\varphi \) terminates if the trace message does not correspond to \(\alpha \), i.e. the guarded expression \(\_\, \,{{\rightarrow }}\, \,{\textsf {stop}} \) in Definition 7.

The translations for formulas \({\textsf {max}}\mathbf {(}X,\varphi \mathbf {)}\) and \(X\) are best understood together. The monitor for \({\textsf {max}}\mathbf {(}X,\varphi \mathbf {)}\) behaves like that for \(\varphi \), under the extended environment where \(X\) is mapped to the monitor for \(\varphi \), i.e. the operation \({\textsf {\{}}'X',{\textit{y}} _{\text {mon}}{\textsf {\}}}\mathop {:}{\textit{x}} _{{\text {env}}}\) in Definition 7; this effectively models the formula unrolling \(\varphi \{{\textsf {max}}\mathbf {(}X,\varphi \mathbf {)}/X \}\) from Definition 3. The monitor for \(X\) retrieves the respective monitor translation bound to \(X\) in the map using the auxiliary function lookUp, and then behaves like the monitor retrieved from the environment; the retrieved monitor is launched by applying it to the environment itself, i.e. expression \( {\textit{y}} _{\text {mon}}({\textit{x}} _{{\text {env}}})\) in Definition 7. Closed formulas ensure that map entries for the formula variables used are always present in the respective environment generated, whereas guarded formulas guarantee that formula variables, \(X\), are guarded by necessity conditions, \(\mathbf {[}\alpha \mathbf {]}\varphi \); this implements the lazy recursive unrolling of formulas and prevents infinite bound-variable expansions.

Monitor instrumentation, performed through the function Mon (defined below), spawns the synthesised function initialised to the empty environment, i.e. a \({\textsf {nil}}\) list, and then acts as a message forwarder to the spawned process, through the function mLoop (defined below), for any trace messages it receives through the tracing semantics discussed in Sect. 2.

$$\begin{aligned} {\textsf {Mon}}&\;\mathop {=}\limits ^\mathbf{ def }\; \uplambda {\textit{x}} _{\text {frm}}.\;{\textit{z}} _{\text {pid}} \,{\textsf {=}}\, {\textsf {spw}}\, \bigl ( [\![{\textit{x}} _{\text {frm}}]\!]^\mathbf {m}({\textsf {nil}})\bigr ) {\textsf {,}}\,\; {\textsf {mLoop}} ({\textit{z}} _{\text {pid}})\\ {\textsf {mLoop}}&\;\mathop {=}\limits ^\mathbf{ def }\; \upmu {\textit{y}} _{\text {rec}}.\uplambda {\textit{x}} _{\text {pid}}.\;{\textsf {rcv}}\, {\textit{z}} _{\text {msg}} \,{{\rightarrow }}\, \bigl ({\textit{x}} _{\text {pid}} {\textsf {!}} {\textit{z}} _{\text {msg}}\bigr ) \,{\textsf {end}}{\textsf {,}}\,\; {\textit{y}} _{\text {rec}}({\textit{x}} _{\text {pid}}) \end{aligned}$$

Example 7

(Synthesised monitor) Recall \(\varphi _{\text {react}\_\mathrm{ans}}\) from Example 5:

$$\begin{aligned} \varphi _{\text {react}\_\mathrm{ans}}&\triangleq {\textsf {max}}\mathbf {(}Y,\, \,\mathbf {[}\beta _{\text {ans}} \mathbf {]}\,{\textsf {ff}} \; \mathbf {\wedge } \;\mathbf {[}\alpha _{\text {call}} \mathbf {]}\,\mathbf {[}\beta _{\text {ans}} \mathbf {]}\,Y \,\mathbf {)} \end{aligned}$$

According to Definition 7, its respective monitor translation is the one described below.

Once the translated function above is applied to the function Mon (defined above), it is spawned as a concurrent actor where variable \({\textit{x}} _{{\text {env}}}\) (below) is instantiated to the empty environment, \({\textsf {nil}}\). In turn, the spawned function launches two further concurrent actors that monitor for the subformulas \(\mathbf {[}\beta _{\text {ans}} \mathbf {]}\,{\textsf {ff}} \) and \(\mathbf {[}\alpha _{\text {call}} \mathbf {]}\,\mathbf {[}\beta _{\text {ans}} \mathbf {]}\,Y \), binding their respective pIds to variables \({\textit{y}} _{\text {pid1}}\) and \({\textit{y}} _{\text {pid2}}\) below. Note that these two actors are instantiated with the extended environment \({\textsf {\{}}'Y ', e _{\text {unf}}{\textsf {\}}}\mathop {:}{\textsf {nil}}\), mapping the formula variable \(Y\) to the unfolding function \(e _{\text {unf}}\) (labelled below). This function is retrieved and executed by the actor monitoring for the subformula \(\mathbf {[}\alpha _{\text {call}} \mathbf {]}\,\mathbf {[}\beta _{\text {ans}} \mathbf {]}\,Y \), after reading two consecutive messages from its mailbox of the form \(\alpha _{\text {call}}\) and \(\beta _{\text {ans}}\). In this setup, the residual expressions of Mon and \([\![\varphi _{\text {react}\_\mathrm{ans}} ]\!]^\mathbf {m}\) (after their respective actor spawnings) behave as trace forwarders to the two concurrent actors monitoring for subformula \(\mathbf {[}\beta _{\text {ans}} \mathbf {]}\,{\textsf {ff}} \) and subformula \(\mathbf {[}\alpha _{\text {call}} \mathbf {]}\,\mathbf {[}\beta _{\text {ans}} \mathbf {]}\,Y \).

$$\begin{aligned}&[\![\varphi _{\text {react}\_\mathrm{ans}} ]\!]^\mathbf {m} = [\![{\textsf {max}}\mathbf {(}Y,\, \,\mathbf {[}\beta _{\text {ans}} \mathbf {]}\,{\textsf {ff}} \; \mathbf {\wedge } \;\mathbf {[}\alpha _{\text {call}} \mathbf {]}\,\mathbf {[}\beta _{\text {ans}} \mathbf {]}\,Y \,\mathbf {)} ]\!] =\\&\uplambda {\textit{x}} _{{\text {env}}}.\\&\;\;{\textit{y}} _{\text {mon}}=\\&\quad \left. {\left( \begin{array}{l} \uplambda {\textit{x}} _{{\text {env}}}'.\,\\ \;\;{\textit{y}} _{\text {pid1}} \,{\textsf {=}}\, {\textsf {spw}}{}\\ \qquad \left( \left( \begin{array}{l} \uplambda {\textit{x}} _{{\text {env}}}''.\\ \quad {\textsf {rcv}}{{\mathrm{tr}}}(\beta _{\text {ans}})\, \,{{\rightarrow }}\, \, \uplambda {\textit{x}} _{{\text {env}}}'''.{\textsf {{fail}}} {\textsf {!}} ({\textit{x}} _{{\text {env}}}''); \\ \qquad \quad \_\, \,{{\rightarrow }}\, \,{\textsf {stop}} \\ \quad {\textsf {end}}\end{array}\right) \left( {\textit{x}} _{{\text {env}}}'\right) \right) {\textsf {,}}\,\\ \;\;{\textit{y}} _{\text {pid2}} \,{\textsf {=}}\, {\textsf {spw}}{}\\ \qquad \left( \left( \begin{array}{l} \uplambda {\textit{z}} _{{\text {env}}}.\\ \quad {\textsf {rcv}}{{\mathrm{tr}}}(\alpha _{\text {call}})\, \,{{\rightarrow }}\, \\ \qquad \left( \begin{array}{l} \uplambda {\textit{z}} _{{\text {env}}}'.\\ \;{\textsf {rcv}}{{\mathrm{tr}}}(\beta _{\text {ans}})\, \,{{\rightarrow }}\, \\ \;\;\qquad \uplambda {\textit{z}} _{{\text {env}}}''. \left( \begin{array}{l} {\textit{y}} _{\text {mon}} \,{\textsf {=}}\, {\textsf {lookUp}}('Y ', {\textit{z}} _{{\text {env}}}''){\textsf {,}}\,\\ {\textit{y}} _{\text {mon}}({\textit{z}} _{{\text {env}}}'') \end{array}\right) ({\textit{z}} _{{\text {env}}}')\\ \;\qquad \_\, \,{{\rightarrow }}\, \,{\textsf {stop}} \\ \;\; {\textsf {end}}\end{array}\right) \left( {\textit{z}} _{{\text {env}}}\right) ;\\ \qquad \_\, \,{{\rightarrow }}\, \,{\textsf {stop}} \\ \quad {\textsf {end}}\end{array}\right) \left( {\textit{x}} _{{\text {env}}}'\right) \right) {\textsf {,}}\,\\ \; \; {\textsf {fork}}({\textit{y}} _{\text {pid1}},{\textit{y}} _{\text {pid2}})\end{array} \right) } \quad \right\} e _{\text {unf}}\\&\, {\textit{y}} _{\text {mon}}({\textsf {\{}}'Y ',{\textit{y}} _{\text {mon}}{\textsf {\}}}\mathop {:}{\textit{x}} _{{\text {env}}}) \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \quad \end{aligned}$$

\(\square \)

5.1 Tool implementation

We have constructed a tool called detectEr [21], that implements the monitor synthesis of Definition 7: given an sHML formula it generates a monitor that can be instrumented with minimal changes to the system (actual Erlang code), as discussed earlier in Sect. 4. Despite not being the main focus of this work, we conducted preliminary empirical experiments evaluating the performance of our synthesised monitors. This was carried out using a simulated server that launches individual workers to handle a series of requests from individual clients; we also injected faults making certain workers non-deterministically behave erratically. We synthesised monitors to check that each worker respects the no-duplicate-reply property from Example 5:

$$\begin{aligned} \varphi _{wrkr}&\;\triangleq \; \mathbf {[}{\textit{wrk}}\mathtt {?}{\textit{req}}\mathbf {]}\,\bigl (\,{\textsf {max}}\mathbf {(}X,\, \,\mathbf {[}{\textit{clnt}}\mathtt {!}{\textit{rply}}\mathbf {]}\,\mathbf {[}{\textit{clnt}}\mathtt {!}{\textit{rply}}\mathbf {]}\,{\textsf {ff}} \; \mathbf {\wedge } \;\mathbf {[}{\textit{clnt}}\mathtt {!}{\textit{rply}}\mathbf {]}\,\mathbf {[}{\textit{wrk}}\mathtt {?}{\textit{req}}\mathbf {]}\,X \mathbf {)} \,\bigr ) \end{aligned}$$

and calculated the overheads incurred for varying number of client requests (i.e. concurrent workers); we also compared this with the performance a monitor that checks for property violations in sequential fashion. Tests were carried out on an Intel Core i7 processor with 8GB of RAM, running Microsoft Windows 8 and EVM version R15B02. The results, summarised in the table below, show that our synthesised concurrent monitoring yields acceptable overheads that are consistently lower than those of a sequential monitor. We conjecture that this discrepancy can be increased further when monitoring for recursive properties with longer chains of necessity formulas. For more extensive empirical results relating to the tool detectEr, see [9].

No. of reqs.

Unmonitored

Sequential

Concurrent

 

Time (\(\mu \)s)

Time (\(\mu \)s)

Ovhd (%)

Time (\(\mu \)s)

Ovhd. (%)

Improv. (%)

250

117.813

121.667

3.27

118.293

0.40

2.86

350

185.232

202.500

9.32

194.793

5.16

4.16

450

237.606

248.333

4.51

242.380

2.01

2.51

550

286.461

319.167

11.42

308.853

7.82

3.60

650

345.543

372.232

7.72

354.333

2.54

5.18

6 Proving correctness

The preliminary results obtained in Sect. 5 advocate for the feasibility of using concurrent monitors. We however still need to show that the monitors synthesised are correct. Definition 6 allows us to state one of the main results of the paper, Theorem 3.

Theorem 3

(Correctness) For all \(\varphi \in {\textsc {sHML}} \), \( {\textsf {Mon}} (\varphi )\) is a correct monitor for \(\varphi \).

Proving Theorem 3 directly can be an arduous task: for any sHML formula, it requires reasoning about all the possible execution paths of any monitored system in parallel with the instrumented monitor. We propose a formal technique for alleviating the task of ascertaining the monitor correctness of Definition 6 by teasing apart three separate (weaker) monitor-conditions: they are referred to as Violation Detectability, Detection Preservation and Monitor Separability.

These conditions are important properties in their own right—for instance, detection preservation requires the monitor to behave deterministically wrt. violation detections, whereas Monitor Separability requires that the monitor computation does not affect the execution of the monitored system. Moreover, the three conditions pose advantages to the checking of monitor correctness: since these conditions are independent to one another, they can be checked in parallel by distinct analysing entities; alternatively, the conditions that are inexpensive to check may be carried out before the more expensive ones, thus acting as vetting phases that abort early and keep the analysis cost to a minimum. More importantly though, the three conditions together imply our original monitor-correctness criteria.

The first sub-property is Violation Detectability, Lemma 3, guaranteeing that every violating trace \(s\) of formula \(\varphi \) is detectable by the respective synthesised monitor,Footnote 7 (the only-if case) and that there are no false detections (the if case). This property is easier to verify than Theorem 3 since it requires us to consider the execution of the monitor in isolation and, more importantly, requires us to verify the existence of an execution path that detects the violation; concurrent monitors typically have multiple execution paths and Theorem 3 requires us to prove this property for all of the possible monitor execution paths.

Lemma 3

(Violation Detectability) For basic \(A \in {\textsc {Actr}} \) and \(i \not \in {{\mathrm{fId}}}(A)\), \(A \mathop {\Longrightarrow }\limits ^{\;s \;}\) implies:

$$\begin{aligned} A, s \models _{\text {v}}\varphi \quad \text {iff}\quad i {\textsf {[}} {\textsf {Mon}} (\varphi ) \,\triangleleft \, {{\mathrm{tr}}}(s) {\textsf {]}}_{}^{*} \mathop {\Longrightarrow }\limits ^{\;{\textsf {{fail}}} {\textsf {!}} \;} \end{aligned}$$

Detection preservation (Lemma 4), the second sub-property, is not concerned with relating detections to the violations specified by our logic semantics, \(A, s \models _{\text {v}}\varphi \). Instead it guarantees that if a monitor can potentially detect a violation, further reductions do not exclude the possibility of this detection. In the case where monitors always have a finite reduction wrt. their mailbox contents (as it turns out to be the case for monitors synthesised by Definition 7) this condition guarantees that the monitor will detect violations deterministically. More generally, however, in a setting that guarantees fair actor executions, Lemma 4 ensures that detection will always eventually occur, even when monitors execute in parallel with other, potentially divergent, systems.

Lemma 4

(Detection Preservation) For all \(\varphi \in {\textsc {sHML}} \), \(q \in \textsc {Val} ^*\)

$$\begin{aligned} \Bigl (i {\textsf {[}} {\textsf {Mon}} (\varphi ) \,\triangleleft \, q {\textsf {]}}_{}^{*} \mathop {\Longrightarrow }\limits ^{\;{\textsf {{fail}}} {\textsf {!}} \;} \;\text { and }\; i {\textsf {[}} {\textsf {Mon}} (\varphi ) \,\triangleleft \, q {\textsf {]}}_{}^{*} \mathop {\Longrightarrow }\limits ^{\quad } B \Bigl )\quad \text { implies }\quad B \mathop {\Longrightarrow }\limits ^{\;{\textsf {{fail}}} {\textsf {!}} \;} \end{aligned}$$

The third sub-property is Separability, Lemma 5, which implies that the behaviour of a (monitored) system is independent of the monitor and, dually, the behaviour of the monitor depends, at most, on the trace generated by the system.

Lemma 5

(Monitor Separability) For all basic \(A \in {\textsc {Actr}}, i \not \in {{\mathrm{fId}}}(A), \varphi \in {\textsc {sHML}},\) and \(s \in \bigl (\textsc {Act} \setminus \{ {\textsf {{fail}}} {\textsf {!}} \}\bigr )^*\),

$$\begin{aligned}&(\upnu \, i) \bigl (\lceil A \rceil \parallel i {\textsf {[}} {\textsf {Mon}} (\varphi ) {\textsf {]}}^{*}\bigr ) \mathop {\Longrightarrow }\limits ^{\;\;s \;\;} B \text { implies } \exists B ',B '' {\textit{s.t.}} \\&\qquad B \equiv (\upnu \, i) \bigl (B '\parallel B ''\bigr ) \quad \text {and}\quad A \mathop {\Longrightarrow }\limits ^{\;s \;} A ' \;{\textit{s.t.}} \; B ' = \lceil A ' \rceil \quad \text {and}\quad i {\textsf {[}} {\textsf {Mon}} (\varphi ) \,\triangleleft \, {{\mathrm{tr}}}(s) {\textsf {]}}_{}^{*} \mathop {\Longrightarrow }\limits ^{\quad } B '' \end{aligned}$$

These three properties suffice to show monitor correctness.

Theorem 3 (Correctness). For all \(\varphi \in {\textsc {sHML}} \), \( {\textsf {Mon}} (\varphi )\) is a correct monitor for \(\varphi \).

Proof

According to Definition 6 we have to show:

$$\begin{aligned} (\upnu \, i) \bigl (\lceil A \rceil \parallel i {\textsf {[}} {\textsf {Mon}} (\varphi ) {\textsf {]}}^{*}\bigr ) \;\mathop {\Longrightarrow }\limits ^{\;\;s \;\;}\; B \quad \text {implies}\quad \bigl (A, s \models _{\text {v}}\varphi \quad \text {iff}\quad B \Downarrow _{{\textsf {{fail}}} {\textsf {!}} } \bigr ) \end{aligned}$$

We assume that \((\upnu \, i) \bigl (\lceil A \rceil \parallel i {\textsf {[}} {\textsf {Mon}} (\varphi ) {\textsf {]}}^{*}\bigr ) \;\mathop {\Longrightarrow }\limits ^{\;\;s \;\;}\; B \) holds and consider the two sides of the bi-implication separately. For the only-if case, we assume

$$\begin{aligned}&(\upnu \, i) \bigl (A \parallel i {\textsf {[}} {\textsf {Mon}} (\varphi ) {\textsf {]}}^{*}\bigr ) \;\mathop {\Longrightarrow }\limits ^{\;\;s \;\;}\; B \end{aligned}$$
(9)
$$\begin{aligned}&A, s \models _{\text {v}}\varphi \end{aligned}$$
(10)

In order to show \(B \Downarrow _{{\textsf {{fail}}} {\textsf {!}} }\), we use Definition 4 (Should- \(\alpha \) ) to expand \(B \Downarrow _{{\textsf {{fail}}} {\textsf {!}} }\). We thus also assume \(B \mathop {\Longrightarrow }\limits ^{\quad } B '\), for arbitrary \(B '\), and then be required to prove that \(B ' \mathop {\Longrightarrow }\limits ^{{\textsf {{fail}}} {\textsf {!}} }\). From (9), \(B \mathop {\Longrightarrow }\limits ^{\quad } B '\) and Lemma 5 (Monitor Separability) we know

$$\begin{aligned} \exists B '',B ''' {\textit{s.t.}} \;&B ' \equiv (\upnu \, i) \bigl (B ''\parallel B '''\bigr ) \end{aligned}$$
(11)
$$\begin{aligned}&A \mathop {\Longrightarrow }\limits ^{\;\;s \;\;} A ' \text { for some }A '\text { where } \lceil A ' \rceil = B ''\end{aligned}$$
(12)
$$\begin{aligned}&i {\textsf {[}} {\textsf {Mon}} (\varphi ) \,\triangleleft \, {{\mathrm{tr}}}(s) {\textsf {]}}_{}^{*} \mathop {\Longrightarrow }\limits ^{\;\quad \;} B ''' \end{aligned}$$
(13)

From (12), (10) and Lemma 3 (Violation Detectability) we obtain

$$\begin{aligned}&\,i {\textsf {[}} {\textsf {Mon}} (\varphi ) \,\triangleleft \, {{\mathrm{tr}}}(s) {\textsf {]}}_{}^{*} \mathop {\Longrightarrow }\limits ^{{\textsf {{fail}}} {\textsf {!}} } \end{aligned}$$
(14)

and from (13) , (14) and Lemma 4 (Detection Preservation) we get \(B ''' \mathop {\Longrightarrow }\limits ^{{\textsf {{fail}}} {\textsf {!}} }\). Hence, by (11), and standard transition rules for parallel composition and scoping, Par and Scp, we can reconstruct \(B ' \mathop {\Longrightarrow }\limits ^{\;{\textsf {{fail}}} {\textsf {!}} \;}\), as required.

For the if case we assume:

$$\begin{aligned}&(\upnu \, i) \bigl (\lceil A \rceil \parallel i {\textsf {[}} {\textsf {Mon}} (\varphi ) {\textsf {]}}^{*}\bigr ) \mathop {\Longrightarrow }\limits ^{\;\;s \;\;} B \end{aligned}$$
(15)
$$\begin{aligned}&B \Downarrow _{{\textsf {{fail}}} {\textsf {!}} } \end{aligned}$$
(16)

and have to prove \(A, s \models _{\text {v}}\varphi \). From (16) we know \(B \mathop {\Longrightarrow }\limits ^{\;{\textsf {{fail}}} {\textsf {!}} \;}\). Together with (15), this implies that there exists a sequence of reductions (\(\tau \)-transitions) from \((\upnu \, i) \bigl (\lceil A \rceil \parallel i {\textsf {[}} {\textsf {Mon}} (\varphi ) {\textsf {]}}^{*}\bigr )\) leading to a (monitored) system that flags a violation:

$$\begin{aligned} \exists B '\;{\textit{s.t.}} \;(\upnu \, i) \bigl (\lceil A \rceil \parallel i {\textsf {[}} {\textsf {Mon}} (\varphi ) {\textsf {]}}^{*}\bigr ) \mathop {\Longrightarrow }\limits ^{\;\;s \;\;} B '\xrightarrow {\;{\textsf {{fail}}} {\textsf {!}} \;} \end{aligned}$$
(17)

From Lemma 5 (Monitor Separability) and (17) we obtain

$$\begin{aligned} \exists B '',B ''' {\textit{s.t.}} \;&B '=(\upnu \, i) \bigl (B ''\parallel B '''\bigr ) \end{aligned}$$
(18)
$$\begin{aligned}&A \mathop {\Longrightarrow }\limits ^{\;\;s \;\;} A ' \text { for some }A ' \text { where } \lceil A ' \rceil = B ''\end{aligned}$$
(19)
$$\begin{aligned}&i {\textsf {[}} {\textsf {Mon}} (\varphi ) \,\triangleleft \, {{\mathrm{tr}}}(s) {\textsf {]}}_{}^{*} \mathop {\Longrightarrow }\limits ^{\quad } B ''' \end{aligned}$$
(20)

From (17), (18) and the assumption that action \({\textsf {{fail}}} {\textsf {!}} \) is fresh to \(A\), we deduce that it can only be \(B '''\) that is capable of flagging the violation:

$$\begin{aligned} B ''' {\mathop {\longrightarrow }^{\textsf {fail!}}}. \end{aligned}$$
(21)

Thus, by (20) and (21), we get

$$\begin{aligned} i {\textsf {[}} {\textsf {Mon}} (\varphi ) \,\triangleleft \, {{\mathrm{tr}}}(s) {\textsf {]}}_{}^{} \mathop {\Longrightarrow }\limits ^{{\textsf {{fail}}} {\textsf {!}} } \end{aligned}$$
(22)

Therefore, by (19) , (22) and Lemma 3 (Violation Detectability) we obtain \(A, s \models _{\text {v}}\varphi \). \(\square \)

7 Proving the monitor sub-properties

The proof of monitor correctness given in Sect. 6 hinges on the three monitor sub-properties discussed in the same section. We now consider the proofs for these three monitor properties for the monitor synthesis presented in Sect. 5. In what follows, we assume that the tracer, i.e. where the synthesised monitor is placed in the instrumentation of Definition 6, is located at the process identifier \(i_{\text {mtr}}\). Moreover sequences of identifiers are denoted as \(\tilde{h}\); for instance, \((\upnu \, \tilde{j}) A \) is used to denote the system \((\upnu \, j _1) \ldots (\upnu \, j _n) A \) whenever the scoped identifiers are unimportant.

These proofs rely on an encoding of formula substitutions, \({\theta :{:} \textsc {LVar} \rightharpoonup {\textsc {sHML}}}\), partial maps from formula variables to (possibly open) formulas, to lists of tuples containing a string representation of the variable and the respective monitor translation of the formula as defined in Definition 7. Formula substitutions are denoted as lists of individual substitutions, \(\{\varphi _1/X _1\}\ldots \{\varphi _n/X _n\}\) where every \(X _i\) is distinct, and empty substitutions are denoted as \(\epsilon \).

Definition 8

(Formula Substitution Encoding)

$$\begin{aligned} {{\mathrm{enc}}}(\theta )&\mathop {=}\limits ^\mathbf{ def }{\left\{ \begin{array}{ll} {\textsf {nil}}&{} \text {when } \theta = \epsilon \\ {\textsf {\{}}'X ', [\![\varphi ]\!]^\mathbf {m}{\textsf {\}}} \mathop {:}{{\mathrm{enc}}}(\theta ') &{} \text {if } \theta = \{{\textsf {max}}\mathbf {(}X,\varphi \mathbf {)}/X \}\theta ' \end{array}\right. } \end{aligned}$$

Our monitor lookup function of Definition 7 models variable substitution, Lemma 6. We can also show that different representations of the same formula substitution do not affect the outcome of the execution of lookUp on the respective encoding, Lemma 7, which justifies the abuse of notation that assume a unique representation of a formula substitution.

Lemma 6

If \(\;\theta (X) = \varphi \;\) then \(\;i {\textsf {[}} {\textsf {lookUp}}('X ', {{\mathrm{enc}}}(\theta )) \,\triangleleft \, q {\textsf {]}}_{}^{m} \mathop {\Longrightarrow }\limits ^{\;\quad \;} i {\textsf {[}} [\![\varphi ]\!]^\mathbf {m} \,\triangleleft \, q {\textsf {]}}_{}^{m}\)

Proof

By induction on the number of mappings \(\{\varphi _1/X _1\}\ldots \{\varphi _n/X _n\}\) in \(\theta \). The base case, i.e. when we have zero mappings, is trivial since it contradicts the left side of the implication. The inductive case has two subcases and follows from the pattern-matching branches of the lookUp code, defined in Definition 7. \(\square \)

Lemma 7

If \(\;\theta (X) = \varphi \;\) then \(\;i {\textsf {[}} {\textsf {lookUp}}('X ', {{\mathrm{enc}}}(\theta ')) \,\triangleleft \, q {\textsf {]}}_{}^{m} \mathop {\Longrightarrow }\limits ^{\;\quad \;} i {\textsf {[}} [\![\varphi ]\!]^\mathbf {m} \,\triangleleft \, q {\textsf {]}}_{}^{m}\) whenever \(\theta \) and \(\theta '\) denote the same substitution.

Proof

By induction on the number of mappings \(\{\varphi _1/X _1\}\ldots \{\varphi _n/X _n\}\) in \(\theta \), analogously to the proof for Lemma 6. \(\square \)

Our proofs use another technical result, Lemma 8, stating that silent actions are, in some sense, preserved when actor-mailbox contents of a free actor are increased; note that the lemma only applies for cases where the mailbox at this free actor decreases in size or remains unaffected by the \(\tau \)-action, specified through the sublist condition \(q '\le q \).

Lemma 8

(Mailbox Increase) \((\upnu \, \tilde{h}) (i {\textsf {[}} e \,\triangleleft \, q {\textsf {]}}_{}^{m} \parallel A) \xrightarrow {\;\;\tau \;\;} (\upnu \, \tilde{j}) (i {\textsf {[}} e ' \,\triangleleft \, q ' {\textsf {]}}_{}^{m} \parallel B) \) where \(i \not \in \tilde{h}\) and \(q '\le q \) implies \( (\upnu \, \tilde{h}) (i {\textsf {[}} e \,\triangleleft \, q \mathop {:}v {\textsf {]}}_{}^{m} \parallel A) \xrightarrow {\;\;\tau \;\;} (\upnu \, \tilde{j}) (i {\textsf {[}} e ' \,\triangleleft \, q '\mathop {:}v {\textsf {]}}_{}^{m} \parallel B) \)

Proof

By rule induction on \((\upnu \, \tilde{h}) (i {\textsf {[}} e \,\triangleleft \, q {\textsf {]}}_{}^{m} \parallel A) \xrightarrow {\;\;\tau \;\;} (\upnu \, \tilde{j}) (i {\textsf {[}} e ' \,\triangleleft \, q ' {\textsf {]}}_{}^{m} \parallel B) \). \(\square \)

7.1 Violation detection

In one direction, Lemma 3 (Violation Detection) from Sect. 6 relies on Lemma 13 (see Appendix A.1) in order to establish the correspondence between violations and the possibility of detections. In the other direction, Lemma 3 relies on Lemma 16 (see Appendix A.1), which establishes a correspondence between violation detections and actual violations, as stated in Definition 3. We recall that Lemma 3 was stated wrt. closed sHML formulas.

Lemma 3 (Violation Detection). Whenever \(A \mathop {\Longrightarrow }\limits ^{\;\;s \;\;}\) then :

$$\begin{aligned} A, s \models _{\text {v}}\varphi \quad iff \quad i_{\text {mtr}} {\textsf {[}} {\textsf {Mon}} (\varphi ) \,\triangleleft \, {{\mathrm{tr}}}(s) {\textsf {]}}_{}^{*} \mathop {\Longrightarrow }\limits ^{\;{\textsf {{fail}}} {\textsf {!}} \;} \end{aligned}$$

Proof

For the only-if case, we assume \(A \mathop {\Longrightarrow }\limits ^{\;\;s \;\;}\) and \(A, s \models _{\text {v}}\varphi \) and are required to prove \(i_{\text {mtr}} {\textsf {[}} {\textsf {Mon}} (\varphi ) \,\triangleleft \, {{\mathrm{tr}}}(s) {\textsf {]}}_{}^{*} \mathop {\Longrightarrow }\limits ^{\;{\textsf {{fail}}} {\textsf {!}} \;}\). We recall from Sect. 5 that Mon was defined as

$$\begin{aligned} \uplambda {\textit{x}} _{\text {frm}}.{\textit{z}} _{\text {pid}} \,{\textsf {=}}\, {\textsf {spw}}\, \bigl ( [\![{\textit{x}} _{\text {frm}}]\!]^\mathbf {m}({\textsf {nil}})\bigr ) {\textsf {,}}\, {\textsf {mLoop}} ({\textit{z}} _{\text {pid}}). \end{aligned}$$
(23)

and as a result we can deduce (using rules such as App, Spw and Par) that

$$\begin{aligned} i_{\text {mtr}} {\textsf {[}} {\textsf {Mon}} (\varphi ) \,\triangleleft \, {{\mathrm{tr}}}(s) {\textsf {]}}_{}^{*} \mathop {\Longrightarrow }\limits ^{\;\quad \;} (\upnu \, i) \bigl (i_{\text {mtr}} {\textsf {[}} {\textsf {mLoop}} (i) \,\triangleleft \, {{\mathrm{tr}}}(s) {\textsf {]}}_{}^{*} \parallel i {\textsf {[}} [\![\varphi ]\!]^\mathbf {m}({\textsf {nil}}) {\textsf {]}}^{\bullet }\bigr ) \end{aligned}$$
(24)

Assumption \(A, s \models _{\text {v}}\varphi \) can be rewritten as \(A, s \models _{\text {v}}\varphi \theta \) for \(\theta =\epsilon \), and thus, by Definition 8 we know \({\textsf {nil}}={{\mathrm{enc}}}(\theta )\). By Lemma 13 we obtain

$$\begin{aligned} (\upnu \, i) \bigl (i_{\text {mtr}} {\textsf {[}} {\textsf {mLoop}} (i) \,\triangleleft \, {{\mathrm{tr}}}(s) {\textsf {]}}_{}^{*} \parallel i {\textsf {[}} [\![\varphi ]\!]^\mathbf {m}({\textsf {nil}}) {\textsf {]}}^{\bullet }\bigr ) \mathop {\Longrightarrow }\limits ^{\;{\textsf {{fail}}} {\textsf {!}} \;} \end{aligned}$$
(25)

and the result thus follows from (24) and (25).

For the if case, we assume \(A \mathop {\Longrightarrow }\limits ^{\;\;s \;\;}\) and \(i_{\text {mtr}} {\textsf {[}} {\textsf {Mon}} (\varphi ) \,\triangleleft \, {{\mathrm{tr}}}(s) {\textsf {]}}_{}^{*} \mathop {\Longrightarrow }\limits ^{\;{\textsf {{fail}}} {\textsf {!}} \;}\) and are required to prove \(A, s \models _{\text {v}}\varphi \).

Since \(\varphi \) is closed, we can assume the empty list of substitutions \(\theta =\epsilon \) where, by default, \({{\mathrm{fv}}}(\varphi )\subseteq {{\mathrm{dom}}}(\theta )\) and, by Definition 8, \({\textsf {nil}}={{\mathrm{enc}}}(\theta )\). By (23) we can decompose the transition sequence \(i_{\text {mtr}} {\textsf {[}} {\textsf {Mon}} (\varphi ) \,\triangleleft \, {{\mathrm{tr}}}(s) {\textsf {]}}_{}^{*} \mathop {\Longrightarrow }\limits ^{\;{\textsf {{fail}}} {\textsf {!}} \;}\) as

$$\begin{aligned} i_{\text {mtr}} {\textsf {[}} {\textsf {Mon}} (\varphi ) \,\triangleleft \, {{\mathrm{tr}}}(s) {\textsf {]}}_{}^{*} (\xrightarrow {\;\;\tau \;\;})^3 (\upnu \, i) \bigl (i_{\text {mtr}} {\textsf {[}} {\textsf {mLoop}} (i) \,\triangleleft \, {{\mathrm{tr}}}(s) {\textsf {]}}_{}^{*} \parallel i {\textsf {[}} [\![\varphi ]\!]^\mathbf {m}({\textsf {nil}}) {\textsf {]}}^{\bullet }\bigr ) \mathop {\Longrightarrow }\limits ^{{\textsf {{fail}}} {\textsf {!}} } \end{aligned}$$
(26)

The result, i.e. \(A, s \models _{\text {v}}\varphi \), follows from (26) and Lemma 16. \(\square \)

7.2 Detection preservation

In order to prove Lemma 4 from Sect. 6, we are able to require a stronger guarantee, i.e. confluence (Definition 9) under weak transitions for the concurrent monitors described in Definition 7; this is formalised as Lemma 18 in Appendix A.2.

Definition 9

(Confluence modulo Inputs with Identical Recipients)

$$\begin{aligned} {{\mathrm{cnf}}}(A) \;\mathop {=}\limits ^\mathbf{ def }\;\; A \xrightarrow {\gamma _1}A '\text { and } A \xrightarrow {\gamma _2}A '' \;\text { implies }\; {\left\{ \begin{array}{ll} \gamma _1=i \mathtt {?}v _1, \gamma _2=i \mathtt {?}v _2 \;\text { or;} \\ \gamma _1=\gamma _2, A '=A '' \;\text { or;} \\ A '\xrightarrow {\gamma _2}A ''', A ''\xrightarrow {\gamma _1}A ''' \text { for some }A ''' \end{array}\right. } \end{aligned}$$

In Definition 9, a system is deemed confluent if, whenever it can perform two separate actions \(\gamma _1\) and \(\gamma _2\) that are not input actions at the same actor, both residual systems can resp. still perform the other action to reach the same system.

Lemma 18 (Appendix A.2) allows us to prove Lemma 9, and subsequently Lemma 10; the latter Lemma implies Detection Preservation, Lemma 4, used by Theorem 3 of Sect. 6.

Lemma 9

For all \(\varphi \in {\textsc {sHML}} \), \(q \in \textsc {Val} ^*\)

$$\begin{aligned} i_{\text {mtr}} {\textsf {[}} {\textsf {Mon}} (\varphi ) \,\triangleleft \, q {\textsf {]}}_{}^{*} \mathop {\Longrightarrow }\limits ^{\;\quad \;} A, A \mathop {\Longrightarrow }\limits ^{\;{\textsf {{fail}}} {\textsf {!}} \;} \text { and } A \xrightarrow {\;\tau \;} B \;\text { implies }\; B \mathop {\Longrightarrow }\limits ^{\;{\textsf {{fail}}} {\textsf {!}} \;} \end{aligned}$$

Proof

From \(i_{\text {mtr}} {\textsf {[}} {\textsf {Mon}} (\varphi ) \,\triangleleft \, q {\textsf {]}}_{}^{*} \mathop {\Longrightarrow }\limits ^{\;\quad \;} A \) and Lemma 18 we know that \({{\mathrm{cnf}}}(A)\). The proof is by induction on \(A (\xrightarrow {\;\,\tau \,\;})^n\cdot \xrightarrow {\;{\textsf {{fail}}} {\textsf {!}} \;}\).

\(n=0\)::

We have \(A \xrightarrow {\;{\textsf {{fail}}} {\textsf {!}} \;}A '\) (for some \(A '\)). By \(A \xrightarrow {\;\tau \;} B \) and \({{\mathrm{cnf}}}(A)\) we obtain \(B \xrightarrow {\;{\textsf {{fail}}} {\textsf {!}} \;} B '\) for some \(B '\) where \(A '\xrightarrow {\;\tau \;} B '\).

\(n=k+1\)::

We have \(A \xrightarrow {\;\,\tau \,\;}A '(\xrightarrow {\;\,\tau \,\;})^k\cdot \xrightarrow {\;{\textsf {{fail}}} {\textsf {!}} \;}\) (for some \(A '\)). By \(A \xrightarrow {\;\tau \;}A '\), \(A \xrightarrow {\;\tau \;} B \) and \({{\mathrm{cnf}}}(A)\) we either know that \(B =A '\), in which case the result follows immediately, or else obtain

$$\begin{aligned}&B \xrightarrow {\;\tau \;}A ''\end{aligned}$$
(27)
$$\begin{aligned}&A '\xrightarrow {\;\tau \;}A '' \quad \text { for some }A '' \end{aligned}$$
(28)

In such a case, by \(A \xrightarrow {\;\,\tau \,\;}A '\) and \(i_{\text {mtr}} {\textsf {[}} {\textsf {Mon}} (\varphi ) \,\triangleleft \, q {\textsf {]}}_{}^{*} \mathop {\Longrightarrow }\limits ^{\;\quad \;} A \) we deduce that

$$\begin{aligned} i_{\text {mtr}} {\textsf {[}} {\textsf {Mon}} (\varphi ) \,\triangleleft \, q {\textsf {]}}_{}^{*} \mathop {\Longrightarrow }\limits ^{\;\quad \;} A ', \end{aligned}$$

and subsequently, by (28), \(A '(\xrightarrow {\;\,\tau \,\;})^k\cdot \xrightarrow {\;{\textsf {{fail}}} {\textsf {!}} \;}\) and I.H. we obtain \(A ''\mathop {\Longrightarrow }\limits ^{\;{\textsf {{fail}}} {\textsf {!}} \;}\); the required result then follows from (27). \(\square \)

Lemma 10

(Detection Confluence) For all \(\varphi \in {\textsc {sHML}} \), \(q \in \textsc {Val} ^*\)

$$\begin{aligned} i_{\text {mtr}} {\textsf {[}} {\textsf {Mon}} (\varphi ) \,\triangleleft \, q {\textsf {]}}_{}^{*} \mathop {\Longrightarrow }\limits ^{\;\quad \;} A, A \mathop {\Longrightarrow }\limits ^{\;{\textsf {{fail}}} {\textsf {!}} \;} \text { and } A \mathop {\Longrightarrow }\limits ^{\;\quad \;} B \;\text { implies }\; B \mathop {\Longrightarrow }\limits ^{\;{\textsf {{fail}}} {\textsf {!}} \;} \end{aligned}$$

Proof

By induction on \(A (\xrightarrow {\;\,\tau \,\;})^n B \) and Lemma 9. \(\square \)

We are now in a position to prove Lemma 4 of Sect. 6.

Lemma 4 (Detection Preservation). For all \(\varphi \in {\textsc {sHML}} \), \(q \in \textsc {Val} ^*\)

$$\begin{aligned} i_{\text {mtr}} {\textsf {[}} {\textsf {Mon}} (\varphi ) \,\triangleleft \, q {\textsf {]}}_{}^{*} \mathop {\Longrightarrow }\limits ^{\;{\textsf {{fail}}} {\textsf {!}} \;}\quad and\quad i_{\text {mtr}} {\textsf {[}} {\textsf {Mon}} (\varphi ) \,\triangleleft \, q {\textsf {]}}_{}^{*} \mathop {\Longrightarrow }\limits ^{\;\quad \;} B \;\quad implies \quad B \mathop {\Longrightarrow }\limits ^{\;{\textsf {{fail}}} {\textsf {!}} \;} \end{aligned}$$

Proof

From Lemma 10, for the case where \(i_{\text {mtr}} {\textsf {[}} {\textsf {Mon}} (\varphi ) \,\triangleleft \, q {\textsf {]}}_{}^{*} \mathop {\Longrightarrow }\limits ^{\;\quad \;} i_{\text {mtr}} {\textsf {[}} {\textsf {Mon}} (\varphi ) \,\triangleleft \, q {\textsf {]}}_{}^{*}\). \(\square \)

7.3 Monitor separability

For the proof for Lemma 5 of Sect. 6, we make use of Lemma 2, relating the behaviour of a monitored system to the same system when unmonitored, Lemma 8 delineating behaviour preservation after extending mailbox contents at specific actors, and Lemma 11 (Appendix A), so as to reason about the structure and generic behaviour of synthesised monitors.

Lemma 5 (Monitor Separability). For all basic actors \(\varphi \in {\textsc {sHML}},A \in {\textsc {Actr}} \) where \(i_{\text {mtr}}\) is fresh to \(A\), and \( s \in \bigl (\textsc {Act} \setminus \{ {\textsf {{fail}}} {\textsf {!}} \}\bigr )^*\),

$$\begin{aligned} (\upnu \, i_{\text {mtr}}) \bigl (\lceil A \rceil \parallel i_{\text {mtr}} {\textsf {[}} {\textsf {Mon}} (\varphi ) {\textsf {]}}^{*}\bigr ) \mathop {\Longrightarrow }\limits ^{\;s \;} B \text { implies } \exists B ',B '' {\textit{s.t.}} {\left\{ \begin{array}{ll} B \equiv (\upnu \, i_{\text {mtr}}) \bigl (B '\parallel B ''\bigr ) \\ A \mathop {\Longrightarrow }\limits ^{\;\;s \;\;} A ' \;{\textit{s.t.}} \; B ' = \lceil A ' \rceil \\ i_{\text {mtr}} {\textsf {[}} {\textsf {Mon}} (\varphi ) \,\triangleleft \, {{\mathrm{tr}}}(s) {\textsf {]}}_{}^{*} \mathop {\Longrightarrow }\limits ^{\quad } B '' \end{array}\right. } \end{aligned}$$

Proof

By induction on n in \((\upnu \, i_{\text {mtr}}) \bigl (\lceil A \rceil \parallel i_{\text {mtr}} {\textsf {[}} {\textsf {Mon}} (\varphi ) {\textsf {]}}^{*}\bigr ) (\;\xrightarrow {\;{\gamma _n}\;}\;)^n B \), the length of the sequence of actions:

\(n=0\)::

Since \(s =\epsilon \) and \(A =(\upnu \, i_{\text {mtr}}) \bigl (\lceil A \rceil \parallel i_{\text {mtr}} {\textsf {[}} {\textsf {Mon}} (\varphi ) {\textsf {]}}^{*}\bigr ) \), the conditions hold trivially.

\(n=k+1\)::

We have \((\upnu \, i_{\text {mtr}}) \bigl (\lceil A \rceil \parallel i_{\text {mtr}} {\textsf {[}} {\textsf {Mon}} (\varphi ) {\textsf {]}}^{*}\bigr ) (\;\xrightarrow {\;{\gamma _k}\;}\;)^kC \;\xrightarrow {\;\gamma \;}\;B \). By I.H. we know that

$$\begin{aligned}&C \equiv (\upnu \, i_{\text {mtr}}) \bigl (C '\parallel C ''\bigr ) \end{aligned}$$
(29)
$$\begin{aligned}&A \mathop {\Longrightarrow }\limits ^{\;\;t \;\;} A '' \;{\textit{s.t.}} \; C ' = \lceil A '' \rceil \end{aligned}$$
(30)
$$\begin{aligned}&i_{\text {mtr}} {\textsf {[}} {\textsf {Mon}} (\varphi ) \,\triangleleft \, {{\mathrm{tr}}}(t) {\textsf {]}}_{}^{*} \mathop {\Longrightarrow }\limits ^{\quad } C ''\end{aligned}$$
(31)
$$\begin{aligned}&\gamma =\tau \text { implies } t =s \quad \text {and} \quad \gamma =\alpha \text { implies } t \alpha =s \end{aligned}$$
(32)

and by (31) and Lemma 11 we know that

$$\begin{aligned}&C ''\equiv (\upnu \, \mathbf {h}) \bigl (i_{\text {mtr}} {\textsf {[}} e \,\triangleleft \, q {\textsf {]}}_{}^{*}\parallel C '''\bigr ) \end{aligned}$$
(33)
$$\begin{aligned}&{{\mathrm{fId}}}(C '')=\{i_{\text {mtr}} \} \end{aligned}$$
(34)

We proceed by considering the two possible subcases for the structure of \(\gamma \):

\(\gamma =\alpha \)::

By (32) we know that \(s =t \alpha \). By (34) and (33), it must be the case that \(C \equiv (\upnu \, i_{\text {mtr}}) \bigl (C '\parallel C ''\bigr ) \;\xrightarrow {\;\alpha \;}\;B \) happens because

$$\begin{aligned} \text { for some }B '&\; C '\;\xrightarrow {\;\alpha \;}\;B '\end{aligned}$$
(35)
$$\begin{aligned}&B \equiv (\upnu \, i_{\text {mtr}}) \bigl (B '\parallel (\upnu \, {\mathbf {h}}) \bigl (i_{\text {mtr}} {\textsf {[}} e \,\triangleleft \, q \mathop {:}{{\mathrm{tr}}}(\alpha ) {\textsf {]}}_{}^{*}\parallel C '''\bigr ) \bigr ) \end{aligned}$$
(36)

By (35), (30) and Lemma 2 we know that \(\exists A '\) such that \(\lceil A ' \rceil =B '\) and that \(A ''\;\xrightarrow {\;\alpha \;}\;A '\). Thus by (30) and \(s =t \alpha \) we obtain

$$\begin{aligned}&A \mathop {\Longrightarrow }\limits ^{\;\;s \;\;} A ' \;{\textit{s.t.}} \; B ' = \lceil A ' \rceil \end{aligned}$$

By (31), (33) and repeated applications of Lemma 8 we also know that

$$\begin{aligned} i_{\text {mtr}} {\textsf {[}} {\textsf {Mon}} (\varphi ) \,\triangleleft \, {{\mathrm{tr}}}(t)\mathop {:}{{\mathrm{tr}}}(\alpha ) {\textsf {]}}_{}^{*}=i_{\text {mtr}} {\textsf {[}} {\textsf {Mon}} (\varphi ) \,\triangleleft \, {{\mathrm{tr}}}(s) {\textsf {]}}_{}^{*} \mathop {\Longrightarrow }\limits ^{\quad }\\ (\upnu \, {\mathbf {h}}) \bigl (i_{\text {mtr}} {\textsf {[}} e \,\triangleleft \, q \mathop {:}{{\mathrm{tr}}}(\alpha ) {\textsf {]}}_{}^{*}\parallel C '''\bigr ) = B '' \end{aligned}$$

The result then follows from (36).

\(\gamma =\tau \)::

Analogous to the other case, where we also have the case that the reduction is instigated by \(C ''\), in which case the results follows immediately. \(\square \)

8 Conclusion

We have studied a more intensional notion of correctness for monitor synthesis in a concurrent online setting; we worked close to the actual implementation level of abstraction so as to enhance our confidence in the correctness of our instrumented monitors. More precisely, we have identified a number of additional issues raised when proving monitor correctness in this concurrent setting, and built a tool [21], automating monitor synthesis from a reactive property logic (sHML) to asynchronous monitors in a concurrent language (Erlang), illustrating these issues. The specific contributions of the paper, in order of importance, are:

  1. 1.

    A novel formal definition of monitor correctness, Definition 6, dealing with issues such as system non-determinism and system interference.

  2. 2.

    A proof technique teasing apart aspects of the monitor correctness definition, Lemma 3, Lemma 4 and Lemma 5, allowing us to prove correctness in stages. We subsequently apply this technique to prove the correctness of our tool, Theorem 3.

  3. 3.

    An alternative violation characterisation of the logic, sHML, that is more amenable to runtime analysis and reasoning about monitor correctness, together with a proof of correspondence for this reformulation, Theorem 1.

  4. 4.

    An extension of a formalisation for Erlang describing its tracing semantics, Sect. 2.

  5. 5.

    A formal monitor synthesis definition from sHML formulas to Erlang code, Definition 7.

8.1 Related work

The aforementioned work, [5, 23, 34], discusses monitor synthesis from a different logic, namely LTL, to either pseudocode, automata or Büchi automata; none of this work considers online concurrent monitoring, circumventing issues associated with concurrency and system interference. There is considerable work on runtime monitoring of web services, e.g. [7, 18] verifying the correctness of reactive (communication) properties, similar to those expressed through sHML; to the best of our knowledge, none of this work tackles correct monitor synthesis from a specified logic. In [14], Colombo et al. develop an Erlang RV tool using the EVM tracing mechanism but do not consider the issue of correct monitor generation. Fredlund [22] adapted a variant of HML to specify correctness properties in Erlang, albeit for model checking purposes.

There is also work relating HML formulas with tests, namely [1], but also [10]. Our monitors differ from tests, as in [1], in a number of ways: (i) they are defined in terms of concurrent actors, as opposed to sequential CCS processes; (ii) they analyse systems asynchronously, acting on traces, whereas tests interact with the system directly, forcing certain system behaviour; (iii) they are expected to always detect violations when they occur whereas tests are only required to have one possible execution that detects violations.

In recent work, Bocchi et al. [6] studied monitors that enforce multi-party session types at runtime for high-level specifications of message-passing programs, expressed using a distributed \(\pi \)-calculus. Their methodology differs from ours in a number of ways. In particular, (i) they give a direct operational semantics to their session specifications in terms of an LTS, which allows them to interact directly with the processes that they monitor; by contrast, we synthesise monitors from our specification formulas as programs in the host language, and focus on proving the correctness of this synthesis; (ii) the parallelisation criteria for their session projections is based on the participants being monitored whereas we parallelise on the basis of the structure of the formula, namely across formula conjunctions; (iii) they work at the level of an abstract model, namely a distributed \(\pi \)-calculus, whereas we work at a level of abstraction that is close to actual Erlang code that can be compiled and executed.

There is other work on the decomposition of monitor synthesis. In [4], they synthesise a dedicated monitor from an LTL specification for each synchronous component executing in a system, thereby localising monitoring to a component level. By contrast, our monitor synthesis is agnostic to the internal structure of the monitored system, and decomposition is purely based on the structure of the correctness formulas. In [35], they define a distributed logic for specifying correctness properties of distributed systems and provide a distributed monitor synthesis algorithm for the logic, implemented as actor-based tool called DiAna. Their setting is however different from ours: their systems do not assume a global clock and monitoring work over partially ordered traces (one for each location). Monitoring in the absence of global clocks is also considered in [20], where they develop bisimulation-based coinductive techniques to reason about monitored systems. Crucially, none of these works considers issues relating to the correctness of monitor synthesis studied in this paper.

8.2 Future work

The monitoring semantics of Sect. 2 can be used as a basis to prove the correctness of existing Erlang monitoring tools such as [14, 15]. sHML can also be extended to handle limited, monitorable forms of liveness properties (often termed co-safety properties [28]); the work carried out in [10] provides an ideal starting point. It is also worth exploring mechanisms for synchronous monitoring, as opposed to asynchronous variant studied in this paper. Erlang also facilitates monitor distribution which can be used to lower monitoring overheads [16, 35]. Distributed monitoring can also be used to increase the expressivity of our tool so as to handle correctness properties for distributed programs. However, this poses a departure from our setting because the unique trace described by our framework would be replaced by separate independent traces at each location, where the lack of a total ordering of events may prohibit the detection of certain violations [20].